On Monday, I was testing our Freedome
VPN for Windows and eventually… I forgot that I was using our London exit
node.
And then I attempted to log in to
Twitter.
This was the result:
And then I received this message via e-mail:
An unusual device or location?
In order to
determine that I was attempting to log in from an "unusual" location, Twitter
must be keeping a history of my previous IP addresses to compare against. This
type of security feature is not new, Facebook has been doing this sort of thing
for years already. But I've not yet seen it from Twitter. (A few years ago,
Twitter seemed to be actively against such an idea.) Unlike Facebook, I don't
see anyplace from which I can download my own connection history. Previous IP
addresses used are available to those who download a Facebook archive. But IP
address information isn't in the Twitter archive that I downloaded
today.
So then the questions I now have for Twitter is this: for how long
have my connections been logged and tracked? And when will a copy of the data be
available to me?
March 11th update:
Eagle-eyed reader Tero
Alhonen found the answer to one of my questions in Twitter's Privacy Policy.
Twitter "may" receive information such as IP
address and will "either delete Log Data or remove any common account
identifiers" "after 18 months." The language about 18 months was first included
in version 5 of the policy, June 23, 2011.
Tuesday, April 7, 2015
Smart Home Safe
The Internet of Things (IoT) devices can help you save time and hassle and
improve your quality of life. As an example, you can check the contents of your
fridge and turn on the oven while at the grocery store thus saving money,
uncertainty, and time when preparing dinner for your family. This is great and
many people will benefit from features like these. However, as with all changes,
along with the opportunity there are risks. Particularly there are risks to your
online security and privacy but some of these risks extend to the physical World
as well. As an example, the possibility to remotely open your front door lock
for the plumber can be a great time saver but it also means that by hacking your
cloud accounts it will be possible for also the hackers to open your door -- and
possibly sell access to your home on dark markets. And it's not just about
hacking: These gadgets collect data about what's happening in your home and life
and hence they themselves present a risk to your privacy.
Image: The above image shows a typical smart home configuration and the kinds of attacks it can face. While the smart home is not a target at the moment due to its low adoption rate and high fragmentation, all of the layers can be attacked with existing techniques.
If you are extremely worried about your privacy and security, the only way to really stay safe is to not buy and use these gadgets. However, for most people, the time-saving convenience benefits of IoT and the Smart Home will outweigh most privacy and security implications. Also, IoT devices are not widely targeted at the moment and even when they are, the attackers are after the computing power of the device -- not yet your data or your home. Actually, the biggest risk right now comes from the way how the manufacturers of these devices handle your personal data. This all said, you shouldn't just blindly jump in. There are some things that you can do to reduce the risks:
• Do not connect these devices directly to public internet addresses. Use a firewall or at least a NAT (Network Address Translation) router in front of the devices to make sure they are not discoverable from the Internet. You should disable UPnP (Universal Plug and Play) on your router if you want to make sure the devices cannot open a port on your public internet address.
• Go through the privacy and security settings of the device or service and remove everything you don't need. For many of these devices the currently available settings are precious few, however. Shut down features you don't need if you think they might have any privacy implications. For example, do you really use the voice commands feature in your Smart TV or gaming console? If you never use it, just disable it. You can always enable it back if you want to give the feature a try later.
• When you register to the cloud service of the IoT device, use a strong and unique password and keep it safe. Change the password if you think there is a risk someone managed to spy it. Also, as all of these services allow for a password reset through your email account, make sure you secure the email account with a truly strong password and keep the password safe. Use 2-factor authentication (2FA) where available -- and for most popular email services it is available today.
• Keep your PCs, tablets, and mobile phones clear of malware. Malware often steals passwords and may hence steal the password to your smart home service or the email account linked to it. You need to install security software onto devices where you use the passwords, keep your software updated with the latest security fixes, and, as an example, make sure you don't click on links or attachments in weird spam emails.
• Think carefully if you really want to use remotely accessible smart locks on your home doors. If you're one of those people who leave the key under the door mat or the flower pot, you're probably safer with a smart lock, though.
• If you install security cameras and nannycams, disconnect them from the network when you have no need for them. Consider doing the same for devices that constantly send audio from your home to the cloud unless you really do use them all the time. Remember that most IoT devices don't have much computing power and hence the audio and video processing is most likely done on some server in the cloud.
• Use encryption (preferably WPA2) in your home Wi-Fi. Use a strong Wi-Fi passphrase and keep it safe. Without a passphrase, with a weak passphrase, or when using an obsolete protocol such as WEP, your home Wi-Fi becomes an open network from a security perspective.
• Be careful when using Open Wi-Fi networks such as the network in a coffee shop, a shopping mall, or a hotel. If you or your applications send your passwords in clear text, they can be stolen and you may become a victim of a Man-in-the-Middle (MitM) attack. Use a VPN application always when using Open Wi-Fi. Again, your passwords are they key to your identity and also to your personal Internet of Things.
• Limit your attack surface. Don't install devices you know you're not going to need. Shut down and remove all devices that you no longer need or use. When you buy a top of the line washing machine, and you notice it can be connected through Wi-Fi, consider if you really want and need to connect it before you do. Disconnect the device from the network once you realize you actually don't use the online features at all.
• When selecting which manufacturer you buy your device from, check what they say about security and privacy and what their privacy principles are. Was the product rushed to the market and were any security corners cut? What is the motivation of the manufacturer to process your data? Do they sell it onwards to advertisers? Do they store any of your data and where do they store it?
• Go to your home router settings today. Make sure you disable services that are exposed to the Internet -- the WAN interface. Change the admin password to something strong and unique. Check that the DNS setting of the router points to your ISP's DNS server or some open service like OpenDNS or Google DNS and hasn't been tampered with.
• Make sure you keep your router's firmware up-to-date and consider replacing the router with a new one, especially, if the manufacturer no longer provides security updates. Consider moving away from a manufacturer that doesn't do security updates or stops them after two years. The security of your home network starts from the router and the router is exposed to the Internet.
The above list of actions is extensive and maybe a bit on the "band-aid on the webcam"-paranoid side. However, it should give you an idea of what kinds of things you can do to stay in control of your security and privacy when taking a leap to the Internet of Things. Security in the IoT World is not that different from earlier: Your passwords are also very important in IoT as is the principle of deploying security patches and turning off services you don't need.
Image: The above image shows a typical smart home configuration and the kinds of attacks it can face. While the smart home is not a target at the moment due to its low adoption rate and high fragmentation, all of the layers can be attacked with existing techniques.
If you are extremely worried about your privacy and security, the only way to really stay safe is to not buy and use these gadgets. However, for most people, the time-saving convenience benefits of IoT and the Smart Home will outweigh most privacy and security implications. Also, IoT devices are not widely targeted at the moment and even when they are, the attackers are after the computing power of the device -- not yet your data or your home. Actually, the biggest risk right now comes from the way how the manufacturers of these devices handle your personal data. This all said, you shouldn't just blindly jump in. There are some things that you can do to reduce the risks:
• Do not connect these devices directly to public internet addresses. Use a firewall or at least a NAT (Network Address Translation) router in front of the devices to make sure they are not discoverable from the Internet. You should disable UPnP (Universal Plug and Play) on your router if you want to make sure the devices cannot open a port on your public internet address.
• Go through the privacy and security settings of the device or service and remove everything you don't need. For many of these devices the currently available settings are precious few, however. Shut down features you don't need if you think they might have any privacy implications. For example, do you really use the voice commands feature in your Smart TV or gaming console? If you never use it, just disable it. You can always enable it back if you want to give the feature a try later.
• When you register to the cloud service of the IoT device, use a strong and unique password and keep it safe. Change the password if you think there is a risk someone managed to spy it. Also, as all of these services allow for a password reset through your email account, make sure you secure the email account with a truly strong password and keep the password safe. Use 2-factor authentication (2FA) where available -- and for most popular email services it is available today.
• Keep your PCs, tablets, and mobile phones clear of malware. Malware often steals passwords and may hence steal the password to your smart home service or the email account linked to it. You need to install security software onto devices where you use the passwords, keep your software updated with the latest security fixes, and, as an example, make sure you don't click on links or attachments in weird spam emails.
• Think carefully if you really want to use remotely accessible smart locks on your home doors. If you're one of those people who leave the key under the door mat or the flower pot, you're probably safer with a smart lock, though.
• If you install security cameras and nannycams, disconnect them from the network when you have no need for them. Consider doing the same for devices that constantly send audio from your home to the cloud unless you really do use them all the time. Remember that most IoT devices don't have much computing power and hence the audio and video processing is most likely done on some server in the cloud.
• Use encryption (preferably WPA2) in your home Wi-Fi. Use a strong Wi-Fi passphrase and keep it safe. Without a passphrase, with a weak passphrase, or when using an obsolete protocol such as WEP, your home Wi-Fi becomes an open network from a security perspective.
• Be careful when using Open Wi-Fi networks such as the network in a coffee shop, a shopping mall, or a hotel. If you or your applications send your passwords in clear text, they can be stolen and you may become a victim of a Man-in-the-Middle (MitM) attack. Use a VPN application always when using Open Wi-Fi. Again, your passwords are they key to your identity and also to your personal Internet of Things.
• Limit your attack surface. Don't install devices you know you're not going to need. Shut down and remove all devices that you no longer need or use. When you buy a top of the line washing machine, and you notice it can be connected through Wi-Fi, consider if you really want and need to connect it before you do. Disconnect the device from the network once you realize you actually don't use the online features at all.
• When selecting which manufacturer you buy your device from, check what they say about security and privacy and what their privacy principles are. Was the product rushed to the market and were any security corners cut? What is the motivation of the manufacturer to process your data? Do they sell it onwards to advertisers? Do they store any of your data and where do they store it?
• Go to your home router settings today. Make sure you disable services that are exposed to the Internet -- the WAN interface. Change the admin password to something strong and unique. Check that the DNS setting of the router points to your ISP's DNS server or some open service like OpenDNS or Google DNS and hasn't been tampered with.
• Make sure you keep your router's firmware up-to-date and consider replacing the router with a new one, especially, if the manufacturer no longer provides security updates. Consider moving away from a manufacturer that doesn't do security updates or stops them after two years. The security of your home network starts from the router and the router is exposed to the Internet.
The above list of actions is extensive and maybe a bit on the "band-aid on the webcam"-paranoid side. However, it should give you an idea of what kinds of things you can do to stay in control of your security and privacy when taking a leap to the Internet of Things. Security in the IoT World is not that different from earlier: Your passwords are also very important in IoT as is the principle of deploying security patches and turning off services you don't need.
The Evolution ??... reader
Sure, smartphones and tablets get all the press, and deservedly so. But if
you place the original mainstream eInk device from 2007, the Amazon
Kindle, side by side with today's model, the evolution of eInk devices is
just as striking.
Each of these devices has a 6 inch eInk screen. Beyond that they're worlds apart.
They may seem awfully primitive compared to smartphones, but that's part of
their charm – they are the scooter
to the motorcycle of the smartphone. Nowhere near as versatile, but as a form of
basic transportation, radically simpler, radically cheaper, and more durable.
There's an object lesson here in stripping things away to get to the core.
eInk devices are also pleasant in a paradoxical way because they basically suck at everything that isn't reading. That doesn't sound like something you'd want, except when you notice you spend every fifth page switching back to Twitter or Facebook or Tinder or Snapchat or whatever. eInk devices let you tune out the world and truly immerse yourself in reading.
I believe in the broadest sense, bits > atoms. Sure, we'll always read on whatever device we happen to hold in our hands that can display words and paragraphs. And the advent of retina class devices sure made reading a heck of a lot more pleasant on tablets and smartphones.
But this idea of ultra-cheap, pervasive eInk reading devices eventually replacing those ultra-cheap, pervasive paperbacks I used to devour as a kid has great appeal to me. I can't let it go. Reading is Fundamental, man!
That's why I'm in this weird place where I will buy, sight unseen, every new Kindle eInk device. I wasn't quite crazy enough to buy the original Kindle (I mean, look at that thing) but I've owned every model since the third generation Kindle was introduced in 2010.
I've also been tracking the Kindle prices to see when they can get them down to $49 or lower. We're not quite there yet – the basic Kindle eInk reader, which by the way is still pretty darn amazing compared to that original 2007 model pictured above – is currently on sale for $59.
But this is mostly about their new flagship eInk device, the Kindle Voyage. Instead of being cheap, it's trying to be upscale. The absolute first thing you need to know is this is the first 300 PPI (aka "retina") eInk reader from Amazon. If you're familiar with the smartphone world before and after the iPhone 4, then you should already be lining up to own one of these.
When you experience 300 PPI in eInk, you really feel like you're looking at a high quality printed page rather than an array of RGB pixels. Yeah, it's still grayscale, but it is glorious. Here are some uncompressed screenshots I made from mine at native resolution.
Note that the real device is eInk, so there's a natural paper-like fuzziness that makes it seem even more high resolution than these raw bitmaps would indicate.
I finally have enough resolution to pick a thinner font than fat, sassy old Caecilia.
The backlight was new to the original Paperwhite, and it definitely had some teething pains. The third time's the charm; they've nailed the backlight aspect for improved overall contrast and night reading. The Voyage also adds an ambient light sensor so it automatically scales the backlight to anything from bright outdoors to a pitch-dark bedroom. It's like automatic night time headlights on a car – one less manual setting I have to deal with before I sit down and get to my reading. It's nice.
The Voyage also adds page turn buttons back into the mix, via pressure sensing zones on the left and right bezel. I'll admit I had some difficulty adjusting to these buttons, to the point that I wasn't sure I would, but I eventually did – and now I'm a convert. Not having to move your finger into the visible text on the page to advance, and being able to advance without moving your finger at all, just pushing it down slightly (which provides a little haptic buzz as a reward), does make for a more pleasant and efficient reading experience. But it is kind of subtle and it took me a fair number of page turns to get it down.
In my experience eInk devices are a bit more fragile than tablets and smartphones. So you'll want a case for automatic on/off and basic "throw it in my bag however" paperback book level protection. Unfortunately, the official Kindle Voyage case is a disaster. Don't buy it.
Previous Kindle cases were expensive, but they were actually very well designed. The Voyage case is expensive and just plain bad. Whoever came up with the idea of a weirdly foldable, floppy origami top opening case on a thing you expect to work like a typical side-opening book should be fired. I recommend something like this basic $14.99 case which works fine to trigger on/off and opens in the expected way.
It's not all sweetness and light, though. The typography issues that have plagued the Kindle are still present in full force. It doesn't personally bother me that much, but it is reasonable to expect more by now from a big company that ostensibly cares about reading. And has a giant budget with lots of smart people on its payroll.
Each of these devices has a 6 inch eInk screen. Beyond that they're worlds apart.
8" × 5.3" × 0.8" 10.2 oz |
6.4" × 4.5" × 0.3" 6.3 oz |
6" eInk display 167 PPI 4 level greyscale |
6" eInk display 300 PPI 16 level greyscale backlight |
256 MB | 4 GB |
400 Mhz CPU | 1 GHz CPU |
$399 | $199 |
7 days battery life USB |
6 weeks battery life WiFi / Cellular |
eInk devices are also pleasant in a paradoxical way because they basically suck at everything that isn't reading. That doesn't sound like something you'd want, except when you notice you spend every fifth page switching back to Twitter or Facebook or Tinder or Snapchat or whatever. eInk devices let you tune out the world and truly immerse yourself in reading.
I believe in the broadest sense, bits > atoms. Sure, we'll always read on whatever device we happen to hold in our hands that can display words and paragraphs. And the advent of retina class devices sure made reading a heck of a lot more pleasant on tablets and smartphones.
But this idea of ultra-cheap, pervasive eInk reading devices eventually replacing those ultra-cheap, pervasive paperbacks I used to devour as a kid has great appeal to me. I can't let it go. Reading is Fundamental, man!
That's why I'm in this weird place where I will buy, sight unseen, every new Kindle eInk device. I wasn't quite crazy enough to buy the original Kindle (I mean, look at that thing) but I've owned every model since the third generation Kindle was introduced in 2010.
I've also been tracking the Kindle prices to see when they can get them down to $49 or lower. We're not quite there yet – the basic Kindle eInk reader, which by the way is still pretty darn amazing compared to that original 2007 model pictured above – is currently on sale for $59.
But this is mostly about their new flagship eInk device, the Kindle Voyage. Instead of being cheap, it's trying to be upscale. The absolute first thing you need to know is this is the first 300 PPI (aka "retina") eInk reader from Amazon. If you're familiar with the smartphone world before and after the iPhone 4, then you should already be lining up to own one of these.
When you experience 300 PPI in eInk, you really feel like you're looking at a high quality printed page rather than an array of RGB pixels. Yeah, it's still grayscale, but it is glorious. Here are some uncompressed screenshots I made from mine at native resolution.
Note that the real device is eInk, so there's a natural paper-like fuzziness that makes it seem even more high resolution than these raw bitmaps would indicate.
I finally have enough resolution to pick a thinner font than fat, sassy old Caecilia.
The backlight was new to the original Paperwhite, and it definitely had some teething pains. The third time's the charm; they've nailed the backlight aspect for improved overall contrast and night reading. The Voyage also adds an ambient light sensor so it automatically scales the backlight to anything from bright outdoors to a pitch-dark bedroom. It's like automatic night time headlights on a car – one less manual setting I have to deal with before I sit down and get to my reading. It's nice.
The Voyage also adds page turn buttons back into the mix, via pressure sensing zones on the left and right bezel. I'll admit I had some difficulty adjusting to these buttons, to the point that I wasn't sure I would, but I eventually did – and now I'm a convert. Not having to move your finger into the visible text on the page to advance, and being able to advance without moving your finger at all, just pushing it down slightly (which provides a little haptic buzz as a reward), does make for a more pleasant and efficient reading experience. But it is kind of subtle and it took me a fair number of page turns to get it down.
In my experience eInk devices are a bit more fragile than tablets and smartphones. So you'll want a case for automatic on/off and basic "throw it in my bag however" paperback book level protection. Unfortunately, the official Kindle Voyage case is a disaster. Don't buy it.
Previous Kindle cases were expensive, but they were actually very well designed. The Voyage case is expensive and just plain bad. Whoever came up with the idea of a weirdly foldable, floppy origami top opening case on a thing you expect to work like a typical side-opening book should be fired. I recommend something like this basic $14.99 case which works fine to trigger on/off and opens in the expected way.
It's not all sweetness and light, though. The typography issues that have plagued the Kindle are still present in full force. It doesn't personally bother me that much, but it is reasonable to expect more by now from a big company that ostensibly cares about reading. And has a giant budget with lots of smart people on its payroll.
This is what text looks like on a kindle.
— Justin Van Slembrou…
(@jvanslem) February 6,
2014
Nexmo is launching chat
Companies are constantly on the lookout for new methods of interacting with their customer base but it can be hard to integrate these with existing systems.
Cloud communications firm Nexmo is launching a new API that allows a chat application to interact with a customer service platform.
The Nexmo Chat App API helps brands consolidate all chat messages into their existing communication platforms, eliminating the need to manually manage communications over individual chat applications. The Chat App API does this by automatically detecting and connecting brand messages with the appropriate chat application in real time.
It lets marketing, sales or customer support staff send one message and have it appear on all relevant chat applications at once. Nexmo also works directly with each chat application to ensure messages appear correctly on all platforms. In addition it informs brands which features are available on each chat application. Through Nexmo's carrier relations it knows the cultural restrictions in play on each network and can make them clear to brands.
"We live in an always-on world, where customers expect to be engaged anytime, anywhere and on their preferred channels," says Nexmo CEO and co-founder, Tony Jamous. "This means the bar for customer engagement has risen dramatically. At the same time, solutions that are put in place need to be scalable, near real time and cost efficient, and that’s where we see the tremendous opportunity for the Chat App API. Adding the Chat App API to our portfolio of industry-leading messaging and voice APIs transforms Nexmo from a company helping brands navigate the current landscape of mobile communications to a resource that brands can come to as customer communications dynamically changes shape".
The Nexmo Chat App API currently supports messaging on WeChat and Line. The company will be adding support for additional chat apps, service platforms and new features in the coming months. You can find out more and sign up for beta access on the Nexmo website.
Image Credit: Rawpixel / Shutterstock
Now that many enterprises are seeing value in big data analysis, it may be
time for their database administrators and data warehouse managers to get
involved.
Oracle has released a new extension for its Oracle Data Integrator middleware that allows DBAs and data warehouse experts to treat big data repositories as just another data source, alongside their structured databases and data warehouses.
[ Explore the current trends and solutions in BI with InfoWorld's Big Data Analytics Deep Dive and Extreme Analytics blog. ] The Oracle Data Integrator for Big Data "makes a non-Hadoop developer instantly productive on Hadoop," said Jeff Pollock, Oracle vice president of product management.
Big data platforms such as Hadoop and Spark were initially geared more towards programmers than DBAs, using languages such as Java and Python, Pollock said. Yet traditional enterprise data analysis has largely been managed by DBAs and experts in ETL (Extract Transform and Load Tools), using tools such as SQL and drag-and-drop visually-oriented interfaces.
The Data Integrator for Big Data extends Oracle's ODI product to handle big data sources.
ODI provides the ability for organizations to pull together data from multiple sources and formats, such as relational data hosted in IBM or Microsoft databases, and material residing in Teradata data warehouses. So it was a natural step to connect to big data repositories to ODI as well.
With the extension, "you don't have to retrain a database administrator on Hive for Hadoop. We can now give them a toolkit that they will be naturally familiar with," Pollock said. The administrator can work with familiar concepts such as entities and relations, and 4GL data flow mapping. The software "automatically generates the code in the different underlying languages," needed to complete the job, Pollock said.
The software can work with any Hadoop or Spark deployment, and doesn't require software installation on any of the data nodes. Using the power of distributed computing, Data Integrator for Big Data uses the nodes where the data is stored to carry out all the computations needed.
A retail organization could use the software to analyze its customers' purchasing histories. Real-time data capture systems such as Oracle GoldenGate 12c could move transactional data into a Hadoop cluster, where it then can be prepared for analysis by ODI.
Oracle is not alone in attempting to bridge the new big data tools with traditional data analysis software. Last week, Hewlett-Packard released a software package that allows customers to integrate HP's Vertica analysis database with HP Autonomy's IDOL (Intelligent Data Operating Layer) platform, providing a way for organizations to speedily analyze large amounts of unstructured data.
Oracle has released a new extension for its Oracle Data Integrator middleware that allows DBAs and data warehouse experts to treat big data repositories as just another data source, alongside their structured databases and data warehouses.
[ Explore the current trends and solutions in BI with InfoWorld's Big Data Analytics Deep Dive and Extreme Analytics blog. ] The Oracle Data Integrator for Big Data "makes a non-Hadoop developer instantly productive on Hadoop," said Jeff Pollock, Oracle vice president of product management.
Big data platforms such as Hadoop and Spark were initially geared more towards programmers than DBAs, using languages such as Java and Python, Pollock said. Yet traditional enterprise data analysis has largely been managed by DBAs and experts in ETL (Extract Transform and Load Tools), using tools such as SQL and drag-and-drop visually-oriented interfaces.
The Data Integrator for Big Data extends Oracle's ODI product to handle big data sources.
ODI provides the ability for organizations to pull together data from multiple sources and formats, such as relational data hosted in IBM or Microsoft databases, and material residing in Teradata data warehouses. So it was a natural step to connect to big data repositories to ODI as well.
With the extension, "you don't have to retrain a database administrator on Hive for Hadoop. We can now give them a toolkit that they will be naturally familiar with," Pollock said. The administrator can work with familiar concepts such as entities and relations, and 4GL data flow mapping. The software "automatically generates the code in the different underlying languages," needed to complete the job, Pollock said.
The software can work with any Hadoop or Spark deployment, and doesn't require software installation on any of the data nodes. Using the power of distributed computing, Data Integrator for Big Data uses the nodes where the data is stored to carry out all the computations needed.
A retail organization could use the software to analyze its customers' purchasing histories. Real-time data capture systems such as Oracle GoldenGate 12c could move transactional data into a Hadoop cluster, where it then can be prepared for analysis by ODI.
Oracle is not alone in attempting to bridge the new big data tools with traditional data analysis software. Last week, Hewlett-Packard released a software package that allows customers to integrate HP's Vertica analysis database with HP Autonomy's IDOL (Intelligent Data Operating Layer) platform, providing a way for organizations to speedily analyze large amounts of unstructured data.
Dawn of the data center operating system
Dawn of the data center operating system
Virtualization has been a key driver behind every major trend in software, from search to social networks to SaaS, over the past decade. In fact, most of the applications we use -- and cloud computing as we know it today -- would not have been possible without the server utilization and cost savings that resulted from virtualization.
But now, new cloud architectures are reimagining the entire data center. Virtualization as we know it can no longer keep up.
As data centers transform, the core insight behind virtualization -- that of carving up a large, expensive server into several virtual machines -- is being turned on its head. Instead of divvying the resources of individual servers, large numbers of servers are aggregated into a single warehouse-scale (though still virtual!) “computer” to run highly distributed applications.
Every IT organization and developer will be affected by these changes, especially as scaling demands increase and applications get more complex every day. How can companies that have already invested in the current paradigm of virtualization understand the shift? What’s driving it? And what happens next?
That story begins in the mainframe era, with IBM. Back in the 1960s and 1970s, the company needed a way to cleanly support older versions of its software on newer-generation hardware and to turn its powerful computers from a batch system that ran one program at a time to an interactive system that could support multiple users and applications. IBM engineers came up with the concept of a “virtual machine” as a way to carve up resources and essentially timeshare the system across applications and users while preserving compatibility.
This approach cemented IBM’s place as the market leader in mainframe computing.
Fast-forward to the early 2000s and a different problem was brewing. Enterprises were faced with data centers full of expensive servers that were running at very low utilization levels. Furthermore, thanks to Moore’s Law, processor clock speeds had doubled every 18 months and processors had moved to multiple cores -- yet the software stack was unable to effectively utilize the newer processors and all those cores.
Again, the solution was a form of virtualization. VMware, then a startup out of Stanford, enabled enterprises to dramatically increase the utilization of their servers by allowing them to pack multiple applications into a single server box. By embracing all software (old and new), VMware also bridged the gap between the lagging software stack and modern, multicore processors. Finally, VMware enabled both Windows and Linux virtual machines to run on the same physical hosts -- thereby removing the need to allocate separate physical servers to those clusters within the same data center.
Virtualization thus established a stranglehold in every enterprise data center.
But in the late 2000s, a quiet technology revolution got under way at companies like Google and Facebook. Faced with the unprecedented challenge of serving billions of users in real time, these Internet giants quickly realized they needed to build custom-tailored data centers with a hardware and software stack that aggregated (versus carved) thousands of servers and replaced larger, more expensive monolithic systems.
What these smaller and cheaper servers lacked in computing power they made up in number, and sophisticated software glued it all together to build a massively distributed computing infrastructure. The shape of the data center changed. It may have been made up of commodity parts, but the results were still orders of magnitude more powerful than traditional, state-of-the-art data centers. Linux became the operating system of choice for these hyperscale data centers, and as the field of devops emerged as a way to manage both development and operations, virtualization lost one of its core value propositions: the ability to simultaneously run different “guest” operating systems (that is, both Linux and Windows) on the same physical server.
Microservices quickly became the design pattern of choice for a few reasons.
First, microservices enable rapid cycle times. The old software development model of releasing an application once every few months was too slow for Internet companies, which needed to deploy new releases several times during a week -- or even on a single day in response to engagement metrics or similar. Monolithic applications were clearly unsuitable for this kind of agility due to their high change costs.
Second, microservices allow selective scaling of application components. The scaling requirements for different components within an application are typically different, and microservices allowed Internet companies to scale only the functions that needed to be scaled. Scaling older monolithic applications, on the other hand, was tremendously inefficient. Often the only way was to clone the entire application.
Third, microservices support platform-agnostic development. Because microservices communicate across language-agnostic protocols, an application can be composed of microservices running on different platforms (Java, PHP, Ruby, Node, Go, Erlang, and so on) without any issue, thereby benefiting from the strengths of each individual platform. This was much more difficult (if not impractical) to implement in a monolithic application framework.
That’s where Linux-based containers come in.
Both virtual machines and containers are means of isolating applications from hardware. However, unlike virtual machines -- which virtualize the underlying hardware and contain an OS along with the application stack -- containers virtualize only the operating system and contain only the application. As a result, containers have a very small footprint and can be launched in mere seconds. A physical machine can accommodate four to eight times more containers than VMs.
Containers aren’t actually new. They have existed since the days of FreeBSD Jails, Solaris Zones, OpenVZ, LXC, and so on. They’re taking off now, however, because they represent the best delivery mechanism for microservices. Looking ahead, every application of scale will be a distributed system consisting of tens if not hundreds of microservices, each running in its own container. For each such application, the ops platform will need to keep track of all of its constituent microservices -- and launch or kill those as necessary to guarantee the application-level SLA.
What does this mean for virtualization?
Virtual machines aren’t dead. But they can’t keep up with the requirements of microservices and next-generation applications, which is why we need a new software layer that will do exactly the opposite of what server virtualization was designed to do: Aggregate (not carve up!) all the servers in a data center and present that aggregation as one giant supercomputer. Though this new level of abstraction makes an entire data center seem like a single computer, in reality the system is composed of millions of microservices within their own Linux-based containers -- while delivering the benefits of multitenancy, isolation, and resource control across all those containers.
Think of this software layer as the “operating system” for the data center of the future, though the implications of it go beyond the hidden workings of the data center. The data center operating system will allow developers to more easily and safely build distributed applications without constraining themselves to the plumbing or limitations (or potential loss) of the machines, and without having to abandon their tools of choice. They will become more like users than operators.
This emerging smart software layer will soon free IT organizations -- traditionally perceived as bottlenecks on innovation -- from the immense burden of manually configuring and maintaining individual apps and machines, and allow them to focus on being agile and efficient. They too will become more strategic users than maintainers and operators.
The aggregation of virtualization is really an evolution of the core insight behind virtual machines in the first place. But it’s an important step toward a world where distributed computing is the norm, not the exception.
Sudip Chakrabarti is a partner at a16z where he focuses on infrastructure software, security, and big data investments. Peter Levine is a general partner at Andreessen Horowitz. He has been a lecturer at both MIT and Stanford business schools and was the former CEO of XenSource, which was acquired by Citrix in 2007. Prior to XenSource, Peter was EVP of Strategic and Platform Operations at Veritas Software. Follow him on his blog, http://peter.a16z.com/, and on Twitter @Peter_Levine.
New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.
But now, new cloud architectures are reimagining the entire data center. Virtualization as we know it can no longer keep up.
As data centers transform, the core insight behind virtualization -- that of carving up a large, expensive server into several virtual machines -- is being turned on its head. Instead of divvying the resources of individual servers, large numbers of servers are aggregated into a single warehouse-scale (though still virtual!) “computer” to run highly distributed applications.
Every IT organization and developer will be affected by these changes, especially as scaling demands increase and applications get more complex every day. How can companies that have already invested in the current paradigm of virtualization understand the shift? What’s driving it? And what happens next?
Virtualization then and now
Perhaps the best way to approach the changes happening now is in terms of the shifts that came before it -- and the leading players behind each of them.That story begins in the mainframe era, with IBM. Back in the 1960s and 1970s, the company needed a way to cleanly support older versions of its software on newer-generation hardware and to turn its powerful computers from a batch system that ran one program at a time to an interactive system that could support multiple users and applications. IBM engineers came up with the concept of a “virtual machine” as a way to carve up resources and essentially timeshare the system across applications and users while preserving compatibility.
This approach cemented IBM’s place as the market leader in mainframe computing.
Fast-forward to the early 2000s and a different problem was brewing. Enterprises were faced with data centers full of expensive servers that were running at very low utilization levels. Furthermore, thanks to Moore’s Law, processor clock speeds had doubled every 18 months and processors had moved to multiple cores -- yet the software stack was unable to effectively utilize the newer processors and all those cores.
Again, the solution was a form of virtualization. VMware, then a startup out of Stanford, enabled enterprises to dramatically increase the utilization of their servers by allowing them to pack multiple applications into a single server box. By embracing all software (old and new), VMware also bridged the gap between the lagging software stack and modern, multicore processors. Finally, VMware enabled both Windows and Linux virtual machines to run on the same physical hosts -- thereby removing the need to allocate separate physical servers to those clusters within the same data center.
Virtualization thus established a stranglehold in every enterprise data center.
But in the late 2000s, a quiet technology revolution got under way at companies like Google and Facebook. Faced with the unprecedented challenge of serving billions of users in real time, these Internet giants quickly realized they needed to build custom-tailored data centers with a hardware and software stack that aggregated (versus carved) thousands of servers and replaced larger, more expensive monolithic systems.
What these smaller and cheaper servers lacked in computing power they made up in number, and sophisticated software glued it all together to build a massively distributed computing infrastructure. The shape of the data center changed. It may have been made up of commodity parts, but the results were still orders of magnitude more powerful than traditional, state-of-the-art data centers. Linux became the operating system of choice for these hyperscale data centers, and as the field of devops emerged as a way to manage both development and operations, virtualization lost one of its core value propositions: the ability to simultaneously run different “guest” operating systems (that is, both Linux and Windows) on the same physical server.
Microservices as a key driver
But the most interesting changes driving the aggregation of virtualization are on the application side, through a new software design pattern known as microservices architecture. Instead of monolithic applications, we now have distributed applications composed of many smaller, independent processes that communicate with each other using language-agnostic protocols (HTTP/REST, AMQP). These services are small and highly decoupled, and they're focused on doing a single small task.Microservices quickly became the design pattern of choice for a few reasons.
First, microservices enable rapid cycle times. The old software development model of releasing an application once every few months was too slow for Internet companies, which needed to deploy new releases several times during a week -- or even on a single day in response to engagement metrics or similar. Monolithic applications were clearly unsuitable for this kind of agility due to their high change costs.
Second, microservices allow selective scaling of application components. The scaling requirements for different components within an application are typically different, and microservices allowed Internet companies to scale only the functions that needed to be scaled. Scaling older monolithic applications, on the other hand, was tremendously inefficient. Often the only way was to clone the entire application.
Third, microservices support platform-agnostic development. Because microservices communicate across language-agnostic protocols, an application can be composed of microservices running on different platforms (Java, PHP, Ruby, Node, Go, Erlang, and so on) without any issue, thereby benefiting from the strengths of each individual platform. This was much more difficult (if not impractical) to implement in a monolithic application framework.
Delivering microservices
The promise of the microservices architecture would have remained unfulfilled in the world of virtual machines. To meet the demands of scaling and costs, microservices require both a light footprint and lightning-fast boot times, so hundreds of microservices can be run on a single physical machine and launched at a moment’s notice. Virtual machines lack both qualities.That’s where Linux-based containers come in.
Both virtual machines and containers are means of isolating applications from hardware. However, unlike virtual machines -- which virtualize the underlying hardware and contain an OS along with the application stack -- containers virtualize only the operating system and contain only the application. As a result, containers have a very small footprint and can be launched in mere seconds. A physical machine can accommodate four to eight times more containers than VMs.
Containers aren’t actually new. They have existed since the days of FreeBSD Jails, Solaris Zones, OpenVZ, LXC, and so on. They’re taking off now, however, because they represent the best delivery mechanism for microservices. Looking ahead, every application of scale will be a distributed system consisting of tens if not hundreds of microservices, each running in its own container. For each such application, the ops platform will need to keep track of all of its constituent microservices -- and launch or kill those as necessary to guarantee the application-level SLA.
Why we need a data center operating system
All data centers, whether public or private or hybrid, will soon adopt these hyperscale cloud architectures -- that is, dumb commodity hardware glued together by smart software, containers, and microservices. This trend will bring to enterprise computing a whole new set of cloud economics and cloud scale, and it will introduce entirely new kinds of businesses that simply were not possible earlier.What does this mean for virtualization?
Virtual machines aren’t dead. But they can’t keep up with the requirements of microservices and next-generation applications, which is why we need a new software layer that will do exactly the opposite of what server virtualization was designed to do: Aggregate (not carve up!) all the servers in a data center and present that aggregation as one giant supercomputer. Though this new level of abstraction makes an entire data center seem like a single computer, in reality the system is composed of millions of microservices within their own Linux-based containers -- while delivering the benefits of multitenancy, isolation, and resource control across all those containers.
Think of this software layer as the “operating system” for the data center of the future, though the implications of it go beyond the hidden workings of the data center. The data center operating system will allow developers to more easily and safely build distributed applications without constraining themselves to the plumbing or limitations (or potential loss) of the machines, and without having to abandon their tools of choice. They will become more like users than operators.
This emerging smart software layer will soon free IT organizations -- traditionally perceived as bottlenecks on innovation -- from the immense burden of manually configuring and maintaining individual apps and machines, and allow them to focus on being agile and efficient. They too will become more strategic users than maintainers and operators.
The aggregation of virtualization is really an evolution of the core insight behind virtual machines in the first place. But it’s an important step toward a world where distributed computing is the norm, not the exception.
Sudip Chakrabarti is a partner at a16z where he focuses on infrastructure software, security, and big data investments. Peter Levine is a general partner at Andreessen Horowitz. He has been a lecturer at both MIT and Stanford business schools and was the former CEO of XenSource, which was acquired by Citrix in 2007. Prior to XenSource, Peter was EVP of Strategic and Platform Operations at Veritas Software. Follow him on his blog, http://peter.a16z.com/, and on Twitter @Peter_Levine.
New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.
Microsoft closes acquisition of R software and services provider
Microsoft today closed its acquisition of Revolution Analytics, a commercial provider of software and services for the R programming language, making it a wholly owned subsidiary.
"R is the world's most popular programming language for statistical computing and predictive analytics, used by more than 2 million people worldwide," says Joseph Sirosh, corporate vice president of Information Management & Machine Learning at Microsoft.
[ Go deep with R: Sharon Machlis reveals R data manipulation tricks at your fingertips. And read her beginner's guide to R | Matt Asay explores whether Microsoft can make R easy. ]
"Revolution has made R enterprise-ready with speed and scalability for the largest data warehouses and Hadoop systems," he adds. "For example, by leveraging Intel's Math Kernel Library (MKL), the freely available Revolution R Open executes a typical R benchmark 2.5 times faster than the standard R distribution and some functions, such as linear regression, run up to 20 times faster. With its parallel external memory algorithms, Revolution R Enterprise is able to deliver speeds 42 times faster than competing technology from SAS."
[Related: Learn R for Beginners With Our PDF ]
Microsoft announced its plans to acquire Revolution Analytics in January, citing its desire to help use the power of R and data science to unlock insights with advanced analytics.
With the acquisition now closed, Sirosh says Microsoft plans to build R and Revolution's technology into its data platform products, making it available on-premises, on Azure public cloud environments and in hybrid environments.
[ Related: 60+ R Resources to Improve Your Data Skills ]
"For example, we will build R into SQL Server to provide fast and scalable in-database analytics that can be deployed in an enterprise customer's datacenter, on Azure or in a hybrid combination," Sirosh says.
"In addition, we will integrate Revolution's scalable R distribution into Azure HDInsight and Azure Machine Learning, making it much easier and faster to analyze big data, and to operationalize R code for production purposes," Sirosh says. "We will also continue to support running Revolution R Enterprise across heterogeneous platforms including Linux, Teradata and Hadoop deployments. No matter where data lives, customers and partners will be able to take advantage of R more quickly, simply and cost effectively than ever before."
Microsoft also plans to continue Revolution's education and training efforts around R, and Sirosh notes it will leverage its global programs and partner ecosystem to do so.
Revolution Analytics CEO Dave Rich is now general manager of Advanced Analytics at Microsoft.
[Related: Learn to Crunch Big Data with R ]
"The CIO and CDO will need an easy-to-use, integrated platform and a vendor partner who simultaneously understands end-user productivity, cloud computing and data platforms," Rich says, describing the "Decision Process Engineering" that he sees dominating the next decade. "Who better to deliver this to companies large and small than Microsoft? All Microsoft needed was a bridge to crowd-sourced innovation on the advanced analytics algorithms and tools power results from big data. Who better than Revolution Analytics? Stay tuned. Now it gets interesting."
Follow Thor on Google+
"R is the world's most popular programming language for statistical computing and predictive analytics, used by more than 2 million people worldwide," says Joseph Sirosh, corporate vice president of Information Management & Machine Learning at Microsoft.
[ Go deep with R: Sharon Machlis reveals R data manipulation tricks at your fingertips. And read her beginner's guide to R | Matt Asay explores whether Microsoft can make R easy. ]
"Revolution has made R enterprise-ready with speed and scalability for the largest data warehouses and Hadoop systems," he adds. "For example, by leveraging Intel's Math Kernel Library (MKL), the freely available Revolution R Open executes a typical R benchmark 2.5 times faster than the standard R distribution and some functions, such as linear regression, run up to 20 times faster. With its parallel external memory algorithms, Revolution R Enterprise is able to deliver speeds 42 times faster than competing technology from SAS."
[Related: Learn R for Beginners With Our PDF ]
Microsoft announced its plans to acquire Revolution Analytics in January, citing its desire to help use the power of R and data science to unlock insights with advanced analytics.
With the acquisition now closed, Sirosh says Microsoft plans to build R and Revolution's technology into its data platform products, making it available on-premises, on Azure public cloud environments and in hybrid environments.
[ Related: 60+ R Resources to Improve Your Data Skills ]
"For example, we will build R into SQL Server to provide fast and scalable in-database analytics that can be deployed in an enterprise customer's datacenter, on Azure or in a hybrid combination," Sirosh says.
"In addition, we will integrate Revolution's scalable R distribution into Azure HDInsight and Azure Machine Learning, making it much easier and faster to analyze big data, and to operationalize R code for production purposes," Sirosh says. "We will also continue to support running Revolution R Enterprise across heterogeneous platforms including Linux, Teradata and Hadoop deployments. No matter where data lives, customers and partners will be able to take advantage of R more quickly, simply and cost effectively than ever before."
Open sources loves its R
Sirosh adds that Microsoft considers the active and passionate open source community around R an essential element to the programming language's success, and it plans to "support and amplify" Revolution's open source projects, including the Revolution R Open distribution, the ParallelR collection of packages for distributed programming, Rhadoop for running R on Hadoop nodes, DeployR for deploying R analytics in web and dashboard applications, the Reproducible R Toolkit and RevoPemaR for writing parallel external memory algorithms.Microsoft also plans to continue Revolution's education and training efforts around R, and Sirosh notes it will leverage its global programs and partner ecosystem to do so.
Revolution Analytics CEO Dave Rich is now general manager of Advanced Analytics at Microsoft.
[Related: Learn to Crunch Big Data with R ]
"The CIO and CDO will need an easy-to-use, integrated platform and a vendor partner who simultaneously understands end-user productivity, cloud computing and data platforms," Rich says, describing the "Decision Process Engineering" that he sees dominating the next decade. "Who better to deliver this to companies large and small than Microsoft? All Microsoft needed was a bridge to crowd-sourced innovation on the advanced analytics algorithms and tools power results from big data. Who better than Revolution Analytics? Stay tuned. Now it gets interesting."
Follow Thor on Google+
This story, "Microsoft closes acquisition of R software and services provider" was originally published by CIO.
Subscribe to:
Posts (Atom)