There's been a lot of discussion the past couple of days about an analysis by Guy Rosen, in which he estimates that Amazon Web Services (AWS) is provisioning 50K EC2 server instances per day.
Computerworld feature Virtualization 101: What is virtualization?
He created this estimate by examining EC2 resource IDs (if you read his post, you'll see how he broke down resource IDs to understand their meaning) and doing a time-series analysis on how much the IDs are incremented per hour.
From this analysis, Rosen concluded that AWS is provisioning around 50,000 EC2 instances per day.
A 50K/day run rate would imply a yearly total of over 18 million provisioned instances. Rosen admits that his understanding of the resource ID might be incorrect, thereby creating flaws in his analysis; however, even if he's off by an order of magnitude, that would imply a yearly run rate of 1.8 million provisioned instances.
I'm not aware of Amazon announcing its total EC2 statistics, but it has announced S3 stats (S3 is Amazon's storage services.)
In February of this year, Amazon announced S3 contained 40 billion objects.
By August, the number was 64 billion objects. This indicates a growth of 4 billion S3 objects per month, giving a daily growth total of about 133 million new S3 objects per day.
Given the growth in S3, 50K EC2 instances being provisioned each day doesn't seem far-fetched at all, making the yearly estimate of 18 million provisioned server instances plausible.
By way of comparison, total server shipments for Q209 were around 1.4 million, according to IDC.
Of course comparing server shipments to EC2 provisioned instances is not direct. For one thing, each of the servers shipped in Q2 were very likely going to be virtualized, implying a much larger number of virtual machines being installed, which would be a more appropriate comparison to EC2 instances.
If each server hosts five virtual machines, that would imply a total quarterly VM instance count of 7 million, with a yearly total of 28 million (the number will probably be higher, perhaps significantly so, since the 1.4 million physical servers comes at a time of historic low sales; the yearly total could be significantly higher than 5.6 million, which would therefore raise the total number of virtual machines being hosted as well).
Moreover, while one can confidently state that each physical server represents a true increment to the pool, one cannot make the same claim about EC2 instances. A single Amazon Machine Image (the virtual machine) may be launched multiple times as an EC2 instance, thereby indicating that the true number of individual Amazon servers may be lower, perhaps much lower, than 50K per day.
Of course, one could make the same observation about the virtual machines hosted on the physical server count, so the quarterly VM instance count of 7 million might be somewhat lower as well.
Without overstating the accuracy of this analysis, what can we conclude from Rosen's analysis?
People are putting a lot of servers up on Amazon: Whether the real number is 1.8 million or 18 million EC2 instances, it's clear that a lot of computing is being done up on Amazon. And even if many of those instances are "double-dippers" (i.e., represent a single AMI that gets launched multiple times), there's still a lot of EC2 instances running on the AWS framework.
People are putting a lot of servers up on Amazon because it's cheap: There's lots of debate about whether cloud computing through an external provider can be less expensive than via an internal data center.
I've addressed this question before in previous posts.
Notwithstanding the larger question of TCO, there's no denying that it's dirt cheap to get started via the cloud. I heard one anecdote about NASDAQ's AWS application-when they got started one executive was astonished that their bills were running $US5 per month.
It's common in the early stages of a project that little actual computing is done-designs are worked on, a small prototype is put up and run, problems are identified, the prototype is taken down while the code is worked on.
In a traditional environment where the server has to be paid for upfront even if little work is done on it for weeks or months, it's typical that a lot of money is spent for little actual use.
With Amazon, people can get started on applications for-literally-pennies (dimes, anyhow). Amazon's growth story indicates how attractive that value proposition is.
People are putting a lot of servers up on Amazon because it's easy: Something we discuss with companies all the time is the reduced friction in using cloud computing.
Instead of the lengthy and tiresome resource request process common in IT organizations, cloud computing resources can be available with practically no overhead. Request resources via a web page, indicating parameters like amount of storage, etc., and press a button: minutes later resources are available.
If you've ever bought a book on Amazon, you're qualified to begin cloud computing (the process for the just announced vCloud Express product from VMware and its service provider partners is nearly as painless).
The benefit of reduced friction is widely under-appreciated, but vastly important. The easier it is to do something, the more likely one is to do it. There is a Best Buy no more than five miles from my house. But I often choose to purchase electronic goods from Amazon, because its two-day shipping makes it so easy to get stuff.
The reduced friction of electronic ordering and delivery to my door trumps close access and immediate purchase. There's no doubt that the ease of deploying compute resources on Amazon leads to people doing lots more provisioning.
Rosen's analysis is fascinating, and certainly timely. Many people pooh-pooh the phenomenon of cloud computing, dismissing it as only used by a few companies, or only startups, or only for trivial applications.
It's hard to look at these numbers and not conclude that something big is going on, and not just in "toy" applications.
Bernard Golden is CEO of consulting firm HyperStratus, which specializes in virtualization, cloud computing and related issues. He is also the author of "Virtualization for Dummies," the best-selling book on virtualization to date. Follow Bernard Golden on Twitter @bernardgolden.