Hyper-specialization Is The Secret Sauce To Amazon Web Services

Become a Member!

Why Register?

Login

Featured Research

Announcements

Technology Events

Home Profile Peers Wiki Activity Groups Feedback

Hyper-specialization is the secret sauce to Amazon Web Services

Currently 4.7/5 Stars.
1
2
3
4
5

rate this

Last Update: Dec 16, 2013 | 11:58

Viewed 16195 times | Community Rating: 4.7

Originating Author: Stuart Miniman

1 Introduction
2 Where specialization meets scale
3 Compute is our density
4 Hyperscale storage paradigm
5 Networking becomes just another programmable component
6 Utilization Gap

Introduction

Amazon Web Service’s James Hamilton at re:Invent 2013
Photo by Wikibon’s Stuart Miniman

Last month at Amazon’s AWS re:Invent user conference (see full coverage here), 9,000 attendees gathered to dig deep into the top cloud computing offering in the marketplace. Amazon Web Services (AWS) is very secretive about its business; everything from revenue to the underlying infrastructure of AWS are not disclosed since, as AWS SVP Andy Jassy told analysts attending the conference, “customers don’t care about this”. So while no one is allowed to tour the AWS data centers, we were given some rare insights into the methodologies and philosophy of design by AWS VP & Distinguished Engineer James Hamilton (see his interview on theCUBE from re:Invent).

Every day, AWS adds enough new server capacity to support all of Amazon’s global infrastructure when it was a $7B annual revenue enterprise.

Where specialization meets scale

Scale matters, but AWS is not an undifferentiated collection of commodity gear. Hamilton said that 10 years ago he believed that architecture should be a giant collection of commodity gear where software provides most of the value. He now believes that this thinking is wrong and that it is through hyper-specialization that Amazon can continue to deliver innovation. The scale of AWS S3 is trillions of objects delivering over 1.5M requests per second. Not only is the scale massive, but for it to be predictable, Amazon DynamoDB consistently delivers 3ms average latency across all APIs. When asked about Facebook’s methodology discussed at the Open Compute Summit earlier this year, which is to standardize on five compute configurations, Hamilton said “I have many more configurations that that now and will have even more next year.” Adding more configurations does have more overhead, but at scale, it is better than just having a handful of configurations.

While this message runs counter to the discussion that large public clouds save money through homogeneous deployments that reduce operational costs, Hamilton points out that AWS is not a typical data center:

General market offerings must work in a wide range of data center environments; AWS solutions are optimized for specific, well known data center parameters.
While many solutions are built for specific application requirements, AWS builds each application to a scale that is unmatched and therefore doesn’t lose economies of scale.
Amazon designs and integrates the entire solution, hardware, software, and datacenter

Compute is our density

Amazon is a large consumer of Open Source Software (OSS) but is not a public contributor. James Hamilton is himself a large proponent of OSS initiatives, and in his presentation at re:Invent, he discussed the advantages of using commodity hardware. For the compute layer, Hamilton said that while a rack of Quanta servers weighs ¾ ton (up to 600 disk drives in a 42U rack – this matches the densest commercially available architectures for Hadoop), Amazon’s configurations are even denser at over 1 ton per rack! AWS added a number of flash optimized instances. Recent industry figures show that ODM servers like those used at Amazon make up a sizable portion of the marketplace ($783M in 3Q13 which was 45% y/y growth). Amazon is not content to simply take components off the shelf; Hamilton stated that it has two engineers working solely on server power supplies where redesigns that are pennies cheaper or a fraction more efficient translates into huge savings.

Hyperscale storage paradigm

While S3 may be the largest storage array in the world, it is made up of all compute-resident disk and flash. Server-based storage architectures can also be designed for the enterprise]]. Large cloud providers have used “Distributed DAS” architectures for many years and they have been adding more features into the solutions. Service providers and enterprise accounts need scalable solutions (although not the same order of magnitude of Googlezon) that are more feature rich and don’t require a team of PhDs. ScaleIO (acquired by EMC) fits into this new category of solution; here’s a blog about a 1000 node configuration. The best fits for compute based storage solutions are for test environments and larger scale configurations. As discussed in VMware VSAN vs the Simplicity of Hyperconvergence, the overhead of building, testing, optimizing and supporting this sort of architecture makes the total cost more expensive for smaller configurations.

Networking becomes just another programmable component

From a networking perspective, Hamilton shared that AWS uses custom routers and protocol stacks. He is publicly supportive of white-box networking solutions and even wrote a blog post about Cumulus Networks bringing Linux to the networking world. By using merchant silicon, networking can follow a path similar to Moore’s Law, leading to lower costs and especially at scale a non-linear growth (networking is one of the few resources that does not typically get cheaper at larger volumes). Since Amazon builds its own devices and stack, it can make fixes in a day that would otherwise take months if they had to wait for a vendor to spin code. Amazon’s network and every service are heavily monitored so that every metric can be tracked.

Utilization Gap

When building any infrastructure, you pay for the peak but only monetize the average. In a typical data center, even with a heavily virtualized environment, getting 30% utilization is great. Cloud methodology is to combine non-correlated workloads over infrastructure at scale so that the law of large numbers allows the difference between the peak and average workloads to shrink. Amazon’s low margin “cycle of innovation” is to iterate on this path:

Innovate,
Listen to customers,
Drive down costs & improve processes,
Pass on value to customers & re-invest in features.

Twitter quotes posted during re:Invent 2013 keynote

Action Item: Scale matters, and not all clouds are created equal. Amazon Web Services is continuing to innovate in secrecy in an attempt to keep ahead of contenders to its cloud leadership. CIOs need to pay attention to the hyperscale players, which herald the direction of technology. The forecast of public versus private cloud usage over the next few years is hotly debated, but it is without doubt that infrastructure designs and operational models are seeing seismic shifts, and Amazon is a key disruptor.

Footnotes: Stu Miniman is a Principal Research Contributor for Wikibon. He focuses on networking, virtualization, converged infrastructure and cloud technologies. Stu can be reached via email (stu@wikibon.org) or Twitter (@stu).

Comments on 'Hyper-specialization is the secret sauce to Amazon Web Services'

Stu, Great article. Some observers are suggesting that Amazon's move into more customized hardware is also going to support a move into private cloud -- providing hardware with AWS software preloaded to customers who want to run clouds internally. Did you see any indication of that at the conference?

Also, HP was very public at Discover Europe last week about its big push into the public and private (hybrid & managed) cloud market based on OpenStack. IBM of course is also using and contributing to developing OpenStack. Is this a real threat to AWS dominance in public cloud?

Posted By:Bert Latamore| Mon Dec 16, 2013 11:38
Bert - Amazon has stated that they have no interest in building private clouds. As I mentioned in the article, one of the reasons that AWS can build some of the solutions is because they have complete control over the environment and this wouldn't be the case in a non-Amazon data center.

From what I observed at re:Invent, Amazon does not consider OpenStack to be a threat. From my perspective, I would want to see more maturity in OpenStack and a lot more adoption before I would consider it as a credible threat to AWS. Here's an article and video that I did about Google, Amazon and OpenStack earlier this year:

http://wikibon.org/wiki/v/Sizing_Up_Google_Compute_Engine,_Amazon_AWS_and_the_Cloud_Field

Posted By:Stuart Miniman| Mon Dec 16, 2013 12:39

Revision ID	Author	Timestamp	Comment
53131	Bert Latamore	13 Dec 16 11:58:13
53130	Bert Latamore	13 Dec 16 11:34:23
53124	Stu	13 Dec 15 20:38:53
53123	Stu	13 Dec 15 20:38:44
53122	Stu	13 Dec 15 20:36:39
53121	Stu	13 Dec 15 20:36:12
53120	Stu	13 Dec 15 20:35:48
53119	Stu	13 Dec 15 20:35:25
53118	Stu	13 Dec 15 20:34:41	Created page with '===Introduction=== [[Image:JamesHamiltonAWS.JPG\|right\|thumb\|400px\|Amazon Web Service’s James Hamilton at re:Invent 2013Photo by Wikibon’s Stuart Miniman]] La...'

Wikibon is a professional community solving technology and business problems through an open source sharing of free advisory knowledge.

Become a Member!

Login

Featured Research

Announcements

Technology Events

Contents

Introduction

Where specialization meets scale

Compute is our density

Hyperscale storage paradigm

Networking becomes just another programmable component

Utilization Gap

Comments on 'Hyper-specialization is the secret sauce to Amazon Web Services'

Post A Comment

most recent wikibon articles

latest wikibon blog posts

company profiles

wikibon community information