ORIGINALLY PUBLISHED JUNE 2011
Analyst's Note: Since this note was originally published in June 2011, there have been significant developments in the Hadoop market. In particular, the table in this research note is now outdated. Since then, Hortonworks has shifted Rob Bearden to CEO and Eric Baldeschwieler to CTO, and added Herb Cunitz as President and Greg Pavlik as Vice President of Engineering. Hortonworks also has added over 50 paying customers as of March 2013. Cloudera, meanwhile, has since added to its funding, with total funding of $141 million as of March 2013. Wikibon provides a detailed assessment of the market as of June 2012 in Hadoop: From Innovative Up-Start to Enterprise-Grade Big Data Platform and will likewise soon publish another update on the Hadoop market for Spring/early Summer 2013.
Originating Author: Jeff Kelly, With David Vellante and John Furrier
From the time of its inception in 2009 until last summer, Cloudera was far-and-away the leading Hadoop distribution vendor on the market, both in terms of market share and mindshare. Its only real competitor was MapR, whose largely proprietary Hadoop distribution never caught on with the open source Hadoop community. Cloudera was viewed by most as the favorite to become the Red Hat of the Hadoop world.
Red Hat has a $10B market cap, however and competitors don’t want to let Cloudera run away with the Hadoop prize. In May 2011, EMC joined forces with MapR and announced its intention to distribute its own version of Hadoop and not fall in line with Cloudera’s offering. It was a clear shot across the bow at Cloudera. Nonetheless, while EMC is a formidable competitor, generally the Wikibon community’s reaction to this news was that Cloudera was still on solid ground. Our view was that Cloudera had three things going for it, including:
- A deep bench of Hadoop experts;
- Major contributions to the Hadoop open source community and
- A solid head start.
Essentially we saw Cloudera as the lone open source champion of Hadoop and the EMC/Greenplum/MapR initiatve as a more closed and proprietary long shot.
That all changed in June 2011 when Yahoo spun-off its Hadoop engineering team into a new company called Hortonworks, backed by Benchmark Capital. While the rest of Yahoo was struggling to find its way, over the past several years Yahoo’s Hadoop team quietly contributed the majority of code to the Apache Hadoop project and built one of the largest, most innovative Hadoop deployments in the world. Yahoo executives, looking for a winning formula, made the decision to form Hortonworks in order to grab its piece of the Hadoop pie and prevent Cloudera from running away with the market uncontested.
Overnight Cloudera faced a stiff and determined competitor with open source ‘street cred’ and funds from a high profile, aggressive VC. Hortonworks quickly set about de-positioning Cloudera as an undeserving darling of the Big Data open source community. It did so by aggressively courting the analyst community and IT press corps with a clear and consistent message: Hortonworks is committed to growing the Hadoop ecosystem by keeping the Big Data framework 100% open source and, as CEO Eric Baldeshwieler told Wikibon, “We are always going to ship Hadoop for free.”
It followed up its messaging with a fully open source, freely available Hadoop distribution of its own, Hortonworks Data Platform (HDP), released in November 2011, along with paid training and technical support services.
The implicit contrast with Cloudera’s freemium model was clear. While largely based on the open source Apache Hadoop distribution, Cloudera Enterprise includes a proprietary management console called Cloudera Management Suite. Customers must pay Cloudera to license the software or be content with Cloudera’s free Hadoop distribution, CDH, which lacks the management console and related support services.
Hortonworks’ other implicit message to the market was that it, not Cloudera, is in the better position to transition Proof-of-Concept Hadoop deployments to full-on enterprise-level deployments thanks to its experience at Yahoo.
Squinting Through the Hortonworks Hype Machine
Benchmark backed companies have a reputation for aggressive PR and marketing. Hortonworks is using the Benchmark playbook to go after the obvious angle—i.e. that open source communities tend to embrace the most open source approach to a given technology. That’s why Linux won out over Unix. As such, Hortonworks’ “100% open” message clearly resonates with the open source Hadoop community and is the source of much of the momentum behind Hortonworks. The result of all these moves is that, in just four months of life, from a market optics perspective, Hortonworks has pulled even with Cloudera in the court of public opinion.
The reality is that Hortonworks lacks one critical ingredient in this game—paying customers. The fact is there’s more sizzle than steak right now with Hortonworks as the company has just only delivered an actual product to market in November 2011-- and that offering is still in preview mode.
Hortonworks cannot claim all the credit for this incredible ascension in such a short period of time, however. Many believe that Cloudera has hurt its own cause by not countering Hortonworks’ frontal assault with an aggressive marketing campaign of its own. Specifically, many Hadoop watchers, including us, have indicated that other than a few blog posts written by Cloudera execs, the company has not aggressively countered Hortonworks’ marketing campaign with a concise value proposition. To be clear, we believe Cloudera has a significant marketing opportunity and the potential to take back the mantle—but it must act fast (see below).
Cloudera v. Hortonworks: Tale of the Tape
Cloudera has plenty to boast about. It has in fact contributed significantly to the open source Apache Hadoop project and its Hadoop distribution is in production at high-profile Web companies like Groupon and Klout. It launched an innovative partner and certification program in September and Cloudera engineers continue to develop new features to help Hadoop meet enterprise-level uptime and security requirements.
In addition, Cloudera has a two-year head start over Hortonworks servicing a small but growing customer base. No question the Hortonworks team learned many valuable lessons working at Yahoo, but supporting an internal Hadoop deployment at one large technology company is a lot different than supporting a large and varied customer base of both technology and non-technology companies. In order for Hortonworks to become a self-sufficient Hadoop support juggernaut, Baldeschwieler’s stated goal, the company needs to prove it can deliver.
Finally, consider the competing Hadoop distributions themselves. Their cores are both based on the open source Apache Hadoop distribution and related sub-projects, with the real differentiation being the installation and administration management add-on tools. Cloudera Management Suite, while proprietary, includes important enterprise-level features such as automated, wizard-based Hadoop deployment capabilities, dashboards for configuration management and a resource management module for capacity and expansion planning. Ambari, Hortonworks' answer to Cloudera Management Suite, is open but is less mature and currently lacks advanced cluster management capabilities.
The reality is that Cloudera’s Hadoop distribution is largely open source and the risk of vendor lock-in due to its relatively few proprietary components is, in Wikibon’s opinion, lower than what Hortonworks marketing implies. Organizations that come to rely on Cloudera Enterprise for crucial parts of the business but later decide to move to a different Hadoop distribution or competing Big Data approach should be able to do so with little difficulty.
That said, Hortonworks’ open 100% approach means that updates and improvements to its distribution are likely to come quicker than those of Cloudera’s distribution and that partners may find it easier to integrate with HDP than Cloudera Enterprise. These are not insignificant factors that potential customers must consider.
Hortonworks Takes Cloudera to PR/Marketing School
In Wikibon’s opinion, Cloudera needs to immediately answer a number of fundamental questions and begin marketing its message aggressively to counter Hortonworks implicit criticisms. We believe Hadoop World 2011 is Cloudera’s best near term opportunity for this effort. Specifically, there are four areas in which we believe Cloudera must sharpen its marketing message:
- 1) Clearly articulate the benefits of Cloudera’s Hadoop management console over HDP’s Ambari.
- 2) Communicate what role services will play at Cloudera in both the short- and long-term.
- 3) Explain how Cloudera’s two-year head start in the commercial Hadoop market better positions it to support enterprise customers than Hortonworks.
- 4) Re-affirm its commitment to openness and counteract the Hortonworks marketing machine.
In addition, the company must make it clear why its strategy of becoming a software versus a services company is the most viable approach for customers. If Cloudera can answer these and similar questions, and effectively communicate them to the market, it has a good chance of regaining the momentum Hortonworks has snatched from it. The stakes are high for both vendors. Wikibon believes the Big Data market will reach the multi-billion dollar level in the next five years and many tens of billions of dollars in market value are at stake. While there is likely room for both Cloudera and Hortonworks to build credible businesses, there’s a big difference from being the top vendor in a large and growing market and a distant second or third place.
From a buyer’s perspective, users must consider the viability of the vendors with Hadoop offerings. Both Cloudera and Hortonworks are venture-funded start-ups with products and services running on top of the Apache distribution. While both claim to be in it for the long haul, there remains the risk that either could sell out to a larger vendor (think MySQL, Java and Oracle) or, conversely, the companies may never turn a profit and cease operating. There are also alternative Big Data approaches to consider, such as HPCC Systems and MPP data warehousing from vendors like Teradata Aster Data, HP Vertica and EMC Greenplum.
Enterprises must weigh all these factors and make the best decision for their particular situation. Even once a decision has been made, practitioners must keep a close eye on developments in the Big Data landscape as it is developing rapidly. Early adoptors are very much the pioneers of the Big Data world. Flexibility and a willingness to experiment are key to success.
Action Item: Enterprises evaluating Big Data approaches must determine which vendor – Cloudera, Hortonworks or MapR-- brings the greatest business value with the lowest cost and least risk. For some, the value of fast business impact on revenue or profit will outweigh the risks of vendor lock-in and potential higher capex. For others taking the long view, the potential improvements to an open approach and the flexibility to change direction may be more important than any short-term value gained. The bottom line is we are early in the race and while Cloudera is the favorite to be in the winner’s circle, Hortonworks’ aggressive marketing threatens to tip the balance if not counteracted by Cloudera.
Footnotes: On November 7th, Cloudera announced it secured $40M in Series D financing from Ignition Partners. In addition, Accel Partners announced a $100M Big Data fund to finance emerging companies in the big data ecosystem.