Navigating the Big Data Vendor Landscape

The Big Data vendor landscape is developing rapidly. A number of vendors have developed their own Hadoop distributions, most based on the Apache open source distribution but with various levels of proprietary customization. The clear market leader in terms of distribution is Cloudera, a Silicon Valley start-up with an all-star line-up of Big Data experts including Hadoop creator Doug Cutting and former Facebook Data Scientist Jeff Hammerbacher. A new entrant to the market is Hortonworks, which was spun out of Yahoo in June 2011 and released a completely open source Hadoop distribution of its own in November 2011.

(Read the entire Big Data Manifesto here, which includes market analysis, technical primers on Hadoop and MPP Data Warehousing, and action items for enterprises and vendors.)

Likewise, a number of start-ups are working to deliver commercially supported versions of Hadoop’s myriad sub-projects. DataStax, for example, offers a commercial version of Cassandra that includes enterprise support and services. Proprietary data integration vendors, including Informatica and Syncsort, are making inroads into the Big Data market with Hadoop connectors and complimentary tools aimed at making it easier for developers to move data around within Hadoop clusters. The analytics and data visualization layers of the Hadoop stack are also experiencing significant development. A start-up called Platfora, for example, is developing what it says is an “all-in-one” business intelligence platform for Hadoop. EMC Greenplum, meanwhile, has Chorus, a sort of playground for Data Scientists where they can mash-up and experiment with large volumes of data.

Meanwhile, the Next Generation Data Warehouse Market has experienced significant consolidation since 2010. Four leading vendors in this space — Netezza, Greenplum, Vertica and Aster Data — were acquired by IBM, EMC, HP and Teradata, respectively. Just a handful of niche independent players remain, among them Kognitio and ParAccel.

The cloud is increasingly playing a roll in the Big Data market as well. Amazon supports Hadoop deployments in its Amazon Elastic MapReduce cloud, enabling users to easily scale-up and scale-down clusters as needed. A start-up called Tresata offers Big-Data-as-a-Service for the financial services vertical market. Wikibon believes Big Data and the cloud are a natural fit and encourages Hadoop developers and Data Scientists to explore cloud-based Big Data deployments where they make economic and administrative sense.

The services-side of the Big Data market is embryonic. The established services providers like Acenture and IBM are just starting to build out Big Data practices. Just a few smaller providers focus strictly on Big Data, among them Think Big Analytics, with other niche business analytics services providers like Digital Reasoning expanding their offerings to include Big Data.

(Don’t miss live coverage via #theCUBE and SiliconANGLE from Hadoop World 2011, November 8 and 9.)

Share

, , ,