Posts Tagged Cloudera
Our friend Matt Asay, who oversees business development for streaming Big Data analytics player Nodeable (read hear about Nodeable’s recent shift in business model), penned a column today sizing up the Hadoop distribution competition. Asay narrows the competitors to two – Hortonworks and Cloudera – and proceeds under the premise that only one of the two can and will survive.
Oracle added a twist to this morning’s announcement regarding the general availability of its Big Data Appliance and related Big Data connectors. Rather than shipping the appliance with its own Hadoop distribution or the vanilla Apache distribution, Oracle has partnered with Cloudera to include its Hadoop distribution and management software instead.
Originally announced at Open World in October, the Oracle Big Data Appliance is a preconfigured hardware-software bundle running Oracle Linux. It is available in a full rack configuration of 18 Oracle Sun servers and includes the community edition of Oracle’s NoSQL database, an open source distribution of R, and Oracle HotSpot Java Virtual Machine for running MapReduce jobs, in addition to CDH and Cloudera Manager.
The Big Data community has been waiting in anticipation for Ben Werther’s start-up Platfora to come out of stealth mode and reveal its grand vision since early summer. Well, that day has come and Werther’s vision for Platfora is indeed ambitious.
Platfora today announced it raised $5.7 million in Series A funding led by Anderseen Horowitz, with additional support from In-Q-Tel. In an accompanying blog post, Werther said Platfora has developed a platform to allow business users to interactively explore large data sets stored on Hadoop and create multidimensional, predictive dashboards and reports.
Yesterday my Wikibon colleagues and I had the pleasure of speaking with Charles Zedlewski, Vice President of Products at Cloudera, in anticipation of today’s Cloudera Enterprise 3.5 release. In addition to discussing the new features in today’s release, Zedlewski also talked about Cloudera’s position in the now three-member commercial Hadoop distribution market.
EMC joined the commercial Hadoop club in May with the release of its own enterprise distribution, which includes MapR’s distributed file system. This morning, Yahoo spun-off its Hadoop engineering unit to form HortonWorks, a new company that will offer its own enterprise Hadoop product soon.
The new company, to be called HortonWorks (inspired by the Dr. Seuss character), will focus on developing its own enterprise-ready Hadoop distribution and support services based on the open source Apache Hadoop project. When HortonWorks debuts its commercial Hadoop distribution, it will be the third such product on the market, along with commercial distros from Cloudera and EMC Greenplum (See Table 1).
In a recent interview with InformationWeek, Microsoft CEO Steve Ballmer claimed that IBM and Oracle don’t understand Big Data. For Ballmer and Microsoft, Big Data doesn’t depend so much on the size of the data, but on the type of data being processed and analyzed.
Specifically, for a data processing and analytics project to qualify as Big Data, it must encompass not just internal corporate data, but also third-party data that resides outside the firewall, according to Ballmer. He said IBM and Oracle limit their Big Data approaches to internal data, thus they are not in fact Big Data by his definition.
In 2010, key trends in infrastructure technology innovation included big data, cloud services, simplicity, virtualization, NAND flash, and data efficiency. We discuss these trends and core technology innovations in our Wikibon article, Best Enterprise Infrastructure Technology Innovations of 2010 and chose our Wikibon 2010 CTO award winners here.