Oracle added a twist to this morning’s announcement regarding the general availability of its Big Data Appliance and related Big Data connectors. Rather than shipping the appliance with its own Hadoop distribution or the vanilla Apache distribution, Oracle has partnered with Cloudera to include its Hadoop distribution and management software instead.
Originally announced at Open World in October, the Oracle Big Data Appliance is a preconfigured hardware-software bundle running Oracle Linux. It is available in a full rack configuration of 18 Oracle Sun servers and includes the community edition of Oracle’s NoSQL database, an open source distribution of R, and Oracle HotSpot Java Virtual Machine for running MapReduce jobs, in addition to CDH and Cloudera Manager.
Oracle told Wikibon/SiliconAngle that it has fully tested and certified CDH to run in Oracle environments and that Oracle will provide maintenance and support.
Oracle also announced the availability of Oracle Loader for Hadoop, which, as the name suggests, aims to streamline the process of loading data from Hadoop into Oracle 11g. In addition, the company released Oracle Data Integrator Application Adapter for Hadoop, a GUI-style tool for generating MapReduce jobs; Oracle Connector R, which facilitates working with HDFS using the open source language in Oracle environments; and Oracle Direct Connector for Hadoop Distributed File System that enables direct data access between the Oracle Database SQL engine and HDFS.
Oracle’s Winding Road to Hadoop
Oracle was a curious presence at Hadoop World in November, considering it had only just come around to Hadoop, having previously claimed the Exadata appliance was the answer to all your Big Data needs. That premise was never a sound one, as demonstrated by this detailed analysis from Wikibon’s David Floyer. Due to its traditional scale-up architecture and inability to process unstructured data, Exadata is simply not a cost effective way to manage and analyze truly Big Data.
Even when Oracle announced its intention to include Hadoop in its new Big Data Appliance at Open World, just five weeks before Hadoop World, some (myself included) surmised that Oracle might even be embracing Hadoop and NoSQL approaches in order to stall their development and maintain a market for its Exadata machines. But with today’s Cloudera partnership, which isn’t unlike EMC’s relationship with MapR, I’m satisfied that Oracle is sincere in its support of Hadoop, at least for now.
The main draw for Oracle was likely Cloudera’s proprietary Hadoop management software. Recently upgraded and renamed (previously called Cloudera Service and Configuration Manager), Cloudera Manager includes an event manager tool called Global Time Control, which graphically visualizes all activity in a Hadoop cluster to aid downtime discovery and diagnostics. Cloudera Vice President of Products, Charles Zedlewski, likened it to TiVo for your Hadoop cluster when we spoke last month. Cloudera Manager is also tightly integrated with Cloudera’s support processes.
Not that Oracle is backing away from Exadata, however. Not unlike other data warehouse vendors, the company envisions customers using its Big Data Appliance and related connectors to process and feed unstructured data into existing Oracle data warehouses to enrich traditional, structured data with machine-generated data like sensor, log-file and geo-location data, said Cetin Ozbutun, Oracle Vice President of Data Warehousing Technologies.
The move is also a good one for Cloudera. It gives the start-up an important new distribution channel and adds credibility to its claim that Hadoop is enterprise-ready. The company will presumably get a percentage of each Big Data Appliance sale by Oracle, rather than monthly subscription revenue it collects from its direct customers.
Oracle Looks to Build its Big Data Street Cred
The new Big Data Appliance should prove attractive to some of Oracle’s large enterprise customers, particularly those in financial services and manufacturing that have been struggling (and paying mightily) to manage and analyze exploding data volumes with existing Oracle database technology.
The appliance’s “plug and play” approach will also appeal to Oracle customers that want to leverage Big Data but not spend months configuring a complex Hadoop deployment on their own, or who don’t have the Hadoop know-how to do so even if they wanted to. And it should get Oracle customers that have yet to bump up against the Big Data problem thinking about its possible use cases for the first time, a good thing for Hadoop and the Big Data movement as a whole.
It is unlikely, however, that the Big Data Appliance will result in any significant number of net new customers for Oracle, since there are a number of less expensive, less risky options on the market, not least Cloudera itself. Nor is it likely to spur Exadata or Exalogic sales. Rather, Oracle is likely looking for long-term Big Data credibility, and, perhaps, even laying the groundwork for a Cloudera acquisition.
Most current Cloudera deployments are of the free, proof-of-concept variety and the company’s challenge in 2012 is to turn many (if not most) of those into full-scale, production-level paying deployments. If it delivers, Cloudera could significantly extend its market lead on rivals Hortonworks and MapR, making itself an attractive acquisition target for Oracle, IBM or even EMC.
Another issue that needs more clarification is support. While the press release accompanying today’s announcement says the two vendors will collaborate in providing support services for the Big Data Appliance, Oracle’s Ozbutun told Wikibon/SiliconAngle that Oracle is responsible for maintenance and support. Hopefully, for customers at least, the former is accurate, as Cloudera’s technical support services and management software are its key differentiators. As far as I can tell, Oracle has no experience maintaining and supporting Hadoop clusters, large or small, PoC or in production.
I suggest current Oracle Exadata and 11g customers take a good long look at the new Big Data Appliance. It could prove applicable to those customers that are looking for an effective way to manage Big Data but are reluctant to throw their lot in with a start-up such as Cloudera or MapR. But be sure to ask for more details around maintenance and support, particularly if Cloudera support is included in the license.