The timing was curious.
In late January, HP announced a partnership with Microsoft to release a series of analytic appliances, among them an enterprise data warehouse (EDW) appliance based on HP's hardware and Microsoft's SQL Server R2 database. Just a few days later, HP officially confirmed it was discontinuing HP NeoView, its proprietary EDW offering. Launched in 2007, NeoView failed to gain significant traction with customers and was criticized for its lack of scale-out capabilities.
Then, in mid-February, HP announced it was in the process of acquiring Vertica, a Massachusetts-based maker of columnar-storage, massively parallel processing (MPP)-based data warehouse appliances.
So why would HP partner with Microsoft to release a series of data warehouse appliances, then purchase a data warehouse appliance maker whose core product will presumably compete with the HP-Microsoft appliances?
The answer may be that Vertica's MPP data warehouse appliance and the Microsoft-HP appliances will complement rather than complete with one another. It's possible the two technologies could even be deployed in the same enterprise.
Consider: Vertica's core analytic platform is designed to scale-out to handle large volumes (hundreds of terabytes or more) of data. It is among a handful of analytic appliances hitting the market in recent years that leverage MPP architectures to process massive analytic jobs in the emerging “era of big data” (others include appliances from Greenplum, Aster Data, and ParAccel.) It is designed to quickly and continuously load large data volumes, and its data compression and in-memory capabilities mean it can process and return analytic queries in near real-time.
Its limitations, as cited by users and others, is that Vertica's lack of workload management features make it difficult for the platform to handle mixed-query workloads from large numbers of concurrent users, hallmark features of any EDW. Most Vertica customers have fewer than 100 users accessing the analytic platform, according to Gartner.
Microsoft's SQL Server R2 DBMS represents a more traditional EDW platform. (Yes, Microsoft launched its own MPP-based data warehouse, SQL Server R2 Parallel Data Warehouse, based on technology Microsoft acquired from DATAllegro last fall, but that is still a young product lacking a large customer base.) An EDW based on SQL Server R2 DBMS is adept at handling many users scattered throughout an organization looking to analyze multiple data domains.
So how could the two technologies – an HP-Vertica analytic appliance and an HP-Microsoft EDW appliance -- coexist in the same environment? Vertica answered the question in a whitepaper released this time last year. In it, Vertica maintains, “by offloading certain analytic workloads from an existing EDW, based on popular solutions, like Oracle, DB2 and Teradata (and you could add Microsoft SQL Server to this list) to a Vertica Analytic Database, it is possible to increase the performance and longevity of the EDW and satisfy the ever-growing requests for information from the business.”
The idea is that an organization could maintain a traditional EDW, loading manageable amounts of data from multiple domains in batch form (be it daily or weekly), that would allow large numbers of users to run standard queries and generate reports. The smaller number of users that want to run query-intensive analytic jobs on large volumes of data, which can significantly slow the performance of an already busy EDW, would tap data marts based on Vertica's platform with data offloaded from the EDW. Vertica's parallel data integration capabilities, which the vendor says are up to 10 times faster than traditional data warehouse platforms thanks to its hybrid architecture, along with its appliance model, means a supplemental data mart could be up and running in an enterprise in a matter of days.
This sounds like a plausible strategy for organizations that encounter this scenario – the need for a large, stable EDW and smaller, more nimble and powerful data marts. It allows users with data-intensive analytic jobs the flexibility they need but allows the organization to maintain a certain level of data quality, since the data is loaded from a centrally managed EDW. EDW users, meanwhile, should encounter few if any performance issues, which can hamper user adoption.
The strategy could also be a good one for HP, at least in the short term. Its hardware would power both the Microsoft appliances and the Vertica platform, bringing in two steady streams of income. Its partnership with Microsoft also helps HP rebuild its credibility in the EDW market after the glaring failure of NeoView.
In the long-term, however, expect HP to make another move into the EDW market on its own. Hardware, while important, is basically a commodity today. Most databases and data warehouses can run on commodity hardware from any number of vendors, including IBM, Oracle-Sun, Dell and, of course, HP. And partnerships with Microsoft and others mean HP is splitting data warehouse revenue with potential competitors. The real money, and differentiation from competitors, is in the software and analytics business.
If HP really wants to compete with mega-vendors like IBM, Oracle, and Microsoft, it's going to have to either buy or develop its own EDW technology. One option is to tap Vertica's talent and try to develop the Vertica platform into a true EDW. That would require Vertica to add workload management features and optimize the platform to handle larger volumes of mixed-workload queries. This is possible, of course, but it would require rejiggering a platform optimized for speed and scale.
Another tack would be to make an acquisition. The most obvious choice here would be SAP, for a number of reasons. Most importantly, SAP boasts a well-regarded (if not flashy) EDW platform in the form of Sybase IQ (SAP acquired Sybase in July of last year). HP and SAP already partner on a number of fronts, including data warehousing. And, of course, HP CEO Leo Apotheker was the top man at SAP until February 2010.
It’s hard to say which direction HP will go. Publicly, the company insists it is not interested in acquiring SAP and is content to partner with Microsoft and others to keep its toes in the EDW market. But maintaining this status quo in one of the fastest growing and most lucrative IT segments isn't likely to sit well with Apotheker. Expect HP to make a move towards its own EDW offering, one way or another, in the next year.
Action Item: For now, Vertica customers can probably rest easy. HP acquired Vertica for its technology and will likely let the company operate independently for the foreseeable future (just as IBM is doing with Netezza and EMC with Greenplum.) Customers running any of the HP-Microsoft appliances (or any analytic appliance running on HP hardware, really) looking to offload data-intensive analytic jobs to an MPP-based data mart can and should add HP-Vertica to its list of potential vendors, if they haven't already. Long-term, I suspect HP will go the acquisition rout considering Apotheker's close ties to SAP and HP's lack of success building its own data warehouse, NeoView, from within. Either way, the stakes are probably higher for HP than customers. Even if HP develops or buys its own EDW technology, customers will still be able to run competing data warehouses on HP hardware as it is highly unlikely HP would restrict the use of its hardware to its own database offerings. Hardware isn't a differentiator, but it is a steady revenue source for HP.
Footnotes: