A crucial component of the Big Data value proposition is the ability to bring together structured and unstructured data in a single platform for business analytics and application development. That approach received further validation last week when HP announced it had “combined” Autonomy’s enterprise search platform with Vertica’s massively parallel analytic database into a single Big Data Analytics platform.
Yes, but … delivering a comprehensive, seamless Big Data Analytics platform requires a lot more than just “combining” two complimentary technologies. Unfortunately, the Autonomy/Vertica announcement was short on specifics and lacked some key details. Perhaps this shouldn’t come as a shock, since the Autonomy acquisition closed just over a month ago.
Regardless, HP must answer the following questions if it is to deliver on its promise of a fully integrated Big Data Analytics platform:
- How are the two technologies integrated from a data processing perspective? HP claims to have created a “single processing layer” accessible via a NoSQL interface for both unstructured and structured data, but left most of the details out. How exactly does the new platform blend Autonomy’s Bayensian approach and Adaptive Probabilistic Concept Modeling algorithms for understanding patterns in text-based content with Vertica’s real-time loading and query capabilities for relational data?
- How will both application developers and end-users interact with the new Big Data platform? There was no mention in the announcement of what the application development layer looks like or how it will facilitate moving new applications built in the system into production. Nor did HP discuss any visualization technology to allow non-expert business users to explore Big Data on their own. It’s worth pointing out that HP got out of the business intelligence market when it shuttered NeoView in January, so it may have to turn to partnerships on the BI tools/visualization front.
- What is the upgrade path to the new Big Data platform for existing Autonomy and Vertica customers? Will they have to shell-out full price for the new platform or will HP offer discounts to existing customers?
- Is the new platform intended as a compliment to existing EDW, BI and Hadoop installations, or as a replacement for legacy business analytic technologies (though you could hardly call Hadoop a legacy technology)? If it’s the former, what are the technical challenges to deploying and integrating the new platform within existing IT infrastructures? If the latter, will the platform really deliver enough value to justify a major rip-and-replace of multiple analytic databases and applications that enterprises have invested heavily in over the years?
In addition to answering these questions, HP also has to contend with competitors whose platforms were built from the ground-up to process and analyze both unstructured and structured data and, thus, don’t have to tackle integration challenges.
The most mature of these is Newton, Mass.-based Attivio. Attivio’s Active Intelligence Engine processes both structured and unstructured data via a single inverted index approach, allowing developers to build end-user applications via the company’s AI-SQL extension. The platform can also easily leverage existing BI tools for front-end delivery, meaning customers don’t have to perform costly rip-and-replace projects.
Another competitor is Endeca, whose application development platform called Latitude Studio leverages a hybrid search-analytical database to process both structured and unstructured data. Endeca was recently acquired by Oracle, however, so its future as a stand-alone product is unclear. Should Oracle try to merge Endeca with its Exadata database, for example, it will face similar issues that HP is grappling with over Autonomy and Vertica.
HP’s new Big Data Analytics platform clearly validates the approach of processing and analyzing both structured and unstructured data from a single environment. Enterprises that fail to include unstructured data from emails, social media, documents and other sources in their analytic and application development processes will, without a doubt, miss key insights that could dramatically impact their business. Therefore, enterprises from across industries should consider such unified information access platforms taking into consideration each platform’s level of integration between unstructured and structured data processing (built seamlessly from the ground-up v. cobbled together via acquisition), ease-of-deployment, maturity of the application development and visualization layers, and potential to disrupt existing technologies.
For HP, the company needs to fill in the details around its new Autonomy/Vertica Big Data Analytic platform with an eye toward the questions raised above.