Hadoop and HBase have changed the economics of data, Cloudera CEO Mike Olson said on a SiliconAngle.tv live webcast from the O'Reilly Strata 2012 conference. “When we talk to customers, one of the measurements we talk about is return per byte of data,” he told SiliconAngle CEO John Furrier and Wikibon.org Chief Analyst David Vellante.
Admitting that he is “an old-school relational guy,” he said, “Relational technology “evolved over 30 years to solve certain business problems, and today if you need to capture transactional data, for instance, you should look to relational.”
But the issue with the RDBMS is that it is expensive. Hadoop has dropped the cost of storage to a fraction of the cost-per-byte in an RDBMS, allowing companies to afford to profit from keeping much more data. At the same time new kinds of data – machine-generated data from machine-to-machine interactions, social data from human interactions in the cloud, etc. – have appeared. And because these have never been available before, he says, “even very simple analytics pay enormous value.”
The result is that companies can collect data in a volume and variety not available before. They can save every detail of the data rather than just approximations and realize a net positive return on every byte.
This makes Hadoop and big data a compliment rather than a replacement for the traditional RDBMS data warehouse, allowing businesses to answer new questions that were not only unanswerable but often unaskable five years ago. But today big data analytics often goes on totally isolated from traditional applications, whereas the full value of big data will only be achieved when it is integrated into the IT architecture.
“Cloudera does have consulting today, but basically we are a product company that delivers integration with the infrastructure.” And because HBase runs at full Web interactive speeds and is fully integrated with Hadoop, it provides the full power of MapReduce and gets around the Hadoop batch-only, high-latency restrictions without sacrificing the analytic power underneath.
And, he says, it is a nice development platform that developers like. That is particularly important at the moment because “the gating factor to wider adoption of big data is applications. Financial analysts, insurance adjusters and other potential business users need applications that know all that data is available and can use it. Those applications are starting to appear, but today we are at 1985 in RDBMS development terms.”
Even without that, HBase is being applied to an increasing number of high-value business problems that give an indication of the potential for big data analytics. Those include behavioral customer analytics, and detail analysis of individual investor portfolios including the risk involved. It also is “turbocharging company security architectures” by allowing companies to keep all the detailed data on network activity, not just on threats they recognize immediately. That allows them to analyze all network events in detail to identify hidden attacks and use that knowledge to strengthen their security architecture. Those kinds of analysis can have huge positive impacts on corporate profitability that easily justify the investment.
Action Item:
Footnotes: