#memeconnect #emc #cf #bigdata
Is your data warehouse a dinosaur, headed for extinction, asks Wikibon CEO David Vellante in the latest Peer Incite newsletter. In his lead article Big Data Update: Your Data Warehouse is Not a Dinosaur he focuses on two major forces in the market that challenge the traditional data warehouse: data appliances such as Oracle's Exadata and big data systems, which are a major new approach to data analysis incorporating structured and unstructured data to answer new kinds of business questions.
These two trends challenge traditional data warehouses in separate ways. A newly released Wikibon study based on interviews with 40 Wikibon community members concludes that IT appliances can reduce the total cost of ownership (TCO) of a data warehouse by an impressive 15%-30%, with results tending toward the higher end of that scale. Much of that comes from cost and complexity reductions in maintenance and updates inherent in treating the entire stack as a single SKU. A single vendor, in this case Oracle, provides pre-tested updates to the entire system that can be installed automatically. However, Vellante warns, these savings presume that the appliance has sufficient internal storage to house the data warehouse. In cases where the database exceeds the capacity of the appliance, the user is faced with two expensive alternatives, either purchase another complete appliance, which in most cases means paying for unneeded compute and other capabilities, or add external storage using an Ethernet connection. This second option breaks the single SKU model, increasing cost and complexity of maintenance and updates.
He also warns that the appliance approach is the ultimate vendor lock-in and is a key part of Oracle's strategy to own the entire IT stack, and that it is inflexible in terms of supporting operating systems outside of Unix. Thus, he concludes, some companies will choose to maintain and build new roll-your-own data warehouses despite the extra TCO involved.
Big Data vs the Data Warehouse
Meanwhile, big data analysis projects do not fit well into the data warehouse architecture or infrastructure, whether that is roll-your-own or a data appliance, writes Wikibon CTO David Floyer, the research lead on the study. In his article, Financial Comparison of Big Data MPP Solution and Data Warehouse Appliance, he presents a detailed cost study for an composite project evaluating customer experience based on the experiences of 40 Wikibon community members. The result is eye-opening. Using the traditional data warehouse approach and the Oracle Exadata appliance, the cumulative three-year cash flow would be $53M, compared to $152M, three times that, for the MPP/big data approach. Net Present Value using the DW approach would be $46M, compared to $138M, more than 3X, using MPP. The internal rate of return was 74% for the DW, a whopping 524% for MPP. The break-even point for the DW version would be achieved in 26 months, compared to four months for MPP. The choice was a no-brainer. And given that an increasing number of the advanced business analytics projects companies will be doing, and using as the basis for their business strategies going forward, will involve huge amounts of multiple kinds of unstructured and structured data from multiple internal and external sources, the definition of big data – this implies that over time big data projects and programs will come to dominate new data analysis projects.
But, says Floyer, this does not imply that big data technologies such as MPP will replace traditional data warehouses. Instead, he says, they will exist in parallel. The traditional data warehouse will still play a vital role in the business. Financial analysis and other applications associated with the DW will still be important, and the DW itself will be a source of some of the data used in big data projects and will probably receive data from the results of advanced analysis projects. So big data is an expansion rather than a replacement for DW and RDBMS data engines.
From Big Iron to the Toaster Appliance
This is good news for the traditional vendors, says Stu Miniman in The Emerging Big Data Vendor Ecosystem, because most of the core technologies for big data are coming from startups such as Cloudera, ClickFox, Membase, Karmashpere, and DataStax, along with Internet companies such as Google, Yahoo, LinkedIn, and Bit.ly. This, of course, is normal, and given that expenditures on traditional DW totals $10B+ annually, while the big data market, while growing rapidly, is less than 10% of that, it is expected. The big vendors are already responding with a typical strategy – writing checks. EMC has acquired Greenplum and Isilon, and Teradata is apparently in the process of acquiring Aster Data.
However, this dependence on small startup vendors only adds to the risks of big data projects. This is still largely unexplored territory argues Nick Allen in Leverage the Cloud for Big Data to Avoid Traditional Traps. Some big data projects will inevitably fail, and even when they do succeed they will often be one-offs rather than ongoing programs, and the next big data project may require entirely different infrastructure than the last. As a result, he argues, companies would do well to limit their up-front financial expenditures to decrease that risk. To do that, he suggests, companies should leverage public cloud service and infrastructure providers wherever possible rather than buying hardware and software that they may only use once.
Appliances and big data compliment each other in another way, writes Bert Latamore in IT Appliances, Big Data, Both Challenge the Traditional IT Organization. Early adapters of IT appliances have put them in the database management group, but that is a forced fit. These are really IT-in-a-box, with storage, compute, network, and management built into a single integrated package. They need a cross-functional team that cuts across the traditional IT silos and is focused on delivering data warehouse functionality to end-users rather than on running a particular piece of the infrastructure. Big data projects also require special cross-functional teams argues Mr. Vellante in CIOs Need to Organize Big Data Teams. These teams should include data scientists, programmers, and business professionals who can monetize the data.
“The bottom line,” says Mr. Vellante, “is traditional data warehouses are not dead, they are being complemented by new and emerging big data apps. These newer applications will take feeds from corporate data warehouses and feed back analytics to the main enterprise warehouse over time.” Meanwhile, he predicts the emergence of a data value group in enterprises to oversee and run both traditional DW/BI and new big data initiatives to ensure that they produce business value. “CIOs should plan accordingly and construct a five year plan to evolve this role and the skill sets needed to thrive in this new world.”