Posts Tagged Data warehousing
Despite the apparent contradiction, Hadoop and other emerging Big Data approaches are at the same time complementary to and disruptive to established data warehousing and business intelligence practices in the enterprise. I recently spoke with my colleague Stu Miniman about this and other findings from Wikibon’s Q2 2014 Big Data Analytics Survey in the below Cube Conversation. The survey, one of two major Big Data surveys Wikibon will undertake this year, is part of Wikibon’s new Big Data research service. The new service is focused on primary data-driven research designed to uncover how Big Data is practically applied in today’s enterprise, explore the impact on existing modes of data management and analytics, and to understand its implications for existing and start-up Big Data vendors. To find out more about Wikibon’s new Big Data research service, please email
Traditionally, data processing for analytic purposes follows a fairly static blueprint. Namely, enterprises create mainly structured data with stable data models via enterprise applications like CRM, ERP and financial systems. Data integration tools extract, transform and load the data from enterprise applications and transactional databases to a staging area where data quality and data normalization (hopefully) occur and the data is modeled into neat rows and tables. The modeled, cleansed data is then loaded into an enterprise data warehouse. This routine usually occurs on a scheduled basis – usually daily or weekly, sometimes more frequently.
The Big Data community has been waiting in anticipation for Ben Werther’s start-up Platfora to come out of stealth mode and reveal its grand vision since early summer. Well, that day has come and Werther’s vision for Platfora is indeed ambitious.
Platfora today announced it raised $5.7 million in Series A funding led by Anderseen Horowitz, with additional support from In-Q-Tel. In an accompanying blog post, Werther said Platfora has developed a platform to allow business users to interactively explore large data sets stored on Hadoop and create multidimensional, predictive dashboards and reports.
This week, SAS Institute unveiled a new analytics tool that it will offer in conjunction with data warehouse vendors Teradata and EMC-Greenplum. Called SAS High Performance Analytics, the tool will live inside the data warehouse, a technique known as in-database analytics that is becoming more and more popular in the era of Big Data.
By embedding scoring and modeling capabilities inside the database, in-database analytics allows users to run complex analytics against large data sets without having to transfer the data to a separate analytics or business intelligence application. Loading large volumes of data into an analytics platform can take hours or even days, and in some cases isn’t even possible. As a result, users must often be content to analyze just samples sets of data, which can sometimes lead to inaccurate analysis.
It’s statistics 101: the larger the sample size, the more accurate the results.
So if you want to analyze your customers’ behavior patterns – do they shop online or in stores, when do they make purchases, how often do they make returns – the more customer data can run through your analytics engine the better your results.
But what if you didn’t have to rely on sample data sets, but could analyze all your customer data? You can’t get any more accurate a picture of customer behavior through data analytics than that.