This week, SAS Institute unveiled a new analytics tool that it will offer in conjunction with data warehouse vendors Teradata and EMC-Greenplum. Called SAS High Performance Analytics, the tool will live inside the data warehouse, a technique known as in-database analytics that is becoming more and more popular in the era of Big Data.
By embedding scoring and modeling capabilities inside the database, in-database analytics allows users to run complex analytics against large data sets without having to transfer the data to a separate analytics or business intelligence application. Loading large volumes of data into an analytics platform can take hours or even days, and in some cases isn’t even possible. As a result, users must often be content to analyze just samples sets of data, which can sometimes lead to inaccurate analysis.
By bringing the analytics to the data (rather than the other way around), in-database analytics can help an organization significantly increase the speed of analyzing large data sets and wring more value from its existing data warehouse investments.
In-database analytics is not a new phenomenon, but it has gained increasing attention over the last couple of years as data volumes exploded. SAS Institute has been a big proponent of in-database analytics, having partnered with a number of data warehouse vendors over the last two years. The Cary, N.C.-based vendor already had an in-database analytics partnership with Teradata, which last month acquired MPP data warehouse vendor Aster Data. SAS also partnered with Aster Data on in-database analytics before the acquisition, as well as with Netezza and IBM.
Italy’s social security agency, for example, began using a Teradata box with SAS in-database tools to analyze millions of health insurance claims for possible fraudulent activity last fall.
Not surprisingly, Netezza was not a part of this latest announcement from SAS. IBM acquired Netezza in September 2010 and has its own data mining technology in the form of SPSS, which Big Blue purchased in 2009. IBM has since taken its own steps to incorporate analytics into the database, most recently partnering with Fuzzy Logix to integrate its analytics tools with IBM’s Informix database.
But SAS continues to make the most noise in the in-database analytics space. As an independent company, it has more flexibility to partner with various data warehouse vendors than does IBM-owned SPSS. In an interview with TechTarget last year, SAS co-founder and CEO Dr. Jim Goodnight explained his in-database partnership strategy:
As for the data warehouse vendors we partner with, they’ve got to have enough buy-in to make it worth the cost because this is expensive for us to do. Also, there are some vendors whose architecture just is not very receptive to a foreign object down there in their database. A lot of these databases are tuned to the point that they can’t bear any extra cycles. It’s a little bit of a give and take, and we will partner with only those vendors that are truly interested in working closely with us and are willing to invest some R&D on their side to open up that node for us to force all of that stuff in. That’s the requirement.
The move towards in-database analytics makes sense for SAS, as it opens up a new go-to-market opportunity. And as data volumes continue to climb, demand for the technology is likely to grow at the expense of stand-alone analytics platforms that have been SAS’s bread-and-butter for years.