The emergence of flash storage technology has the potential to dramatically improve the performance of user-facing applications. The move to in-memory computing is already having an impact in big data environments, providing analytic applications access to large volumes of data in near real-time.
Traditionally, databases that support data analytics applications store data on disk in the form of complex multidimensional cubes and tables. Users perform queries against the tables and cubes on disk via front-end applications. As the laws of physics are immutable, data can be accessed off spinning disk only so fast, resulting in high latency for large queries.
In-memory databases, by contrast, load and store data in random access memory. Applications perform queries against the data in RAM, greatly increasing response time and reducing the level of data modeling required.
While in-memory databases are not new, they are the focus of renewed attention thanks in part to HANA, SAP’s new in-memory database engine to support analytic applications and, eventually, transactional systems. SAP plans to migrate its entire application portfolio onto HANA, giving power analysts and business users alike access to near real-time analytics.
In-memory databases such as HANA, however, are not a big data cure-all. While HANA is capable of storing multiple terabytes of data, it does not scale to accommodate truly big data scenarios – hundreds of terabytes or more. Nor is it optimized to process unstructured data.
Action Item: Databases such as HANA support real-time analytics on relatively large, structured data sets, while Hadoop facilitates deep processing and storing of huge volumes of unstructured data for historical analysis and predictive modeling. In scenarios where both low-latency, real-time analytic queries and deep historical analysis on large volumes of unstructured data are required, CIOs should consider deploying both in-memory database technology and Hadoop in conjunction for a comprehensive approach to big data processing, storage, and analytics.
Footnotes: