On Monday February 6 2012, EMC formally announced two projects;
- Project Lightning, now named EMC VFCache, available in March 2012. Follow the link for a detailed assessment.
- Project Thunder, which will which will be in Beta testing in 2Q 2012.
Project Thunder - Introducing Server Area Networks
One of the keys to EMC's success was the early embracing a SANs (storage area networks). SANs allowed operations great flexibility in connecting servers to storage and have grown from simple to complex SANs over metro distances. Single arrays have grown to federated clusters of storage arrays. Operating systems, file systems, database systems, and hypervisors have developed to take advantage of the rich storage infrastructure.
This topology has two major constraints. The first is the relative slowness of IO to persistent storage, mainly disk drives. Well-tuned storage environments struggle to keep IO response times in the low milliseconds, while the power of processors have grown with Moore's Law. The problem of IO variance (the phantom 700ms IO response times that come and go without reason) has grown worse. The cost-per-gigabyte of storage has dropped with Moore's Law, while the cost of IO has remained static. Vendors and practitioners have struggled to keep storage from being the bottleneck. Storage is assumed to be the problem in the data center whenever an application does not perform.
Flash has completely changed the cost, speed, and IO-density capabilities of IO to persistent storage by orders of magnitude, and will continue to do so for several more generations. As with any technology, smart technologists have predicted that performance increase will cease because of some barrier, only to see that barrier removed by even smarter people. Signal processing algorithms from Anobit (now owned by Apple) are the latest miracle cure for increasing the life of flash bit cells.
EMC's thunder project aims to replicate it's success in SANs into server area networks. The network throughput can be 10 times faster, the network latency can be 1,000 times faster, the IO rate can be 100,000 times faster, and the IO density (MIOP/TB) can be 200,000 times greater. The pure cost of storage can be 78 times less expensive than storage on SANs if only a trickle of data is required to be read or written. Cost/GB as a metric to describe the value of storage is no longer useful.
Figure 1 shows EMC's vision of a Server Area Network, a set of appliances that sit between the blade servers and traditional storage arrays.
In a previous alert Wikibon put forward a five layer model for future computing. This is shown in Figure 2 below.
The EMC Project Thunder is mapped onto this model in Figure 3 below.
One key difference between the two visions is the use of flash as an extension of main memory, the ability for the processor to write atomically directly to persistent storage. This can significantly improve the performance of files systems and databases and many operating system and hypervisor functions but will require changes to software to take full advantage. These changes are likely to take several years to come into general use.
This technology is at a very early stage of development, and other vendors are presumably working on solutions. Fusion-io has been an important early developer of technology and will continue to contribute. CISCO, EMC with VCE, HP, IBM, and Oracle will all be contributing, as well as software giants such as Microsoft, SAP, and database vendors.
Action Item: The most important potential of this announcement is for future systems design. As Wikibon has stated before, future systems should be designed top-down in an IO-centric era, assuming that IOPS are now very low cost compared to disk-based systems. Big data transaction systems and big data analytic systems are likely to be designed together with the data flowing top-down from servers to layers of flash, picking up indexes and metadata, before finally ending up on disk for long-term storage. The most important capability will be the ability to manage active data across the SAN, levels of persistent flash storage, remote copies, and indexes and metadata. This management layer will be the most important long term decision that CTOs and CIOs will need to assess.