Contents |
EMC XtremIO is a Strong Ride
With its $430 million acquisition of XtremeIO in May 2012, EMC joined a crowded field of flash-only array vendors, with almost all the major storage vendors and a plethora of flash-only array startups. The Israeli team has brought a strong product to market, making EMC a strong contender with a good product and excellent world-wide support organization. And, with NAND flash chips probably in short supply later in 2013 and in 2014, EMC has the buying clout to ensure supply from the major fabrication facilities owned by Samsung and Toshiba.
The EMC XtremIO has the following features:
- Support for FC & 10GbE (4 ports of each for each brick),
- 4KB Block architecture,
- In-line de-duplication initially across 1-8 bricks,
- Space Efficient Snapshots,
- In-line thin provisioning across 1-8 bricks,
- Commodity Intel servers for controllers,
- High-speed 40Gb InfiniBand connection between the bricks,
- Commodity Solid State Disks (SSDs),
- VAAI support for VMware provisioning,
- Up to 250,000 random reads IOPS per Brick,
- Up to 100,000 random writes IOPS per Brick.
Data and Metadata is spread across all the SSDs and all the bricks for availability and performance (a similar architecture to the Israeli-developed IBM XIV)
Data service features that are not available at the moment on the EMC XtremIO array include:
- Synchronous Replication,
- Asynchronous Replication,
- Encryption,
- Compression,
- Quality of Service minimum and maximum allocations,
- External Management via RESTful APIs,
- Other VMware API Integration.
The EMC XtremIO is not in general availability. EMC has coined the term "directed availability". In other words, EMC will be ensuring that revenues from VMAX are not affected by "directing" availability to non-VMAX environments. General availability is expected later in 2013.
EMC XtremIO Reference Customer
EMC has an excellent reference customer for XtremIO, which illustrates the value of consistent low-latency all-flash arrays. CMA is in the XtremIO beta program and is using XtremIO to engineer Oracle RAC systems to be faster and handle more concurrent users. XtremIO requires only 20% of the previous storage footprint in CMA's data center. CMA identified immediate storage cost savings of nearly $500K by deploying the EMC XtremIO. CMA also realized an additional $500K in savings by nearly eliminating server I/O wait time. This reduces the number of server cores required and the number of Oracle CPU core license required to run the data warehouse workload.
The key to this reduction in the number of server cores and Oracle licenses required is low latency and consistent latency from a flash-only array, which cannot be achieved with cache philosophy. All, or a very high proportion, of the active data needs to be in flash to ensure the consistent part of consistent low latency.
Other Arrays from Traditional Vendors
All the traditional vendors have or will soon have solutions available:
Hitachi has a flash-only module using Hitachi's own SSDs as part of the Hitachi VSP array. Each module can scale up to 76TB of flash storage. Up to four flash enclosures can be housed in Hitachi's high-end Virtual Storage Platform (VSP) array, enabling about 300TB of flash per VSP array. The advantage of this approach is that the flash-only module can use the very high function data services, such as replication and virtualization, of other storage. The disadvantage of the Hitachi implementation is the lack of in-line de-duplication and other services. This is a strong offering with traditional Tier-1 data services.
IBM acquired Texas Memory in 2012. The IBM RamSan-820 rackmount Flash storage array offers 24 TB of MLC Flash storage, 4 GB/s of bandwidth accessible through InfiniBand, and 8 Gb Fibre Channel interfaces in a 1U rackmount form factor. This is a high-performance offering with significant flash-controller IP to maximize flash management. It is a very high-performance offering, but does not integrate with existing Tier-1 storage arrays.
Dell has a collaboration project which allows the Dell Compellent storage array to see a Violin all-flash array as an integrated tier of storage. This enables the Compellent Data Progression software to be used very effectively to control the all-flash part of the storage array. It is a strong mid-market offering with good integration.
NetApp announced the availability of the EF540 all-flash array (technology built on the acquired LSI technology) in February 2013. Additionally, NetApp is building a new, purpose-built FlashRay family of flash-only arrays. These arrays will be in beta in 2013 and GA in 2014. They are a new operating system and do not integrate with NetApp's ONTAP architecture, which has emphasized flash-caching.
HP has not yet announced a true flash-only array (HP did announce an all-flash version of the 3PAR array (EMC did the same with Symmetrix). Filling a traditional array with flash drives is not a best-of-breed solution (more akin to putting a Tesla engine in a Fiat Uno); rather it is a stopgap measure to be able to claim "participation in the game".) HP previously had a relationship with Violin for integrating storage into server blades. Wikibon expects that it will integrate a flash-only module into the 3PAR storage array in 2013, similar to the Hitachi approach.
All these approaches have merit. However, EMC has a strong offering, and Wikibon expects that EMC will deliver essential Tier-1 data service features such as replication later in 2013. EMC has a strong quality tradition, a strong salesforce, and a strong world-wide channel. With good execution and an effective pricing strategy, the XtremIO array is likely to become the leading all-flash storage array, and be ubiquitous in large and small installations.
EMC's greatest challenge will be managing the overlap with VMAX, and over-protecting the very high margins on VMAX.
Other Arrays from Flash Startups
There are two main groups of flash storage vendors:
- Flash-only Arrays:
- There is a long list of flash-only storage array vendors, including Astute Networks, Kaminario (using Fusion-io technology), Nimbus Data Systems, Pure Storage, Skyera, SolidFire, Virident (partnership with Seagate and qualified as supported by EMC XtremSF), Violin Memory (a partnership with Dell Compellent could create an effective Hybrid array), and Whiptail.
- All these vendors have experienced very strong growth in 2012 and 2013. More vendors will be added to this list as products become available. These vendors will come under increasing pressure from the traditional storage vendors as the market matures. Vendors such as Nimbus have the potential for break-out (or acquisition) with very high volumes in 2012. SolidFire has a strong potential niche with service providers, in particular because of its strong QoS features and ability to deal with "noisy neighbors". Virident has an interesting OEM relationship with Seagate.
- Flash-first Hybrid Arrays:
- Wikibon does not define traditional storage arrays with small amounts of flash-cache or SSD drives as hybrid arrays. The hybrid array vendors have architected flash as the first persistent landing point for all IO and have a large percentage of flash (~20% or more) compared with the back-end capacity disk storage. Good hybrid architectures can provide 1-2 millisecond response times with very high hit-rates on flash.
- Traditional storage arrays with tiered storage software (e.g., EMC FastVP) can provide a high-level of IO service for a select sub-set of high-performance volumes.
- Hybrid vendors include NexGen Storage (hybrid flash/HDD), Nimble (hybrid flash/HDD), Starboard Storage (hybrid flash/HDD), Tegile (hybrid flash/HDD) and Tintrí (hybrid flash/HDD). Compellent with Violin could also qualify.
- The hybrid arrays are well suited for single arrays in smaller SMB and mid-size organizations or departments/divisions of larger organizations, as performance and capacity are addressed within the same box. Tintrí has a particularly ardent following from users.
Wikibon expects consolidation of the flash-only arrays vendors later in 2013/2014 and expects acquisition of the best of the hybrid storage vendors by the traditional storage vendors.
Conclusions and the Greatest Threat to Flash-only Storage Arrays
EMC has legitimized the flash-only storage array market, and 2013 and beyond will see the very rapid adoption of flash-only arrays and hybrid arrays. Flash-only arrays will primarily address high-value workloads, particularly those with IO-constrained databases. Within two-to-three years most active data will be handled in flash. Traditional storage arrays will morph into capacity data farms and their margins and storage software potential will decrease.
The greatest threat to flash-only arrays is server-side flash storage. There is a strong movement for flash to move to PCIe cards in the server. The reason is simple - databases work better, much better, with low and consistent latency. Four interrelated factors affect database scaling:
- The number and complexity of database calls from applications to the database - goodness is high from an application functionality and end-user satisfaction;
- The locking rate (number of locks/second) on the database - the ultimate measure of potential scaling of a database. After this threshold is reached and IO cannot be improved, a very expensive re-architecting of the database and applications is required;
- The IO latency (including the OS calls, protocol overhead, network latency, etc) - low latency is good, especially for log-files and metadata.
- The IO variance or IO latency consistency (e.g., the percentage of IOs that complete within (say) 4 times the average IO latency) - very low variance is good;
The bottom line is that the faster that data can be written to persistent storage, the faster locks can be removed, the greater the scaleability and/or functionality of the application and database. NoSQL databases can help reduce locking, but at the price of much more complex applications having to manage the nuances of eventual consistency.
SAN flash-only arrays operate at the 1-2ms range. This is much better than traditional disk-based storage arrays (one order of magnitude), and it will improve many existing applications. However, the improvement is not good enough for the emerging hyperscale and high-performance database markets that are driving new models of business. PCIe cards acting as an extension of memory can reduce this to less that 1 microsecond, or three orders of magnitude.
On Monday March 4 2013, Fusion-io announced the ability to use its Atomic Write capability together with VSL and DirectFS software to be able to write non-contiguous small blocks (64bytes) to flash in 100 nano-seconds. That is four orders of magnitude (10,000 times faster) than a SAN-based storage array. The conclusion - high performance and high-value databases and metadata will migrate closer to the processor, using very low latency with very low protocol overheads. This, like most major changes, will take several years, but it will move the support of data towards the application and server teams.
A One-way Ticket to Oblivion
CIOs and CTOs should and will start to architect systems in a profoundly different way than those historic disk-bound systems. Active data will stay in flash close to processors until its usage and value declines, and the data is offered a one-way ticket to disk and oblivion. All that is left is metadata about the data, held in flash.
The technical point is that the master copy of data, the persistent copy, has to reside close to the servers. A caching architecture, where read-only data is held in a cache near the CPU, will not work for most hyper-scale scenarios. Caching is low cost, but will send the variance through the roof, as well as increasing average IO times. Writes need to be protected on persistent storage, and if the master copy is on the array, even a flash-only array, that is milliseconds away, instead of nanoseconds.
The conclusion of this analysis is that the master copy of active data in hyper-scale computing will migrate to the server clusters. The management of data also has to be in the server clusters; it is not possible to manage a real-time fast system from a slow system. Software-led storage management will need to be top-down, with active data and metadata very close to the servers.
It was a surprise that EMC spent 75% of its XtremSF flash card announcement on Tuesday March 5 comparing with Fusion-io PCIe cards and then announcing that EMC had abandoned its plans for Thunder, a clustered PCIe-card server-based storage management solution that was much more competitive to Fusion-io. Instead of Thunder, we heard EMC emphasize the benefits of XtremSW read caching software on the PCIe card (random-write data is written to the PCIe card, then stored-through to the storage array; an acknowledgement of the IO write is sent to the server from the storage array). Nothing was said about using the EMC XtremSF PCIe cards' capability to hold a complete read/write volume of (say) a database (up close and personal to the server).
To elaborate. The core reason for PCIe cards is to lower latency by placing data closer to the CPU. Stated another way...the best IO is no IO. Storage comprises reads and writes. Writes and locking rates are the bottlenecks for database apps. What EMC emphasized in its announcement was a method to acknowledge writes on the storage array, not the flash. While this approach has some benefits, namely better read performance (and sometimes better write performance) for hard core databases apps (where flash is gaining ground), this will be an issue, particularly in high write activity database environments. Notably, EMC's PCIe solution is capable of writing directly to the flash, but without atomic writes and without a means of protecting/sharing data close to the server, EMC's strategy can be characterized as an array-centric approach which is understandable coming from EMC; but not music to the ears of DBAs. EMC, like most PCIe flash players is eagerly waiting for the NVM Express Standard to hit the market to deliver atomic write capabilities. But lacking a Thunder-like capability we feel the highest margin market for many players will be limited.
Fusion-io on the other hand, is focusing like a laser on hyper-scale computing and on innovations that help solve the latency growth constraints listed in the section above. Fusion-io is server focused, and is working deeply with most large-scale Internet vendors, and is betting that hyper-scale is a harbinger of how enterprises will design infrastructure a few years later. If flash-only arrays and caching on server PCIe cards is the long-term strategy for EMC, it is likely to be a profound and very costly mistake. At the very least, EMC should cover the Fusion-io bet on software-on-the-server-led storage (i.e. a Thunder-like solution). If not, there could indeed be a one-way ticket to oblivion for a traditional storage vendor.
Action Item: EMC has validated the market for flash-only arrays, and all senior executives responsible for storage should embrace and apply low latency and consistent latency arrays as best practice for appropriate workloads. The potential savings in reduced storage costs and reduced license costs can be very significant. More important in the long term, reduced IO latency can provide improved end-use productivity. EMC is a strong if reluctant supplier, and bringing in other flash-only array vendors tactically to an RFP may be required to motivate EMC to bid XtremIO or offer much better deals on VMAX.
The long-term strategy should be to position organizations to migrate highly active and big data to reside much closer to the servers.
Footnotes: