Contents |
Executive Summary
Wikibon has extended previous research into the cost case of flash-only arrays compared with traditional storage arrays. Wikibon looked at three scenarios:
- Traditional storage arrays with flash-cache and automated tiered storage (ATS) with an expected economic life of 5 years.
- Flash-only arrays without control on the write IOs consumed – much higher number of IOs written than Case 1, with an expected economic life of four years.
- Flash-only arrays with IO throttling and IO write control to control the number of IO writes. The same number of IOs/TB is written as in Case 1, with an extended expected economic life of six years.
Figure 1 shows a summary of the results, normalizing the different economic lifespans to a single metric of Net Present Cost/TB/Year, using a discount rate of 5%.
The conclusions show that:
- Traditional arrays with flash-cache & ATS are 57% more expensive than flash-only arrays with IO throttling and IO write control, and 30% more expensive that flash-only without IO write control.
- Flash-only arrays without IO throttling and IO write control are 30% more expensive than flash-only arrays with IO throttling and IO write control.
- The expected data center floor economic life for flash-only arrays is greater than for traditional arrays, and will increase to ten years or more as flash controller technologies mature.
This analysis supports Wikibon’s prediction that 2012 will be a watershed in the acceptance of flash-only arrays as the primary array for IO intense workloads. High capacity storage arrays with SATA or high-capacity SAS drives will continue for the foreseeable future to be the most cost-effective storage for low activity data.
Bottom Line: if write IOs are managed effectively, in 2012 and beyond flash-only arrays will deliver lower costs, better performance, less maintenance and longer economic life than the equivalent high-performance disk arrays, even with the full functionality of Automated Tiered Storage and Flash-cache. By 2013, the flash-only arrays will be as functional and as reliable as Tier-1 traditional arrays.
Traditional Storage Economics
The elements of traditional storage chargeback or show-back are usually by capacity within tier. Each tier represents a combination of storage controller function together with the maximum IO performance of the drives. The end-user is then charged or shown the cost of storage on a $/GB basis for that tier.
The write-down of the storage assets is mainly determined by the maintenance cost/GB for controllers and disk drives; the mechanical nature of disk drives means that maintenance costs remain high. The Wikibon analysis shows that after an average time of 4.5 years it is not cost effective to keep storage; the business case of replacing old storage arrays by new storage can be made based on maintenance, space and power costs alone.
Recent improvements in traditional array storage are the introduction of advanced functionality:
- Automated Tiered Storage (ATS), enables the ability to take very active blocks of a volume and place them in a higher tier (e.g., SSD drives within the array), and migrate low active blocks to lower performing/higher capacity hard drives. Previous Wikibon analysis has shown that ATS can improve cost performance by about 20%.
- Flash-Cache, which is an extended usually read-only flash cache in the storage controller which allows a larger proportion of IO reads to avoid going to disk.
The two techniques are useful for different workloads, and indeed could be implemented on the same storage array controller.
Although the disks keep spinning for the total productive life of an array, there is no cost associated with the amount of IO actually done (the amount of disk arm movement) over the life of an array. The cost is based on potential IO and capacity, not actual IO. The ratio between IO reads and writes makes no difference to costs.
Once data is written to a disk, it is time-consuming to move it. The bandwidth of data transfer is largely determined by the speed of rotation of the disk, which has not increased from a maximum of 15,000 RPM for two decades. The time taken to move data is getting longer with every turn of disk capacity increase. It takes on average about 5 months to migrate data of large arrays. Wikibon research has found the costs are between $50,000 and $100,000 per array dictated by the storage tier and the total capacity. As a result of this high cost of moving data off arrays, the average life of arrays (based on Wikibon surveys) is 8.5 years, with a maximum of over 12 years.
Flash Storage Economics
Unlike tradition hard disks, Flash memory wears out when written to, and can sustain much higher bandwidth and IOPS than disks. As a result the economics of flash are very different.
Flash comes in with a variable number of bits per cell, between one and three. The higher the number of bits/cell, the lower the cost. The lower the number of bits/cell, the higher the number of writes that can be made to the flash. Enterprise flash with one bit/cell (ELC) can sustain over 100,000 write operations. Consumer-grade flash with 3 bits/cell (MLC) might sustain only 3,000 write operations, suitable for iPods but not for most IT requirements.
Many flash-only arrays now include enterprise MLC, with about 30,000 write operations. One of the major improvements is in error handling, with start-ups such as Anobit (reportedly just purchased by Apple) using signal processing techniques instead of traditional ECC techniques to improve reliability. The availability of different levels of flash allows IO tiering to be performed within the flash-only array with the minimum of data movement. In addition, the quality of service controls in a number of flash-only arrays allow the overall IO rate to be throttled. This stops runaway applications from reducing the working life of the flash-only array. A few arrays (e.g., SolidFire) have a further refinement with a quality of service (QoS) IO control mechanisms allow the capping of IO to volumes of sets of volumes. This allows the consumption of IOs to be managed at an application level, and the results to be part of charge-back or show-back processes. Part of that process should be to charge differently for reads and writes.
One huge advantage of flash is that the transfer rate in and out of the array is orders of magnitude faster that traditional arrays. This allows data to be migrated from one array to another with minimal elapsed time, and indeed recovery of data to be much faster than traditional arrays. This, together with the higher IO performance for latency, bandwidth and total IOPS that the flash-only arrays enjoy means that the amount of storage administration and database administrator work required to balance the storage system is drastically reduced. The storage system is much more tolerant of change in the short term and long-term. The variability of IO latency, the bane of the DBAs trying to track database locking problems, is very unlikely to be an issue.
The bottom line is that the amount of effort required to manage storage is drastically reduced. In the calculations in Table 2 (see footnotes) it is assumed that the storage management requirements are cut in half; this is almost certainly a conservative estimate.
The optimum economic control of flash is very very different from traditional storage arrays. To be in line with the actual costs of flash-storage, the definition of IO Tiers for chargeback/show-back should reflect the following metrics in priority order:
- IO write-rate/TB
- Type of Flash-storage/TB
- IO read-rate/TB
Detailed Analysis
Figure 2 shows a more detailed breakdown of the results shown in Figure 1. The data comes from the last column of the detailed analysis in Table 1, found in the footnotes below. The data has been normalized to show the cost per terabyte per year. The number of years is five for the traditional storage arrays, four for the flash-only arrays without throttling and write controls, and six for flash-only arrays with throttling and write controls.
The chart and data table below the chart in Figure 2 show the major components of cost over the life of an array. The fixed costs such as the purchase of the array and migration costs are affected directly by the number of years the asset can be kept. Most of the OPEX costs are not affected by the life of the array.
The analysis shows that for flash-only arrays with good write control should benefit significantly from extending the life of the asset to six years, as CAPEX is a much bigger proportion (>60%) of the overall cost, compared with 44% for traditional storage arrays.
Wikibon expects that flash-only arrays will extend their economic life to ten years and beyond as the technology matures.
Action Item: Flash-only arrays should be the included in RFPs for all situations where a high percentage of high-performance disk is required. For effective control and accounting, flash-only arrays should include extensive throttling and write control features. With these features, the economic life of flash-only arrays should be extended to at least six years for current storage arrays. This will probably extend to ten years or higher for future flash-only arrays. CIOs should implement the changes to chargeback/show-back procedures immediately, so that the design of applications and infrastructure can be aligned with true costs.
Footnotes: The detailed assumptions behind the data in Figure 1 and Figure 2 above are shown in Table 1 below.