One of the issues that Wikibon has focused on in analyzing future storage trends is the impact of slow disks on performance. Both putting cache (RAM or Flash) in front of the slow disks and tiered storage solutions aim to reduce the impact of the slow storage, while taking advantage of the excellent cost/GB characteristics of disk storage, particularly SATA disk. When high hit rates are achieved, caches can greatly improve average response time. However, a cache miss can create a huge difference between the speed of the cache and the speed of the disk. For network attached flash-only arrays that difference is milliseconds(10-3) to hundreds of milliseconds(10-1), for flash as an extension of memory it is microseconds(10-6) to hundreds of milliseconds(10-1). Even with a very high cache hit rate, this creates a very high variability in response time.
Figure one illustrates the impact of flash-only arrays compared with a traditional cached storage device. The average response times are not too far apart (2ms vs. 5ms) in this particular simulation because of the excellent hit rate. The effect of the miss is shown in the variance.
One of the major problems with benchmarks using disk-based storage is that everything is done to reduce and hide the impact of variability. If you take the standard industry benchmarks such as TPC, vendors spend hundreds of hours making sure that every volume and table is positioned to minimize variance.
In the real world, storage characteristics constantly shift, which means that data has to be moved around the disks to avoid contention and variance. Whether this is done manually or by software, it is an inexact and expensive process, one that has absorbed the best brains in the storage industry for many years.
The current set of benchmarks emphasizes IOPS, and cost/IO. In the real world, this is rarely the most important metric. In most production transactional situations, the most important performance metrics are IO response time and variance.
The major benefit of flash solutions is not IOPS, but 'consistent' performance in random IO environments. Flash-only solutions, at either the server or flash-only array level, provide the best solutions. This is illustrated in figure 1 by the blue line.
All caches, including flash-caches, can provide temporary solutions, but at a fundamental level they increase variance, not decrease it. This is illustrated in figure 1 by the pink/purple line.
Action Item: CIOs and CTOs should ignore IOPS as a metric, and focus on IO response time and variance. Putting active data on flash-only solutions (flash-only arrays or flash as an extension of main memory) will eliminate almost all performance-based issues for traditional transactional applications in the data center, for both virtualized environments and high-performance bare-metal applications. When implemented, CIO & CTOs should use the opportunity to flatten the support organization and reduce headcount.
Footnotes: