On September 10, 2013 Oracle announced a major upgrade to its storage appliances by introducing the ZS3-2 and ZS3-4 products. Table 1 shows the summary data highlighting the major attributes of the systems. Wikibon has written in depth about the Oracle Storage Appliance, and concludes that the ZFS appliance is a true "flash-first" hybrid, allowing high continuous read and write rates with sustained low latency. Oracle continues to add functionality to this platform and as a result is expanding its use cases. Specifically, "sweet spot" applications for the solution span a widening spectrum suitable for high performance streaming workloads such as database backups, latency-sensitive data warehouse loading, use as an analytics file system and as a high-performance filer.
Moreover, Oracle continues to tack on a strategy that tightly ties ZFS hardware to Oracle software function and, significantly, locks competitors out of certain capabilities such as hybrid columnar compression. Oracle's clear strategy is to entice customers that widely deploy Oracle software products with increased benefits relative to competitive storage offerings. It appears Oracle is intent on exploiting certain software benefits exclusively for Oracle hardware products which sets up an interesting dynamic given the large installed bases of EMC and NetApp offerings.
The most interesting aspects of the new ZS3 products are the performance and price-performance of the ZS3-4. Figure 1 below shows the three highest performance SPC-2 results. The performance of the ZS3 is slightly higher than the Tier-1 Hitachi VSP and IBM DS8870 (note: these systems were introduced more than one year before the ZS3). What is very impressive is the price performance. Wikibon adjusted (i.e. improved) the published SPC-2 price performance for Hitachi & IBM Tier-1 storage arrays to account for the much earlier benchmarking and outdated pricing of the IBM and Hitachi products. Nonetheless, the price-performance of the ZS3-4 is $23/Megabyte. By comparison, the Hitachi VSP is 192% more expensive at $66/Megabyte, and the IBM DS8870 333% more expensive at $98/Megabyte, underscoring the possible immanent product refresh for these lines.
Wikibon believes that the ZS3 is a well designed hybrid unified storage system that enables Oracle to significantly expand the workloads that the ZFS Storage appliances can address. For present or future users of Oracle's columnar compression and Oracle 12c, there is notable benefit from close integration of storage and database. The ZS3 is a good strategic storage platform for throughput-sensitive workloads such as RMAN backup, Oracle analytics, Oracle data warehousing, and large-scale, performance sensitive VMware workloads requiring NFS support in virtualized Oracle environments.
A major thrust of this announcement relates to the value of using high-performance storage systems such as ZS3 for Oracle analytic workflows that could take advantage of columnar compression, in-line de-duplication and the very fast ZS3 flash-first write system. Wikibon concluded that such an integrated system could load, run, and complete complex analytics in much less time than in-memory databases need just to load the data. In our opinion, when this technology integration is directed at time-sensitive parts of the business cycle (such as month-end/quarter-end/year-end and other peak periods), the improvement in business efficiency from compressed "time-to-value" can be realized much sooner than re-writing applications for pure in-memory databases. In our view, customers should strongly consider this less disruptive strategy (i.e. adding flash to the storage layer) and be cautious regarding the hype of in-memory approaches.
The performance improvements in the ZS3 can be attributed to two main components: 1) hardware improvements and 2) software improvements. Table 2 shows a detailed comparison between the hardware components of the previous Oracle storage appliances (7320 & 7420) and the new ZS3-2 & ZS-4 storage appliances:
- The ZS3-2 is a major technology refresh compared with the 7320, improving almost every aspect of the architecture. In particular the tripling of the SLC write-cache will significantly improve high-write storage environments.
- The ZS3-4 has much faster SAS-2 SSD drives, a three-times increase in MLC read-flash, doubling of PCIe slots and a 1/3 increase in total storage capacity with the use of 4TB HDD drives.
The major software improvements have come from a complete rewrite of the L1 & L2 Adaptive Replacement Caches (L1ARC & L2ARC) to take advantage of the massive increase in cores available and increases in the overall parallelism of cache management. The in-memory de-duplication has been improved with the use of a higher-threaded SMP OS. Oracle claims that the cache miss rate of the overall 25TB cache (DRAM, Write-Flash, Read-Flash) on the ZS3-4 has been halved for the same workload, which effectively doubles the throughput of the caching subsystem. The result is much greater throughput, lower latencies and much lower variance. For database environments, variance is of great importance. Note: Customers should be advised that mileage will vary as typically the workload itself is the major contributor to cache locality of reference, not the storage system. Nonetheless, a well-designed storage system will exploit a cache-friendly application much more than a system with a poor strategic fit and less integration.
A list of the major software components of the software is as follows:
- ZFS File System (128-bit addressability)
- File-level protocols - NFS v2/v3/v4, CIFS, HTTP, WebDAV, FTP/SFTP/FTPS
- Block-level protocols - ISCSI, Fibre Channel, iSER, SRP, IP over InfiniBand, RDMA over InfiniBand
- Data compression
- Hybrid Columnar Compression
- Inline, Block-level De-duplication
- DTrace Analytics
- Snapshots - read-only, restore, copy-on-write & Microsoft Volume Shadow Copy Support (VSS)
- Oracle Intelligent Storage Protocol - Oracle Database 12c sends metadata to the ZFS Storage Appliance about each I/O, enabling storage to dynamically tune itself
Wikibon believes that the increase in storage R&D spending by Oracle on the ZFS is showing up in a first-class ZS3 flash-first architecture that uses flash in an innovative way, rather than flash just being a faster component inside an existing architecture. The hybrid architectures of Oracle ZS3 and the upstarts Tegile and Tintri are all examples of innovative flash-based designs that are significantly increasing performance and reducing the cost of storage systems, compared to traditional storage architectures.
ZS3 Performance & Price-performance
Oracle's historical use of standard benchmarks has been less than stellar. Wikibon has criticized Oracle's misleading habit of comparing its "latest and greatest" with previous generations of competitive products, last tested years earlier. Is Oracle's storage group breaking this habit?
The SPC-2 benchmark is designed as a throughput comparison. It consists of three large-scale workloads:
- SPC-2 Large File Processing (LFP)
- SPC-2 Large Database Query (LDQ)
- SPC-2 Video On Demand (VOD)
The average throughput in Megabytes/second is the SPC-2 performance metric. The price-performance metric is calculated by dividing the total discounted list price by the performance metric and measured as $/megabyte. The heavy hitters in this benchmark have been the multi-controller Tier-1 storage arrays from IBM and Hitachi. EMC does not typically participate in industry standard benchmarks.
In September 2013, Oracle released SPC-2 results for the ZS3-4 and compared them to IBM DS8870 and Hitachi VSP Tier-1 storage array products. As the products are still current, the performance comparisons are fair in our view, with the proviso that IBM and Hitachi are likely to significantly improve throughput performance at their next refresh cycle. It should also be noted that the EMC VMAX would perform well in this category if measured.
The raw price-performance comparisons on the SPC-2 database needs to be adjusted to take into account Moore's Law, that the cost of IT technology is reducing by about 35%/year. Wikibon adjusted the SPC price performance figures by -2.5% for each month of difference between the testing dates, to a common date of 11/5/2013. This date is approximately one month after the early October general availability of the ZS3 storage appliances.
Figure 1 below shows the results of the performance and Wikibon-adjusted price-performance SPC-2 most recent benchmarks.
The performance of the ZS3 is very impressive because it is achieved from a two-controller HA architecture, compared to the multi-node Tier-1 storage arrays from Hitachi & IBM. Notable is the price performance of the system relative to traditional Tier 1 mainframe storage architectures. Customers should bear in mind that the ZS3 is a file-based platform while the Hitachi and IBM systems are block-oriented solutions with significantly more mature high availability functions including synchronous replication. Nonetheless, it is our belief that Oracle is providing the R&D muscle that Sun lacked and focusing its efforts on the ZFS appliance, which is a diamond in the Oracle storage portfolio rough. We expect Oracle to continue to enhance the platform and broaden its uses cases. Moreover we believe that Oracle, the company with a reputation for lock-in, will attempt to lock-out competitors by increasingly leveraging integration up the stack while the competition searches for answers to the Oracle integration strategy.
Customers and observers of Oracle's storage strategy should look at the three following factors when evaluating the company's progress:
- The degree to which Oracle is able to gain share on-platform. It currently lags behind both EMC and NetApp and we believe has a goal of becoming #1 on its own base.
- The level of "lock-out" Oracle can achieve (and corresponding unique value creation) by exploiting its own software stack while closing out its platform to former partners such as EMC and NetApp. The key issue here will be how much pressure EMC and NetApp customers will place (or will want to place) on Oracle, the degree to which Oracle resists and the value proposition Oracle is able to convey with this strategy.
- Future feature expansion to enable the ZFS platform to expand its TAM and extend into Oracle's traditional database stronghold, while at the same time competing with NetApp's Clustered ONTAP and EMC's ever-expanding portfolio.
Performance Study Update
In previous Wikibon analysis, a performance study of the "Flash-first" ZFS architecture was compared against traditional storage array. Wikibon updated the model as a result of ZS3. The major impacts, as discussed earlier, come from:
- A complete rewrite of the L1 & L2 Adaptive Replacement Caches (L1ARC & L2ARC) to take advantage of the massive increase in cores available and increases in the overall parallelism of cache management.
- Significantly Improved in-memory de-duplication with the use of a higher-threaded SMP OS.
The Wikibon detailed model shows that the cache miss rate on the ZS3-4 has been more than halved for the modeled workloads, which over doubles the throughput of the caching subsystem. This, together with the impact of 4-terabyte disks, has reduced the overall theoretical cost/terabyte by between 220% and 270% between the ZFS appliances and the new ZS3 appliances.
The results of the model are shown in Figures 2 below. The ZS3 line is much lower than for the ZFS in both Figures. The traditional array costs have also been reduced compared with the original comparison, as 4TB drives were included where IO rates/drive allowed.
Figure 3 below shows a subset of the data. In the original study, there was a significantly lower costs for traditional arrays for the 500 - 5,000 IOPS per terabyte range. This is a significant portion of the marketplace. The potential space where the ZS3 can compete has been significantly extended. The only IOPS range where traditional arrays are lower cost is in the 500 - 1,000 IOPS per terabyte range. In this smaller range, the relative savings have been significantly reduced to effectively very little.
Note: The model shows theoretical configurations, based on the assumptions shown in Table 4. These configurations are not necessarily available in the Oracle ZFS or ZS3 appliances ranges, or the traditional storage arrays, and the market prices will fluctuate for many reasons. IT professionals should seek specific quotes and specifications for the specific workloads the storage equipment is to support.
Conclusions & Recommendations
Table 3 shows the theoretical cost comparisons between ZFS and the new ZS3 storage appliances. It shows a very significant improvement in total costs effectiveness from the new ZS3, with an expectation of an increase of over twice compared to the preceding appliances.
Figure 4 is an update to Figure 1 in the original research. It shows that the overall cost comparison between the theoretical costs has increased significantly. The traditional arrays are between 244% and 306% higher theoretical cost than the ZS3 appliances. Again, Wikibon would emphasize that these conclusions how the differences based on the assumptions in Table 4. The actual bid prices from vendors will of course vary greatly according to many factors. However, Oracle has much more room to be aggressive with pricing decisions, with a simpler all-inclusive storage software model.
Action Item: On balance we believe Oracle's storage R&D commitment is a positive for customers, and while there is always a risk of lock-in with the "Red Stack," many organizations will find that the value creation achieved through integration will offsets such risks. The Oracle ZS3 storage appliance does not yet have the full functionality required to replace all types of storage in the data center. However, the ZS3 improvements have significantly increased the workloads that can be supported. Wikibon recommends that storage executives should include the ZS3 in all RFPs for throughput heavy workloads, including backup, data warehousing, data analytics, and high-performance filers. In particular, CIOs and CTOs should be looking to identify key business workflows where IO is inhibiting time-to-information and time-to-value of IT. Improving the IO performance with hybrid solutions such as the ZS3, especially when integrated with Oracle databases, will be much faster and lower cost to implement than fashionable in-memory databases and in many cases, alternatives that cannot achieve the same level of integration as Oracle is demonstrating.