Contents |
Introduction
Storage architecture is always a trade-off between competing priorities. This is illustrated in Figure 1, where the green oval represents the design point envelope for high-availability secure cloud archive storage, where the storage is separated from the processing. Performance is not the most critical component; the objects are large, and users will tolerate a minute latency for an object. Local caching will provide better performance if necessary for repeated viewing.
Secure archive data systems require high data reliability for three reasons:
- When data is sensitive and must be encrypted, and the loss of one bit of data will render an object unreadable;
- When data is compressed, the loss of one bit of data may render an object unreadable;
- Keeping data for a long time increases the risk of corruption, and decreases the ability to reconstruct data from other sources.
In Figure 1, fixing the requirements for high availability, but lower performance means that cost does not have to be so high.
Several storage architectures that would fit the green envelope shown in Figure 1. The focus will be to compare two, a 3-data center traditional array-based storage solution and an erasure coding-based cloud storage solution.
Traditional Array-based Storage Topology
Figure 2 illustrates a traditional three-data-center solution with replication among three data centers. The storage array system act as a single file system across three data centers, with copies of the storage in any one data center in each of the other two data centers. The high-availability and security boundary has to include all the components of the system. RAID 6 protection is assumed within the storage arrays as a protection against the increasing problem of multiple disk failures, and the three distributed copies provide higher reliability and very high access availability and protection against disastrous loss at up to two locations.
There are many potential providers of array storage, and the topology is well known and understood. Less well understood are the risks of data corruption; storage vendors offer some features to mitigate the risk, but in general do their best to avoid this subject.
The topology shown in Figure 2 would have a minimum storage requirement of 3.6 times the amount of actual archived data. In reality, systems of this nature have a significant additional overhead in keeping copies of storage both for synchronization between the sites, and for assisting in providing copies of data in case of disastrous software or operator failures. A figure of six times the amount of actual archived data is more realistic.
Erasure Coded Storage Topology
Wikibon has described erasure coding in a posting entitled Erasure Coding and Cloud Storage Eternity, and shows that very high levels of data availability can be achieved. A topology using erasure coded storage is shown in Figure 2, which achieves the same levels of availability and security as the traditional array-based topology shown in Figure 1. The topology has multiple locations solution to store the data fragments. This would allow data to be held in multiple locations (and even multiple cloud storage providers). The loss or theft of individual data stores would not affect either the availability or security of the storage system.
The overhead of storage required would depend on the additional number of fragments that the data is recoded into (n) and the minimum number of fragments that are needed to read the data (m). A typical value that is used in archive store is n = 22, and m=16. The total storage required in this environment is n/m = 1.375 times the actual amount of archived data. If greater levels of reliability and performance were required, then raising n/m to 1.6 would provide a significant improvement.
Comparing Storage Topologies
Table 1 shows a relative cost comparison between the two topologies shown in Figure 1 and Figure 2. The basis of comparison is the cost of a unit of storage (a SATA disk drive, for example) at the well-known Fry’s store.
Traditional arrays consist of software and hardware, and come in at between 5 and 20 times the base storage cost. For high-availability systems, the range is 10-20. The number of copies of the data is in the range 3.6 to 6.
Erasure coded-based storage would require the minimum amount of “Fry’s” overhead at each of the distributed nodes, as all of the protection is within the file system. Almost no storage software is required. A value of 3 times the base storage value is reasonable.
Table 1 shows a range of estimates on the comparisons between the two topologies. The bottom line is that erasure coded storage between 11% and 4% the cost of a traditional array topology (an order of magnitudes less cost). Putting it the other way round, a traditional array topology is between nine (9) and twenty-five (25) times more expensive that topologies based on erasure coding.
Erasure Coding Solutions
Erasure coding as a science has been round for a long time. The technique has been used in traditional array storage for RAID 6. The usage has been historically constrained by the high processing overhead of computing for the reed-Solomon algorithms. The recent introduction of the multi-core processor has radically reduced the cost of the processing component of erasure coding-based storage, and future projections of the number of multi-cores will reduce it even further.
There are a number of storage projects and products using erasure coding for storage systems. These include:
- Tahoe-LAFS;
- Berkeley’s OceanStore technology, used in the Geoparity feature of EMC’s Atmos storage offering;
- Self-* systems project at Carnegie Mellon University (Self-* systems are self-organizing, self-configuring, self-healing, self-tuning and, in general, self-managing);
- Cleversafe, a winner of Wikibon’s Wikibon CTO award for Best Storage technology Innovations of 2009, and is currently the leading vendor of erasure encoded storage solutions.
Compared with traditional array topologies, topologies based on erasure encoding are newer and there is far less user experience in how to implement, manage and code in a practical environment. Wikibon expects to see significant adoption of erasure coding solutions over the next few years. However, users should temper the theoretical calculations in Table 1 and set initial expectations at a lower level.
Comparisons and Observations
The comparison given in Table 1 makes the assumption that topologies deliver the same availability and performance, and that the rest of the system round it the same in cost, performance and availability. That assumption is clearly not completely true. Differences include:
- Storage array topologies are more numerous and are likely to more robust than far newer solutions based on erasure coding - this is very likely to change over time;
- The perimeter for high-availability and security is much longer and more challenging for array-based solutions than for erasure coded solutions;
- The management and safeguarding of encryption and storage metadata become the most critical components in both topologies, as loss of this metadata would lead to access to all the data being lost (the data would be there but impossible to retrieve);
- A separate backup copy of data on a low-cost medium (e.g., tape) would be required periodically in case of metadata loss.
- The coding of applications systems using traditional file systems will be easier for most ISVs and application developers than coding using storage objects - additional training and architectural review would be required for erasure coding solutions;
- SATA disk storage will be the storage technology for the foreseeable future. Flash storage may play a part as a cache mechanism, rather than a tiered storage mechanism;
- The flash-caches will probably be part of a front-end user system, rather than the retrieval system;
- Erasure coding will be equally applicable to public clouds, private clouds and hybrid clouds;
- Both systems do not provide near-zero RPO (recover point objective), i.e., data can be lost before it is secured;
- If near-zero RPO is required (most archive systems can just resend later to recover), an Axxana like system would be appropriate for either system.
- This example specifically analyses a secure high-availability archive cloud system - the business case and calculations would need to be analysed separately for other types of storage systems.
The bottom line is that using erasure encoding for cloud storage systems is a new technology. The early adopters are using the technology in very large secure deployments at the moment, where the savings are highest. Wikibon believes that archive topologies based on erasure-coding will be adopted as best practice within two to three years.
Action Item: CIOs and CTOs with large-scale archive requirements that need to be secure and highly-availability (for reliability and accessibility) should look long and hard at erasure encoded storage vendors, and include them on any RFP.
Footnotes:
- An example of practical example for archiving medical records can be found at Erasure Coding Revolutionizes Cloud Storage;
- Additional technical details on erasure coding can be found at Erasure Coding and Cloud Storage Eternity.