DDN Announces the Biggest Big Data Object System
DataDirect Networks has announced WOS 2.0, which is positioned as the world's fastest object storage system. WOS 2.0 is a fully integrated system including DDN storage, erasure coded data protection mechanisms, a replication strategy for distributing object data, and extensions to the interface options to include S3 APIs and a NAS interface.
Big Data Growth and Challenges
The growth in data is coming from machines, not humans, and the biggest growth is coming from sensors. Data from video, acoustic, pressure, heat, chemical, proximity, speed, and many other sensors is flooding in, in addition to the computer generation of text, tables, and graphics. Organizations that are most affected by data growth operate in the fields of video, surveillance, high performance computing, life sciences, cloud & web content, environmental monitoring, rich media, and government intelligence.
The problems of storing this big data tsunami are the normal ones of writing, indexing, provenance, security, protection, and retrieval, only on a massive scale. In traditional IT, file systems have been built to handle this. The list of file systems has grown extensively, and traditional networked files systems (NAS) have improved dramatically, with global names spaces and better metadata management. However there is a computer science consensus that these types of systems cannot scale to meet the performance and availability requirements of petabyte/exabyte with billions/trillions of records, at least not cost effectively. The World’s Fastest POSIX File Systems in 2012 is a DARPA Lustre system, which achieves about 3 billion reads and writes/day.
The biggest big data systems are now object based rather than file based. One key advantage of object storage is that the data and metadata are stored together, which eliminates many of the locking, metadata traversals, directory crawling, and file allocation table issues of traditional file systems. For example, Google claims that the Google Megastore achieves 3 billion writes and 20 billion reads per day. As of Q2 2011, the Amazon S3 system stores about half a trillion objects and reads a peak of 290,000 0bjects per second (25 billion per day).
DataDirect Networks WOS 2.0
DataDirect Networks (DDN) have introduced a WOS 2.0 (Web Object Scaler) object storage system which they claim delivers up to 55 billion small object reads and 25 billion writes/day. This is twice as many as the Amazon S3 system and twenty times the throughput of the DARPA system.
The components of the DDN system include high density storage appliances, which deliver 2 petabytes per rack and 23 petabytes per cluster. Up to 25 billion objects can be stored in a rack.
The sustained performance and data protection is achieved by combining the traditional DDN 8+2 hardware enhanced data striping together with de-clustered erasure coding. In addition, DDN has introduced an asynchronous replication capability that writes a second copy locally before replication to a remote site.
Access to WOS 2.0 has been extended to include:
- Amazon S3
- CDMI Interface
- WOS API
- iPhone, iPad Client Dropbox-Style Access
- NAS Interface
WOS 2 Proof Points
The most interesting proof-point that DDN offers is the work being done with a Department of Defense multi-agency partnership to provide large scale systems to analyze and distribute high-definition sensor data. Figure 1 shows some examples of the data challenges involved. The DDN solution offered was a Geo-distributed WOS object storage system to address the requirements of high speed and low-latency.
Key partners in this and other projects include YottaStor, Pixia Corp, Objectivity and iRODS.
Summary and Conclusions
DDN has introduced an fully integrated end-to-end geo-distributed, scale-out object storage system, with a single namespace and single global cluster interface. This matches or exceeds the performance of the largest bespoke object systems currently deployed. In addition, the data protection mechanisms can allow recovery in place, and accommodate the introduction of very large disks.
The DDN system has the potential for wide-scale adoption by cloud providers, large organizations and government agencies.
Action Item: This announcement has integrated a number of critical technology components to provide a geo-distributed, scale-out object storage system with the potential to address high performance read/write applications. DDN claim that objects can be retrieved in 40 millisecond. For the first time this has changed the positioning of object-based systems from archive-only to general purpose. CIOs and CTOs of large organizations and service providers should long and hard at the WOS 2.0 system architecture as a potential for much lower cost internal and external cloud deployments.