Storage Peer Incite: Notes from Wikibon’s May 1, 2007 Research Meeting
David Floyer presents The State of Big NAS. In the past few years, we've seen NAS evolve from primarily a plaftorm for highly distributed, lower performance, less mission-critical applications to a domain where the value and amount of data is growing dramatically. This has significant implications for both user and vendor organizations.
In the last few years, we've seen the emergence of a high end in the network attached storage (NAS) market as suppliers and users recognize the business benefits of both SAN (storage area network) and NAS, and realize the need for greater differentiation of NAS solutions. To put this in perspective, recall that most disk-oriented storage solutions have been differentiated along one of two dimensions: 1) is the storage subsystem direct or network attached?; and 2) does the storage subsystem use a block or stateful protocol or a routed and state-less protocol?
The emergence of NAS coincided with the introduction of IP as the fundamental routing protocol between mainly file-oriented network servers and the use of specialized NAS storage subsystems for storing mainly stateless files such as email archives, Web files or media files. As a consequence of this simple advance in utilizing IP-oriented protocols to route storage data from lower cost devices to applications that did not require the overhead of state management, the NAS market took off. This has led to increased competition between traditional NAS players like NetApp and traditional block-oriented or SAN companies like EMC about which technology is most suitable at the very high end of the market.
What has become clear is that for applications that are truly stateful and require a block-oriented protocol like big database OLTP applications, there is no substitute for block- oriented storage subsystems that are commonly referred to as SAN from companies like EMC, Hitachi and IBM. However, as the debates regarding the suitability of NAS as a substitute for SAN start to abate, there's an increased recognition that big NAS is a sector that deserves special investment, requires specific innovation and should be uniquely exploited by users.
Here we are talking about subsystems that can support over 200,000 I/O's per second, still less than the ~1M - 2M+ I/O's per second on big block-oriented storage subsystems, but much higher than has been traditionally associated with NAS.
The emergence of these high end NAS systems means that users for the first time can cost-effectively begin thinking about pursuing three goals: 1) consolidating their NAS subsystems into a more manageable set of resources that better satisfy increasingly strict laws on compliance, for email archiving and other digital file archiving applications; 2) that it can speed the introduction of storage virtualization technologies into IT organizations so that the benefits of storage virtualization can start to be realized sooner and 3) that the skills and methods associated with storage-level metadata management can be supported more readily by high end NAS subsystems and that these skills learned early can be adopted across an array of storage technologies.
Action Item: The high end NAS marketplace has been identified for both specific innovation and specific application. Users should begin considering and exploiting these technologies for those stateless applications that nonetheless for either legal or competitive reasons are on a vector that requires increasingly high performance.
The emergence and adoption of specialized NAS appliances over the past decade has dramatically reduced storage costs and evolved the NAS market well beyond using mainly general purpose servers to store files, unstructured data and certain NAS-friendly databases. For years, this market remained highly distributed and often isolated from the management dictates of the data center. As a result, in these file-oriented storage environments, customers did not demand the same levels of consolidation, availability, performance and business continuance functionality as in block-based worlds dominated by SAN. This led to an attitude within storage administration groups that the value of data residing on block-based systems greatly exceeded the value of data stored on file-oriented infrastructure.
This is changing. With the rapid growth of unstructured data within organizations, initiatives like email archiving, ediscovery and compliance often feature NAS or file-oriented storage as a critical component of storage infrastructure. Combined with the phenomenal success of Network Appliance, fierce interest and competition from EMC and the advancement of more robust technologies, NAS is becoming an imperative for both buyers and suppliers. Recently, we've seen a spate of moves by leading companies like IBM, Hitachi and HP to shore up NAS strategies and compete or partner with upstarts like BlueArc. Suddenly, NAS is where all the action is and unstructured data is getting more attention as organizations exploit big NAS for consolidation, higher performance and applications requiring higher availability and resiliancy.
Action Item: Organizations should stop using artificial distinctions such as file-oriented versus block-based biases to define the value of data. Users should not assume that data stored on NAS is less important than data stored on block-based (e.g. SAN) systems. Rather, organizations should bring file-oriented storage under the umbrella of IT management and aggressively exploit big NAS technologies such as virtualization that enable consolidation and robust business continuance consistent with today's information storage requirements.
A few years ago a flurry of TCO studies purportedly showed that NAS file-based systems were much cheaper to manage than SAN block-based systems. However, the research methodology was deeply flawed. When the type of application was added as a variable, the new research found what pragmatic storage practitioners have known all along, namely that for high write/ high locking rate applications block-based systems were more efficient, and that for applications with high connectivity and undemanding I/O requirements NAS was usually the better technology.
As discussed in the storage alert “The conundrum of NAS consolidation economics,” there is a strong business case to consolidate NAS, driven by the necessity to ensure compliance and protection of all data. However, as the workloads for NAS and SAN are very different, they should not be consolidated onto one platform.
Action item: NAS consolidation should be tackled on different platforms and with different priorities than SAN. The simpler nature of NAS systems will sometimes lead to earlier implementation of new technologies (e.g., thin provisioning, automated classification) on NAS. IT departments should exploit these technologies on NAS early and learn from them where they can, so they can be implemented more effectively when they become mainstream on SAN based systems.
NAS and SAN technical experts do not always agree, or have the same approach to solving problems. However, a common strategic framework should be developed for key compliance, business continuance and classification issues for both pools of data. As discussed in the storage alert “SAN and NAS technologies should be consolidated but kept apart,” the technologies should usually be kept separate.
To achieve this both the user and IT stakeholders will need to be fully and directly involved, so that they can participate and shape the business case, and select technologies that will support the applications. This may lead to some delay in reaching conclusions as the groups struggle to find a common storage language it is better to resolve these issues in the computer room than in the courtroom.
Action item: Common methodologies and procedures should be used to evaluate the cross-functional compliance and other business issues for all storage pools and in particular NAS and SAN. Imposing standards without full buy-in is likely to lead to non-conformance to standards, and increase significantly the exposure of the organization to business risk. Consolidation of NAS should be encouraged as a natural outcome of the business necessity to simplify the practical implementation of standards.
Over the past fifteen years, storage consolidation has been one of the primary drivers of storage economics. Initially, with big-box approaches popularized by EMC's Symmetrix subsystem in the 1990's and more recently storage area network technologies in the first half of the 2000's. Consolidating storage has succeeded in increasing storage utilization, improving backup and recovery procedures (and consequently application availability) and providing more flexibility in IT infrastructure deployment.
We note that many of the more well-publicized economic successes in storage consolidation, which demonstrated 25% - 60% reduction in storage spending, have been in areas where storage was a major expense and an area of risk for application growth. These often tended to be workloads that were expensive to manage (e.g. required lots of people), such as large online transaction processing where accurately predicting storage spending was a major source of IT budget contention.
It happens to be the case that workloads targeted for NAS have been much simpler to manage, typified by file-oriented storage and unstructured data that are generally stateless. These less mission critical applications historically required less rigor in terms of performance, backup and recovery attributes and less people to manage. As a result, consolidating these workloads will not necessarily bring the same hard dollar economic benefits as SAN consolidation demonstrated. Hence the dilemma-- While consolidating NAS in these areas will provide better utilization and certainly simplification of management, consolidating a less expensive problem will take out fewer IT dollars.
The real drivers of NAS consolidation are the need to bring largely distibuted NAS infrastructures under a single management discipline to help reduce corporate risk and provide better productivity for users requiring access to a growing mountain of unstructured data and integrating stateless information into a broader information management strategy.
Action Item: The value proposition of NAS consolidations is not a carbon copy of SAN. Vendors and buyers must re-think the economic value justification of NAS consolidations and not simply try to mirror the successes of block-oriented storage consolidations. A greater emphasis on never-delete retention policies, provenance, data classification and the long-term value of information should drive vendor roadmaps in this growing and important market space.
"Shadow IT" came out of the shadows during the late 1990s and emerged as a major force in enterprise computing. Fueled by a combination of factors, including greater business oversight and accountability for applications, advances in application development technologies that made "user authoring" of application possible, and hosted substitutes for traditional IT services, business lines have seized greater control over their computing resources and are loath to relinquish their new authorities. However, regulations that legally consolidate corporate accountability for data and data processing (e.g., SOx) coupled with dramatic increases in the costs of infrastructure to support user-orientated applications (e.g., image-rich applications that consume huge volumes of storage) are forcing at least a rethinking of the apportionment of roles and responsibilities between central and business IT groups. The availability of high-end NAS technologies can – and will – facilitate whatever degree of consolidation in user-orientated data processing a business chooses. However, the major battle will not be technological, but organizational: Will business give back control of distributed, end-user applications to IT?
Action Item: Adoption of high-end NAS technology as a vehicle for file-orientated storage consolidation will hinge less on the intrinsic quality of these technologies, and more on emerging authority relationships between IT and business groups. IT organizations should help the business achieve control over data by supporting storage consolidation efforts in ways that don’t necessarily demand broader, and usually more complex, application consolidation programs.