Tip: Hit Ctrl +/- to increase/decrease text size)
Storage Peer Incite: Notes from Wikibon’s August 10, 2010 Research Meeting
Recorded audio from the Peer Incite:
Colonial Williamsburg has come a long way in the last few years in terms of its IT efficiency and in particular its storage. As a living history museum, it faces an unusual challenge: it needs to store large amounts of multimedia content documenting historic events as well as objects from the Colonial American period basically forever. And these files are large.
Colonial Williamsburg's IT storage migration from a fragmented, expensive, and very inefficient initial state to its present situation, with a virtualized environment and automated two-tier storage, and the savings and other benefits this migration has provided, are discussed in the articles below. However, at the end of the Peer Incite meeting that generated this newsletter, Sean Maisey, director of operations and engineering, admitted that a major challenge remains -- ensuring that the files containing that historic and archeological information are preserved and available to future generations.
The material in question includes multimedia records of historic events at the museum itself including a meeting between the President of the United States and the Premier of the People's Republic of China at Williamsburg, along with video documentation of archeological digs on the site and many other important historical objects, events, and research. Presently all of the museum's active IT is centralized on its site. This works day-to-day but obviously leaves the data vulnerable to disasters long term.
The only backup for this material at the moment is on tape. This is off site, but it leaves the files at risk to loss or deterioration of the medium over time, or to the obsolescence and replacement of the recording technology itself. And the files on those tapes are difficult to access when needed for historical research and other needs that the museum was created to serve. Maisey is investigating cloud service options for a better answer. What he needs is a solution that combines reasonable cost, an environment in which multiple copies of the files would be maintained at different physical locations, and a reasonable level of access. In theory the cloud should provide this combination of qualities.
Few midrange organizations have files that they need to preserve forever. Many, however, have files they need to maintain for five years or more -- beyond the operational life of a reel of data recording tape. Backup and disaster recovery (DR) are major issues for most midrange organizations. The cloud should be able to provide the answer. If Maisey's quest succeeds it may well discover a better answer for DR not just for Colonial Williamsburg, as important as that is culturally, but for a large number of organizations with unfilled disaster recovery needs.G. Berton Latamore
The rapid growth in file-based and unstructured content storage is forcing companies to find ways to reduce complexity and cut costs. By virtualizing existing filers and creating tiered storage pools, organizations can begin to allocate less frequently accessed data to more cost effective storage tiers (e.g. SATA-based storage), including (potentially) the cloud. This approach avoids capital expenses associated with higher cost tier 1 storage systems, which use expensive FC-based devices, and minimizes operating costs associated with file management. Organizations also can simplify migration headaches and reduce planned downtime.
This was the message from Sean Maisey, Director of Operations and Engineering at Colonial Williamsburg. The Colonial Williamsburg Foundation operates the world’s largest living history museum in Williamsburg, Virginia, the restored 18th-century capital of Britain’s largest and most important outpost in the New World. This college campus-like environment grows from a community of 2,500 to as many as 3,500 employees in the peak summer months. In addition to the Museums, the IT organization at Colonial Williamsburg supports hotels, restaurants, and various retails shops.
Seven years ago, the organization had roughly 70 departments, each in charge of their own storage with no way to share resources. In the middle of the last decade, Colonial Williamsburg created what essentially amounted to a manual storage tiering strategy using two tiers: A tier 1 based on NetApp FAS 250 filers and a second nearline tier using NetApp NearStore with SATA based devices. The organization had essentially two types of unstructured data, including standard user files (e.g. documents, spreadsheets, etc.) and rich media documenting the history of the museum, its visitors, and its legacy, that it needs to capture and preserve. Eighty-to-eighty-five percent of the unstructured data in the system was rich media. The organizations infrastructure was as follows:
- 140 Windows production servers – mostly virtualized;
- 100TB raw storage capacity on two NetApp filers (with a remote cluster for DR), and;
- IBM midrange systems running financial applications.
The problem with the organization’s tiering approach was that the tier 1 storage filled up quickly and users would frequently run out of storage space causing disruption. As well, the IT organization realized it was spending too much on tier 1 storage and expensive migrations.
The Answer: File Virtualization
Colonial Williamsburg implemented the F5 ARX system and deployed the solution in front of its NAS filers. The F5 system virtualizes heterogeneous filers and creates a global name space so that all the storage behind the system can be shared across application servers. The system uses a policy-based life-cycle management approach so that after a certain period of time (90 days in the case of Colonial Williamsburg), data is migrated from expensive tier 1 storage to tier 2 SATA devices. Maisey indicated that this simple age-based classification and migration methodology allowed the organization to dramatically cut storage costs and minimize migration pain.
In particular, Maisey pointed at two cost savings benefits this approach delivered, namely:
- Maisey estimated that the cost of an array migration is measured in the 10’s of thousands of dollars.
- The cost of tier 2 disk was more than 50% lower than that of tier 1 storage.
The ROI of File Virtualization
At the Peer Incite call, we conducted a back-of-napkin ROI for Colonial Williamsburg. The discussion resulted in the following rough analysis: Costs
- Cost of the F5 ARX solution,
- Project implementation costs (approximately equal to the cost of a new filer).
- Value of freed up tier 1 space (80% of Colonial Williamsburg’s tier 1 capacity was returned to the pool),
- Ongoing cost savings of tier 2 disk versus tier 1,
- Value of simplified management,
- Reduced migration costs, and,
- Overall improved quality of service and uptime (including reduced planned downtime).
The bottom line on ROI for Colonial Williamsburg is that the improved utilization and better use of storage from pooling, the automation of migration policies to ensure data is on more cost-effective disk, the avoidance of future tier 1 storage purchases, and reduced migration costs easily offset the cost of the ARX solution and its implementation costs.
Virtualizing Tier 3 in the Cloud
According to Maisey, one gap in the current infrastructure is data protection and disaster recovery, and the cloud may provide the answer. The organization is beginning to plan migrations external to the on-premise data center and the cloud is a logical migration tier. Maisey’s plan is to use the automated tiering capabilities of the F5 platform to front-end the cloud and treat it as another tier, for deep archive and recovery purposes. While there is potential to reduce costs further, the real motivation of exploring the cloud for Colonial Williamsburg is the ability to provide cloud-based backup, replication, and disaster recovery.
Key advice for practitioners considering this type of strategy includes:
- Understand the service level agreements of cloud service providers,
- Gain clarity on security and information governance policies for the cloud, and ensure they comply with corporate edicts,
- Understand how to handle incident reporting and escalation with cloud service providers,
- Secure audit rights and conduct cloud audits at least annually,
- Understand the degree of difficulty in migrating data out of the cloud (i.e. how do you get your data back).
If something goes wrong, the CEO of the cloud service provider isn’t likely liable, and involving risk management in discussions early can avoid problems down the road.
Also, push for integration between suppliers such as F5 and cloud service providers where increased automation, security, and recovery are fundamental focus areas.
Action item: The intelligent use of file virtualization to create a global resource pool can save significant money and time and reduce risks. Automation is the key to success, and data migration policies based on simple age and file activity metrics are an effective methodology for IT practitioners. IT organizations suffering from tier 1 cost pains and file sprawl must investigate creating a global resource pool and automating migration. Including the cloud in plans is timely and sensible, especially for mid-sized and small organizations that lack advanced recovery capabilities.
When considering approaches to sourcing IT infrastructure and business applications, CIOs need to balance acquisition and operating-cost efficiency with agility, scalability, availability, and resiliency. IT departments that are slow to respond to business-unit requirements may find those same business units circumventing corporate IT and sourcing their IT and applications from cloud-based offerings.
Whether as pure infrastructure-on-demand suppliers or providers of software-as-a-service (SaaS), many third-party cloud-services offerings are designed to simplify the procurement process, support on-demand scaling, provide high-availability, and enable rapid recovery. Many also provide the option of a variable-cost pricing model, which is often more palatable to business-unit executives. CIOs that want to remain competitive with cloud-services offerings should move aggressively to adopt the core enabling technologies of Infrastructure 2.0 that underpin cloud-based offerings. These include:
- Virtual servers,
- Virtual networks,
- Virtual storage,
- Virtual file systems.
Rather than viewing cloud services as competitive, moreover, CIOs should plan for the development and deployment of an infrastructure that incorporates private-to-public-cloud extensions. No place is this more obvious than in the fast-growing area of rich media content.
The IT services that support the operations of Colonial Williamsburg provide an excellent case study. At Colonial Williamsburg, 80%-85% of file data is described as rich media, which includes photos and videos. Most of the file data is infrequently accessed, but as an institution that incorporates travel and entertainment, educational, museum, and research activities, all of the data is considered important. Colonial Williamsburg implemented ARX Series file virtualization technology from F5, to enable automated, policy-based migration of data through two tiers of storage. Files that are not accessed for 90 days are automatically migrated to tier 2 storage. Doing so reduced the overall cost of supporting rich-media file growth but also eliminated the out-of-capacity notifications that were occurring with individually-managed tier 1 filers from NetApp. The organization further reports that only about 3.5% of data migrated to tier 2 storage is migrated back to tier 1 storage in any given month and that the cost of tier 2 storage is approximately half the cost of tier 1.
As a next step, Colonial Williamsburg, which provides IT services out of a single, campus data center, is evaluating how the current tape-based backup and recovery process can be eliminated and how file services can be extended to a cloud-based offering that incorporates multi-site replication.
Action item: CIOs should develop an agile, flexible infrastructure strategy with a view towards future integration of private and public cloud offerings. Public cloud services should not be viewed as competitive, but rather as another tool in the IT tool box. The proliferation of rich-media files provides a relatively low-risk proving ground for a strategy that sources long-term archiving and disaster recovery from public-cloud offerings, while serving the needs of more-frequently accessed files and short-term archives from in-house private clouds.
Generally, the idea of creating a single name space with an appliance front-ending storage brings benefits of simplification, improved utilization and the ability to automate data movement throughout the storage hierarchy. CTO’s should be aware however that increasingly, unstructured data is trending toward rich media. This means more space consumption, more issues with long term retention and higher costs. New solutions need to be considered to address this challenge starting with virtualization.
CTOs and Infrastructure 2.0 architects should start to look at the integration of storage and network virtualization as an essential part of private and hybrid cloud computing. To achieve this will require a clear strategy as to how to automate the movement of data through the storage hierarchy. A key aspect will be understanding the policies by which applications must be maintained on tier 1 and which are appropriate for migration.
This is a classic case of which users utilize files. When providing archiving services for users, especially when rich media is involved, proximity and data location become vital to system performance and cost. Integrating network and storage virtualization are increasingly becoming a pre-requisite to ensure data is in the right place for the user at the right time. As well, placing inactive data on the proper tier can save substantial hard dollars.
Action item: CTOs should begin to consider the integration of storage and networking as a critical step in developing next generation infrastructure 2.0 strategies. Virtualizing these components, along with server virtualization will allow IT organizations to begin to offer IT as a service more cost effectively than today's stove-piped, ad hoc approaches allow. As well, organizations will be able to leverage this approach to provide economics and business models more aligned with cloud service providers.
Prior to deploying file virtualization, Colonial Williamsburg’s storage requirements were handled separately by each department. Budgets were separate, and stove-piped infrastructure was the modus operandi.
Business requirements to provide access to rich media assets and to use resources more efficiently drove the need to consolidate IT resources. Colonial Williamsburg could have approached alternatives such as backing-up data from laptops and desktops directly to the cloud, freeing up space. However it was simpler and of greater business value to implement F5’s file virtualization technology. This change also gave all users full access to the vast stores of historical information (rich media). The bottom line is IT organizations must remove the current stove-piped views of infrastructure architecture and management.
Action item: The IT organization must develop trust across lines of business so that a shared storage and network infrastructure model can be delivered. The virtualized infrastructure can deliver consistent access times, high availability and adequate service levels, even for infrequently accessed data. In the absence of chargebacks in a virtualized world, notwithstanding organizational constraints, storage budgets must also be shared among business units for optimal efficiency. Ultimately this will allow internal IT organizations to become more competitive with cloud service providers.
The rapid growth of unstructured data has dramatically out-paced the growth of structured data for more than a decade. It has brought huge challenges for users and made fortunes for several industry executives.
Technology providers have proven that file virtualization solves a problem, works well, and can save money. In particular, F5 (formerly Acopia) and others (e.g. Rainfinity and NetApp) have carved out nice niches consolidating bespoke filers, creating global namespaces and delivering automated, policy-based tiered storage solutions. The result has been lower costs, simplified management, and better overall IT operations.
The most significant trend to hit the market in a decade is cloud computing, and virtually all players need a cloud strategy. Whether selling to cloud service providers or internal IT departments, the consumerization of IT is hitting virtually every segment of the market and driving the need for simplification, automation, and agility. It's also increasing requirements for integration.
In the view of the Wikibon community, from a positioning standpoint, storage vendors must move beyond the notion of file virtualization into the realm of infrastructure 2.0. What does that mean? It means developing a consistent architecture that provides soup-to-nuts automation and can deliver so-called cloud services that are both private (i.e. internal behind the firewall) and public in that they interact with the external cloud in a manner that is deemed safe, fast, simple, and cost effective.
For suppliers, moving beyond file virtualization means taking the attributes that have been most popular (e.g. global namespace, policy-based migration, tiered storage) and extending them to the cloud. Players like F5 can become a critical component of cloud strategies for internal IT departments trying to become more business model competitive with cloud service providers as well as directly selling to/partnering with cloud service providers themselves. From a private cloud standpoint, this means aggressively investing to integrate with key virtualization platforms (especially VMware) and integrating/partnering with cloud service providers through open APIs and cloud standards.
Importantly, suppliers need to understand the use cases for extending their value proposition to the cloud. In the view of the Wikibon community, the best opportunity is to help users that have aging disaster recovery processes and are looking to the cloud to provide efficient redundancy that doesn’t exist within their own data centers today. We believe most organizations, especially mid-sized companies, need help in this regard.
Action item: Cloud computing represents a once-in-a-decade opportunity for traditional file virtualization suppliers to re-brand themselves. Marketing executives at these firms must identify pain points which can be addressed by cloud computing, particularly simplifying internal IT operations and providing data movement, protection and recovery services leveraging external locations. Product development executives at these companies must then aggressively pursue solutions that deliver clear business value to an emerging set of IT buyers under increasing pressure to truly deliver IT as a service.
As unstructured data and files such as documents, messages, MP3s, podcasts and images continue to drive up storage demand within enterprises of all sizes, IT organizations are increasingly adapting cloud technologies to improve their ability to support their internal clients, in particular, when it comes to archiving and retaining information for increasingly longer periods of time.
A case in point was provided by Sean Maisey, Director of Operations and Engineering for The Colonial Williamsburg Foundation, which during Wikibon’s Peer Incite call shared his organization’s struggles with:
- Provisioning storage for over 3,000 users in roughly 70 departments all with their own budgets and stove-piped infrastructure,
- Backing up almost 100 terabytes of unstructured data across multiple tiers and platforms,
- Dealing with degrading access times along with a spike in data access requests,
- Sharing information electronically across the enterprise,
- Soaring storage costs.
Maisey and his team determined that a File Virtualization solution from F5 front-ending their tier 1 NetApp FAS 250 Filers and NetApp NearStore tier 2 SATA drives would essentially eliminate the disruption associated with storage administration and automate many storage management tasks including the time and cost to migrate data.
Maisey stated that the F5 solution moved them closer to a true ILM environment and provides the following Benefits:
- Free up more than 80% of total useable tier 1 storage space,
- Dramatically improve response times,
- Allow users to share information across departments,
- Eliminate storage related help desk calls,
- Automatically migrate data from tier 1 to tier 2 storage.
GRS TRENDS for CLOUD
As organizations increasingly become more “cloud-like” and look to update outmoded processes and Get Rid of Stuff (GRS), including older equipment, at least three trends are emerging including:
1) The elimination of non-x86 architectures for all but the most specialized applications,
2) The elimination of stove-piped storage and network infrastructure,
3) The boundaries between storage tiers and including external cloud repositories are breaking down.
Cloud enabling solutions such as file virtualization hold tremendous possibilities especially for mid-sized and smaller organizations that do not have the resources to proactively manage their storage environment, are cost constrained and need to improve service levels.
Action item: Organizations must embrace this new IT reality and shed not-invented-here mentalities which have led to less efficient and more restrictive infrastructure.