Contents |
Snapshot on a page
Highlights
CS2 is a financial organization with a three data center disaster recovery topology. Significant merger and acquisition activity has created a mixture of storage and storage management tools behind nine SANs, with poor storage utilization and stranded storage pools with no connectivity or migration tools connecting them to the main storage architecture. Storage commissioning takes up to nine months, which adds up to 59% additional cost.
The chargeback system has not reflected all the costs of the three data center topology, and has led to Symmetrix being used extensively for tier 2 storage. CS2 has just decided to make tier 2 storage based on CLARiiON and HP EVA arrays virtualized behind IBM SAN Volume Controllers (SVC) its default open systems storage solution. Only applications with aggressive RPO/RTO requirements will use tier 1.
Original Storage Snapshot
Some 600TB of EMC Symmetrix, EMC CLARiiON and HP EVA storage are spread over three sites and connected to 400 open servers by nine redundant SANs. Two sites (A and B) are within 20 miles of each other, and the third (C) is over 500 miles away. Some 800 servers have direct attached storage.
Pain Points
- The very high cost of EMC STAR licensing,
- The high cost of storage management software on all platforms (EMC CLARiiON and Symmetrix, HP EVA,
- The available storage is either directly attached or not accessible, leading to low storage utilization,
- Commissioning storage takes up to nine months,
- High overhead of host-based migration tools such as Veritas Volume Manager.
Solution Strategy
- CS2 is implementing a tiered strategy for open storage. The key determinant of tier is the RPO/RTO requirement for the application. For RTO requirements of less than two hours or less or near-zero loss, Symmetrix STAR configuration is the tier 1 storage solution.
- For less rigorous RTO/RPO requirements, CS2 plans to virtualize nearly all storage using IBM SAN Volume Controller (SVC0, using EVA and CLARiiON storage with FC drives for tier 2 storage with Metro or asynchronous remote replication. This will cover half of the storage.
- CS2 will use EVA and CLARiiON storage with SATA drives for tier 3 storage.
- The number of SANs will be reduced to three and will allow servers access to all tiers.
- CS2 will replace the majority of storage-based management software with SVC software, enabling extended use of arrays without extending the software contracts.
Adoption Issues
- Creating a central group to determine what tier data should reside on and migrated to, and relying less on user input.
- Giving this group sufficient authority.
- Improving chargeback mechanisms to incent migration from Symmetrix.
Benefits
CS2 plans to save budget by using much cheaper tier 2 storage as the default. CS2 plans to reduce its commissioning time for tier 2 from up to nine months to a few weeks, increasing storage acquisition efficiency. CS2 plans to reduce its storage software budget significantly and achieve much higher utilization of its storage assets.
Vendor Proposal | Advantages for CS2 | Drawbacks for CS2 | Overall CS2 Assessment |
---|---|---|---|
EMC Invista | Performance and availability of Symmetrix STAR | EMC not pushing Invista, cost of intelligent switches, software strategy | ** |
IBM SVC | References, software | Qualification of non-IBM storage | ***** |
Snapshot Detail
Executive Summary
CS2 is a financial organization with a sophisticated three-node disaster recovery topology. Significant mergers and acquisitions activity has led to a mixture of storage and storage management tools behind nine SANs, with poor storage utilization. A large percentage of the servers (66%) still have directly attached storage. Storage commissioning takes up to nine months, which adds 59% of additional cost.
A tiered storage strategy is in place but lacks connectivity or migration tools to use stranded storage. The charge back system does not reflect all the costs of the three data center topology, and has led to Symmetrix being used extensively for tier 2 storage.
To solve thee issues, management wanted to create a virtualization strategy for both servers and storage to improve the cost, time to commission and utilization of a storage infrastructure.Two vendors -- EMC and IBM -- were "short listed" for a detailed consideration. Both proposed virtualization solutions. EMC proposed Invista in front of the existing CLARiiON and EVA storage as the second and third tiers. IBM proposed the SAN Volume Controller (SVC) in the same role.
Both solutions met CS2 requirements. The EMC Invista solution required significant additional cost to the planned SAN switches, and the storage management strategy was not clear. CS2 management was not confident that EMC had the experience necessary for a risk-free installation. IBM has more than 2,500 SVC’s installed, presented detailed technical and implementation information, and CS2 liked the functionality of the SVC storage management software.
CS2 has decided to make the default for open systems storage tier 2 storage based on CLARiiON and HP EVA arrays virtualized behind IBM SAN Volume Controllers (SVC). Only applications with aggressive RPO/RTO requirements will use tier 1.
CS2 will:
- Implement an IBM SVC in each of the three locations;
- Retain Symmetrix as the tier 1 open systems platform when necessary to meet RTO and RPO requirements;
- Moved array-based storage management software from CLARiiON and EVA arrays to the SVC to control software expenses and lengthen the useful life of the array hardware.
The biggest challenge that faces CS2 is to reorganize storage administration with a function that will own the responsibility of placing data on the appropriate tier initially and ongoing. This function will need to have the full support of both IT and business management if it is not to be out-flanked by user and other IT groups used to deciding storage issues themselves. It may also be necessary to address the out-of-line charge back that does not reflect the true costs of tier 1 storage.
Storage Equipment Installed before
CS2 has three data centers, two within 20 miles of each other, and the third more than 500 miles away. CS2 processes large volumes of high-value financial transactions, and has established a near-zero data loss multi-target three-node disaster recovery topology based on Symmetrix STAR technology. Dark fibre runs between the two local sites, and multiple OC48 lines connect both of these to the remote site. The replication between the local nodes is synchronous, and the replication to the remote site is asynchronous. This topology supports both mainframe and open systems applications. The primary financial applications are running on the mainframe with FICON connectivity to storage. Mainframe storage is not the subject of this case study.
The open systems have 1,200 servers supporting many thousands of internal and external users. About 800 of these servers were using directly attached storage. The remaining 400 are SAN attached and are running against three tiers of storage totaling 600 terabytes. There are many important applications running on the open systems with the very high RPO/RTO requirements. The tier 1 storage is based on the Symmetrix STAR system as described above and gives the highest levels of RTO and RPO. Tier 1 storage may have up to eight copies of data throughout the EMC STAR storage infrastructure.
Tier 2 is based on EMC CLARiiON arrays and HP EVA arrays with Fibre Channel (FC) disk drives. Tier 3 storage is based on CLARiiON and EVA storage using high-density SATA and FATA disk drives.
Nine redundant SANs (eighteen SANs in all) connect the open system servers to the storage, based on different technologies from Brocade and McData. Nine people support the open systems storage and SANs.
A chargeback mechanism is in place, but not all the costs of the storage infrastructure have been apportioned to the open systems. The open systems have “piggybacked” on the mainframe infrastructure. The result is that end-user departments see little difference in storage costs between tier 1 and tier 2.
A project is well under way in the server support group to virtualize the UNIX (IBM AIX, Sun Solaris and HP OpenVMS) and Windows-based servers to reduce time for setup and full commissioning of new servers. The UNIX virtualization uses the native capabilities of the servers (e.g., LPARs on AIX servers). On Windows, the virtualization platform is VMware.
In summary, the storage infrastructure as grown like Topsy with new SANs and storage types as a result of acquisitions and mergers. IT management is well aware that an enhanced storage strategy is required.
Business Problems related to Storage
CS2's business focus for establishing the tiers was to save money. However, analysis showed that the storage tiers were stranded and with no connection to the servers that needed storage. In addition, no storage-based tools were available to enable migrations. The only tools available were server based, such as Symantec’s Veritas Volume Manager and AIX MPIO-based software. These required other groups to run the software, and had high server and SAN overheads, which meant that they could only be run out of prime shift. They were rarely used. The result was that storage allocation was suboptimal and storage utilization low.
CS2 also found very large amounts of unused storage directly attached to servers, further contributing to low storage utilization storage isolation. The charge back mechanism created little incentive for user departments to request tier 2 storage, and no group inside It management had clear responsibility for deciding where storage should be located. this resulted in endless discussion about storage placement with “too much user involvement” resulting in over-allocation to Symmetrix.
The Symmetrix STAR system was delivering very high RTO and RPO, making its sophisticated software, especially the STAR remote replication software, expensive overkill for many of applications allocated to it. EMC’s licensing agreement resulted in license fees being paid for all the storage attached to a controller. The software license costs paid to EMC were growing out of control and needed to be trimmed.
CS2 also found that a major inefficiency in the way that storage was acquired. CS2 purchased all its storage, and the process to commission new storage and decommission the old storage was very long (up to nine months!) and time consuming. This forced CS2 to buy arrays as much as nine months before they were actually needed. The impact is shown in Table 2 below.
Array Cost Element | Relative Cost |
---|---|
Tier 1 Array Cost | 100% |
Cost of migration | 17% |
Cost of buying early (32%/year x 9/12 = 24%) | 24% |
Depreciation costs (over 4 years) | 18% |
Total cost of array | 159% |
One of the major factors contributing to this long lead time and high cost was the need to take down applications before migrating the data, forcing either allocation of maintenance time or that special planning with users to schedule downtime.
CS2 had also initiated an aggressive server virtualization project driven in part by the need to provision servers quickly to improve response to business requirements. Storage was sometimes a bottleneck to the rapid deployment of servers.
Solutions Considered
CS2 put a team together to investigate how to improve tier 2 and tier 3 storage utilization and connectivity to make it the default storage for open systems applications. The team put together the following key strategic objectives:
- Clearly define the RPO and RTO requirements for all applications, which should be the major determinant for data placement.
- Ensure that the storage network design allows access to all tiers of storage from every server and plan to significantly reduce directly attached storage in the future.
- Deploy virtualization for tier 2 and tier 3 storage to provide a services layer outside the storage arrays: this would allow CS2 to have the option of purchasing “dumb” arrays and not be held hostage to storage vendor software pricing.
- Move the storage-based management software from the arrays to the SVC software: this would allow an additional 18 months of use from purchased arrays when the software license had run out but the hardware still held value.
- Put in place a storage management function that would decide initial and ongoing placement of storage based on RPO/RTO and the performance characteristics of the application.
An existing SAN project,designed to reduce the number of SANs from nine to three, which would allow the any server to connect to any tier of storage required, was accelerated.
The team investigated virtualization solutions from EMC and IBM to meet these objectives:
- EMC Invista
- EMC offered the Invista out-of-band virtualization approach together with the CLARiiON and HP EVA storage. Invista takes advantage of new intelligence, being built into SAN switches from Brocade Communication Systems, Cisco Systems and McData, to allow dynamic mapping of applications to their required data. Invista is an out-of band virtualization controller that processes the control and metadata paths while relying on port-level processing of the intelligent switch to transfer the data path without overhead. As data volumes are migrated to other locations, for instance during a technology refresh, that migration does not disrupt the applications.
- This Invista solution met the major requirements of CS2, including the ability to migrate data seamlessly between different arrays. However, it required additional changes in the SAN switches, as well as different software approaches for the EVA and CLARiiON arrays. EMC did not have a significant installed base of Invista, with installations only starting in the last few months. When compared with IBM’s SVC (see below), CS2 felt much more comfortable with IBM’s virtualization experience.
- IBM SAN Volume Controller (SVC) appliance.
- IBM offered the SAN Volume Controller (SVC) appliance, which is an in-band virtualization solution for heterogeneous storage. The SVC is deployed as a cluster of nodes. For CS2, each of the two clusters consisted of 1U high rack-mounted appliances based on IBM System x servers. Each node has at least four fibre channel ports and was protected by an uninterruptible power supply. The nodes are clustered so that surviving nodes can take over if one fails. The nodes run a Linux kernel and a specialized Virtualization Storage Software environment. Servers access the SVC as if it were a storage controller. The LUNs they see represent virtual disks which are allocated in SVC from a pool of storage made up from one or more managed disks (mdisks). A managed disk is a storage LUN provided by one of the storage controllers that the SVC is virtualizing. The SVC can migrate data seamlessly from mdisk to mdisk, whilst maintaining IO access to the data. Asynchronous remote copy Global Mirror would allow CS2 a remote disaster recovery site at the other data center. SVC supports point-in-time copying of data (FlashCopy).
- IBM provided extensive technical information on how the SVC should be installed, including an IBM Redbook. IBM also emphasized that it had moire than 2,500 installations of SVC and provided many customer references in the financial services sector.
The team recommended the IBM SVC as the best option for CS2. They have yet to recommend the SAN vendor.
Implementing the Solution Selected
Overall Storage Strategy
CS2 is planning to implement an upgrade to the SAN infrastructure that will reduce the nine SANs to three, one in each of its locations. This will ensure that data is not stranded.
CS2 will to continue to use the Symmetrix arrays as its tier 1 storage for open systems but only when the RTO/RPO requirements for the application dictate. This can have a major effect on reducing STAR software licensing charges. The Symmetrix arrays will not be virtualized. The tier 1 storage will continue to use the EMC storage management software and the STAR remote replication functionality.
CS2 will virtualize all of the EVA and CLARiiON storage arrays behind an IBM SVC in each of the three locations. The array-based storage-management function will be discontinued and moved to the IBM SVC software. This will allow a single software platform across all tier 2 storage, and simplify and reduce the cost of storage management software. The order that the arrays are migrated will be determined to ensure that arrays are moved across before any software-based licenses expire.
Organizational Issues
CS2’s strategy is to make tier 2 the default storage. To effect this strategy, CS2 plans to create a storage function within storage administration responsible for ensuring that data is placed the right tier with the right RTO/RPO, and for ensuring that performance requirements are met. It will also be responsible for using tier 2 storage as the default and responsible for recommending that direct-attached storage be used for exceptional applications.
The same group will be responsible for driving down the time to commission new storage from up to nine months to a few weeks. If implemented quickly, this will have a significant impact on the storage acquisition costs for the following year.
CS2 will need to provide this group with considerable senior IT and business backing for its decisions, as it will be very easy for business groups to revert to established ways of requesting and provisioning storage.
Conclusions
Wikibon draws the following conclusions from this case study:
- Organizations like CS2 have concluded that virtualization is a prerequisite for any effective implementation of tiered storage; without it storage cannot be moved dynamically without causing interruption to the application.
- Virtualization is now ready for general adoption, with IBM showing considerable momentum.
- One of biggest benefits of virtualization for CS2 will be the ability to commission and decommission storage in weeks rather than months. This will reduce the costs of overlapping storage, delays purchase of new storage, and significantly reduces overall storage acquisition costs.
- A tiered storage strategy means that responsibility for the initial and ongoing placement of storage should be centralized, probably as part of the storage administration.
- Storage and server virtualizations are independent projects, but they will gain significant benefits from each other if undertaken together.
Wikibon concludes CS2 is implementing a good strategy that, if well executed, will significantly reduce the cost of the storage infrastructure and improve storage utilization and flexibility. However a potential weak point of the strategy remains the degree to which IT can enforce a tier 2 default strategy without significantly overhauling its charge back strategy. CS2 will need to ensure that the storage function implementing this strategy is strong and has the support of senior IT and business managers.
Legal: © Wikibon 2007. This document is copyright protected by Wikibon and does not fall under the GNU general license terms for Wikibon.org. Links to this article from external sources are allowed, however any other re-distribution of this content for commercial purposes is strictly prohibited. Please contact Wikibon for more information.
The cases cited herein are real however the name of the customer is fictitious. Wikibon case studies are developed independently and their development is not initiated for or funded by any single company. Wikibon reports actual customer experiences and results with no attempt to emphasize any one vendor’s strengths or weaknesses. Read the full disclaimer.