
Understand what 'Non-disruptive' Means
When vendors talk about non-disruptive migration you need to squint between the lines and ask the right questions to determine if they mean truly non-disruptive or sometimes non-disruptive.
Wikibon recently defined federated storage as a collection of autonomous storage resources governed by a common management system. The best way of thinking about federated storage is as a collection of storage resource nodes which are loosely connected. The nodes can be storage arrays or appliances controlling multiple arrays. The management system provides rules, in particular about how data is migrated throughout the network.
One of the major business problems that federated storage tackles is the migration of data between the nodes of a storage network. This can be to allow upgrades in storage technology and to realign the allocation of storage resources to application and business needs. This capability requires non-disruptive migration of data between the nodes.
Non-disruptive migration is of growing value and importance. Most disruptions to an application require extensive planning and result in a narrow window during which migration of data can occur. Solutions have been available for mainframe applications, but open systems solutions have been limited to file-based storage. In general it is easier to lock a file and move it dynamically. However for the business-critical large-scale update-intensive applications which use block-based storage the solutions are not as mature.
There are server based techniques to achieve this (VMWare VMotion being a recent addition) which are useful for small systems. However, the elapsed time and resilience of these techniques does not allow for large amounts of data to be transferred. Array-to-array solutions are the best technology foundation for large-scale migration of data.
Non-disruptive migrations within an array are now possible on most tier 1 and tier 1.5 arrays (e.g. 3PAR, XIV, etc.). This can help reduce the cost of storage and remove performance bottlenecks, but does not tackle major realignments or technology upgrades. Products that support externally attached heterogeneous storage such as Hitachi’s USP V and appliances like the IBM SVC allow migration between arrays within a node. But if the node controller needs a technology upgrade, appliances in theory allow this to happen by upgrading one side of an appliance and then the other. But this is not for the faint of heart as there is no fail-back during the process! I know of at least one large university that tried to perform such a tightrope walk and ended up taking out all their major applications for days during a migration. And don’t forget that the problem of moving data from one node to another still remains.
As far as I can determine, currently (as of 10/09) the only array-based solution for open systems block-based storage that allows rapid, truly non-disruptive migration of data between federated storage nodes is Hitachi’s High Availability Manager (HAM). This function (which really needs a new name) allows two USP V arrays to be dynamically connected, data to be moved non-disruptively between the two arrays, the application to be cut-over to the new array, and all connections to the original array to be severed. The function uses a metadata quorum disk to arbitrate between the two arrays in the case of any failure during the data transfer process. This is unique in the block-based storage industry as my research suggests XIV, SVC and other products fall short in this capability. EMC’s V-Max as well represents another disruptive generational migration for EMC customers.
In my view, this capability is a fundamental building block to facilitate the adoption of federated storage networks. Without this capability, federated storage is marketing hype. All storage vendors will need to provide this function for their roadmaps. Users should be asking for details of how and when storage vendors will be delivering such capabilities, and asking the right questions to get to the truth, including:
- What do you mean by non-disruptive?
- Can I migrate data non-disruptively within a storage array?
- Can I migrate data non-disruptively within storage nodes?
- Can I migrate data non-disruptively across storage nodes?
- When I need to do a storage technology refresh, do my applications take any downtime?
- During a so-called non-disruptive migration or upgrade, if something goes wrong how do I recovery?
Thanks for reading our blog, you may want to subscribe to the RSS feed, or follow Wikibon on Twitter for future updates and information as well.



