Storage Peer Incite: Notes from Wikibon’s May 15, 2012 Research Meeting
Recorded audio from the Peer Incite:
In mid-May the Wikibon community hosted John Meyers, Ph.D., Assistant Professor of Medicine, Immunologist, Hematology/Oncology Researcher, and Director of Technology at the Boston University School of Medicine. He spoke about how he implemented an advanced, centralized IT service in a forklift upgrade, and specifically about the key role that data protection startup Actifio plays in that architecture.
The specific situation and the lessons to be learned are explored in depth in the articles in this issue of the Peer Incite Newsletter. However, this Peer Incite also raises a larger issue. Many businesses, particularly large enterprises, reject the idea of considering products from small entrepreneurial vendors out of hand on the basis that they are too unstable to be trusted long term. While they are not as stable day-to-day as large vendors, this policy also often can prevent those large companies from benefiting from disruptive technologies that in some cases can save the company large sums or allow it to enter entirely new, fast-growing markets.
And bleeding-edge startups have a way of ending up being acquired by large vendors. Often that works to the advantage of all concerned. Also, large vendors themselves are sometimes not as stable long term as they might appear to be. Any of us who have been in this industry for awhile have seen some major companies come and go. Of the first-generation mainframe companies of the 1950s, only Unisys remains. Of the group of minicomputer pioneers of the 1970, including DEC, Prime, and Data General, only HP is still here.
This does not mean that every startup should be hailed as the savior of IT, nor is it an argument against taking proper precautions when engaging a small vendor. But on the other hand, the multiple waves of disruptive technology that are now rocking IT -- mobile computing, virtualization, cloud services, big data, flash storage -- all started with small entrepreneurs with a vision of something new. Companies that grab these waves first can ride them to a more prosperous future. Bert Latamore, Editor
Creating a Single Data Repository for Backup and Archive
On May 15, 2012, the Wikibon community held a Peer Incite to discuss the merits and potential risks of creating a single data repository for backup and archive in a shared-services IT environment. We were joined by John Meyers, Ph.D., Assistant Professor of Medicine and Director of Technology for the Department of Medicine at Boston University School of Medicine, a leader in medical education and research.
For most organizations, the path to a shared-services IT environment is both long and incomplete. However, when Dr. Meyers was asked to assume his current role as director of technology for the department, he accepted, with one pre-condition: that he be allowed to replace the entire existing IT infrastructure.
When the department agreed, Dr. Meyers was afforded the unusual opportunity to abandon a seemingly random collection of servers, storage, and data protection solutions, consolidate infrastructure, and serve up compute, networking, and storage requirements from a more limited number of solutions that fit well in a shared-services model. These included Cisco UCS for servers, VMware for server virtualization, HP 3PAR Utility Storage to support storage requirements for virtualized servers, and EMC’s Isilon scale-out NAS for file storage and automated migration and sharing of scientific data and images from research studies. As a result of his clean-slate approach, Dr. Meyers was able to create his shared-services environment in record time.
The consolidation of infrastructure also created the opportunity for Dr. Meyers to change the way the institution approached data backup and archiving. He needed technology that would enable the department to survive everything short of a region-wide physical disaster and would provide affordable and ready access to data in archives. While he evaluated consolidated, automated tape solutions from leading tape-automation suppliers and storage-controller-based snapshot and remote-replication technologies from his primary storage suppliers, HP 3PAR and EMC Isilon, he was concerned about the feasibility and time requirements for backing up and restoring 300TBs of file data. He was equally concerned over the data growth and storage costs that would result from using controller-based snapshot and replication technology.
On the recommendation of a consultant, Dr. Meyers evaluated and chose a new approach, leveraging technology from Actifio, a young but well-funded technology company based in Waltham, Massachusetts, that is focused on reducing the number of copies of primary data.
By applying the Actifio deduplication algorithms across the full complement of data, the department achieved an 85% reduction in storage requirements between the primary storage from HP and EMC and the Actifio-enabled shared repository for backup and archive. Dr. Meyers reported that this has saved his organization more than $1 million. Through Actifio’s application integration, the Department of Medicine can now rapidly restore volumes and files that have been corrupted or lost and can use the same data store to provide access to archives for multiple users.
Action item: As data volumes continue to grow and as data analytics and data re-use becomes more pervasive, organizations should re-evaluate and reconsider the merits of using data and file copies, as copy-proliferation may crush budgets and impede efficient operation. As an alternative, companies should evaluate single-repository approaches to managing copy data. There are risks inherent with keeping data in a single, multi-purpose repository, so it is logical to want to replicate the repository. And while that runs counter to the copy-elimination thesis, when fully deduplicated the costs of then replicating the repository may be trivial compared to keeping a growing number of raw copies. Ultimately, given budget and time constraints and the continued growth in file data, the only real alternative for some companies may be no backup and no archive, which would be substantially less palatable and decidedly riskier.
Footnotes: Disclosure: After selecting and implementing the Actifio solution, Dr. Meyers was retained as an external advisor to the company and has a financial interest in Actifio.
Getting in Deep with Your Archive and Backup Supplier
Application-consistent snapshots represent perhaps the most time-efficient way to create a virtual backup of applications, application data, and files. Deduplication algorithms applied to the snapshots can substantially reduce storage requirements and make the electronic transmission of data to an off-site disaster recovery facility economically feasible. Automated migration of data to lower-performance, lower-cost storage based upon access frequency can further reduce storage costs.
Equally important, application-consistent snapshots provide rapid restoration of data. In the very dynamic world of virtualization, where virtual machines are added, moved, deleted, and sometimes corrupted on the fly, the ability to quickly hit the “undo” button can be critical. For many operating in the world of big data, there simply is no time to use traditional backup or restore processes, and therefore snapshots represent the only real option.
Many storage systems suppliers offer snapshot technology and some combine snapshots with deduplication and automated data migration for added efficiency. In this manner, they can lower the cost of transmitting data changes to the remote site. But most storage suppliers require that each location have the same type of storage in both the production and recovery sites.
Actifio, whose technology was discussed on the May 15, 2012 Peer Incite, is in use at Boston University School of Medicine, where the technology provides backup functions for HP 3PAR storage, VM images, and EMC Isilon storage, for big data and file shares. With this approach, the school can apply deduplication algorithms to a much larger set of data and achieve greater reductions than might be achieved in more-siloed approaches. BU School of Medicine goes one step further by combining backup and archive functions into a single storage-vendor-independent repository. The savings are significant, as the school reports achieving as much as 85% reduction in storage requirements for backup and archive, when compared to the primary storage that is being protected.
Action item: Backup and archive preservation are critical functions to an organization. Combining the two magnifies the importance of supplier due diligence by the customer. CIOs should, of course, pay close attention to the robustness of the technology but also the track record and financials of the company and the experience of its founders. Organizations can control risk by taking a more-measured approach, applying the technology first to less-risky applications and data. Many startups actively seek IT professionals to serve on customer or technology advisory boards, which may provide deeper insight into company strengths and potential risks and more opportunity for input on company priorities. Regardless, organizations need to know the supplier at a deep level if they are going to invest deeply in the technology.
Footnotes: Dr. John Meyers, who was a guest speaker on the May 15, 2012 Peer Incite, serves as an advisor to Actifio and has a financial interest in the company.
Technology - A Tool a Weapon or a Vital Resource
When considering any transformational technology, it’s important to first evaluate your organizational culture. Does your organization view information technology as a tool, a weapon, or a vital resource?
Large, well-established, risk-averse organizations naturally gravitate toward proven solutions from large, established suppliers. They are rarely first-adopters of technology from startups. They source products from suppliers that have a test lab staffed by more employees than the entire staff of the typical startup. They may also, however, have their own lab to evaluate new technologies that can be transformative to the business two, three, or four years in the future, once an emerging technology matures.
Other organizations view transformational technologies as a weapon to enable them to compete against the goliaths in their industry. For these organizations, the risk of using new, unproven technology is much less than the risk of remaining uncompetitive against established market leaders, and the advantage of adopting new technology years before their larger competitors is significant.
Finally, some organizations have jobs to do or problems to solve that simply aren’t being served by the established players in the industry. The problem is too large, the customer budget too small, or the disruption to suppliers’ current product strategy and profit model too small to attract the full focus of established market leaders. For these organizations, leading-edge technology from emerging startups becomes a vital resource. Such is the case at BU School of Medicine, which was featured in the May 15, 2012, Peer Incite.
Lest anyone be confused, it is important to understand that the world of academic research is highly competitive. Research institutions and faculty thrive or die based upon their ability to win competitive research grants. BU School of Medicine wanted to transform the way in which the school delivered IT services to its researchers and its clients. The school took a clean-slate approach and turned siloed IT into a cloud service. They broke the model on grant-based research funding, by using server and storage virtualization combined with charge-back systems, rather than grant-specific capital equipment purchases, to allocate compute and storage resources.
Central to maximizing efficiency and flexibility was the creation of a shared infrastructure for backup, disaster recovery, and archiving. By being an early adopter of technology from Actifio, BU School of Medicine dramatically lowered the cost of providing these vital services. Just as importantly, Actifio enabled what the school previously considered impossible: the protection of massive repositories of large files from applications such as medical imaging and gene sequencing.
Action item: IT professionals must understand their corporate culture as they evaluate transformative technology from emerging suppliers. Established companies should evaluate early, but adopt carefully, by leveraging new technology in labs or lower-risk application areas. Organizations that are fighting for differentiation against larger competitors need to balance technology risk against the risk of being uncompetitive. The bias should be towards being early adopters. Finally, organizations should recognize that emerging startups may offer the only solution to some problems, particularly when the solution may disrupt the business model of established suppliers.
Footnotes:Dr. John Meyers, who was a guest speaker on the May 15, 2012 Peer Incite, serves as an advisor to Actifio and has a financial interest in the company.
Involve the Business in Writing Data Management Rules
One of the most complex issues in creating a strong automated storage tiering and data archiving system has little to do with technology and everything to do with meeting business needs. That issue is designing a data/document management strategy that protects valuable corporate data, maintains it at the storage tier that best meets business needs, meets compliance requirements including privacy, security, and longevity, and specifies when it should be destroyed.
This is a complex problem that must be driven by business, legal, records management, and security needs rather than technology. For instance, financial industry compliance requires preservation of large amounts of data for several years in a guaranteed unaltered state. This may require writing that data to write-once media for long-term archiving. And while these long-term archives may be off-line the data cannot be lost. That means that it has to be in a format that can still be read years later and its physical location known. Large financial companies have suffered major losses in court when they have been unable to produce the required data in court.
In healthcare, HIPAA and similar regulations include strict security and privacy requirements for all personally identifiable information, including special training for all individuals with access to this data. That includes dev and test staff if they work with regulated data.
Even in companies that are not specifically regulated, the loss of customer information damages the company reputation and brand and can cost it business. Addressing this exposure requires involvement of legal, records management, and security experts, either internal or external.
This is obviously a complex issue in which IT must take the role of implementer rather than designer. It must call in the experts. And implementation will require its own set of experts – data security and a vendor implementation team if the data is to be kept in house, the service provider(s) if part or all of the system is to be outsourced to the cloud or traditional service providers.
Action item: Business, legal, records management, security and other experts must be involved early in the process of designing solutions for data management, including DR and archiving. IT cannot make those decisions on its own, and if the expertise is not available in-house, then it should look outside for help. IT should be the implementer, not the designer."""
Storage Vendors Must Prepare for the Disruption from Copy Data Repositories
During the May 15, 2012, Peer Incite call, John Meyers, Ph.D., Assistant Professor of Medicine and Director of Technology for the Department of Medicine at Boston University School of Medicine, discussed the benefits of creating a consolidated backup and archive repository.
Rather than maintaining different solutions for backup and archive, the Department of Medicine serves both backup/restore requirements and archive access requirements from a common, de-duplicated copy-data pool. The Department of Medicine selected this approach, available from Actifio, because it eliminated the multiple, redundant or similar copies of data that would otherwise proliferate across the Department's two data centers.
Dr. Meyers estimates that he has saved $1 Million in storage expense and achieved an 85% reduction in storage requirements for backup and archive by eliminating copies of data that would otherwise be maintained on his primary storage: Isilon from EMC and 3PAR from HP. Cost savings were not the only benefit, however, as he has demonstrated the ability to very rapidly restore volumes and files from the consolidated pool.
It is not uncommon for organizations to keep significant numbers of copies of production data to serve not only backup, disaster recovery and archive requirements, but also application development, testing, and analytic application requirements. Performance considerations, near term, will act as one constraint on the ability of solutions like Actifio's to reduce all copy data to a single repository for all workloads, all applications, and all users. That said, disruption almost always begins from below, and over time will become a greater and more significant threat to the revenue that leading providers of primary storage derive from controller-based replication software and storage systems that store copies of primary data.
Action item: It is sufficiently early in the technology adoption life cycle that, beyond the account manager and territory-manager level, today's leading storage systems suppliers will feel little near-term impact from the adoption of solutions such as Actifio's. That said, the time is now for product managers to elevate the importance of product requirements for managing copy data and to give customers visibility to capabilities already in the product development roadmap that will reduce volume and file copies and resulting storage hardware requirements. Pricing actions and a responsive deal desk will not be sufficient to compete against a starting point that delivers an 85% reduction in copy-data storage requirements.
Organizations need to rethink data and data recovery in the name of simplicity
During the Wikibon community’s May 15, 2012 Peer Incite discussion, callers learned about the Boston University School of Medicine’s single data repository for backup and archive in a shared-services research IT environment. Our guest speaker was John Meyers, Ph.D., Assistant Professor of Medicine and Director of Technology for the Department of Medicine.
Dr. Meyers began his discussion by outlining a scenario that would be a dream for many CIOs — for all intents and purposes, he was able to begin his effort to transform the School’s research function through a greenfield deployment. He was able to fully replace older hardware, software, and services in favor of something brand new which, from the very beginning, would be sized to meet current research needs and easily scalable to encompass future needs as they arise.
Although not all organizations would be able to undertake an immediate massive infrastructure turnover such as the one described by Dr. Meyers, given the technology lifecycle, such infrastructure migrations can and do happen in a natural way through technological attrition. The end result for the School of Medicine has been the ability to significantly simplify the entire environment, from managing operational data to archiving data to maintaining data in a disaster recovery site. I’ve written previously about the need for a “Simplification Trend”. That is on its way and is necessary in order for IT departments of the future to meet business needs. The undertaking described by Dr. Meyers fits that trend perfectly.
In this spirit of an ongoing move toward simplicity, it becomes necessary to be open to making changes that are needed to make it happen. In the case of Dr. Meyers, that meant jettisoning a lot of older or newly-obsolete technology and create a new paradigm around the new technology. Dr. Meyers has the good fortune to be able to start from scratch and do it better.
For example, whereas the school previously used a mix of traditional backup and recovery mechanisms, with its new combination of EMC Isilon and HP 3PAR, that traditional data protection technology has been replaced with a single copy backup in a disaster recovery site. This has negated the need to offload to tape and keep multiple copies of data at various locations for resiliency. Dr. Meyers believes that the school saved hundreds of thousands of dollars as a result and, thanks to the solution’s inherent deduplication features and intelligent selection of data to protect, has reduced disaster recovery capacity needs by 85%.
In addition to Isilon and 3PAR, the school also partnered with Actifio, a company with a product that was well positioned as what the school considered to be the only one that could deal with its heterogeneous environment. Bear in mind that Dr. Meyers supports the university's research function, so there is no end to the possible needs.
Thanks to the product’s simplicity as delivered through its streamlined Adobe Air interface, Dr. Meyers’ staff was using the tool in a day. Actifio sits in-band on the storage fabric and captures changed blocks on the wire for physical machines. It also has vCenter plug-ins and can use changed block tracking, which makes it very simple to track changes in virtual machines.
The captured information is then used to create deltas of physical and virtual machines. These deltas are combined to create an application consistent backup. Now, when there is a need to restore either into production or for test/dev purposes, Actifio can simply present this series of deltas as a live LUN to a machine without having to fully restore it. It takes just a couple of seconds to set that up. From there, it’s just a matter of using Storage vMotion to place the system back into production.
While the cost and space savings are impressive, there is more to the story. Now, Dr. Meyers and his staff can focus on a single solution and do it well. No longer do they need to worry about multiple data protection mechanisms and hope that everything works. No longer do they worry about losing years of research because someone lost the external hard drive on which it was stored. Instead, they can focus on making things better and extending their new service to even more clients.
Further, the organization can eliminate a number of software licenses, eliminate slow and inefficient tape drives, and eliminate the stale and outdated processes that accompanied the legacy infrastructure.
Action item: Any organization that wants to simplify IT needs to consider solutions such as Actifio. Dr. Meyers’ approach demonstrates the need for organizations to have a master plan for technology that outlines the eventual vision and the steps that it will take to get there. This plan should include the technologies that can then be eliminated in the name of simplicity and direct cost savings.