This is a community project initiated by Wikibon members and other members of the storage community. We welcome your participation. Please weigh in on the comments section or feel free to log in and hit 'edit.'
Collaboration is a basic philosophy of Wikibon and any content developed here is made freely available under an open source (GNU) license. The content is both free of charge and you are free to use it for either commercial or non-commercial purposes.
Collaborators will be invited to participate in customer meetings and other interactions to both develop, vet and evolve the information. Your contributions will be cited in the Wiki and widely on other distribution of this material. Thanks for participating.
Contents |
Research Objectives
- Set forth a premise as to how virtualization and cloud computing change the way data is protected
- Identify the key drivers and inhibitors of change
- Articulate the key planning issues users face in implementing data protection in the context of cloud computing
- Propose a five-year technology roadmap
- Provide recommendations and planning assumptions for users
- Provide a framework against which the vendor community can deliver value to customers
Working Premise
- Backup as we know it is broken (or at least inefficient) and will evolve over time to become something very different than it is today - (see footnote)
- True data assurance requires:
- Real time data fluidity where the state of data is preserved
- A new definition of "Data" where data becomes the machine state and persistent data (on disk) at a pre-defined point in time
- This new data moves seamlessly throughout the data center, the enterprise and ultimately in the cloud
- Continuous Data Protection (CDP) becomes the standard of measurement by which all Recovery Time Objectives (RTO)are measured
- As data moves throughout systems and networks, there are opportunities to:
- Reduce the data - compression and deduplication
- Protect data
- Archive data
- Perform analytics on data
- Create, extract, leverage metadata
- Data recovery becomes more about your data, where you want it, when you need it
- Recovery granularity drives the definition of RPO and RTO (transaction, item (file, email), application, system, data center)
Key Issues
(i.e. member questions)
- What's driving users to disk-based data protection?
- How is data growth changing the way IT protects data?
- How will the role of snapshots evolve in the future of data protection?
- What is the impact of virtualization and cloud computing on the future of data protection?
- How are virtualization and cloud computing changing RPO and RTO?
- How will restore change in the future?
- Which snapshot technologies fit best with virtualization and the cloud - what are the main considerations for users?
- What is the future of data protection software?
- How will advancements such as VADP and CBT evolve and what impact will this have on traditional data protection?
- What is the future of data protection hardware?
- What is the future of global/unified data protection management?
- Is tape still viable? What are the use cases?
- How will technologies such as compression and data deduplication evolve?
- How will virtualization and cloud computing change disaster recovery?
- Where will SSD and Flash fit in the data protection cycle of life?
- What is the role of data dispersal?
The Market
Segments, players, growth, context
Pyramid - Data Protection and Management
Top of the pyramid - Snapshots, Clones, CDP, Synch recovery, asynch recovery
Middle of pyramid - "Active" data protection software (w/ and w/o deduplication), data protection HW w/deduplication, Virtual Tape Libraries (VTL)
Bottom of pyramid - "Long term" data preservation - Tape
Tangential to the pyramid - Archive (software and hardware) with segments for regulatory compliance, business governance, data preservation
Key Drivers and Inhibitors
Drivers:
- Shorter RTO and RPO requirements
- Virtualization capability to enable a cheaper RTO
- Increased drive for accountability, transparency, information governance, compliance and risk management (archive)
- Unrelenting data growth (IDC Digital Universe)
- Flattening of the globe
- The "new data availability reality" - machine state vs just persistent disk
- SSD/Flash
- Ubiquity of networks
- New economics of networks
- Economics and agility of cloud
- On demand
- self-service/rapid provisioning
- xaas
- Chargebacks/showbacks
- The ascendancy of cloud service providers
- Data mobility
Inhibitors:
- Virtualization brings costs and complexities
- The ability to virtualize tier 1&2 apps - ecosystem maturity
- How is the cloud protected? At what costs?
- External cloud performance
- Bottlecks - people, network, storage, restore complexities, consistency of data
- Information risk and government chaos and stasis
- Lack of automated data classification and policy management
- People, process and technology barriers
- Lack of unified data protection management
- Old data reality (inertia)
- Cloud security, privacy, Interoperability, network performance.
Planning Assumptions
Address the key issues in depth
Example - What is the impact of virtualization and cloud computing?
- Live vs non-live 'state'
- Enables shorter RTOs
- Creates complexities - machine and data states are not containerized (today) - storage persistence
- Assumption - files become more prevalent - sharing - storage fails less frequently than servers and applications
- Assumption - I/O becomes the bottleneck requiring the leverage of tightly integrated storage optimization techologies (compression, data deduplication)
- VAAI and VADP and CBT become more important
- Source side deduplication and traditional data protection software technologies collide
- New data reality becomes the mandate - (Data Garden)
- Orchestration software required - autoprovisioning, chargeback, etc
- The role of snapshots
- Assumption - 70% of shops will re-architect backup by 2014
- Private v public cloud
- Data centers 'competing' with cloud service providers
- Facilitates / Applications elimination of the notion of backup window
- Multiple copies live in the cloud
- Technology makes it efficient to store
- Assumption: technologies like compression and dedupe will be embedded
- Virtual server and application awareness
Technology Roadmap
Today: Stovepipe technologies, processes, point products/feature products
Near-term: Unified hardware and software and simplified management
Mid-term: Data protection as a service - b/u, cdp, replication
Long-term: Time machine for the enterprise - data center 2015 - diagram - moving data and applications at distance and pointing to a virtual snapshot copy for recovery.
Recommendations
Action Item:
Footnotes: Why Storage is Broken and The Future of Tape!from Fadi Albatal