Contents |
Introduction
In a fast and sometimes furious way, the world is moving more and more to the cloud. Organizations are discovering new ways to achieve goals, and consumers are being connected to more cloud services every day as we become even more connected than we are now.
The reasons are clear:
- Cloud services offer much higher levels of scalability than organizations can achieve on their own.
- The barrier to entry is low since cloud providers already have the infrastructure in place.
- Purchasing is a breeze and can often be done with just a credit card.
At the same time, we’re still in the hype cycle with regard to cloud. It’s not uncommon to see outlandish claims of impossible uptime figures and similar calumnies bandied about. My focus here, however, is not on cloud providers but rather on customers of cloud services. As CIOs move deeper into cloud, I believe that some fundamental shifts need to take place. When Amazon or Microsoft go down, the world notices. Their cloud services are extremely high profile, with large services running on them, and they’re pointed to as examples of the future of IT.
It’s still your problem
You may have heard vendors tell clients to “make it our (the vendor’s) problem!” In the case of cloud services, you may hear talk about making someone else responsible for a particular infrastructure element by outsourcing that element to a provider.
The only problem with that is it’s still your problem even if you outsource it. No matter who is handling a particular task or function, the CIO is still responsible to the business for the successful delivery of whatever service is dependent upon that task or function. If the cloud provider fails in its mission and experiences a significant outage, the CIO’s business might be able to leverage the SLA for some level of damages, but that doesn’t fix the problem.
The first time a major outage happens, the CIO may be able to point to the provider contract and be protected from potential fallout from a failure. After all, the contract clearly stipulates five nines of uptime, right? The second time, however, would probably end in a worse result.
What if the service was internal?
Imagine that you’re a CIO considering a major move into the cloud. Now, step back and imagine how you would build the same service if you were going to deploy it yourself. If you find that you’re going down a very different architectural road just because you’re going cloud, rethink your strategy.
Suppose you were going to build a new, internal, mission-critical service. You would probably include redundancy and other measures to ensure that the service was highly available and that you would base your availability decisions on the business impact that would be had if the service was out. As the CIO, you know that you need to make sure that what you build is robust.
Why would that be any different in the cloud? Are you willing to bet your business that your cloud provider’s availability, tactics, and promises will actually work 100% of the time? If availability is truly required, a single cloud provider won’t cut it.
Two points here:
- Build for failure. Never assume that a provider will live up to its 99.999% uptime claims. If it doesn’t, it’s your business that suffers.
- Hope is not a strategy. Don’t hope that providers will live up to their claims. Build around their networks to make sure that your business survives the inevitable.
Risk management takes center stage
As companies outsource or move critical services to the cloud, IT’s role naturally changes from an engineering and tactical focus to an oversight and contract management emphasis. That said, organizations need to resist the urge to eliminate internal engineering from the equation and ensure that the outsourced services are architected in a way that meets the company’s risk management needs.
I believe that risk management from a services perspective will be one of the CIO’s top jobs as cloud services continue to gain traction. CIOs that intend to acquire cloud services need to hone their contract negotiation, engineering and risk management skills in order to:
- Ensure that contracts written by cloud providers are actually attainable and enforceable. If a provider absolutely, without a doubt guarantees 100% availability, I’d personally be wary and would conduct massive due diligence in an attempt to either confirm or debunk the claim. After all, even Amazon has had issues spanning regions in the past, so it’s not unheard of.
- Review provider architecture to make sure that true high availability is achievable. If not, you need the technical ability to design a solution that spans providers.
- Match a solution architecture against the needs of the business with a deep understanding for the business’ pain points and critical functions.
Building services in the cloud in a way that ensures service availability even during a provider outage will cost more money, and deploying services in this way is an exercise that should come as the result of a business impact analysis. That said, if the BIA indicates that a service absolutely must function, an organization shouldn’t rely on promises alone.
Action Item: Although a cloud provider be able to do something better, faster, and cheaper than an organization can accomplish on its own, once that service is out of the company’s hands and in the hands of an outsourcer, the organization loses direct control. With that loss of control may come angst when there is an outage, and you have no direct ability to stage a recovery. It is for this reason that I believe that organizations making significant cloud investments need to do so in ways that take risk management to levels that may not have been necessary in the past. CIOs need to lead this process to make sure that their companies are receiving the best possible service from providers and take necessary steps to work around potential provider issues in order to protect the bottom line.
Footnotes: