Cloud outages highlight need for organisations to start designing for failure

Organisations should ‘design for failure’ to prevent outages.

  • Tuesday, 28th March 2017 Posted 7 years ago in by Phil Alsop
The recent Amazon Web Services (AWS) outage, which impacted a number of high-profile websites and service providers, has highlighted the importance of specific skillsets to support public cloud services. This is according to Radek Dymacz, head of R&D at disaster recovery and AWS consulting partner Databarracks, who states that organisations should adopt a ‘design for failure’ approach to prevent outages.
 
Gartner has forecasted that the worldwide public cloud services market will grow 18 per cent in 2017, with additional research from the Cloud Industry Forum (CIF) showing that the overall cloud adoption rate in the UK now stands at 88 per cent. For Radek, this growth has contributed to a shift in the cloud marketplace.
 
He explains: “The growth of hyperscale cloud services has led to an increase in managed services for these clouds. We have seen telecoms providers, data centre owners and managed service providers launch their own cloud services and, in many cases, pull out of the market. Many of these businesses are now focusing their efforts on providing managed services for the hyperscale public clouds of AWS, Azure and Google. However, platforms like AWS need a different approach to traditional hosting.
 
“The ability to design for failure is essential to the value proposition of public cloud platforms, and yet organisations are still consuming AWS services as though they’re building a traditional hosting environment. The great strength of platforms like AWS is that you can build in resiliency in a way that scales depending on your budget. At the larger end of the spectrum, this might involve using object storage across multiple Availability Zones and even Regions to provide an extra layer of resilience. This is expensive but, for large organisations, so is downtime. We recommend that all organisations adopt a ‘design for failure’ approach. This means that if any single element fails then there is an easily-identifiable specific cause, with a known resolution.
 
“What we’re seeing in customer demand agrees with this trend as businesses are now more mature in their use of cloud services. They have gone beyond testing, so they are now seeking help to increase resilience, optimise cost and support it round-the-clock. Therefore, when looking for support, organisations must select a supplier with genuine expertise, rather than a cowboy. To do this, one trick is to listen to the naming conventions they use as this is a surprisingly effective way to identify people who have not changed their approach to infrastructure. For example, consultants with little experience within the AWS ecosystem will use terminology such as ‘server’ instead of ‘instance’.
 
“Also, don’t be fooled by brand champions; almost anyone can pay their way through certification with AWS or Azure so you should always ask your provider how long they have been working with their chosen platform and ask to see multiple and specific case studies. This should help you find an experienced public cloud provider, but if in doubt always opt for shorter contracts.
 
“Although launching services in AWS is simple, it’s maintaining them that requires a highly specialised skillset. Working with a demonstrably experienced AWS provider typically involves redesigning the way your applications work, specifically around decoupling services and resources. But there’s a huge grey area between infrastructure, resource provision, application functionality and service delivery. You should therefore always choose a provider who has the developers and a support team to occupy this grey area, and who can work collaboratively with you to keep things running,” concludes Radek.