Tackling cloud complexity with the right tools

Securing the right observability tools will help enterprises get control over their software and spot problems before they become a more widespread issue. By Stela Udovivic, Director of Product Marketing, VMware Tanzu

  • Tuesday, 3rd August 2021 Posted 3 years ago in by Phil Alsop

Cloud adoption has soared in recent years with 92 percent of enterprises deploying a multi-cloud strategy, and 82 percent a hybrid cloud strategy. Change has largely been driven by businesses seeking to cut costs, deliver and scale operations faster, and offer more competitive products and services. As IT teams have accelerated their use of modern cloud applications, managing and running them has become an increasingly complex task. New research we conducted shows 86 percent of software engineers believe modern cloud applications are significantly more complex than just five years ago. Another 84 percent report using hundreds or even thousands of compute instances across a single organisation, with many containers and microservices in play.

For IT teams to keep on top of their applications, they must be able to monitor and troubleshoot any issues within this complex software landscape. However, only 16 percent of the IT practitioners we questioned are using modern observability tools. The good news is that this proportion is set to rise, with ca third (34 percent) planning to implement observability tools in the next six months.

With observability on the rise, what are the ‘red flags’ companies must consider for implementation?

Why monitoring tools need to evolve

Cloud complexity means application requests, in the context of distributed tracing, are now going through dozens of technologies including many third-party APIs, cloud databases, and digital queues.

Nearly half of IT practitioners state application requests touch over 25 different technologies, and at the top end, that number can reach over 100. Across this dense software application delivery, troubleshooting application issues has become even more demanding, and the risk of bottlenecks has increased.

More complex environments with distributed applications that are updated frequently, means teams must act fast when issues arise. With so many endpoints to monitor, tools must be able to scale to avoid incomplete coverage. However, many enterprises use too many tools for different elements of the cloud, creating multiple disparate systems. Half of the respondents in our survey stated they use more than five monitoring tools across their apps and infrastructure, from traditional style logging and NPM tools, to siloed container tools. And the larger the enterprise, the bigger this issue becomes.

Drawbacks include a lack of or siloed visibility, which prolongs troubleshooting and makes it difficult to maintain existing tooling. More than ever, enterprises need to secure the complete picture, and better observability holds the key to managing this effectively.

Red flags to look out for

Although different business models have different demands, there are several red flags teams can look for when considering observability.

Companies with ‘tech DNA’ – software companies – have already adopted DevOps processes and automated deployments in their operations, so should find these red flags easy to spot. For more traditional companies moving to the cloud, its important they build these factors into observability strategies from the get-go.

Firstly, teams must carefully assess the kinds of performance metrics they need to collect, and what they must do to measure these at high granularity. Some observability platforms collect lots of metrics and offer flexibility in fine-tuning collection, so only the details important to the business are shown.

Secondly, teams should consider which observability platform will integrate well with existing tooling. Open-source monitoring is very popular and will require integration. Our research found easy integration with open-source and existing tooling is essential for half of respondents.

Last but by no means least, teams should consider the cost of un-optimised infrastructures running the open-source observability platforms being maintained. It’s vital to factor in such things to ensure companies aren’t left with any unexpected bills to pay.

What next?

With observability on the rise, it’s important to take stock of what comes next for IT practitioners and DevOps teams. Once observability is in place, linking it to business outcomes will follow.

For organisations with operating models based around applications, tracking data insights for better decisions provides the ability to spot operating difficulties and reduce the risk of revenue loss through poor cloud service availability.. Sub-optimal performance or critical delays on an application can translate into a direct loss of business – if on a Friday evening you can’t order a taxi via a ride-sharing app, there’s a direct revenue impact for the respective sales team involved.

But with observability, a ride-sharing service can understand its business better by observing cloud service metrics, which make clear when its service is used the most, how many customers wait for rides, and how many drivers are on the streets during peak hours. In addition to observability being important to engineering teams, a company could provide its executives with access to the tools, to better showcase how the business is performing based on both real-time and historical insights.

Another impact of greater observability uptake will be a rise in self-healing. If you instrument your observability and CI/CD properly, by the time SRE teams react, the alert is probably already resolved. Similarly, implementations of automated technologies, such as AI/ML will only increase. For example, ‘surfacing’ is powered by AI and helps automatically spot problems for DevOps teams to troubleshoot – particularly important when you look at highly distributed environments with millions of traces and metrics.

Finally, as with lots of software technology at the moment, we’ll see an intersection between observability and DevSecOps. As developers start adding observability into applications early, they’ll need to define which metrics to use, and code must be secure. The challenge for teams is to prevent vulnerabilities from being introduced, both during development and eventual deployment, and to validate that every container deployed to production is compliant and secure.

Final thoughts

It’s an interesting time for software businesses looking to scale and secure their application delivery chains in increasingly complex cloud environments. The more sophisticated applications become, the greater the risk that troubleshooting will have an impact on delays for customers – and a bottom-line impact for sales.

Securing the right observability tools will help enterprises get control over their software and spot problems before they become a more widespread issue. This will only improve communication across the cloud network and boost user experience at the other end of the chain for customers – a win-win scenario for everyone involved.