Making observability the backbone of digital resilience

By Bob Wambach, VP, Portfolio & Strategy, Dynatrace.

  • Tuesday, 2nd September 2025 Posted 8 hours ago in by Phil Alsop

Software failures happen. The difference between a brief downtime and a prolonged outage lies in how quickly and effectively organsiations can detect, diagnose and recover. Traditional monitoring – often fragmented across tools, data and teams—falls short in this high-stakes environment. 

 

To build and maintain resilient, high-performing software, organisations need deep, AI-powered end-to-end observability which provides a unified and consistent view across the entire digital ecosystem. As enterprise environments grow more complex with cloud-native architectures, multi-cloud infrastructure, APIs and agentic AI, visibility becomes more challenging. These layered dynamics introduce blind spots that make managing risk, performance and resilience at scale more complex than ever.  

 

Uncovering weak links in modern software stacks 

 

Today’s enterprises rely on a vast ecosystem of interconnected technologies. A single misconfigured update or a vulnerability in a widely deployed third-party agent can cascade across systems at machine speed, impacting customer experience, operations and ultimately, business continuity. 

 

Research shows that 42% of organisations anticipate experiencing an incident caused by one of their suppliers. Too often, teams are left flying blind when something goes wrong, which can be frustrating and costly.  To operate with confidence, businesses must see across their entire digital supply chain, which has proven to be lacking with basic monitoring.  Unlike traditional monitoring, which often focuses on siloed metrics or alerts, modern observability provides a unified, real-time view across the entire technology stack, enabling faster, data-driven decisions at scale. Implementing real-time, AI-powered observability covers every component from infrastructure and services to applications and user experience. 

 

Observability is a business imperative  

 

End-to-end observability is evolving beyond its current role in IT and DevOps to become a foundational element of modern business strategy. In doing so, observability plays a critical role in managing risk, maintaining uptime and safeguarding digital trust.  

 

Observability also enables organisations to proactively detect anomalies before they escalate into outages, quickly pinpoint root causes across complex, distributed systems and automate response actions to reduce mean time to resolution (MTTR).  The result is faster, smarter and more resilient operations, giving teams the confidence to innovate without compromising system stability, a critical advantage in a world where digital resilience and speed must go hand in hand. 

 

Turning complexity into your greatest strength  

 

Resilient systems must absorb shocks without breaking. This requires both cultural and technical investment, from embracing shared accountability across teams to adopting modern deployment strategies like canary releases, blue/green rollouts and feature flagging.  

 

Modern strategies only work if teams have real-time feedback and clarity, enabling organisations to understand what’s happening, why and what to do about it before customers ever notice a disruption. 

 

 

A new layer of complexity: agentic AI  

 

As organisations increasingly adopt generative and agentic AI to accelerate innovation, they also expose themselves to new kinds of risks. Agentic AI can be configured to act independently, making changes, triggering workflows or even deploying code without direct human involvement. This level of autonomy can boost productivity, but it also introduces serious challenges beyond the obvious hallucinations associated with generative AI. 

 

For example, a misconfigured agent or a malicious prompt can create far reaching downstream consequences. Small ripples can become waves, faster, broader and harder to contain. Real-time, AI-driven observability platforms are essential, not just for monitoring what the agents do, but for understanding how they act, how they interact with other systems and when intervention is needed. Observability helps safely harness the potential of agentic AI and pave the way toward autonomous operations. 

 

Building resilience for the next outage  

 

The leaders of tomorrow will be defined not merely by their adoption of technologies such as agentic AI, but by how effectively they manage the complexity and risks these innovations introduce. Thriving in this new era requires a shift from reactive operations to proactive and preventative strategies. 

 

Real-time, AI-driven observability enables this transformation by automating intelligent responses without the need for manual intervention. It does more than prepare organisations for the next disruption; it establishes a foundation of trust, agility and  ongoing innovation. In a world where resilience, speed and transparency are critical to success, observability is no longer just a technical solution but a strategic advantage. 

By Eric Herzog, Chief Marketing Officer at Infinidat.
By Richard Bearman, Chief Development Officer, British Business Bank.
By Krishna Vishnubhotla, Vice President of Product Solutions, Zimperium.
By Stig Martin Fiska, Global Head of Cognizant Ocean.

Transforming Cyber Defence with Agentic AI

Posted 5 days ago by Phil Alsop
By Micah Heaton, Executive Director, Microsoft Security Centre of Excellence at BlueVoyant.
By Rob Hankin, Chief Technology Officer at Cybit.
By Burley Kawasaki, Global VP of Product Marketing & Strategy, Creatio.

When compromise becomes the dangerous norm

Posted 1 week ago by Phil Alsop
By Mark Jow, Technical Evangelist EMEA, Gigamon.