Over half of outages are avoidable

LogicMonitor’s 2019 IT Outage Impact Study finds 38% of IT decision makers in the UK expect to experience an outage so severe it will make national media headlines with 35% expecting job losses.

Wednesday, 25th September 2019 Posted 6 years ago in by Phil Alsop

LogicMonitor, a SaaS-based performance monitoring platform for enterprise IT and service providers, has released the results from a new study of 300 IT decision makers, including 100 based in the UK. The 2019 IT Outage Impact Study examines the impact infrastructure and software brownouts and outages have on organisations, and if such events are preventable. The survey found that although performance and availability, the state of when an organisation’s IT infrastructure is functioning properly, are the top two concerns of IT teams worldwide, organisations are still plagued by frequent brownouts (where infrastructure or software performs at a degraded level) or outright outages.

“IT availability has become one of the business world’s most valuable commodities, but also the most difficult to maintain. Organisations today are increasingly dependent on the availability of their IT infrastructure,” said Gadi Oren, Vice President of Technology Evangelism of LogicMonitor. “A single IT outage can have huge negative business impacts including lost revenue and compliance failure, as well as decreased customer satisfaction and a tarnished brand reputation. Comprehensively monitoring IT infrastructure is key in detecting the early warning signs of impending IT outages and acting in real-time to course-correct before it’s too late.”

IT Downtime Is Expensive

The cost of even an hour of downtime can be staggeringly high, depending on the organisation. Global companies that have frequent outages and brownouts experience up to 16x higher costs when mitigating and recovering from downtime than companies who have fewer instances of downtime. The “big six” costs identified by respondents included:

Lost revenue
Lost productivity
Compliance costs
Mitigation costs
Damage to the brand
Lowered stock price

IT Availability Matters

80% of global survey respondents report that the performance and availability of their IT infrastructure tops their list of concerns. In fact, availability was considered more important than security and cost-effectiveness, which ranked third and fourth respectively. A DevOps engineer for a technology integration and management company said, “We support finance clients that deal with microtransactions against the open market, so an outage or even a loss of connectivity to the stock exchange can quickly equate to lost dollars, and they hold us accountable for that.”

IT Downtime Is Rampant

The typical global organisation surveyed experienced five outages and five brownouts within the past three years. In the UK, 30% of companies surveyed suffered through 10 or more outages within the past three years, and 37% suffered through 10 or more brownouts. Although unified monitoring technologies exist to help mitigate these issues, IT leaders are surprisingly pessimistic about their ability to avoid outages and brownouts. 38% of UK IT leaders say they worry about experiencing an IT brownout or outage so severe that it makes national media headlines. When such an event does happen, 35% fully expect someone to lose his or her job - perhaps even themselves.

Causes of IT Downtime

Survey participants report that the most common causes of disruptive downtime, which pose a threat to their key priorities of performance and availability, include:

Network failure
Software malfunction
Usage spikes/surges
Third-party provider outages
Human error
Configuration error

Survey respondents in the UK also reported that 53% of outages and 53% of brownouts are avoidable. The top two missed opportunities to avoid downtime globally are:

Failing to notice when usage is trending towards a danger level. For example, this might be more traffic than the network can efficiently handle, or it might be a primary storage share running out of space.
Failing to notice that critical hardware (or software) performance is trending steadily downward.