The importance of visibility in AIOps for digital transformation

The concept of digital transformation is a highly complex one. It means many things to many people, and involves a variety of complex environments and infrastructures from hybrid to multi-cloud, containers and more. To do this at scale, requires automation. To have any hope of managing digital transformation properly, forward-thinking organisations are also looking to machine learning and AIOps. By Paul Barrett, Chief Technology Officer, Enterprise, NETSCOUT

  • Tuesday, 3rd August 2021 Posted 3 years ago in by Phil Alsop

AIOps, however, is not a panacea for all of an organisation’s digital transformation ills, and what’s crucial to making it work as intended is the concept of observability, which I like to think of as the degree to which you can understand the internal state of a system. 

Good AIOps and bad AIOPs 

What separates good AIOps from bad AIOps is the quality of the data that you’re feeding into it. If you’ve got a system that has good visibility, one can generate high quality data about what's happening, which can then be used to underpin machine learning and automation. But if high quality data isn’t driving machine learning and automation, one can’t expect good results. 

The old IT adage of ‘garbage in, garbage out’ is still as relevant now as it has ever been. Ultimately, automation has been defined by a human being whether they put instructions into a user interface or whether they wrote a configuration file or a script. If we give the wrong instructions, or there's errors in instructions that we give our automation system those errors will get replicated at scale. Without visibility, and the high-quality data that enables, things can go wrong fairly quickly. 

Loss of visibility increases the risk of unintended consequences 

One example I always like to go back to in my discussions on automation is the stock market crash of October 19th, 1987. One of the primary causes of which was the collective behavior of automated trading algorithms at scale. For those people who wrote those algorithms at the time, they never expected to crash the stock market, they were trying to make money. However, no one had foreseen how these individual algorithms might interact with one another when taken out of their comfort zone, i.e. when the markets started to slide. 

This is a very real consequence of not having that full visibility, and it’s entirely possible to create unintended control loops. If you start linking enough systems together, before you know it you’ve created a super-system, and one system may have an influence on another system far, far away that you really hadn't intended. 

General estimates suggest there are around 30 billion connected IoT devices now, with that number growing rapidly. Once again, that's a call to arms for visibility. Organisations need to make sure that we have independent visibility and clear oversight of how all of these systems are working and interacting with each other. So that, if something does go wrong, and we have created an unexpected feedback loop, that we can stop that and get ahead of it before it becomes a real problem. 

Loss of visibility through complexity 

Unfortunately, one of the trends we're seeing at the moment is actually a loss of visibility. This is caused by increasingly complex hybrid multi-cloud environments. As an organisation redistributes its IT infrastructure across a mixture of public cloud, colocation facilities and traditional data centres, an increasing amount of that IT infrastructure is no longer directly under the organisation’s control, and that means less visibility. 

Organisations therefore have to ask the question of how they can gain insight into what is happening inside these third-party infrastructures, and regain that visibility and control. 

Having end-to-end and end-through-end visibility is crucial. For example, the boundaries of different IT domains where there's a change in ownership, or responsibility, is a great place to try and establish visibility. 

Although all data is useful for driving AIOps, I believe that packet flows are the most important data source. Packet flows encompass a wealth of information about the status, performance, and security of IT infrastructure. They also provide insight into the experience of end-users and the efficacy of machine-to-machine communications when people and devices interact with each other through apps and services. With that level of visibility into what’s going on in your IT infrastructure, AIOps can be a successful approach that truly revolutionises how businesses operate.