Yield big results with data lakes and automation

By Abdul Razack, SVP of Platforms, Big Data and Analytics at Infosys.

  • Monday, 20th June 2016 Posted 8 years ago in by Phil Alsop
Think you had a tough day? Spare a thought for recruiters who compete with each other to hire one of the country’s most sought-after specialists: the data scientist. The demand for data experts is so high that there could still be a 60 per cent hiring gap for data scientists three years from now.
 
It’s all a part of the huge effort to tap into the promise of big data. Many businesses are struggling to realise gains in their big data investments, and the reasons extend beyond a lack of qualified data scientists.
 
In my experience, the lack the skills to properly manage data is a very real problem, but another challenge to big data insight is the way enterprises organise data – many companies silo or fragment their data. It’s gotten to the point that some organisations are wondering what benefits big data held for them in the first place.
 
One immediate benefit of big data is automation – the ability to automatically identify and pre-emptively resolve symptoms before they become a problem, as well as to eliminate time-wasting processes. This one-two punch frees up time and resources, enabling organisations to focus on better understanding what the end user wants and needs.
 
To realise the benefits of automation, we must consider how data ought to be stored today. We need to discuss data lakes.
 
Data lakes are repositories for storing relevant data requiring analysis. The types of data stored in these lakes usually come in three forms: structured, unstructured, and semi-structured. These data are stored in their raw forms, allowing for deep and complex analysis and not losing fidelity due to aggregated data. The more data that organisations pool into their data lakes, the more opportunity they have to discover previously unseen correlations and insights.
 
The ease and flexibility of using data contained in data lakes helps to identify repeatable tasks and processes. In fact, data lakes, because they act as a central repository for automated systems, can be used in building a system capable of recognising trends, learning, and acting on its own accord.
 
Let’s use the process of resetting a password as an example. The system monitors the actions of an administrator helping an end user reset his or her password. It observes the steps involved in resetting the password and stores this information in its data lake. Then, the next time a user submits a password reset request, a software robot, as we call them, can walk the end user through the password reset process without the need for admin intervention. In this example, it takes previously observed learning and applies it in practice.
 
Another example we can look at is in retail. Innovative retailers have leveraged their massive customer data collections to automatically identify customer behaviour, trends, inventory replenishment cycles, and more. This helps personalise a customer’s shopping experience and deliver consistency across a brand’s engagement points.
 
Banks have applied automation to event-ticket processing. Automated systems have reduced the number of events by 35 percent and then helped to reduce the number of tickets needed to be processed by another 45 percent. First it reduced the noise and then it brought the number of actionable tickets down and to the proper employee’s attention. This helps to accelerate a bank’s ability to respond to customer concerns, thereby improving the customer experience.
 
This sort of identification, segmentation, and automation doesn’t happen overnight. Data must be accessible – automatically sent to the right spot at the right time – for companies to extract the most value from that data. Silos need to be eliminated and application interfaces managed before data can move about freely. Employees must be able to address contextual bias in data capture, among many other things. But the investments made in big data initiatives are worth it.
 
Big data automation does more than merely streamline and eliminate processes. It accentuates the uniquely human ability to take complex problems and deliver creative solutions to them. Automation, then, is a simple tool to enhance what we already have and to create new opportunities.