Big Data predictions for 2018

Innovations in analytics and big data management will make it possible for businesses not only to handle growth in data but also make better sense of it than they do today. IDC predicts that the percentage of useful data will grow from 22% in 2013 to 37% in 2020, even as the volume of data available grows expontentially.

Thursday, 4th January 2018 Posted 8 years ago in by Phil Alsop

According to Ted Dunning, PhD, chief application architect, MapR Technologies, Inc., the following seven major trends guide his predictions for 2018:

Machine Learning will go from “in vogue” to “in production”

Increasingly, machine learning will be seen as a normal part of business. Artificial intelligence (AI) will continue to get buzz, but a much broader set of machine learning approaches will deliver valuable insights to enterprises across many industries.

We anticipate that the most successful systems will be where people focus more on the problem than the tool, framing questions correctly and having realistic goals. They also need access to appropriate data at scale and a plan to convert machine learning results into action.

Organizations will recognize that 90% of Machine Learning success is in the logistics (rather than the algorithm or the model)

To run successful machine learning systems in the real world, it is essential to manage input data and multiple models across a complete life cycle including model development, evaluation and ongoing maintenance in production. With effective architecture and good planning, much of this can be handled at the platform level rather than the application level, across systems handled by different machine learning tools. A new project won’t require a new plan for logistics. The need for efficient machine learning logistics drives a trend toward stream-based architectures and a global data fabric.

Rapid Kubernetes adoption forms the foundation for multi-cloud deployments

Runaway success of Kubernetes is predicted, but it is being adopted so quickly that this may soon be more of an observation than a prediction in 2018.

So far almost everybody thinks of Kubernetes as a way to organize and orchestrate computation in a cloud. Over the next year, Kubernetes will be the way leading-edge companies organize and orchestrate computation across multiple clouds, both public and private. On-premises computation also is moving quickly to containerised orchestration, and when you can interchangeably schedule work anywhere, you have real revolution.

What about the data? The next two predictions speak to this:

Big data systems will be the centre of gravity

In the past, big data projects have been isolated, special projects or experiments that complemented traditional systems. Now, big data is becoming an essential asset. Enterprises are transforming into data-driven concerns with big data systems as the centre of gravity, in terms of data size, storage and access as well as operations and analytics in truly multi-tenant system.

Leading organisations knit data flows into a data fabric

This coming year, we will see more and more businesses treat computation in terms of data flows rather than data that is just processed and landed in a database. These data flows can capture key business events and mirror business structure. A unified data fabric that breaks down silos to give comprehensive access to multiple kinds of computation and data from many sources creating a foundation for building these large-scale, flow-based systems. Databases will become the natural partner and complement of a dataflow. The emerging trend is to have a data fabric that provides data-in-motion and data-at-rest needed for multi-cloud use provided by things like Kubernetes.

DataOps emerges as key organisational approach to drive agility

We see a trend toward embedding data scientists and data-focused developers into otherwise traditional DevOps teams to form a DataOps team. Better communication, focus and goal orientation by cross-skilled teams results (importantly) in faster time to value and better agility.

Organising work in a DataOps style improves the ability for timely and appropriate responses to changing conditions. Better flexibility and efficiency at the human level helps take full advantage of new technologies and architectures.

Processing Extends to the IoT Edge

In this upcoming year, we aren’t just going to see data fabrics and computation that span on-premises facilities into multiple clouds. We are also expecting to see full-scale data fabric extend right to the edge next to devices, and, in some cases, we will see threads of the fabric extend right into the devices themselves.

“Our expectations are driven by customer discussions and knowledge we’ve gained helping companies achieve measurable business value by combining advanced technology with pragmatic approaches to solving data challenges,” said Dunning. “There is a constant struggle among companies when it comes to embracing new technologies without exorbitant costs tied to it. MapR’s data fabric reduces the costs of running legacy systems while allowing customers to pursue innovation.”