DataOps: A crash course

Data is the lifeblood of software development — and the biggest hurdle. Yet, far too many companies often address these hurdles last, and usually, not very effectively either. The hard truth is that you can’t achieve the “nirvana state” of Continuous Integration and Continuous Delivery (CI/CD) without first automating data delivery. Figuring out how to keep your software pipeline constantly flowing with a fresh supply of high-quality, up-to-date data is crucial. By Matthew Yew, Senior Director of Product Marketing at Delphix.

  • Sunday, 12th January 2020 Posted 4 years ago in by Phil Alsop

So, if data is the “new oil” for innovation, then “DataOps” encompasses everything you need to transform that crude asset into digital fuel. Over the next 12 months, 86 per cent of enterprises plan to increase investment in DataOps strategies and platforms, according to 451 Research. So, it’s clear that companies today understand they should be investing in data management, but with the practice still so nascent, most are not sure how to go about it.

 

Here are some practical tips and proven methods for improving the flow of data. But first, we need to get on the same page about DataOps and its strategic values in enabling companies to reach their business objectives.

 

What is DataOps?

 

You may have heard of DataOps before. Last year, DataOps burst onto the scene and entered the IT vernacular of forward-thinkers after being named an “Innovation Trigger” for Gartner’s Hype Cycle for Data Management. While there’s a lot of similar definitions floating around, here’s how Gartner defines it:

 

"DataOps is a collaborative data management practice focused on improving the communication, integration, and automation of data flows between data managers and consumers across an organisation. The goal of DataOps is to create predictable delivery and change management of data, data models and related artefacts. DataOps uses technology to automate data delivery with the appropriate levels of security, quality, and metadata to improve the use and value of data in a dynamic environment."

 

Core to DataOps is the need to align people, processes and technology around the flow of data in the enterprise. Through that organisational change, DataOps promises to accelerate innovation by providing everyone with the access they need to data, while introducing improved security and privacy controls. In fact, according to 451 Research, nearly three-quarters of enterprises cited security and compliance as a top perceived benefit of DataOps, and expect new DataOps technologies to help lessen the growing friction between data access and compliance.

 

The “1-2-3s” of getting started with DataOps 

 

For too many companies today, ensuring the right data is securely delivered to the right environment at the right time is an afterthought. That makes it more difficult or impossible to discover relevant datasets, manage data preparation and cleansing, share analytics queries, or track and version machine learning data models needed to power intelligent applications.

 

To get started with DataOps and address this, one needs to address all three aspects of adopting DataOps throughout an organisation: 1) culture; 2) process; and 3) technology.

 

Introducing a “data first” culture

 

DataOps requires you to first shift the organisation’s culture to embrace “data first” processes and

incorporate the right solutions so data can fuel secure, agile innovation. Just like implementing any new IT strategy, the first step is to figure out the reasons why you need to adopt DataOps. Identify the business needs for changing processes or improving the organisation’s culture are data. What are those goals? This is often the most difficult step for organisations hoping to adopt DataOps, changing the overall culture to think “data first.”

 

Put processes in place to eliminate data bottlenecks

 

For most organisations, data is scattered, messy and complex to use – and a new layer of classification and process across the data is needed. As my colleague Eric Schrock often says, one must “know thy data” before you can do anything. Then, you must determine who owns that data and who needs data for everything you’ve identified. Once that has been done, you can start the process of creating a developer roadmap to transform the technology stack.

 

Add privacy and security controls

 

DataOps also means that you’re incorporating the right technology tools to enable both of these so that innovation with data can flourish, while also introducing privacy and security controls to deal with the realities of today’s headline-making data breaches. It requires a technology toolbox that supports self-service data provisioning (wherever and whenever data is needed) – as well as automated “guardrails” to protect sensitive user information (such as masking). Technology must also adhere to regulatory compliance, govern access, and provide the ability to fully integrate data delivery into the CI/CD pipeline.

 

Understand what tools are needed to securely automate data delivery and allow for self-service data processes that make working with data more agile and safe. The goal of DataOps is to make data more agile, accessible, and secure. To make that happen, it’s important for every stakeholder in the company to contribute to a “data first” mentality. For today’s most successful DataOps journeys, it’s a shared responsibility that makes the adoption successful.

 

Common missteps and pitfalls

 

There are a couple of common missteps that could derail a DataOps practice before it’s up-and-running. By far the most common one I see is that the majority of companies don’t have a culture that is amenable to DataOps or ensures that data is a core aspect of software development. DataOps is as much a cultural and organisational shift, as it is process and technology. You need everyone to put “data first” or there will be kinks in the pipeline. Developers need to put data first because fresh, accurate and high-quality data is a necessity for building innovative digital services and experiences that lead to market leadership.

Another big one is not having access to high quality data to test against results in not delivering an accurate prediction of the real world. It’s simple, relying on old, slow data leads to bad results. This happens because 1) the data being tested against is stale, and needs to be refreshed with more frequency; and 2) Not classifying your data and treating it all the same. Being able to segment data and have different policies for governing and managing each type is something that a lot of organisations are not doing – and it’s another roadblock that reduces agility. 

 

A mature DataOps organisation may look different from company to company given their security needs and business objects, but these are a few of the shared qualities of a successful DataOps practice. The most innovative companies today are those who have embedded DataOps principle to feed data continuously in the application delivery pipeline. This includes automation to do provision through self-service, and developers who are able to build an app by themselves by deploying data to run tests (provision and refresh data).

 

DataOps adoption doesn’t happen overnight, but by getting your teams’ buy-in, building-out the right processes for your business, and implementing the right technology – you’ll lay the foundation to transform how your company can leverage data to fuel innovation.