The pursuit of happiness

How Big Data use enables big benefits By Gary LaFever, CEO and General Counsel, Anonos.

  • Monday, 5th October 2020 Posted 4 years ago in by Phil Alsop

Data-driven decision-making is a major part of economic and societal development. How can we go through a journey of improvement if we aren’t sure where we are going, how to get there, or how far along the pathway we are? We need data to make decisions about where to head, the path to follow, and to see how much progress we have made along the way.


The use of data for analysis, gaining insights and making good decisions is critical, not just in business or technological success, but in everything we do. For example, in the current COVID-19 pandemic, we cannot know whether or not the virus is under control if we don’t have adequate testing systems in place. Those testing systems show us how many cases there are and where those cases are, and with this information we can make better decisions about how to tackle the problem.

However, data collection and use is not so simple. Throw in the complications of increasingly prevalent AI processing and the ‘garbage in garbage out’ principle, and the value in collection, use, and analysis of good quality data becomes abundantly clear. So, what’s the problem with just going ahead and doing it?

This issue is multifaceted. Firstly, determining what ‘good quality’ data is, is actually quite hard. With the proliferation of cloud computing, IoT, and social networking, as well as mass media, we are now working within a world of Big Data whether we like it or not.

This world of Big Data runs into big problems. Biases are perpetuated in data sets and spat out through machine learning tools; large data sets can take a long time to process; and undefined data sets can create results that are not useful for the task at hand.

When you add issues such as generalised data sets, masking techniques, and approaches that make data even less accurate, these problems are magnified. Data becomes lower quality, lower utility, and its decision-making power is diminished.

The second aspect of the issue is that we are not always just collecting, using, and analysing data from the environment. Often, (in fact usually) we want data from people. With this, another problem arises: we need to get them to give it to us, and to trust us to do the right thing with it. 

Privacy laws and regulations are in place to create a framework around ethical and unethical approaches to data use. However, these laws are part of a slow and bureaucratic regulatory process, and the development of technology usually far outpaces regulations and what they capture. This leaves a big void in the middle between what technology really does, and what regulations actually cover.

A simple example is shown in the regulation of data privacy with regard to consent. When consenting to observed and inferred data collection through the internet, as well as to AI and machine learning processes being performed on data, on the whole people simply don’t know what they are agreeing to. Technology has moved so fast that regulatory protections simply fail to function adequately in today’s new environment.

Several proponents argue that a solution to this problem could involve shifting the burden of protection to the data controller, through an approach that many people know as ‘legitimate interests’ processing. Instead of making data subjects read long privacy policies and give often-uninformed consent to data collection and use, the data controller should instead have to show that they have a legitimate purpose in using the data.

However, this requires the introduction of effective risk-based controls to mitigate risk to data subjects from a data controller’s use, or else the term ‘legitimate use’ is no more than a subterfuge. Figuring out these issues is critical for maximising the value of data while protecting people’s right to privacy.

The third issue is that even if you can collect good quality data, and people allow you to collect it from them, the real benefit of data comes from actually using it to deliver an outcome. In most cases, using something alone is not going to produce as good of a result as with collaboration. The sharing, combining, and use of data in a broad sense is where the real value and innovation comes in, but once again the data protection hurdle stands in the way.

While numerous techniques for protected data processing exist, many of them are slow, expensive, cumbersome, or simply do not work in higher-risk or distributed data processing environments. For example, protecting data through techniques such as anonymisation looks good on the surface, but it has numerous limitations that organisations keep running into.

Different tools are required to protect data while it is in distributed, shared use for maximum value-gain. For some purposes you need one tool, and for expanded uses you need another. The very real benefits of data are out there but harnessing them in a fundamentally different way that balances privacy and data enablement is the key.

To move forward, a solution is needed that balances fundamental societal benefits and risks to individuals on all sides. Data must be able to be collected and used in a way that protects individual privacy; as well as to be processed in a way that allows quality preservation, and the sharing and combining of data with others.

Data sharing, combining and enriching is where data value, insight, privacy and security meet. Processing using traditional centralised data protection technologies may be too slow and inefficient to obtain digital insights as quickly as needed, and for cutting-edge data use processes, a decentralised data protection solution is necessary.

A decentralised, risk-based data protection approach allows broader and more-valuable data processing to occur, without diminishing the value of the data or at the cost of privacy rights and personal protections.

The tug-of-war between data utility and data protection can be resolved, with the right approach. The solution is the use of technology that is not only able to handle issues with Big Data and complex processing in machine learning and AI environments, but also complies with the GDPR and other evolving global data privacy laws.