Identifying the cause of data anomalies

Fujitsu Limited and Inria, the French national research institute for digital science and technology, have developed a new AI technology that can identify factors contributing to anomalies in time series data.

  • Wednesday, 21st July 2021 Posted 3 years ago in by Phil Alsop

In recent years, various kinds of time-series data collected in fields including healthcare, social infrastructure, and manufacturing have been leveraged by AI to perform situational judgment and detect anomalies. In the case of time-series data, however, there are a wide range of factors that can contribute to AI decision-making. This means that even experts find it difficult to notice what kind of changes in the data contributed to an anomaly detection making it difficult to take appropriate measures to prevent their occurrence.

Fujitsu and Inria, more specifically the Inria's DATASHAPE Project Team led by Frederic Chazal in France, have now successfully developed a new technology based on Topological Data Analysis (TDA)(1) that can identify the factors contributing to anomaly detections by AI for time series data and visualize the differences in AI decisions during normal and anomalous circumstances.

Fujitsu and Inria anticipate that this will contribute to the analysis of the causes of anomalies in time series data for various phenomena, clarifying the mechanism surrounding the occurrence of anomalies, as well as the discovery of new solutions to these.

This technology will be presented as one of just 3% of total submitted papers as a "Long Talk" presentation at the Thirty-eighth International Conference on Machine Learning (ICML), the leading international conference in the field of machine learning, which opens virtually from July 18th, 2021.

Newly Developed Technology

Fujitsu and Inria have developed an AI technology that can determine the cause of anomalies in time-series data, consisting of the following key features.

1) Using an analysis technology developed by Fujitsu that extract features that affect judgement from time-series data and detects anomalies (2), the characteristics that led to the anomalous judgment as well as the unrelated characteristics from the data that was judged to be anomalous by the AI are mapped onto a plane (TDA space).

2) The technology transforms the point data of the characteristic that is the cause closer to the point data group of the characteristic that is not the cause on the plane.

3) The time-series data is deformed based on the conversion of the characteristics of the point data, and the data judged to be normal is generated.

This allows the waveform of normal and anomalous time-series data to be compared and enables the user to visually investigate the cause of the anomaly.

The newly developed technology was applied to test the possibility of detecting symptoms of delirium (3) using actual electroencephalography (EEG) data (4) collected in strict accordance with ethical guidelines. Using the newly developed technology, it was confirmed that the characteristics of the brain wave of the time series data coincided with the "slowing" phenomenon (5) that at times accompanies the state of delirium. These results offer the potential to help medical professionals interpret the data to help determine the cause of these symptoms. This may one day contribute to important medical developments, including the ability to discover possible precursors to diseases that have been difficult to identify with conventional techniques, as well as the discovery of preventive treatments. The technology could also be applied to shed light on the mechanisms of diseases that are not yet well-understood.

Comments from Dr. Gen Shinozaki, ASSOCIATE PROFESSOR OF PSYCHIATRY AND BEHAVIORAL SCIENCES, Stanford University School of Medicine

Due to the nature of the random signals, it has proven difficult to use EEG data quantitatively and accurately to identify certain disorders. In recent years, advances in data processing technologies, such as AI, have made it possible to better understand the characteristic changes in subtle brain waves. These advances are important not only for diagnosing various disorders, but also for understanding the treatment response and the pathophysiological mechanism. The technology developed by Fujitsu and Inria has successfully captured the unique characteristics of brain waves in patients suffering from delirium. In addition to verifying this, we anticipate that further improvement and practical use of this technology will ultimately offer the potential to achieve accurate diagnosis, monitoring of treatment response and the elucidation of pathophysiology for other disorders.

Future Plans

Fujitsu and Inria plan to encourage the use of the jointly developed technology in field work and experiments at companies and research institutes, and proceed with verifying the technology.

(1) Topological Data Analysis (TDA):

A method for analyzing data in which data are arrayed in a cluster of points in space, and geometric data is extracted from the cluster.

(2) an analysis technology developed by Fujitsu that classifies time-series data by features and detects anomalies:

Fujitsu and France's Inria Jointly Develop Technology to Automatically Create Anomaly-Detecting AI Models (Press Release: 2020/3/16)

(3) delirium:

a syndrome, or group of symptoms, caused by a disturbance in the normal functioning of the brain.

(4) actual EEG data:

the newly developed technology was applied to the electroencephalographic data of approximately 600 patients, who consented to participate in research with Dr. Shinozaki, at the University of Iowa. Professor Shinozaki has been an Associate Professor at Stanford University since June 2021.

(5) slowing phenomenon:

a phenomenon that frequently occurs in the EEG data of patients suffering from delirium.