GridGain Professional Edition 2.4 introduces Integrated Machine Learning and Deep Learning

Wednesday, 28th March 2018 Posted 8 years ago in by Phil Alsop

GridGain Systems has announced the immediate availability of GridGain Professional Edition 2.4, a fully supported version of Apache Ignite 2.4. GridGain Professional Edition 2.4 now includes a Continuous Learning Framework, which includes machine learning and a multilayer perceptron (MLP) neural network that enable companies to run machine and deep learning algorithms against their petabyte-scale operational datasets in real-time. Companies can now build and continuously update models at in-memory speeds and with massive horizontal scalability. GridGain Professional Edition 2.4 also enhances the performance of Apache® Spark™ by introducing an API for Apache Spark DataFrames, adding to the existing support for Spark RDDs.

GridGain Continuous Learning Framework

GridGain Professional Edition 2.4 now includes the first fully supported release of the Apache Ignite integrated machine learning and multilayer perceptron features, making continuous learning using machine learning and deep learning available directly in GridGain. By optimizing these libraries for massively parallel processing (MPP) against the data residing in the GridGain cluster, large-scale machine learning use cases can be greatly accelerated. Processing data directly in the GridGain cluster enables a continuous learning workflow by eliminating the need to move transactional data into a separate database before model training. The result is real-time model training or even continuous model training with less complexity and substantially lower cost than traditional approaches.

The new GridGain Continuous Learning Framework is a building block for in-process HTAP (hybrid transactional/analytical processing) applications in which a data model is continually trained based on incoming data. In-process HTAP offers next-generation applications the ability to react to and benefit from real-time model training, which can power better real-time decision making in a wide range of business applications, such as fraud prevention, ecommerce recommendation engines, credit approvals, logistics, and transportation system maintenance decisions.

Expanded Support for Spark DataFrames

GridGain can now be used to store and manage Spark DataFrames. DataFrame support expands what was already the broadest support for Spark by any in-memory computing platform. GridGain continues to include the GridGain RDD API for accessing data in GridGain as mutable Spark RDDs, as well as the Ignite File System (IGFS) for using GridGain as an in-memory implementation of the Hadoop Distributed File System (HDFS).

Spark can be used to process data in GridGain as DataFrames or RDDs and also save DataFrames or RDDs into GridGain for later use. These capabilities allow GridGain to be used as in-memory storage by Spark developers to access, save and share information between Spark jobs. GridGain provides ANSI-99 SQL support, including data indexing, so Apache Spark can leverage GridGain’s distributed SQL to improve ad hoc query performance up to 1000x. Spark developers can also leverage the GridGain Continuous Learning Framework to automate decisions and continually update models to improve outcomes in real-time.