Daitaku expands features for statisticians and data scientists

Dataiku has released Dataiku 7, bringing deeper integration for technical data professionals to work on machine learning project development and row-level explainability for white-box AI. Additional feature highlights with this latest release include Kubernetes-powered web apps to expand on the capabilities introduced in Dataiku 6 and a machine learning-assisted data labeling plugin.

  • Friday, 20th March 2020 Posted 4 years ago in by Phil Alsop
“Collaboration has been at the core of Dataiku since our founding in 2013, and with Dataiku 7, we’re continuing to add features that deepen our philosophy to effectively democratize AI in the enterprise,” said Dataiku CEO, Florian Douetteau. “With this launch, Dataiku 7 is our second consecutive product release that expands features for explainable AI, a critical component for organizations across industries to succeed and understand the impact of their AI model outcomes.” 

 

Organizations worldwide are committed to enterprise AI efforts from the top down, but struggle to democratize projects from the bottom up to give more individuals access to actionable data insights. Dataiku 7 brings more people to the table via collaboration and empowers individuals with explainable AI for businesses to use data for day-to-day decisions and build impactful AI projects.

 

With the launch of Dataiku 7, new features include: 

 

Support for Advanced Statistical Analysis: Statisticians can now use Dataiku to perform advanced statistical analysis in the familiar worksheet-and-cards format while collaborating with the wider data or analytics team. In the past, advanced statisticians were relegated to siloed tools with no visibility for non-statisticians, creating bottlenecks in governance and AI project deployment. 

 

Advanced Prediction Explanations: Traditionally, machine learning models do not include insights into why or how they arrived at an outcome, making it difficult to objectively explain the decisions made and actions taken based on these models. Prediction explanations in Dataiku open the black box by describing which characteristics, or features, have the greatest impact on a model’s outcomes. Dataiku 7 includes both row-level prediction explanations in output datasets as well as interactive visualizations of individual prediction explanations.

 

Git for Better Coder Collaboration: With the enhanced Git integration in Dataiku 7, data scientists (or other code-first users) can now create, delete, push, and pull Git branches directly from Dataiku. This brings big efficiency gains, as coders can duplicate projects to easily sandbox changes, leaving the original project unaffected. Once the iteration on the duplicate project is complete, changes can be seamlessly merged back to the original project (with all changes tracked in Git). 

 

More Elasticity With Kubernetes: Dataiku 7 expands on the managed Kubernetes cluster capability from Dataiku 6 by allowing users to now run web apps on Kubernetes clusters. This enables more concurrent users and a fast, flexible execution backend for resource-heavy AI deployments.

 

A Labeling Plugin for Active Learning: Properly labeled data is a prerequisite for unlocking precise, quality insights from machine learning models and the ability to label data quickly often speeds up the entire analytics lifecycle by easing the tedious and time-consuming data collection step. The new human-in-the-loop labeling and active learning plugin provides a suite of Dataiku web apps to ease the labeling process whether data is tabular, images, or even sound.