Databricks introduces Delta Live Tables

ETL framework is the first to both automatically manage infrastructure and bring modern software engineering practices to data engineering, allowing data engineers and analysts to focus on transforming data, not managing pipelines.

  • Tuesday, 5th April 2022 Posted 2 years ago in by Phil Alsop
Databricks has launched Delta Live Tables (DLT), the first ETL framework to use a simple declarative approach to build reliable data pipelines and to automatically manage data infrastructure at scale. Turning SQL queries into production ETL pipelines often requires a lot of tedious, complicated operational work. By using modern software engineering practices to automate the most time-consuming parts of data engineering, data engineers and analysts can concentrate on delivering data rather than on operating and maintaining pipelines. 

 

As companies develop strategies to get the most value out of their data, many will hire expensive, highly-skilled data engineers - a resource that is already hard to come by - to avoid delays and failed projects. What is often not well understood is that many of the delays or failed projects are driven by a core issue: it is hard to build reliable data pipelines that work automatically without a lot of operational rigour to keep them up and running. As such, even at a small scale, the majority of a data practitioner's time is spent on tooling and managing infrastructure to make sure these data pipelines don't break.

 

Delta Live Tables is the first and only ETL framework to solve this problem by combining both modern engineering practices and automatic management of infrastructure, whereas past efforts in the market have only tackled one aspect or the other.  It simplifies ETL development by allowing engineers to simply describe the outcomes of data transformations. Delta Live Tables then understands dependencies of the full data pipeline live and automates away virtually all of the manual complexity. It also enables data engineers to treat their data as code and apply modern software engineering best practices like testing, error-handling, monitoring, and documentation to deploy reliable pipelines at scale more easily. Delta Live Tables fully supports both Python and SQL and is tailored to work with both streaming and batch workloads.

 

Delta Live Tables is already powering production use cases at leading companies around the globe like JLL, Shell, Jumbo, Bread Finance, and ADP. "At ADP, we are migrating our human resource management data to an integrated data store on the lakehouse. Delta Live Tables has helped our team build in quality controls, and because of the declarative APIs, support for batch and real-time using only SQL, it has enabled our team to save time and effort in managing our data," said Jack Berkowitz, Chief Data Officer, ADP.

 

“The power of DLT comes from something no one else can do - combine modern software engineering practices and automatically manage infrastructure. It’s game-changing technology that will allow data engineers and analysts to be more productive than ever,” said Ali Ghodsi, CEO and Co-Founder at Databricks. “It also broadens Databricks’ reach; DLT supports any type of data workload with a single API, eliminating the need for advanced data engineering skills.”