Cloudera continues data fabric and data lakehouse innovation

Introduces unique new hybrid data capabilities that allow companies to move data at scale across clouds for optimal performance, cost and security.

  • Tuesday, 25th October 2022 Posted 2 years ago in by Phil Alsop

Cloudera has introduced new hybrid data capabilities that enable organisations to more efficiently move data, metadata, data workloads and data applications across clouds and on premises to optimise for performance, cost and security. Cloudera’s portable data services enable simple, low-risk data workload and data application movement for ultimate data lakehouse optionality. The company’s new secure data replication simplifies and secures movement of data and metadata, the latest SDX enhancement in Cloudera’s unified data fabric. And Cloudera universal data distribution delivers the first data ingestion solution built for hybrid data. These new capabilities are key to getting control of hybrid data through a data-first strategy. When companies do right by their data, the entire business can access and analyse it without limitations.  

“As data continues to grow exponentially, enterprises must identify the critical tools that enable rapid business transformation in an increasingly hybrid and multi-cloud environment,” said Daniel Newman, founding partner and principal analyst, Futurum Research. “Cloudera has a long proven track record for handling large and complex data volumes in even the most highly regulated and compliance intensive industries. With these updates, Cloudera is further advancing its position as a leader for data-first enterprises seeking to leverage AI, ML, and hybrid architectures to drive their businesses forward.” 

The volume of data businesses collect and store from on-premises, cloud and streaming locations continues to soar. Statista projects the total amount of data generated globally to hit more than 180 zettabytes by 2025. This is the challenge of hybrid data. Adding to the pressure that require organisations to derive insights from their data at an ever-faster pace are economic and market forces. Furthermore, industry experts agree that getting control of data at scale is the only way to drive continuous business transformation with ML and AI. Cloudera’s new data analytics and data management innovations for hybrid data are specifically designed to help organisations manage data at scale across data centers and public clouds, helping make ML and AI business transformation possible.  

“Cost or performance is not a choice companies want to make, especially since - as enterprises move to a hybrid, multi-cloud world - these two things are tightly interlinked,“ said Sudhir Menon, Chief Product Officer at Cloudera. “Organisations that choose a data-first strategy can focus on how they deliver value, not just how they spend money. A huge piece of this is the ability to move data and workloads whenever and wherever throughout a modern data architecture to meet evolving business requirements. Cloudera has always provided consistent data security and governance across hybrid cloud, and with these updates will do so between all data services across all infrastructures.”    

The new Cloudera data analytics and data management innovations for hybrid data include: 

Portable Data Services enable data analytics and the data applications that are built with them to be moved quickly and efficiently between different infrastructures without costly redeveloping or rearchitecting the data applications. CDP Data Services – Data Engineering, Data Warehousing and Machine Learning – are each built on a unified code base and offer identical functionality on AWS, Azure and on-prem Private Cloud. Using data services that run identically across different clouds – yes the same bits – makes it easier for users, administrators and developers to turn data into value and insight. Users have the same data experience, irrespective of where the data is stored or where the data applications run – the same data analytics functions, same Cloudera SDX security and governance, tailored to run seamlessly with the cloud-native storage on the preferred cloud. Only Cloudera delivers true hybrid data analytics that enables organisations to easily move data workloads and data applications across clouds to optimise for performance, cost and security.  

Secure Data Replication enables data and the metadata to be copied or moved quickly and securely between different Cloudera deployments in data centers and public clouds. Data is often created in different places from where it’s needed. Secure data replication is enabled by the replication manager, the latest addition to Cloudera SDX. Only Cloudera’s Replication Manager moves the metadata that carries data security and governance policies with the data wherever it goes, eliminating the need to reimplement them. Replication manager is a data movement service that moves data and metadata from on-premises to cloud or cloud to cloud in real time with an easy policy driven interface, enabling hybrid data flexibility.  

Universal Data Distribution enables companies to take control of their data flows, from origination through all points of consumption both on-premises and in the cloud, in a universal way that’s simple, secure, scalable and cost-effective. Universal data distribution is enabled by Cloudera DataFlow, the first data ingestion solution built for a hybrid data world. Unlike dumbed-down, target-system-specific, wizard-based connector solutions, Cloudera DataFlow provides indiscriminate data distribution with 450+ connectors and processors across an ecosystem of hybrid cloud services including data lakes, lakehouses, cloud warehouses, on-premises and edge data sources. Cloudera DataFlow, is a true hybrid data ingestion solution that addresses the entire diversity of data movement use cases: batch, event-driven, edge, microservices and continuous/streaming. With Cloudera DataFlow, streaming is treated as a first-class citizen, turning any data source into a data stream, supporting streaming scale, and unlocking hundreds of thousands of data-generating clients.