Improving programmability

Snowflake has introduced at its annual user conference, Snowflake Summit 2022, new enhancements that improve programmability for data scientists, data engineers, and application developers.

  • Thursday, 16th June 2022 Posted 2 years ago in by Phil Alsop

Snowflake’s latest innovations bring Python to the forefront, with the launch of Snowpark for Python, now in public preview, and a native integration with Streamlit for rapid application development and iteration, currently in development. Additionally, Snowflake is streamlining access to more data with new enhancements for working with streaming data, alongside making data stored in open formats and on-premises available in the Data Cloud. These enhancements make it easier for data professionals and developers to build and collaborate with data quickly, while leveraging the speed, simplicity, and consistent governance and security of Snowflake’s platform.

Snowflake Doubles Down on Python for Machine Learning and Application Development

The introduction of Snowpark, Snowflake’s developer framework, opened up a rich programming environment for data scientists, data engineers, and application developers to build scalable pipelines, applications, and machine learning (ML) workflows directly in Snowflake using their preferred languages and libraries. Snowflake is further extending what users can build with Snowpark for Python, making Python’s rich ecosystem of open-source packages and libraries seamlessly accessible in the Data Cloud.

With a highly secure Python sandbox, Snowpark for Python runs on the same Snowflake compute infrastructure as Snowflake pipelines and applications written in other languages. This provides Snowpark for Python with the same scalability, elasticity, security, and compliance benefits developers have come to expect when building with Snowflake. Developers now have the unique opportunity to streamline and modernize their data processing architecture by consolidating their Python-based data processing in Snowflake using Snowpark.

Additional updates complementing Snowpark for Python include:

 

Snowflake Worksheets for Python, now in private preview, enables users to develop pipelines, ML models, and applications directly in Snowsight, Snowflake’s user interface, using Python and Snowpark’s DataFrame APIs for Python, streamlining development with code auto-complete, and the ability to productize custom logic in seconds.

Snowflake’s Streamlit Integration, currently in development, brings Python-based application development directly into Snowflake, enabling users to build interactive applications, and securely share, iterate, and collaborate with business teams to increase the impact of development.

Large Memory Warehouses, currently in development, empowers users to securely execute memory-intensive operations such as feature engineering and model training on large datasets using popular Python open-source libraries available through the Anaconda integration.

SQL Machine Learning, starting with time-series forecasting now in private preview, empowers SQL users to embed ML-powered predictions into their everyday business intelligence and analytics to improve decision quality and speed.

 

Python’s robust syntax and rich ecosystem of open-source packages make it a popular choice for developers, and Snowflake’s continued partnership with Anaconda extends access to more Python packages seamlessly in Snowflake, with all code running in a highly secure sandboxed environment. The Snowpark Accelerated program has also seen continued growth in large part due to Snowflake’s Python advancements, with more partners building with Python to extend the power of the Data Cloud in their language of choice.

Allegis Group, a global talent solutions firm, relies on Snowpark to support ML and artificial intelligence (AI) solutions that leverage data in the Allegis Enterprise Data Platform on Snowflake.

“At its core, Snowpark is all about extensibility, and Snowpark for Python provides us with the tools we need to work with data effectively in our programming language of choice,” said Joe Nolte, AI & MDM Architect, Allegis Group.“Snowpark is becoming our preferred framework for data science and application development, providing our teams with a seamless experience to easily collaborate with data and bring everyone onto the same platform for accelerated time-to-value.”

Snowflake Increases Data Access for Faster, More Valuable Insights

Getting access to the right data quickly and efficiently is critical for improving developer productivity, building ML models with increased accuracy, and delivering more powerful applications. Snowflake’s enhancements enable teams to experiment faster, with more data at their fingertips, driving increased programming capabilities and deeper insights for users.

New innovations include:

Streaming Data Support to eliminate the boundaries between streaming and batch pipelines with Snowpipe Streaming, now in private preview, for serverless ingestion of streaming data, and Materialized Tables, currently in development, which make it simple to transform streaming data declaratively.

Iceberg Tables in Snowflake, currently in development, to enable users to work with Apache Iceberg, a popular open table format, in external storage while taking advantage of the ease-of-use, performance, and consistent governance of the Snowflake platform, simplifying overall data management and enabling architectural flexibility.

External Tables for On-Premises Storage, now in private preview, to allow users to access their data in on-premises storage systems like Dell Technologies, Pure Storage, and more from Snowflake so they can benefit from the elasticity of the Data Cloud without moving this data. 

 

“We are heavily investing in Python to make it easier for data scientists, data engineers, and application developers to build even more in the Data Cloud, without governance trade-offs,” said Christian Kleinerman, Senior Vice President of Product, Snowflake. “Our latest innovations extend the value of our customers’ data-driven ecosystems, enabling them with more access to data and new ways to develop with it directly in Snowflake. These capabilities, paired with Snowflake’s best of class data security and privacy, are changing the way teams experiment, iterate, and collaborate with data to drive value.”