The announcement came at this week’s global Snowflake Summit in Las Vegas, and Snowflake co-founder and president of product Benoit Dageville said it is a highly strategic direction that “by far will transform Snowflake.”
"Our vision was always to democratise access to data to anyone. We realised quickly to do this would require full applications, so it must be super-simple to build, develop, and run applications, and they should scale and our platform should take care of them,” he said.
“We want to make Snowflake the iPhone of data applications," Dageville said. By this, Dageville referred to the ability of the iPhone to run all kinds of magnificent applications - which for the most part are not written by Apple. However, Apple provided the platform, the ecosystem, and the marketplace that allows developers to distribute and monetise their apps. This is the future Dageville sees for Snowflake when it comes to data-driven business applications.
|
Part of this vision is bringing work to the data, instead of the traditional process of taking data to the work. This driver “provides maximum protection to your sensitive enterprise data, and enforces compliance and governance”, said Snowflake senior director of product management Torsten Grabs.
Let me go back a step: the Snowflake data cloud provides a high-performing and scalable database with vast capabilities around security and governance. The Snowflake many people know allows you to store data, query data, analyse data, and perform complex calculations. However, typically your applications were elsewhere, connecting to Snowflake via JDBC, ODBC, or other means.
If you used containers with Snowflake, these were part of your conventional software development and deployment, constructing apps in .NET, Java, Python, or other languages, which connected to Snowflake for its data needs.
Well, Grabs says, from inception, Snowflake has had a clear focus on security and governance of data by bringing compute to the data as opposed to creating new copies and additional silos.
This focus shaped the company’s approach to Snowpark, the developer framework that brings native SQL, Python, Java, and Scala support for fast and collaborative development.
The news from Snowflake is that Snowpark now supports containerisation, via the all-new Snowpark Container Services, running OCI-compliant containers on Snowflake compute. Many of your Docker containers will work out of the box right away.
Your apps don't even need to use Snowflake; they could be any container running any code. Whether using Snowflake as a general-purpose web app host is cost-effective or not is up to you. Still, if you’re using Snowflake data then there’s a huge ramification, namely, the containerised compute means all your data, and all your security and governance around it, remains completely within the Snowflake account boundaries. That is, you bring the compute to the data and your sensitive enterprise information never leaves the Snowflake cloud, and this greatly simplifies your data governance compliance.
That's exciting news in itself, but there is more to the story. Firstly, containerisation provides a path for large language model providers who wish to make their models available to others to use. For these providers, the model is their confidential intellectual property, along with its weights and parameters. Sharing a container typically would expose these things to others.
However, Snowflake Container Services provides a means for providers to package up their container, place it on the Snowflake app marketplace, and for consumers to spin it up in their Snowflake compute environment, with protections on both sides. The consumers don’t need to share their data with anyone outside of their Snowflake account; the providers can distribute their container and Snowflake will allow consumers to work with its API endpoints but will not expose any confidential internals of the code, or its data.
So, right away, Snowflake Container Services brings the power of LLMs to the data to make enterprises smarter about their data and enhance user productivity in secure and scalable ways.
You can begin immediately and it couldn't be easier: "go to Hugging Face, download an open source container with an LLM like Dolly, register that container, and run it on Snowflake with confidence and comfort you can send all your sensitive enterprise data to it,” Grabs said.
Secondly, all of this - containerisation, LLMs on Snowflake - is made possible through accelerated compute from GPUs, which is another major announcement from Snowflake Summit, made in conjunction with nVidia CEO Jensen Huang.
"This is very critical," Dageville said. "Snowpark Container services can run on a compute layer which is not a warehouse; you can create a compute pool and specify which type of compute you need to run the services. In particular, GPUs will be available which are critical for machine learning as they directly accelerate training and inference and make it cheaper. There’s a good cost/performance trade-off using GPUs.”
Snowflake Container Services is currently in private preview. Right now, the company is receiving feedback from partners. “Often the time from private to public preview is dictated by that,” Dageville said.
"When people say Snowflake is a database I say no, it's much more," Dageville said. "You can run applications all the way from the UI to app services to the raw data access that apps need, and all inside the data cloud now.”