What the Heck Is Snowflake?

You probably find yourself collecting an increasing volume of data along with a need to analyze this data to make business decisions.  The name Snowflake may have come up in a meeting, but what the heck is Snowflake, and why should you consider it for your data warehouse?

Background on Snowflake

First, a little background, Snowflake, was founded in 2012 by 3 data warehouse experts.  They built an entirely new cloud data platform guided by 3 simple principles: break free from the past, design for the cloud, and support modern data and applications.  The Snowflake ecosystem works with leading data management, data integration, and BI partners to easily bring together all your data and enable all your users to perform cutting-edge analytics.  It can be deployed on all 3 major cloud platforms (AWS, Azure, and Google Cloud Platform).

Snowflake’s Key Feature

While Snowflake contains many robust features,  its ability to spin up an unlimited number of virtual warehouses is the most powerful feature.  This allows users to run an unlimited number of workloads against the same data without risk of data contention.  But what about performance?  I’m glad you asked.  Since Snowflake is cloud-based, each warehouse can be resized in milliseconds from 1 node to as many as the cloud provider offers.  As the workload changes throughout the day, the warehouse can also be adjusted to match it.  It is also possible to automatically scale up based on the user load and then back down as the user count decreases.

Not Just Another Data Warehouse

Traditional data warehouses required transforming and loading data, which inevitably leads to selectivity and data loss.  When considering storing massive data volumes in a raw form, it has become popular to consider a data lake. This architecture was traditionally deployed on Hadoop platforms as it often includes semi-structured and unstructured data, which were challenging to handle on traditional relational platforms.

Unlike legacy data warehouses, Snowflake supports structured and semi-structured data, including JSON, AVRO, and Parquet, and these can be directly queried using SQL. Unlike Hadoop, Snowflake independently scales compute and storage resources and is, therefore, a far more cost-effective platform for a data lake. The ability to seamlessly combine JSON and structured data in a single query is a compelling advantage of Snowflake and avoids operating a different platform for the Data Lake and Data Warehouse.

Pros and Cons

The advantages of cloud-based data warehousing have been extensively reviewed. The main advantages of Snowflake over traditional on-premise bases solutions are:

  • Machine Size:  

    • Is no longer an issue.  Unlike traditional systems that typically involve deploying a massive server with plans to upgrade a few years down the line, Snowflake can be deployed on a single extra-small cluster and scaled up and down.
  • Disk Space:

    • Is no longer an issue, as data storage from cloud providers is both inexpensive and practically infinite in size.
  • Security:

    • Is baked into the system.  Snowflake includes a vast array of security features, including IP whitelisting, multi-factor authentication, and AES 256 secure end-to-end encryption.
  • Disaster Recovery:

    • Is no longer an issue, as data is automatically replicated across three availability zones and can withstand the loss of any two data centers.
  • Software Upgrades:

    • Are no longer required, as Snowflake is provided as a software service, both the operating system and database upgrades are silently and transparently applied.
  • Performance:

    • Is no longer an issue, as clusters can be resized on-the-fly to deal with unexpectedly high data volumes.
  • Concurrency:

    • Is no longer an issue, as each cluster can also be configured to automatically scale out to satisfy massive numbers of users, then scale back when no longer needed.
  • Tuning and Maintenance:

    • Is no longer an issue, as Snowflake supports no indexes, and aside from a few well documented best practices, there is no need to tune the database.  Built for simplicity, there’s little requirement for DBA resources.

In terms of the disadvantages, there is not much to write out. Some customers will need to migrate to Snowflake, and this should be considered part of an overall cloud strategy. Otherwise, there are no significant drawbacks to a data warehouse platform.

Are you considering Snowflake for your business? We’re here to help. My contact information is available below – let’s continue this conversation.

Written by Jeff Beier, Principal

Jeff brings over 20 years of business experience in helping companies grow while managing their bottom line. He is a senior technology leader with a proven success record of delivering excellent operating results. Jeff thinks cross-functionally while applying expertise gained from small businesses to large global enterprises in the areas of marketing, wholesale, retail, healthcare, product and e-Commerce environments. Jeff is proficient in leveraging cloud technologies to modernize applications while reducing cost. He is passionate about delivering business value and quality customer experiences and has a keen ability to create strong partnerships with global stakeholders at all levels.  

Reach out to Jeff to explore these ideas further!