• Mozart Data

Choosing a Data Warehouse: Snowflake vs. Redshift

Growing companies need to optimize their data practices. As their volume of data increases, and data questions become more complex, it’s increasingly difficult to perform basic reporting, let alone move to automated dashboards and insight mining.


One of the most important steps in achieving this is centralizing all data from siloed sources in a data warehouse. Two of the most popular data warehousing options are Snowflake and Redshift.

When choosing between Snowflake or Redshift, ensure:

  1. The solution you choose integrates well with your existing tech stack.

  2. Setup will be painless, so you don’t have to divert resources for longer than necessary.

  3. Your new data warehouse will be able to scale with your business.

With these factors in mind, let’s take a closer look at these two options, exploring their similarities, differences, and current pricing models.


Similarities and differences between Snowflake and Redshift

Snowflake and Redshift are serverless data warehouses that store, process, and compute data in the cloud. This means you won’t have to worry about investing in physical space and hardware to house your data.


Similarities

  • Serverless warehouses storing, processing, and computing in the cloud

  • Data encrypted at-rest and in-transit

  • SQL-based columnar data warehouse organization

  • Decoupled storage and compute architectures

Both of these warehouses maintain encryption while data is at rest and in transit, so they remain secure and comply with privacy regulations at all times.


Snowflake is an independent, publicly-traded company, which gives it flexibility in the data space. While some solutions can only be hosted on a single cloud computing service, Snowflake can be hosted on Google Cloud Platform (GCP), Microsoft Azure, and Amazon Web Service (AWS). This means Snowflake can be easily integrated with a wide range of tools.

Redshift is an Amazon product that can only be operated on AWS. That does not mean it can only integrate with Amazon products, but there are operating constraints that don’t exist for Snowflake customers.


Snowflake and Redshift are both SQL-based, columnar data warehouses with decoupled storage and compute architectures. These warehouses are organized by field, which makes them well-suited for reporting and querying because all like data is stored next to each other. This organization format leads to faster computations, which makes it much easier to use data and also leads to cost savings for you.


Differences

  • Hosting — Redshift can only be hosted on AWS. Snowflake can be hosted on AWS, Google Cloud Platform (GCP), and Microsoft Azure.

  • Speed — Both are competitive, but Snowflake outperforms for raw queries.

  • Maintenance — Snowflake automates more maintenance processes.

  • Setup — Snowflake has an easier, more user-friendly setup process.

Snowflake and Redshift are known as the top two warehouses in the space for performance, with both providing excellent performance overall. They both use massively parallel processing, which for practical purposes means they’re capable of running multiple data analyses and queries simultaneously without declining performance. This is a great feature for companies that intend to have multiple team members working with data at once.


However, there are situations where Snowflake outperforms Redshift. Generally, Redshift is slower for query optimization, or the first time a query is run. As data caches over multiple runs, these queries tend to become much faster. Snowflake starts off much faster on these raw queries.


All data warehouses require some maintenance. Snowflake automates more of these processes than Redshift does, saving users substantial time in troubleshooting.


Reputationally, both products are recognized as robust and capable of handling the needs of companies ranging from smaller start-ups to enterprise-level. Snowflake is known for its simple, intuitive setup and interface that can make it more user-friendly for teams with less experience in the space.


Pricing: Snowflake vs Redshift

For most companies, compute units account for the majority of data warehouse costs. Storage has become increasingly efficient, making it cheaper — it currently costs just $20 to $25 per month to store one terabyte (TB) of data. Computation costs typically run higher and are influenced by customer needs.


Pricing Basics

Snowflake

Redshift

Storage costs are separate from compute costs.

Storage and compute costs are bundled.

There are three pricing levels: standard, enterprise, and business critical.

Complex pricing system with pay-as-you-go, on-demand, and flat-rate options for compute payment and active and long-term classifications for storage.

Compute charges based on query time, not amount of data.

Compute charges either based on amount of data or computation time, depending on your bundle.

Redshift’s pricing structure is quite complex. At a high level, Redshift bundles costs for storage and compute and provides discounts for long-term commitments. Their compute fees are broken down by pay-as-you-go, on-demand, and flat-rate options, while storage pricing is per gigabyte and broken down by active and long-term data classifications. Redshift processing is based either on the amount of data being computed (so the query time matters less than the number of bytes being worked on) or in “Redshift Processing Units” (RPUs), which are based on the amount of time for computation. The nuances of this compute pricing model will be determined by the chosen bundle. This generally results in slightly higher computing costs compared to Snowflake.


Snowflake’s pricing model is more simple and separates storage costs from compute costs. There are just three levels of pricing — standard, enterprise, and business critical. Customers can purchase storage up front or month-to-month. Snowflake’s compute charges are based on query time, not byte size, so more complex data processing will cost more. On the other hand, customers are incentivized to write cleaner queries that aren’t unnecessarily complicated. In most business use cases, compute, and thus analysis, is cheaper on Snowflake.

Which is better: Snowflake or Redshift?

We believe Snowflake is the clear best-in-class data warehouse, particularly for start-ups and smaller agile companies. Not only is it easy to implement, but it also presents great opportunities for cost savings and has the edge in performance. Redshift offers many excellent features, but most businesses won’t be able to use them to their full potential due to lack of resources or necessity, compounded by the additional maintenance requirements of Redshift. Snowflake’s system is powerful enough to support small- and medium-sized businesses now and for years into the future as they scale.


Set up your Snowflake warehouse in minutes with Mozart Data

Mozart Data provides an out-of-the-box modern data stack that’s built with Snowflake under the hood. We chose best-in-class technology to serve our customers and make it easy for them to go from siloed data to analysis-ready in an hour. Because of the scale of our partnership with Snowflake, we’re able to pass cost savings to our customers. By getting your Snowflake data warehouse through Mozart, you’ll save 30% compared to independently purchasing Snowflake and the rest of your data stack. Read more about why we chose Snowflake.