Snowflake Cost Optimization: 12 Steps To Smarter Spending

Table Of Contents

How Snowflake Calculates Costs Keep Your Bill Under Control With These Snowflake Cost Optimization Tips Now What? Input Your Telemetry Data Into CloudZero Dimensions FAQs

Snowflake has gained significant traction in data warehousing due to its unique architecture, flexibility, and ease of use. Heck, after using a legacy approach for some time, we at CloudZero went with Snowflake.

While at it, we noticed Snowflake customers express concerns about costs more often than they would like. So, in this guide, we’ll share how Snowflake calculates costs and our best tips for getting the most out of Snowflake without overspending.

How Snowflake Calculates Costs

Snowflake’s pricing model is based on pay-per-second usage for compute resources. This is different from hourly pricing (like Redshift). This makes Snowflake pricing more cost-effective for organizations that run sporadic queries.

Snowflake billing is based on storage capacity, compute (for query processing per second), and data transfer you actually use.

This pricing model opens the door to several interesting cost optimization opportunities if you’re willing to investigate why and how things are working behind the scenes.

Keep Your Bill Under Control With These Snowflake Cost Optimization Tips

The Snowflake cost optimization strategies below range from easy and simple changes to long-term, robust methods of understanding how every query affects your monthly usage bill.

For best results, start at the top and work your way down to keep your spending within a comfortable range.

Warehouse optimization

1. Use multiple warehouses

It is common to set up a Snowflake warehouse of whatever size you think you’ll need most frequently and then run every query within it. Since this tactic technically works, many people continue to do it indefinitely.

However, Snowflake breaks your costs down by warehouse level. So, if you run every query in one warehouse, everything you do gets lumped into one big sum on your monthly bill.

Instead, break your queries out into multiple warehouses. You’ll start to see how the costs of those warehouses vary. This can reveal why each warehouse costs more or less than others.

2. Size your warehouses appropriately

Snowflake costs also vary by the size of the warehouse. Large warehouses cost more per query than small warehouses. Thus, it is easy to assume that smaller warehouses will always be cheaper.

Interestingly, costs don’t always scale that way. You pay for the time each warehouse is in use. The larger and quicker warehouses can sometimes cost equal to or less than smaller warehouses simply because they execute queries much faster. However, the results depend on the query, so nailing down which sizes are right for you may require some experimentation.

For instance, you might want to do some simple A/B testing. How much does Snowflake cost for a particular query on an extra-large versus an extra-small warehouse?

From there, fine-tune to find the option that best meets your needs with a balance of cost and efficiency.

3. Use Snowflake’s auto-suspend feature

You can modify this in your Snowflake settings, but again, it’s one thing most people never think to check.

When you run a query, the warehouse will wake up, execute your query, and then stay running for a while to see whether you need to do more work. If you don’t have your warehouses set to suspend automatically, they’ll keep running, and you’ll keep paying for the time even if you’re not actively using it.

You can play around to find the time frame that works best for you, but we at CloudZero have found that lowering the default from five minutes to one minute makes a big difference in our Snowflake costs.

4. Ensure minimum clusters are set to one

This Snowflake performance best practice guarantees that at least one active compute resource is always available to process your queries. This is particularly critical for maintaining performance during peak loads or when multiple users query the database simultaneously.

You log into your Snowflake and change the count under Warehouse>select warehouse>Edit>Scaling policy>Minimum clusters. Or, you can use the ALTER WAREHOUSE command:

ALTER WAREHOUSE my_warehouse SET MIN_CLUSTER_COUNT = 1;

This ensures the warehouse can scale up to handle increased workloads while maintaining a baseline capacity. This flexibility can help to optimize performance and resource consumption, particularly in environments with fluctuating query demands.

Query optimization

5. Reducing the frequency of queries

Snowflake’s pay-per-use model means each query consumes resources and incurs charges. Frequent queries, especially those that are repetitive and return similar results, can increase costs.

Further, excessive querying can strain your system, potentially causing performance bottlenecks and slower response times.

Instead, consider caching results, using materialized views, or optimizing query logic to minimize redundancy. This approach can help reduce costs while improving overall system performance and user experience.

6. Enable query timeouts

Set a maximum duration for queries. Query timeouts automatically terminate inefficient or runaway queries. This can prevent long-running processes from monopolizing excessive compute resources, particularly in multi-user environments.

When a query exceeds its allotted time, it signals a need for optimization. This prompts your team to review and improve the query logic.

Implementing query timeouts also improves your system’s reliability and user experience. It prevents unexpected delays and ensures more predictable query execution times, leading to better resource management and smoother operation.

7. Embed metadata into each query

This is another tip we use at CloudZero to gain visibility into our own Snowflake costs.

Snowflake breaks down costs by warehouse, which is helpful. But what happens when you (like most users) run several operations within each warehouse? You need a better way to see how each action contributes to your total cost.

We’ve solved this problem by embedding a long comment block with identifying information into each query we run. We’ve also built a program that parses that comment into JSON, which is usable by our CloudZero API, which we will cover in the next step.

Don’t be so intimidated by embedding this data that you don’t even try. It takes some effort up front, but paired with CloudZero’s telemetry features, this strategy will catapult you much closer to your ultimate goal of high-definition cost visibility.

Table optimization

8. Drop unused tables

Dropping unused Snowflake tables keeps your database clean and lean. A cleaner database structure reduces confusion and improves query efficiency. It also becomes easier to navigate and work with relevant data.

In addition, removing stale data minimizes the risk of data exposure. The smaller the attack surface, the safer sensitive information is from unauthorized access. According to an IBM study, the average data breach costs over $4.5 million to resolve, so this is a Snowflake security best practice you shouldn’t skimp on.

Before deleting tables, ensure they are truly unused, and consider backing up data you may need later. To identify unused tables, analyze your access history logs. Look for recent queries and determine which tables are no longer needed.

9. Using transient tables

Snowflake transient tables can be dropped when they are no longer needed. They are ideal for scenarios where data does not require long-term retention but needs to persist beyond a single session.

Transient tables are also useful for staging and intermediate data processing tasks, where data is frequently loaded, transformed, and then discarded.

The transient tables:

Do not incur Time Travel and Fail-safe storage costs.
Allowing session-specific data storage can reduce resource contention and improve query performance, particularly during complex data processing tasks.
Can be accessed by multiple users and sessions, encouraging collaboration while remaining easy to manage.
Organize complex transformations into manageable steps, simplifying data manipulation and analysis.

This option offers the same performance benefits as permanent tables without the added cost of extended data retention policies.

10. Avoiding frequent DLM operations

Frequent Data Load Management operations, such as data archiving, purging, and replication, can consume significant compute and storage resources, increasing Snowflake costs.

Intensive DLM activity can also cause data processing delays due to the overhead of managing numerous files in the internal load queue. This can result in longer load times and reduced efficiency, mainly when dealing with many small files rather than fewer larger ones.

Also, streamlining your DLM processes can simplify data governance and management. And this can improve your Snowflake data quality and compliance without overburdening your system.

Better still, using efficient data loading practices, such as batching files and optimizing file sizes, can further improve overall system performance and cost-effectiveness.

Monitoring and visibility optimizations

11. Properly configure snowflake resource monitors

Go to Account>Resource Monitors>Create Resource Monitor. To properly configure Snowflake Resource Monitors, follow these steps:

Assign warehouses: Connect Resource Monitor to specific warehouses you want to monitor. This ensures that warehouse credit usage is actually tracked.
Set a credit quota that best represents the maximum credits you want to consume within a given period for the monitored resource (account or warehouse). You can define monthly, weekly, or daily credit limits.
Choose the monitor level, account-wide, or for specific warehouses. Account-level monitors track usage across all warehouses, while warehouse-level monitors provide more granular control.
Define the monitoring schedule, including start and end times, and the frequency at which the credit count resets (daily, weekly, monthly, etc.).
Configure triggers. This specifies actions to be taken when thresholds are reached, such as sending alerts, suspending warehouses, or preventing further usage.
Enable email notifications to relevant users when thresholds are exceeded so they can take action to prevent overspending.

12. Understand your unit Snowflake costs with CloudZero

Snowflake’s unique architecture can help you manage your data cloud and analytics costs. The problem is that most Snowflake cost management tools don’t provide specific cost intelligence. Identifying the people, products, and processes that drive your Snowflake spend is challenging.

Without knowing exactly who, what, and why your Snowflake bill changes, it becomes harder to tell where to cut usage without compromising performance or data quality.

Connecting your Snowflake data cloud to CloudZero changes that.

CloudZero ingests and displays your Snowflake storage and usage costs in a way you likely haven’t seen before. For instance, our dashboard breaks down your Snowflake costs per warehouse and displays them next to your AWS costs so you can easily compare them.

CloudZero offers more, too.

Now What? Input Your Telemetry Data Into CloudZero Dimensions

This is where the magic happens.

Let’s say you’ve tracked metadata for each query, and now you have a bunch of data sitting in your query history. You could look at the metadata manually, but there is a better way.

CloudZero Dimensions ingests all your gathered data, analyzes it, and displays it in clear, convenient charts and graphs.

CloudZero: Ingest, Allocate, Analyze, Engage

Instead of seeing that a certain warehouse costs you $X per month, now you can break down that total cost even further.

You can track your Snowflake costs by query, customer, type of query within each customer category, or any other metric you choose. Comparing them will help you determine where to improve your efficiency or reduce costs.

Let’s say one of your warehouses costs $22,000 this month. If that’s higher than normal, you’re probably wondering what could have contributed to the increase. You’ve just begun using a new type of query this month, so this is the obvious culprit in your mind.

Thankfully, you tracked metadata for each query type and customer, so you can use Dimensions to break your total down into unit costs by category.

Looking closer, you’re surprised to learn that the new query cost $1,200 this month. At about 5.45% of $22,000, this query is responsible for some, but probably not all, of the jump in price.

Then, you switch to the graph that displays costs per customer and see that one customer is responsible for a huge chunk of this month’s database queries.

Because queries for this customer are typically low, you’ve assigned them to a small Snowflake warehouse, assuming that will be sufficient. However, the spike in demand has caused that warehouse to need to run almost constantly, rarely, if ever, reaching the point of automatically turning off.

This cost intelligence empowers you to investigate why the change in usage happened, helping you fine-tune your approach toward this customer.

Without this detailed breakdown of unit costs, you’d be stuck paying your enormous bill without knowing why it happened or how to prevent it from happening again.

We’ve used cost per query and cost per customer as examples. These metrics frequently help our customers. However, in CloudZero, the metrics you decide to track and the depth and granularity with which you break them down are customizable to your needs.

Yet, reading about CloudZero is nothing like experiencing it for yourself. to see how CloudZero can help you maximize Snowflake without compromising performance and data quality.

FAQs

What makes Snowflake’s architecture unique?

Snowflake differentiates itself by separating storage, compute, and cloud services. This design offers unmatched flexibility, scalability, and cost optimization compared to traditional data platforms.

How does Snowflake separate compute and storage?

Snowflake decouples storage and compute resources. This means you can scale storage independently of computational power and vice versa. Resources can be adjusted dynamically to match your performance and cost requirements.

What kind of storage does Snowflake use?

Snowflake uses Hybrid Columnar Storage for centralized database storage. This approach allows for:

Efficient data compression.
Faster retrieval compared to traditional row-based systems.

How does Snowflake handle query processing?

Snowflake employs virtual warehouses that independently scale compute resources. These virtual warehouses enable:

Parallel query execution without performance degradation.
Better query performance than competitors like Amazon Redshift, which ties compute resources to specific clusters.

What tasks are handled by Snowflake’s cloud services layer?

Snowflake’s cloud services layer manages:

Authentication and access control.
Multi-cloud support (AWS, GCP, Azure).
Automatic optimizations, reducing the need for manual intervention.

How does Snowflake optimize performance?

Snowflake offers several performance enhancements:

Automatic clustering and performance tuning, minimizing manual management overhead.
A multi-tiered caching system, including result caching, local disk caching, and remote disk storage.

What data formats does Snowflake support?

Snowflake supports various data types, including JSON, Avro, and Parquet. This enables you to integrate and analyze data from diverse sources without extensive preprocessing.

How does Snowflake manage cloud flexibility and vendor lock-in?

Snowflake’s multi-cloud implementation works across AWS, GCP, and Azure. This ensures superior flexibility and avoids vendor lock-in.

How do Snowflake’s database warehouses work?

In Snowflake, you can set up multiple database warehouses to store raw data. Key details include:

Warehouses are accessed on demand when you execute a query.
Higher-level warehouses run queries faster but cost more per query.
If you’re not running queries, you are not charged during downtime.

How does Snowflake pricing work?

Snowflake charges are based on storage usage and compute resource consumption during query execution. You only pay for what you use, and you’re not billed during idle time when no queries are running.

Author: Cody Slingerland

Cody Slingerland, a FinOps certified practitioner, is an avid content creator with over 10 years of experience creating content for SaaS and technology companies. Cody collaborates with internal team members and subject matter experts to create expert-written content on the CloudZero blog.

The Cloud Cost Playbook

The step-by-step guide to cost maturity

Any Cost Source, All In One View

The Cloud Cost Playbook

Snowflake Cost Optimization: 12 Steps To Smarter Spending

How Snowflake Calculates Costs

Keep Your Bill Under Control With These Snowflake Cost Optimization Tips

Warehouse optimization

1. Use multiple warehouses

2. Size your warehouses appropriately

3. Use Snowflake’s auto-suspend feature

4. Ensure minimum clusters are set to one

Query optimization

5. Reducing the frequency of queries

6. Enable query timeouts

7. Embed metadata into each query

Table optimization

8. Drop unused tables

9. Using transient tables

10. Avoiding frequent DLM operations

Monitoring and visibility optimizations

11. Properly configure snowflake resource monitors

12. Understand your unit Snowflake costs with CloudZero

Now What? Input Your Telemetry Data Into CloudZero Dimensions

FAQs

What makes Snowflake’s architecture unique?

How does Snowflake separate compute and storage?

What kind of storage does Snowflake use?

How does Snowflake handle query processing?

What tasks are handled by Snowflake’s cloud services layer?

How does Snowflake optimize performance?

What data formats does Snowflake support?

How does Snowflake manage cloud flexibility and vendor lock-in?

How do Snowflake’s database warehouses work?

How does Snowflake pricing work?

The Cloud Cost Playbook

Suggested Articles