1 0
Read Time:2 Minute, 31 Second

Performance and scale are core areas and key factors to Snowflake. Materialized views (MV) brings an  exciting functionality that adds performance enhancement to Snowflake. Snowflake main aim is that Users should focus on analytical task and analyzing data rather than spending time managing and tuning. Clustering and Materialized View the two Snowflake Serverless capabilities helps to deliver the best performance and scalability without requiring administration or maintenance from customers.

Snowflake’s automatic clustering includes the following benefits:

  • No manual operation to perform the reclustering.
  • Incremental clustering automatically enabled based on the new data or data modification.
  • No impact to the production workloads.
  • Exponentially faster queries with lower maintenance and lower cost

Materialized Views comes with following advantages:

  • Ensure fast DML operations in the presence of MVs .
  • Automatically and transparently maintained by Snowflake. A background service update materialized views once changes are made to the base table.
  • Creating the materialized view with Snowflake allows you to specify the new clustering key. Enables Snowflake to reorganize the data during the initial creation of the materialized view
  • Deliver query results via MVs that are always current and consistent with the main data table.
  • Materialized views can speed up expensive aggregation, projection, and selection operations.

In scenarios when a user is accessing an MV, which has yet to be update, Snowflake’s query engine will perform a combine execution with the base table to always ensure consistent query results.

Therefore, During this post we will see how the CLUSTERD MV has improved the Query performance while fetching data from one of the large data set i.e.3 billion rows.

Dataset

Case 1 : Run Aggregation and Grouping on INVOICE table with no clustering/MV.

select cust_id,sum(amt),sum(part_payment),count(*) from invoice group by cust_id

alter warehouse compute_wh suspend

alter session set use_cached_result=false

Case1

Case 2 : For instance, Create the Materialized view and Run Aggregation and Grouping on INVOICE table and notice the timings.

create materialized view inv_mv (INVC_NO,INVC_CD,AMT,cust_id,part_payment) as select INVC_NO,INVC_CD,AMT,cust_id,part_payment from invoice

select cust_id,sum(amt),sum(part_payment),count(*) from inv_mv group by cust_id;

alter warehouse compute_wh suspend

alter session set use_cached_result=false

Clustered MV:

Case 3 : Develop the CLUSTERED Materialized view and Run Aggregation and Grouping on INVOICE table and notice the timings.

 create materialized view inv_mv (INVC_NO,INVC_CD,AMT,cust_id,part_payment) cluster by (cust_id) as select INVC_NO,INVC_CD,AMT,cust_id,part_payment from invoice;

select cust_id,sum(amt),sum(part_payment),count(*) from inv_mv group by cust_id

In conclusion, Combining the Clustering and materialized views produce higher performance improvement without manual intervention and maintenance which really makes snowflake a cloud data warehouse leader unlike the cost and maintenance associated with traditional data warehouse.

Average Rating

5 Star
0%
4 Star
0%
3 Star
0%
2 Star
0%
1 Star
0%

Leave a Reply

Your email address will not be published. Required fields are marked *