DMF Output
0 0
Read Time:1 Minute, 52 Second

In a data-driven world, maintaining data quality is paramount for organizations. Snowflake provides a powerful mechanism to assess and ensure data quality using Data Metric Functions (DMFs). These functions enable administrators to evaluate data in tables based on pre-defined or custom metrics. Large organizations often deal with vast datasets spread across multiple tables and schemas. Extracting DMF Details Across Schemas allows administrators to gain a comprehensive view of where and how DMFs are applied. Tracking this information ensures that data quality initiatives consistently implement and monitor across the entire organization.

  1. Which DMFs are applied?
  2. On which tables and columns?
  3. The results of these metric evaluations?

For a Snowflake AccountAdmin or Schema Owner, having a consolidated view of this information is crucial for auditing, reporting, and troubleshooting data quality issues.

This blog explores a real-time use case where we automate the extraction of detailed information about DMFs applied to tables and columns across a specified schema. This is achieved using a custom stored procedure in Snowflake, GET_DMF_DETAILS_FOR_SCHEMA.

We developed the GET_DMF_DETAILS_FOR_SCHEMA procedure that:

  1. Identifies all DMFs within a specified schema.
  2. Extracts the table and column details associated with each DMF using the DATA_METRIC_FUNCTION_REFERENCES table.
  3. Returns a consolidated result for analysis.

Technical Implementation:

DMF Procedure

Technical Implementation:

  • Automated DMF Discovery: Uses the SHOW DATA METRIC FUNCTIONS command to list all DMFs in the schema.
  • Detail Insights: For each DMF, retrieves the database, schema, and metric paths and identifies the specific columns evaluates.
  • Dynamic Execution: Builds queries dynamically to fetch results for multiple DMFs within the schema.

Call the Procedure:

call GET_DMF_DETAILS_FOR_SCHEMA(‘PUBLIC’);

Output:

DMF Output

If we observe the output we can see DMF: CHECK_HIGH_TRANSACTION applies to the multiple tables i.e. Customer and DATA_QUALITY_TEST. The above proc will give the list of all tables and DMF inside the schema passes as input argument.

Practical Benefits

  1. Auditing: Maintain detailed records of DMFs applied across schemas.
  2. Troubleshooting: Quickly locate data issues by identifying impacted tables and columns.
  3. Governance: Demonstrate compliance by showing proactive monitoring of data quality metrics.

Average Rating

5 Star
0%
4 Star
0%
3 Star
0%
2 Star
0%
1 Star
0%

Leave a Reply

Your email address will not be published. Required fields are marked *