Automating PII Governance in Snowflake

Read Time:3 Minute, 14 Second

The Challenge

Consider a typical enterprise scenario: your data team onboards new customer tables weekly, each containing various PII fields. Your compliance team requires all sensitive data to be properly tagged and masked based on user roles. Without automation, this becomes a bottleneck as data engineers manually identifying columns, applying tags, creating masking policies.

The Solution: Rule-Based Automated PII Governance

We’ve built a scalable, automated PII governance framework in Snowflake that transforms this manual process into rules-driven system. This solution demonstrates how to intelligently identify, tag, mask, and audit PII data with minimal human intervention.

Architecture Overview

Our solution consists of four key components:

Configuration Layer

PII_RULES: Central repository defining detection patterns, tags, and masking policies
PII_TARGET_COLUMNS: Explicit whitelist of columns requiring governance (prevents over-tagging)
PII_LEVEL Tag: Classification taxonomy (PUBLIC, INTERNAL, CONFIDENTIAL, RESTRICTED)

Enforcement Layer

MASK_EMAIL_PHONE: Dynamic masking policy showing real data only to privileged roles
SP_AUTO_TAG_PII: Automated procedure applying governance rules at scale

Audit Layer

PII_DETECTION_AUDIT: Complete lineage of all governance actions with timestamps

Testing Framework

Role-based verification ensuring masking works as expected

What We Achieved

Automated Detection & Protection

The stored procedure SP_AUTO_TAG_PII automatically:

Scans designated columns against pattern-matching rules
Applies classification tags (e.g., PII_LEVEL = ‘CONFIDENTIAL’)
Enforces masking policies in a single execution
Logs every action for compliance reporting

— One simple call protects your entire schema

CALL DEMO_GOV.PUBLIC.SP_AUTO_TAG_PII('DEMO_GOV', 'PUBLIC');

Result: In our demo, 4 PII columns across 2 tables were automatically tagged and masked in seconds—a process that would take hours manually.

Technical Implementation:

Technical Implementation:

3. Audit table

CREATE OR REPLACE TABLE DEMO_GOV.public.PII_DETECTION_AUDIT ( RUN_TS TIMESTAMP_NTZ, DATABASE_NAME STRING, SCHEMA_NAME STRING, TABLE_NAME STRING, COLUMN_NAME STRING, MATCHED_RULE STRING, ACTION_TAKEN STRING, ACTION_DETAILS STRING );

4. Governance objects: Tag + Masking Policy

CREATE OR REPLACE TAG DEMO_GOV.PUBLIC.PII_LEVEL ALLOWED_VALUES 'PUBLIC','INTERNAL','CONFIDENTIAL','RESTRICTED';

CREATE OR REPLACE MASKING POLICY DEMO_GOV.PUBLIC.MASK_EMAIL_PHONE AS (val STRING) RETURNS STRING -> CASE WHEN CURRENT_ROLE() IN ('ACCOUNTADMIN','SECURITYADMIN') THEN val WHEN val IS NULL THEN NULL ELSE '***MASKED***' END;

5. Run the procedure:

CALL SP_AUTO_TAG_PII('DEMO_GOV', 'PUBLIC');

No code changes required—the framework adapts to new requirements through configuration.

Stored Procedure Logic Flow

Stored Procedure Logic Flow

a): Load Active Governance Rules

Queries PII_RULES table to fetch all active rules (IS_ACTIVE = TRUE)
Each rule contains:
- Pattern to match (e.g., %EMAIL%, %PHONE%)
- Tag to apply (e.g., PII_LEVEL = ‘CONFIDENTIAL’)
- Masking policy to enforce (e.g., MASK_EMAIL_PHONE)
Stores rules in a JavaScript array for processing

b): Identify Target Columns (Whitelist Approach)

Joins INFORMATION_SCHEMA.COLUMNS with PII_TARGET_COLUMNS
Only processes explicitly whitelisted columns
Filters by target database and schema passed as parameters

c): Pattern Matching & Rule Application

Loops through each whitelisted column
Matches column name against rule patterns (case-insensitive)
For each match:
- Applies classification tag using dynamic ALTER TABLE statement
- Enforces masking policy using dynamic SQL

d): Audit Trail Logging