1 0
Read Time:3 Minute, 14 Second

The Challenge

Consider a typical enterprise scenario: your data team onboards new customer tables weekly, each containing various PII fields. Your compliance team requires all sensitive data to be properly tagged and masked based on user roles. Without automation, this becomes a bottleneck as data engineers manually identifying columns, applying tags, creating masking policies.

The Solution: Rule-Based Automated PII Governance

We’ve built a scalable, automated PII governance framework in Snowflake that transforms this manual process into rules-driven system. This solution demonstrates how to intelligently identify, tag, mask, and audit PII data with minimal human intervention.

Architecture Overview

Our solution consists of four key components:

  1. Configuration Layer
  • PII_RULES: Central repository defining detection patterns, tags, and masking policies
  • PII_TARGET_COLUMNS: Explicit whitelist of columns requiring governance (prevents over-tagging)
  • PII_LEVEL Tag: Classification taxonomy (PUBLIC, INTERNAL, CONFIDENTIAL, RESTRICTED)
  1. Enforcement Layer
  • MASK_EMAIL_PHONE: Dynamic masking policy showing real data only to privileged roles
  • SP_AUTO_TAG_PII: Automated procedure applying governance rules at scale
  1. Audit Layer
  • PII_DETECTION_AUDIT: Complete lineage of all governance actions with timestamps
  1. Testing Framework
  • Role-based verification ensuring masking works as expected

What We Achieved

 Automated Detection & Protection

The stored procedure SP_AUTO_TAG_PII automatically:

  • Scans designated columns against pattern-matching rules
  • Applies classification tags (e.g., PII_LEVEL = ‘CONFIDENTIAL’)
  • Enforces masking policies in a single execution
  • Logs every action for compliance reporting

— One simple call protects your entire schema

CALL DEMO_GOV.PUBLIC.SP_AUTO_TAG_PII('DEMO_GOV', 'PUBLIC');

Result: In our demo, 4 PII columns across 2 tables were automatically tagged and masked in seconds—a process that would take hours manually.

Technical Implementation:

Technical Implementation:

Rules and Whitelist

3. Audit table

CREATE OR REPLACE TABLE DEMO_GOV.public.PII_DETECTION_AUDIT (
RUN_TS          TIMESTAMP_NTZ,
DATABASE_NAME   STRING,
SCHEMA_NAME     STRING,
TABLE_NAME      STRING,
COLUMN_NAME     STRING,
MATCHED_RULE    STRING,
ACTION_TAKEN    STRING,
ACTION_DETAILS  STRING
);

4. Governance objects: Tag + Masking Policy

CREATE OR REPLACE TAG DEMO_GOV.PUBLIC.PII_LEVEL ALLOWED_VALUES 'PUBLIC','INTERNAL','CONFIDENTIAL','RESTRICTED';

CREATE OR REPLACE MASKING POLICY DEMO_GOV.PUBLIC.MASK_EMAIL_PHONE AS (val STRING)
RETURNS STRING ->
CASE
WHEN CURRENT_ROLE() IN ('ACCOUNTADMIN','SECURITYADMIN') THEN val
WHEN val IS NULL THEN NULL
ELSE '***MASKED***'
END;

5. Run the procedure:

CALL SP_AUTO_TAG_PII('DEMO_GOV', 'PUBLIC');

Proc Output

No code changes required—the framework adapts to new requirements through configuration.

Stored Procedure Logic Flow

Stored Procedure Logic Flow

 a): Load Active Governance Rules

  • Queries PII_RULES table to fetch all active rules (IS_ACTIVE = TRUE)
  • Each rule contains:
    • Pattern to match (e.g., %EMAIL%, %PHONE%)
    • Tag to apply (e.g., PII_LEVEL = ‘CONFIDENTIAL’)
    • Masking policy to enforce (e.g., MASK_EMAIL_PHONE)
  • Stores rules in a JavaScript array for processing

 b): Identify Target Columns (Whitelist Approach)

  • Joins INFORMATION_SCHEMA.COLUMNS with PII_TARGET_COLUMNS
  • Only processes explicitly whitelisted columns
  • Filters by target database and schema passed as parameters

 c): Pattern Matching & Rule Application

  • Loops through each whitelisted column
  • Matches column name against rule patterns (case-insensitive)
  • For each match:
    • Applies classification tag using dynamic ALTER TABLE statement
    • Enforces masking policy using dynamic SQL

 d): Audit Trail Logging

  • Records every governance action in PII_DETECTION_AUDIT table.

6. Verify Audit Log Table.

Audit Log

 Role-Based Access Control

The masking policy implements intelligent role-based data access:

  • ACCOUNTADMIN/SECURITYADMIN: See actual data (amit@example.com, 9999999999)
  • Standard roles (FLM): See masked values (***MASKED***)
RBAC Tag and Masking

Conclusion

This framework demonstrates that with thoughtful architecture, you can:

  • Protect sensitive data without slowing down data teams
  • Scale governance across hundreds of tables and schemas
  • Maintain complete auditability for regulatory requirements

Average Rating

5 Star
0%
4 Star
0%
3 Star
0%
2 Star
0%
1 Star
0%

Leave a Reply

Your email address will not be published. Required fields are marked *