UDF Masking
0 0
Read Time:3 Minute, 51 Second

In this post, we’ll explore the application of masking policies on external tables in Snowflake. While masking policies are commonly implemented on permanent tables, there are scenarios where sensitive data may reside in external sources, such as files in S3, which do not need to be fully ingested into Snowflake.

Scenario:

Imagine a situation where you’ve received a feed file from the business on your S3 location. The file contains sensitive information, and the requirement is to mask this data when querying the external table. Let’s walk through how to achieve this.

Step 1: Setting Up Storage Integration and File Formats

We’ll start by creating the storage integration and the necessary file formats for the external table.

CREATE OR REPLACE STORAGE INTEGRATION s3_int
TYPE = EXTERNAL_STAGE
STORAGE_PROVIDER = S3
ENABLED = TRUE
STORAGE_AWS_ROLE_ARN = 'arn:aws:iam::913267004595:role/testsnowflake'
STORAGE_ALLOWED_LOCATIONS = ('s3://sharedbucket29/countrycode/');

CREATE OR REPLACE FILE FORMAT demo_db.public.csv_format
TYPE = CSV
SKIP_HEADER = 1
NULL_IF = ('NULL', 'null')
EMPTY_FIELD_AS_NULL = TRUE;

Step 2: Creating the External Stage and External Table and Masking Policy

Next, we define the external stage pointing to the S3 bucket and create the external table based on the file format.

CREATE OR REPLACE STAGE demo_db.public.ext_csv_stage
URL = 's3://sharedbucket29/countrycode/'
STORAGE_INTEGRATION = s3_int
FILE_FORMAT = demo_db.public.csv_format;

CREATE OR REPLACE MASKING POLICY country_masked AS (val STRING) RETURNS STRING ->
CASE
WHEN CURRENT_ROLE() IN ('CREATE_NOTEBOOKS') THEN val
ELSE 'na'
END;

CREATE OR REPLACE EXTERNAL TABLE Country_Code_Details
(
country_code VARCHAR(10) AS (value:c1::varchar),
country_number VARCHAR(20) AS (value:c2::varchar),
country_currency_name VARCHAR(10) AS (value:c3::varchar),
currency_code VARCHAR(20) AS (value:c4::varchar) WITH MASKING POLICY country_masked,
currency_number VARCHAR(10) AS (value:c5::varchar)
)
WITH LOCATION = @demo_db.public.ext_csv_stage
FILE_FORMAT = demo_db.public.csv_format;

use Role ACCOUNTADMIN;

select * from Country_Code_Details;

External Table Mask

To address the issue of the VALUE column not being masked, we can adopt a different approach. First, we create the external table without any masking. Then, we apply the masking policy by creating a view on top of the external table.

CREATE OR REPLACE EXTERNAL TABLE Country_Code_Details_nomask
(
country_code VARCHAR(10) AS (value:c1::varchar),
country_number VARCHAR(20) AS (value:c2::varchar),
country_currency_name VARCHAR(10) AS (value:c3::varchar),
currency_code VARCHAR(20) AS (value:c4::varchar),
currency_number VARCHAR(10) AS (value:c5::varchar)
)
WITH LOCATION = @demo_db.public.ext_csv_stage
FILE_FORMAT = demo_db.public.csv_format;

CREATE OR REPLACE VIEW vw_Country_Code_Details AS
SELECT
country_code,
country_number,
country_currency_name,
currency_code WITH MASKING POLICY country_masked,
currency_number
FROM Country_Code_Details_nomask;

SELECT * FROM vw_Country_Code_Details;

View on External TBL

This solution works well, but there’s one drawback. If you have a large number of external tables that require masking, you would need to create individual views for each table to apply the masking policy. To streamline this process, what if we could apply mask to the VALUE column and other derived columns directly within a single external table?

Implementing a UDF-Based Mask Policy.

CREATE OR REPLACE FUNCTION mask_udf(v VARIANT)
RETURNS VARIANT
LANGUAGE JAVASCRIPT
AS
$$
let obj = V;
if (obj.c5) {
obj.c5 = "masked";
}
return obj;
$$;

CREATE OR REPLACE MASKING POLICY country_masked_udf AS (val VARIANT) RETURNS VARIANT ->
CASE
WHEN CURRENT_ROLE() IN ('CREATE_NOTEBOOKS') THEN val
ELSE mask_udf(val)
END;

CREATE OR REPLACE EXTERNAL TABLE Country_Code_Details_udf
(
country_code VARCHAR(10) AS (value:c1::varchar),
country_number VARCHAR(20) AS (value:c2::varchar),
country_currency_name VARCHAR(10) AS (value:c3::varchar),
currency_code VARCHAR(20) AS (value:c4::varchar),
currency_number VARCHAR(10) AS (value:c5::varchar)
)
WITH LOCATION = @demo_db.public.ext_csv_stage
FILE_FORMAT = demo_db.public.csv_format;

ALTER TABLE Country_Code_Details_udf MODIFY COLUMN VALUE SET MASKING POLICY country_masked_udf;

select * from Country_Code_Details_udf;

In this post, we’ve explored various approaches to applying masking policies on external tables in Snowflake. Depending on your use case, you can either apply masking directly to the external table, create a view to simplify management, or implement advanced mask using a UDF for more complex scenarios. Each approach ensures that sensitive data is protected while working with external data sources like S3.

Average Rating

5 Star
0%
4 Star
0%
3 Star
0%
2 Star
0%
1 Star
0%

Leave a Reply

Your email address will not be published. Required fields are marked *