During this post we will discuss multiple FILE FORMAT parameters used in COPY Command. We have seen many times the data push by business varies in nature in a single file. Structure of the data is not in synch even for the Records present in single file. Few Records having Fields enclosed in double quotes. While another set of records are simple and not enclosed in any quotes. On the similar lines few records are having some special characters while others have backspace (\) character in their values.
In short, it can be the possibility we see different combinations with data in same file. As per the business these are the valid records and needs to be push into the system. It is the responsibility of developer to cater such scenario at Snowflake level. To avoid missing of any such records FILE FORMAT comes to the rescue.
We will see the below different FILE FORMAT parameters to avoid any data error.
So we have created the below FILE with known possible combinations in data. We will try to ingest the file into Snowflake and ensure all Records in file should consume. No Record should be dropped.
File has been uploaded to the AWS staging area and create the below FORMAT:
create or replace file format demo_db.public.csv_format
type = csv
skip_header = 1
field_delimiter = ','
date_format = 'YYYY-MM-DD' -- Defined the Date format getting in feed file
FIELD_OPTIONALLY_ENCLOSED_BY='"' -- Handle the fields having value enclosing in double quotes ""
TIMESTAMP_FORMAT='YYYY-MM-DD-HH24.MI.SS.FF' -- Defined the Timestamp format receiving in feed file
encoding = 'iso-8859-1' -- to handle special character ß
SKIP_BLANK_LINES = TRUE -- Handle the Blank line in your feed file
NULL_IF = ('NULL', 'null', '\N') -- to repace blank with sql NULL values
ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE
ESCAPE_UNENCLOSED_FIELD=NONE -- to keep the backslash in your data and avoid error
empty_field_as_null = true
TRIM_SPACE = TRUE;
create or replace stage demo_db.public.load_test
URL = 's3://ec2s3'
STORAGE_INTEGRATION = s3_int
file_format = demo_db.public.csv_format;
create or replace table FORMAT_TBL (
CUST_NUM varchar(8) null,
CUST_STAT varchar(255) null,
CUST_BAL number(20) null,
INV_NO varchar(10) null,
INV_AMT number(10) null,
Verify the Data into Table: