While uploading multiple files into table, question comes to our mind, how can we iterate files using Snowflake metadata parameter. Here, File Iteration using Meta object implies to identify the source file i.e. File Name for the Record or data.
Consider scenario where data is coming from the several files. Therefore how can we determine Filename through which data is coming and ingesting into the table.
The answer is pretty straightforward. Metadata objects comes to the rescue which helps to determine the filename along with the line number from where the data is coming .
File Iteration using meta object say, metadata$filename and metadata$file_row_number are snowflake parameter. These parameters comes into relevance to store the Filename and Record number along with the data information in a table.
Say 100 files to be upload and in order to track data/Record belongs to which file. We can use above parameters to identify the source of truth for filename.
- Also if run the copy command again to oad the data into table then snowflake does nt load the duplicate data into the table. As in below example let say you have already loaded the table with staging area and again you are running the same copy command then script will not throw any error but no duplicate records would be copied into the table
- But running the COPY command leads to the processing cost ,Also storing the files in Staging area leads to the Storage cost as well. So as best practice if you are no longer required the stage files after upload into the tables then it is advisable we should remove these files from Staging area by using rm command. This would save processing and Storage cost.
create or replace file format demo_db.public.csv_format
type = csv
skip_header = 1
null_if = ('NULL', 'null')
empty_field_as_null = true;
create or replace stage demo_db.public.ext_csv_stage
URL = 's3://testsnowflakepro/snowflake/csv'
STORAGE_INTEGRATION = s3_int
file_format = demo_db.public.csv_format;
create or replace table demo_db.public.Demographic (
file_name string,
fie_row_number string,
first_name string not null,
last_name string ,
email string ,
streetaddress string ,
city string
);
copy into demo_db.public.Demographic
from (select metadata$filename, metadata$file_row_number, t.$1 , t.$2 , t.$3 , t.$4 , t.$5
from @demo_db.public.ext_csv_stage t)
on_error = 'continue'
FORCE = TRUE;