In continuation of the previous post, here we are going to load selected column sequentially from CSV file to Snowflake table.
Sequential columns: selecting sequential columns from the .csv file starting with column1, column2,column3, etc. then you can use the COPY command. Again note, We are not extracting the complete fields from CSV file instead we Load selected column but this time columns are in sequential order not in disparate way like we implemented in previous post.
- Firstly, Create target table with desired business columns
- Secondly, Create file format ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE. This parameter has to be set to FALSE when selecting few columns out of all the columns contained in the .csv file.
- Finally, COPY the data into target table.
ERROR_ON_COLUMN_COUNT_MISMATCH : Specifies whether to generate an error when the columns in the source file do not match that of the destination table. When set to FALSE, an error not thrown and the load continues. Hence, The fields loaded in the order of the occurrence in the source file.
When using the INSERT statement to select the specific columns you want directly from the .csv file, the ERROR_ON_COLUMN_COUNT_MISMATCH parameter does not need to be set to FALSE.
For Instance, Create the File format:
create or replace file format demo_db.public.csv_format_few_cols
type = csv
FIELD_DELIMITER = ‘,’
skip_header = 1
ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE
null_if = (‘NULL’, ‘null’)
empty_field_as_null = true;
Similarly, Create the Stage:
create or replace stage demo_db.public.ext_csv_stage_fewcols
URL = ‘s3://testsnowflakepro/snowflake/csv’
STORAGE_INTEGRATION = s3_int file_format = demo_db.public.csv_format_few_cols
Create or replace table healthcare_target_seq( AVERAGE_COVERED_CHARGES NUMBER(38,6)
,AVERAGE_TOTAL_PAYMENTS NUMBER(38,6) ,TOTAL_DISCHARGES NUMBER(38,0)
,BACHELORORHIGHER NUMBER(38,1)
,HSGRADORHIGHER NUMBER(38,1)
,TOTALPAYMENTS VARCHAR(128)
,REIMBURSEMENT VARCHAR(128) )
Finally, Loaded selected data into table:
copy into healthcare_target_seq
from @demo_db.public.ext_csv_stage_fewcols
on_error = CONTINUE;
Another approach:
Here we are not using the COPY command ,however, we will select the columns from Stage itself and here we can load the data for sequential or non sequential columns.
Create the target table with desired number of columns (Columns can be sequential or non sequential order)
Create file format ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE
Insert into your table columns the cell names you want out of your .csv file to populate the TEST
Create or replace table healthcare_target_test( AVERAGE_COVERED_CHARGES NUMBER(38,6)
,AVERAGE_TOTAL_PAYMENTS NUMBER(38,6)
,TOTAL_DISCHARGES NUMBER(38,0)
,BACHELORORHIGHER NUMBER(38,1) //First 3 columns of file
,PROVIDER_STATE VARCHAR(128)
,PROVIDER_STREET_ADDRESS VARCHAR(256) //last 3 columns of file
,PROVIDER_ZIP_CODE NUMBER(38,0)
)
Insert into healthcare_target_test (AVERAGE_COVERED_CHARGES, AVERAGE_TOTAL_PAYMENTS, TOTAL_DISCHARGES, BACHELORORHIGHER,
PROVIDER_STATE, PROVIDER_STREET_ADDRESS ,PROVIDER_ZIP_CODE )
Select T.$1, T.$2, T.$3,T.$4, T.$24, T.$25,T.$26
FROM @demo_db.public.ext_csv_stage_fewcols as T;