Snowpipe Refresh: During this post we will be discussing Snowpipe use case. Consider the scenario where the files are already present in the S3 bucket. As per the ask, now we need to create Snowpipe on this bucket. Snowpipe enables loading data from files as soon as they’re available in a stage. So once the PIPE gets created it will trigger based on Notification received from cloud messaging service (SQS). Therefore, when we upload new files to bucket, PIPE will load data into tables continuously from an ingestion queue.
Now the question comes, what will happen to the files which are already into the Bucket. Will Snowpipe loads these files automatically?. The answer is No. Snowpipe will not process the already existed files in bucket.
In order to process the already available file ,either we have to use COPY command or we can use the REFRESH command with PIPE. For this use case we would be using Snowpipe Refresh to process the previous files present in bucket before configuring the SNOWPIPE.
As per the Screenshot ,below are the files present in Bucket.
Now we will be creating PIPE in snowflake and configure the notification in AWS.
create or replace pipe demo_db.public.prod_copy auto_ingest=true as
copy into prod_tbl
from @demo_db.public.ext_csv_stage
on_error = CONTINUE;
Now upload the file to Bucket and verify the count in prod table.
As we see there are 52 records loaded into the table. Other existed files did not load into the table.
In order to process the previous files use the below REFRESH command:
ALTER PIPE prod_copy REFRESH;
Verify the Count inside the table:
To verify the status of Snowpipe load history, run the below query and will get all the history.