Snowpipe Troubleshooting: As we know Snowpipe automates the process by enabling the data loading from files as soon as they’re available in a stage. This process makes data available to the users in micro batches within minutes, rather than manually executing COPY statements on a schedule to load larger batches.
But sometimes users comes into the situation where file/data is present in S3 bucket, but snowpipe is not getting triggered to load the data in staging tables. So it can be any issue with File format/pipe configuration settings or any data related issues, Now the challenge is how Snowpipe Troubleshooting can be done by the user for preventing the automatic loading of files.
In this post, we will be discussing about General Troubleshooting Steps:
Troubleshooting Snowpipe:
Step 2. View the COPY History for the Table
Step 3: Validate the Data Files
1: Check Pipe Status: Retrieve the current status of the pipe. For Instance, the results are available in JSON format.
select SYSTEM$PIPE_STATUS(<<‘pipe Name’>>);
lastReceivedMessageTimestamp:
Specifies the timestamp of the last event message received from the message queue. If the timestamp is earlier than expected, this likely indicates an issue with either the service configuration (i.e. Amazon SQS or Amazon SNS, or Azure Event Grid) or the service itself.
lastForwardedMessageTimestamp:
If event messages are getting received from the message queue but are not forwarded to the pipe, then there is likely a mismatch between the blob storage path where the new data files are created and the combined path specified in the Snowflake stage and pipe definitions.
Following values needs to be check:
Firstly, Executionstate : RUNNING
Secondly, LastReceivedMessageTimestamp: 21st May 2021 : 08:41
Finally, lastForwardedMessageTimestamp:21st May 2021 : 08:42
Receiving and Forwarding of Data is complete .
Step 2. View the COPY History for the Table:
Query : COPY_HISTORY table
If event messages are getting received and forwarded, then query the load activity history for the target table.
The STATUS column indicates whether a particular set of files load, partially loaded, or failed to load.
The FIRST_ERROR_MESSAGE column provides a reason when an attempt partially loaded or failed. Please note that if a set of files has multiple issues, the FIRST_ERROR_MESSAGE column only indicates the first error encountered.
In addition, To View all errors in the files, execute a COPY INTO <table> statement with the VALIDATION_MODE copy option set to RETURN_ALL_ERRORS. The VALIDATION_MODE copy option instructs a COPY statement to validate the data to be loaded and return results based on the validation option specified.
SELECT
"FILE_NAME",
"STAGE_LOCATION",
"LAST_LOAD_TIME",
"ROW_COUNT",
"FILE_SIZE",
"ERROR_COUNT",
"STATUS",
"TABLE_CATALOG_NAME",
"TABLE_SCHEMA_NAME",
"TABLE_NAME",
"PIPE_CATALOG_NAME",
"PIPE_SCHEMA_NAME",
"PIPE_NAME",
"PIPE_RECEIVED_TIME" ,
"FIRST_ERROR_MESSAGE"
FROM snowflake.account_usage.copy_history
WHERE
LAST_LOAD_TIME between to_timestamp_ltz('2021-05-20 00:00:00.000000+00:00') AND to_timestamp_ltz('2021-05-22 01:00:00.000000+00:00')
-- AND STATUS != 'load failed'
ORDER BY LAST_LOAD_TIME DESC;
In next part of this article we will discus about 3rd Approach as well as will talk about the negative testing if Snowpipe gets failed. Click here.