File Structure
5 0
Read Time:2 Minute, 5 Second

In the modern data landscape, the common challenge is efficiently handling and integrating new data from various files into data warehouses. This blog post explores a real-world use case for a Snowflake stored procedure designed to dynamically create tables based on CSV file headers, load data, and provide insightful feedback on the data loading process.

Consider a  company that receives daily CSV files containing customer transaction data. The files have varying structures, as different departments might add or remove columns based on their reporting needs. To streamline the ingestion process, the company needs a solution that can:

  1. Dynamically create tables based on the CSV file headers.
  2. Load data into these tables.
  3. Provide a summary of the loading process, including the number of records ingested.

File Structure:

File Structure

We implemented a Snowflake stored procedure, Dynamic_File_Pipe, to address these requirements. This procedure automates the table creation and data loading process, ensuring that the data ingests accurately and efficiently.

Here’s a breakdown of how the Dynamic_File_Pipe stored procedure works:

  1. Generate Column List and Create Table:
    • The procedure first examines the CSV file to dynamically generate a list of columns based on the headers.
    • It uses Snowflake’s FLATTEN function to parse the headers and create a CREATE TABLE statement.
    • The table is created in the DEMO_DB.PUBLIC schema with columns that match the CSV file structure.
  2. Load Data into the Table:
    • The COPY INTO command is used to load the CSV data into the newly created table.
    • The file format for CSV is specified to handle the data appropriately, including delimiters and header skipping.
  3. Count Records and Return Summary:
    • After loading the data, the procedure counts the number of records inserted into the table.
    • It returns a success message with the record count and the date of the load.

Code:

File_Format:

create or replace file format demo_db.public.csv_format_first
type = csv
null_if = ('NULL', 'null')
field_delimiter = ',';

Code:

Dynamic File

Call the Procedure:

call DYNAMIC_FILE_PIPE();

Proc Output

 

Snowflake Output

This dynamic file pipe demonstrates the power of Snowflake’s stored procedures and JavaScript capabilities. By leveraging automation and dynamic schema generation, we can streamline real-time data ingestion, empowering businesses to gain valuable insights from their ever-evolving data landscape.

Average Rating

5 Star
0%
4 Star
0%
3 Star
0%
2 Star
0%
1 Star
0%

Leave a Reply

Your email address will not be published. Required fields are marked *