2 0
Read Time:1 Minute, 52 Second

During this post we are going to discuss how we can read EXCEL file in Snowflake Python code. Snowflake offers seamless integration with a variety of file formats, such as CSV, JSON, Parquet, Avro, ORC, and XML, providing users with extensive options for data ingestion. However, what if your specific use case involves a file format that isn’t among Snowflake’s native offerings?

In such scenarios, SnowflakeFile.open comes to the rescue. This function serves as a versatile gateway, enabling users to work with a broader spectrum of file formats, even those not explicitly supported out of the box by Snowflake. By utilizing SnowflakeFile.open, users gain flexibility and adaptability, ensuring they can efficiently ingest and process data regardless of the file format’s native support status within the Snowflake ecosystem.

This feature enhances Snowflake’s capabilities and making it a powerful tool for handling a wide range of data formats in various real-world scenarios. We’ll explore practical use cases, demonstrating how Snowpark Python, in combination with SnowflakeFile.open, can be employed to seamlessly ingest data from an Excel file into a Snowflake table

Say we have below excel file present in my internal staging area:

Excel snapshot

Below is the Python procedure to read excel file from internal staging area i.e. @EXCEL_STAGE.

Python Excel Proc

Call the Procedure:

CALL parse_excel_sp(build_scoped_file_url(@EXCEL_STAGE, 'Inv.xlsx'));

Output:

Excel Proc Output

Now in extension to it, if we have multiple sheet in one excel file and we have to merge these sheets into one Dataframe. Finally this dataframe has to be imported into Snowflake table. Below is the procedure for the same.

MultiExcel Snapshot

Read Mult Sheet Proc

Call the Procedure:

CALL read_excel_multfile(build_scoped_file_url(@EXCEL_STAGE, 'Inv1.xlsx'));

Output:

MultiExcel Output

Final state of DB after two procs execution:

Final Output

Average Rating

5 Star
0%
4 Star
0%
3 Star
0%
2 Star
0%
1 Star
0%

One thought on “Read Excel using SnowflakeFile.open

  1. You can embed the build scoped file URL as a function to simplify the call signature of the procedure. With this function, you would call the procedure passing in the stage name and file path as parameters

    def get_scoped_url(stage, file_path):
    sql = f”CALL BUILD_SCOPED_FILE_URL(@{stage}, ‘{file_path}’)”
    row = session.sql(sql).collect()
    url = row[0][“BUILD_SCOPED_FILE_URL”]
    return url

Leave a Reply

Your email address will not be published. Required fields are marked *