We have come up with another better approach where processing of millions records completes with in a few second. Also this Error handing process is very flexible as it is able to handle invalid TIMESTAMP,BOOLEAN,NUMBER datatypes with very ease and in effective manner.
Following three tables created for this scenario:
S_INVOICE: To hold the Clean records:
CREATE OR REPLACE TABLE S_INVOICE(INVC_NO VARCHAR,CUST_NO VARCHAR NOT NULL,INVC_AMT NUMBER,INVC_DT timestamp, LATE_FEE BOOLEAN);
RAW_INVOICE: Staging table to hold the source data. Al columns we have created as VARCHAR.
CREATE OR REPLACE TABLE RAW_INVOICE(INVC_NO VARCHAR,CUST_NO VARCHAR ,INVC_AMT VARCHAR,INVC_DT VARCHAR,LATE_FEE VARCHAR);
INVOICE_ERROR: To hold the Error Records.
CREATE OR REPLACE TABLE INVOICE_ERROR(INVC_NO VARCHAR,CUST_NO VARCHAR ,INVC_AMT VARCHAR,INVC_DT VARCHAR,LATE_FEE VARCHAR,ERROR_MSG VARCHAR)
I have inserted 3.1 million records in our staging table i.e. RAW_INVOICE and out of these 100K records I marked them wrong intentionally. I tried to combine all the possible error combinations as below.
Following Multi-Insert query executes which will insert Error records to the Audit/Log table and clean records to the Target table.