Read Time:7 Minute, 36 Second

In my previous article, I built an AI-powered resume screening and job matching solution in Snowflake. That version followed a more structured and deterministic approach:

resumes were parsed from PDF
candidate details were extracted
candidate seniority was classified
resumes and job descriptions were embedded
candidates were ranked against jobs using vector similarity

That design worked well for questions like:

“Who are the top candidates for this job?”

But after publishing that solution, A few practical questions came up:

How do we handle candidate profile updates over time?
What happens when a candidate uploads a new resume and old skillsets become outdated?
Can recruiters interact with the system in natural language instead of depending on predefined ranking logic?
Can we make Cortex Search Service a meaningful part of the solution instead of just creating it and not using it?

Those questions led to this second version.

In this article, I’ll walk through how evolved the first design into a more advanced GenAI-powered hiring assistant built entirely in Snowflake using:

Streamlit for interaction
stage, stream, task, and stored procedure for ingestion and lifecycle management
Cortex Search Service for retrieval
AI_COMPLETE for recruiter-style reasoning

What is different in this version

The earlier solution was centered on deterministic candidate-job matching.

This solution introduces three major upgrades:

1. Candidate lifecycle management

A candidate can upload a new resume. The system should preserve history but use only the latest active version.

2. Cortex Search Service becomes meaningful

Instead of embedding all resumes and doing only pairwise ranking, we now use Cortex Search to retrieve the most relevant resumes for a recruiter’s question.

3. GenAI-based reasoning

Once relevant resumes are retrieved, AI_COMPLETE is used to generate a recruiter-style answer grounded in the retrieved resumes.

End-to-end architecture

The full design now looks like this:

Step 1 — Use the same database, schemas, and resume stage in previous blog

Step 2 — Introduce candidate lifecycle tables

CREATE OR REPLACE TABLE SMART_HIRE.RAW.CANDIDATE_MASTER ( CANDIDATE_KEY VARCHAR, EMAIL VARCHAR, FULL_NAME VARCHAR, CURRENT_VERSION NUMBER, CURRENT_FILE_NAME VARCHAR, IS_ACTIVE BOOLEAN, CREATED_AT TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP(), UPDATED_AT TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP() );

CREATE OR REPLACE TABLE SMART_HIRE.RAW.CANDIDATE_RESUME_HISTORY ( CANDIDATE_KEY VARCHAR, RESUME_VERSION NUMBER, FILE_NAME VARCHAR, FILE_SIZE_BYTES NUMBER, UPLOAD_TIMESTAMP TIMESTAMP_NTZ, IS_ACTIVE BOOLEAN, FULL_NAME VARCHAR, EMAIL VARCHAR, PHONE VARCHAR, LINKEDIN VARCHAR, CONTACT_INFO VARIANT, EXTRACTED_PROFILE VARIANT, SENIORITY_CLASSIFICATION VARIANT, RESUME_TEXT VARCHAR, SOURCE_STAGE_PATH VARCHAR, CREATED_AT TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP() );

Why this step matters:

CANDIDATE_RESUME_HISTORY preserves all uploaded resume versions.
IS_ACTIVE identifies the latest one.
CANDIDATE_MASTER stores the current snapshot at candidate level.

Step 3 — Create a stream on the resume stage and Dummy table to consume Stream

CREATE OR REPLACE TABLE SMART_HIRE.OPERATIONS.NEW_RESUME_FILES ( FILE_NAME STRING, ACTION STRING, LOAD_TS TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP() );

CREATE OR REPLACE STREAM SMART_HIRE.RAW.RESUMES_DIR_STREAM ON STAGE SMART_HIRE.RAW.RESUMES;

Why this step matters:

Whenever a new resume PDF lands in the stage, the stream captures that change. This avoids manual monitoring of files and allows downstream automation.

Step 4 — Build the stored procedure to process each resume

This procedure does the real work.

It:

identifies the uploaded file
parses the PDF
extracts contact details
extracts technical profile details
classifies seniority
resolves candidate identity
deactivates older active resume
inserts the new active version
updates the candidate master

CREATE OR REPLACE PROCEDURE SMART_HIRE.OPERATIONS.PROCESS_RESUME_BY_FILE(P_FILE_NAME STRING) RETURNS STRING LANGUAGE SQL AS $$ DECLARE V_PARSED_CONTENT VARIANT;V_RESUME_TEXT STRING;V_CONTACT_INFO VARIANT;V_EXTRACTED_PROFILE VARIANT; V_SENIORITY VARIANT;V_FULL_NAME STRING; V_EMAIL STRING; V_PHONE STRING; V_LINKEDIN STRING; V_CANDIDATE_KEY STRING; V_NEXT_VERSION NUMBER; V_FILE_SIZE_BYTES NUMBER; V_UPLOAD_TIMESTAMP TIMESTAMP_NTZ; BEGIN SELECT SIZE, LAST_MODIFIED INTO :V_FILE_SIZE_BYTES, :V_UPLOAD_TIMESTAMP FROM DIRECTORY(@SMART_HIRE.RAW.RESUMES) WHERE RELATIVE_PATH = :P_FILE_NAME;

SELECT AI_PARSE_DOCUMENT( TO_FILE('@SMART_HIRE.RAW.RESUMES/' || :P_FILE_NAME), {'mode': 'LAYOUT'} ) INTO :V_PARSED_CONTENT; V_RESUME_TEXT := V_PARSED_CONTENT:content::STRING; SELECT AI_EXTRACT( text => :V_RESUME_TEXT, responseFormat => { 'full_name': 'Full name of the candidate', 'email': 'Email address', 'phone': 'Phone number if available', 'linkedin': 'LinkedIn URL if available' } ) INTO :V_CONTACT_INFO; V_FULL_NAME := V_CONTACT_INFO:response:full_name::STRING; V_EMAIL := LOWER(V_CONTACT_INFO:response:email::STRING); V_PHONE := V_CONTACT_INFO:response:phone::STRING; V_LINKEDIN := V_CONTACT_INFO:response:linkedin::STRING; SELECT AI_EXTRACT( text => :V_RESUME_TEXT, responseFormat => { 'technical_skills': 'List all technical tools, languages, and frameworks mentioned', 'years_experience': 'Total years of professional experience as a number', 'highest_education': 'Highest degree and university name', 'most_recent_role': 'Most recent job title', 'most_recent_company': 'Most recent employer name', 'key_achievements': 'Top 3 quantifiable achievements with metrics' } )::VARIANT INTO :V_EXTRACTED_PROFILE; SELECT AI_CLASSIFY( :V_RESUME_TEXT, ['Entry Level (0-1 years)', 'Junior (1-3 years)', 'Mid-Level (3-5 years)', 'Senior (5-8 years)', 'Staff/Principal (8+ years)'] ) INTO :V_SENIORITY; V_CANDIDATE_KEY := COALESCE( V_EMAIL, MD5(COALESCE(LOWER(V_FULL_NAME), 'UNKNOWN') || '|' || COALESCE(V_PHONE, '')) ); SELECT COALESCE(MAX(RESUME_VERSION), 0) + 1 INTO :V_NEXT_VERSION FROM SMART_HIRE.RAW.CANDIDATE_RESUME_HISTORY WHERE CANDIDATE_KEY = :V_CANDIDATE_KEY; UPDATE SMART_HIRE.RAW.CANDIDATE_RESUME_HISTORY SET IS_ACTIVE = FALSE WHERE CANDIDATE_KEY = :V_CANDIDATE_KEY AND IS_ACTIVE = TRUE; INSERT INTO SMART_HIRE.RAW.CANDIDATE_RESUME_HISTORY ( CANDIDATE_KEY, RESUME_VERSION, FILE_NAME, FILE_SIZE_BYTES, UPLOAD_TIMESTAMP, IS_ACTIVE, FULL_NAME, EMAIL, PHONE, LINKEDIN, CONTACT_INFO, EXTRACTED_PROFILE, SENIORITY_CLASSIFICATION, RESUME_TEXT, SOURCE_STAGE_PATH ) SELECT :V_CANDIDATE_KEY, :V_NEXT_VERSION, :P_FILE_NAME, :V_FILE_SIZE_BYTES, :V_UPLOAD_TIMESTAMP, TRUE, :V_FULL_NAME, :V_EMAIL, :V_PHONE, :V_LINKEDIN, :V_CONTACT_INFO, :V_EXTRACTED_PROFILE, :V_SENIORITY, :V_RESUME_TEXT, '@SMART_HIRE.RAW.RESUMES/' || :P_FILE_NAME; MERGE INTO SMART_HIRE.RAW.CANDIDATE_MASTER T USING ( SELECT :V_CANDIDATE_KEY AS CANDIDATE_KEY, :V_EMAIL AS EMAIL, :V_FULL_NAME AS FULL_NAME, :V_NEXT_VERSION AS CURRENT_VERSION, :P_FILE_NAME AS CURRENT_FILE_NAME ) S ON T.CANDIDATE_KEY = S.CANDIDATE_KEY WHEN MATCHED THEN UPDATE SET EMAIL = S.EMAIL, FULL_NAME = S.FULL_NAME, CURRENT_VERSION = S.CURRENT_VERSION, CURRENT_FILE_NAME = S.CURRENT_FILE_NAME, IS_ACTIVE = TRUE, UPDATED_AT = CURRENT_TIMESTAMP() WHEN NOT MATCHED THEN INSERT ( CANDIDATE_KEY, EMAIL, FULL_NAME, CURRENT_VERSION, CURRENT_FILE_NAME, IS_ACTIVE ) VALUES ( S.CANDIDATE_KEY, S.EMAIL, S.FULL_NAME, S.CURRENT_VERSION, S.CURRENT_FILE_NAME, TRUE ); INSERT INTO SMART_HIRE.OPERATIONS.NEW_RESUME_FILES (FILE_NAME, ACTION) SELECT RELATIVE_PATH, METADATA$ACTION FROM SMART_HIRE.RAW.RESUMES_DIR_STREAM WHERE METADATA$ACTION = 'INSERT'; RETURN 'Processed file: ' || P_FILE_NAME || ' for candidate key: ' || V_CANDIDATE_KEY; END; $$;

Why this step matters

This is where the lifecycle logic becomes real.

If a candidate uploads a new resume:

the old version is not deleted
the old version becomes inactive
the new version becomes active
the latest profile is now available for retrieval and AI

Step 5 — Trigger processing automatically using a task

The stream detects file arrival. The task makes processing automatic.

Tasks

Step 6 — Build the active candidate view

We only want the latest valid profile used in search and reasoning.

CREATE OR REPLACE VIEW SMART_HIRE.ENRICHED.ACTIVE_CANDIDATES AS

SELECT *

FROM SMART_HIRE.RAW.CANDIDATE_RESUME_HISTORY

WHERE IS_ACTIVE = TRUE;

Why this step matters:

With this view:

search uses only the latest profile
AI uses only the latest profile
history remains available for audit

Step 7 — Create Cortex Search Service on active resumes only

Why this step matters:

The search service retrieves the most relevant active candidate profiles for a recruiter’s question. That means:

no outdated resume versions
no need to scan all resumes manually
a proper retrieval layer for RAG

Step 8 — Use Streamlit for resume upload and recruiter interaction

The Streamlit app now does two jobs:

upload/update resume
ask recruiter questions in natural language

Step 9 — Understand the GenAI layer properly

This is important.

Cortex Search retrieves the top relevant candidates, but it does not make the final hiring decision.

That job belongs to the LLM.

So the design is:

CSS retrieves
LLM reasons

This is why the prompt became extremely important during testing.

Final thoughts

In the previous article, I showed how to build a structured candidate-job matching engine in Snowflake.

In this article, I extended that design into a more advanced Gen AI-based assistant that can:

accept resume updates
preserve profile history
keep only the latest profile active
retrieve relevant candidates using Cortex Search
answer hiring questions in natural language using AI_COMPLETE

This makes the solution closer to how recruiters actually work.

About Post Author

Sachin Mittal

sachin.mittal04@gmail.com

https://cloudyard.in

Happy

0 %

Sad

0 %

Excited

0 %

Sleepy

0 %

Angry

0 %

Surprise

0 %

From PDF Resumes to AI-Powered Hiring Decisions – End-to-End in Snowflake

What is different in this version

1. Candidate lifecycle management

2. Cortex Search Service becomes meaningful

3. GenAI-based reasoning

End-to-end architecture

Step 1 — Use the same database, schemas, and resume stage in previous blog

Step 2 — Introduce candidate lifecycle tables

Step 3 — Create a stream on the resume stage and Dummy table to consume Stream

Step 4 — Build the stored procedure to process each resume

Why this step matters

Step 5 — Trigger processing automatically using a task

Step 6 — Build the active candidate view

Step 7 — Create Cortex Search Service on active resumes only

Step 8 — Use Streamlit for resume upload and recruiter interaction

Step 9 — Understand the GenAI layer properly

Final thoughts

About Post Author

Sachin Mittal

Average Rating

Leave a Reply Cancel reply

From PDF Resumes to AI-Powered Hiring Decisions – End-to-End in Snowflake

From PDF Resumes to AI-Powered Hiring Decisions – End-to-End in Snowflake

Cost Intelligence in Snowflake using Cortex Code Custom Skills

Building a Cortex Agent App with Snowflake REST API

Document AI to AI_PARSE_DOCUMENT

From PDF Resumes to AI-Powered Hiring Decisions – End-to-End in Snowflake

From PDF Resumes to AI-Powered Hiring Decisions – End-to-End in Snowflake

Cost Intelligence in Snowflake using Cortex Code Custom Skills

Building a Cortex Agent App with Snowflake REST API

Document AI to AI_PARSE_DOCUMENT

EXECUTE TASK … USING CONFIG

AI_REDACT-Powered PII Redaction in Snowflake

Integrate Snowflake with Microsoft Fabric Using Iceberg Tables