In this discussion, our focus lies in creating a User-Defined Function (UDF) designed to calculate the word count within a provided text. The UDF will receive an input string, tally the words within the text, and furnish an array detailing the word counts. To enhance the practicality, we will extend this functionality to work with Snowflake table columns as inputs, generating an array that represents the word counts within the specified column.
Furthermore, we will delve into an additional use case. Here, the UDF will not only take the input text but also an array of specific words. It will then search for occurrences of these words within the input text, providing an output indicating whether the words were found within the respective table column or not.
Use Case1: Word count. Below Python UDF will count the number of words to the input string passed to the UDF.
Call the UDF:
SELECT CC_CALL_CENTER_SK,CC_REC_START_DATE,CC_REC_END_DATE,CC_NAME,CC_MKT_DESC AS INPUT,
COUNT_WORDS(CC_MKT_DESC) as word_counts from SNOWFLAKE_SAMPLE_DATA.TPCDS_SF10TCL.call_center;
Use Case2: Python UDF to search the array of keywords in Input String.
First we will create a table to hold the keywords which will be Search against the Input String
CREATE or replace TABLE search_keyword_tbl (search_text VARCHAR(50));
INSERT INTO search_keyword_tbl (search_text)
VALUES ('Shared'), ('largely'), ('Rich'), ('Reduced'), ('Blue');
Call the UDF:
CC_CALL_CENTER_SK,CC_MKT_DESC AS INPUT,
(SELECT ARRAY_AGG(search_text) from search_keyword_tbl) as Keywords_to_find,
WORD_CNT_SELCETIVE_FIND(CC_MKT_DESC, (SELECT ARRAY_AGG(search_text) from search_keyword_tbl))