During this post we will discuss about the one argument use inside the Snowflake Python Connector. Recently while looking for performance topic I have come across the SnowflakeNoConverterToPython argument. However, As per the documentation this argument is handy to improve Performance by Bypassing Data Conversion. Inside Snowflake python connector, there is an argument to stop Snowflake from converting the datatypes for the results of a query to python native datatypes. In simpler terms this indicates:
In a query result all data is represented in string form and application is responsible for converting it to the native Python data types. To implement the argument:
Use the SnowflakeNoConverterToPython class in the snowflake.connector.converter_null module. This Class is useful to bypass data conversions from the Snowflake internal data type to the native Python data type.
Technical Implementation:
We can use in Python in below way:
from snowflake.connector.converter_null import SnowflakeNoConverterToPython
converter_class=SnowflakeNoConverterToPython
Here is my Python code to verify the behavior:
import snowflake.connector
/*** Firstly, Import the NoConverterToPython from converter_null class***/
from snowflake.connector.converter_null import SnowflakeNoConverterToPython
/** Create the connection with Noconverter argument and hence datatype would not be change as per the Python Native Datatypes.**/
con_noconvert = snowflake.connector.connect(
user="sachinsnowpro",
account="ak65063.us-east-2.aws",
password='xxxxxxxx',
warehouse="compute_wh",
database="demo_db",
schema="public",
converter_class=SnowflakeNoConverterToPython
)
/*** Create the default connection , Datatypes conversion would be there as per Python Native Type ***/
con_convert = snowflake.connector.connect(
user="sachinsnowpro",
account="ak65063.us-east-2.aws",
password='xxxxxx',
warehouse="compute_wh",
database="demo_db",
schema="public"
)
for rec in con_noconvert.cursor().execute("SELECT * FROM performance_improve"):
print(rec[0],rec[1],rec[2],rec[3],rec[4],rec[5])
print(type(rec[0]))
print(type(rec[1]))
print(type(rec[2]))
print(type(rec[3]))
print(type(rec[4]))
print(type(rec[5]))
for rec in con_convert.cursor().execute("SELECT * FROM performance_improve limit 1"):
print(rec[0],rec[1],rec[2],rec[3],rec[4],rec[5])
print(type(rec[0]))
print(type(rec[1]))
print(type(rec[2]))
print(type(rec[3]))
print(type(rec[4]))
print(type(rec[5]))
Therefore, We can notice the difference in above output ,as in Nonconvert connection all data is treated at String irrespective of Datatype.
Question: Even though we have seen the difference in output but I am not able to relate it with real time scenario. Considering that No datatype conversion improved the performance , but at the end it has changed the data itself. However, if you want to consume this data ,it application or Python responsibility to cast data in proper format.
So just wanted to understand if it really helps in conjunction with snowflake. In conclusion, Please do provide suggestion and share knowledge if anyone has used this argument in their code.