2

I am loading a json data with large number of into snowflake. Some values of the json row contains single quotes and it is returing an error when I am trying to insert json into snowflake using parse_json . I am using sqlalchemy as a connector from python. Here is my code

connection.execute(
                f'insert into json_demo select parse_json( """{json.dumps(data)}""" )')

The json data sample is as follows:

[
  {
    "name": "Milka Luam",
    "ID": "124",
    "UnitAddressLine1": "1262 University Runa",
    "UnitCity": "Jas sti'n"
  },
  {
    "name": "Rahu Liran",
    "ID": "541",
    "UnitAddressLine1": "1262 University Blina",
    "UnitCity": "Rish 21"
  },
  ...
]

The single quote in the unity city Jas sti'n is returning an error. Here is the error:

sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 001003 (42000): SQL compilation error:
syntax error line 1 at position 47 unexpected 'UnitCity'.
syntax error line 1 at position 47 unexpected 'UnitCity'.
parse error line 1 at position 90,841 near '<EOF>'.

I can't manually add an escape character as I am loading a large number of row.

0

2 Answers 2

4

Using f-strings to interpolate values is error-prone and a security risk. Use SQLAlchemy's text function and bind parameters instead.

# Demonstrate query text
import json

import sqlalchemy as sa
from sqlalchemy.dialects import postgresql


json_ = json.dumps({'a': "Jas sti'n"})

stmt = sa.text("""insert into json_demo select parse_json(:json_)""")

stmt = stmt.bindparams(json_=json_)
print(
    stmt.compile(
        dialect=postgresql.dialect(), compile_kwargs={'literal_binds': True}
    )   
)

Output:

insert into json_demo select parse_json('{"a": "Jas sti''n"}')

To execute the statement, you would do

stmt = sa.text("""insert into json_demo select parse_json(:json_)""")
with engine.begin() as conn:
    conn.execute(stmt, {'json_': json_})
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you! This is working, but I am getting a warning saying that snowflake will not make use of SQL compilation caching as it does not set the 'supports_statement_cache' attribute to True. This can have significant performance implications including some performance degradations in comparison to prior SQLAlchemy versions. Dialect maintainers should seek to set this attribute to True after appropriate development and testing for SQLAlchemy 1.4 caching support. Alternatively, this attribute may be set to False which will disable this warning... Would this be a problem? Thanks,
It's an issue with the snowflake dialect rather than your code - see this issue, which includes instructions for suppressing the warning.
-1

As a solution you can either triple quote "Jas sti'n":

"""Jas sti'n"""

[
  {
    "name": "Milka Luam",
    "ID": "124",
    "UnitAddressLine1": "1262 University Runa",
    "UnitCity": """Jas sti'n"""
  }
]

Or you can place a backslash before the quote to escape it:

"Jas sti\'n"

[
  {
    "name": "Milka Luam",
    "ID": "124",
    "UnitAddressLine1": "1262 University Runa",
    "UnitCity": "Jas sti\'n"
  }
]

Both of these solutions make python read the ' as part of the string.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.