2

I'm using Pyodbc in Python to run some SQL queries. What I'm working with is actually longer than this, but this example captures what I'm trying to do:

connection = pyodbc.connect(...)
cursor = connection.cursor(...)

dte = '2018-10-24'

#note the placeholders '{}'
query = """select invoice_id
    into #output 
    from table1 with (nolock) 
    where system_id = 'PrimaryColor' 
    and posting_date = '{}' 

    insert into #output
    select invoice_id
    from table2 with (nolock)
    where system_id = 'PrimaryColor'
    and posting_date = '{}'"""

#this is where I need help as explained below
cursor.execute(query.format(dte, dte))

output = pd.read_sql("""select *
                 from #output"""
                 , connection)

In the above, since there are only two '{}', I'm passing dte to query.format() twice. However, in the more complicated version I'm working with, I have 19 '{}', so I'd imagine this means I need to pass 'dte' to 'query.format{}' 19 times. I tried passing this as a list, but it didn't work. Do I really need to write out the variable 19 times when passing it to the function?

2 Answers 2

2

Consider using a UNION ALL query to avoid the temp table needs and parameterization where you set qmark placeholders and in a subsequent step bind values to them. And being the same value multiply the parameter list/tuple by needed number:

dte = '2018-10-24'

# NOTE THE QMARK PLACEHOLDERS
query = """select invoice_id    
           from table1 with (nolock) 
           where system_id = 'PrimaryColor' 
             and posting_date = ? 

           union all

           select invoice_id
           from table2 with (nolock)
           where system_id = 'PrimaryColor'
             and posting_date = ?"""

output = pd.read_sql(query, connection, params=(dte,)*2)
Sign up to request clarification or add additional context in comments.

Comments

1

I agree with the comments, pandas.read_sql has a params argument which prevent from sql injection.

See this post to understand how to use it depending on the database.

Pyodbc has the same parameter on the execute method.

# standard 
cursor.execute("select a from tbl where b=? and c=?", (x, y))

# pyodbc extension 
cursor.execute("select a from tbl where b=? and c=?", x, y)

To answer to the initial question, even if it is bad practice for building SQL queries :

Do I really need to write out the variable 19 times when passing it to the function?

Of course you don't :

query = """select invoice_id
into #output 
from table1 with (nolock) 
where system_id = 'PrimaryColor' 
and posting_date = '{dte}' 

insert into #output
select invoice_id
from table2 with (nolock)
where system_id = 'PrimaryColor'
and posting_date = '{dte}'""".format(**{'dte': dte})

or :

query = """select invoice_id
into #output 
from table1 with (nolock) 
where system_id = 'PrimaryColor' 
and posting_date = '{0}' 

insert into #output
select invoice_id
from table2 with (nolock)
where system_id = 'PrimaryColor'
and posting_date = '{0}'""".format(dte)

Python 3.6+ :

query = f"""select invoice_id
into #output 
from table1 with (nolock) 
where system_id = 'PrimaryColor' 
and posting_date = '{dte}' 

insert into #output
select invoice_id
from table2 with (nolock)
where system_id = 'PrimaryColor'
and posting_date = '{dte}'"""

Note the usage of f before """ ... """

2 Comments

@roganjosh thanks, i mixed up because he uses pandas as well. Edited
Thank you! This did it!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.