96

Are there any examples of how to pass parameters with an SQL query in Pandas?

In particular I'm using an SQLAlchemy engine to connect to a PostgreSQL database. So far I've found that the following works:

df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
                     'where "Timestamp" BETWEEN %s AND %s'),
                   db,params=[datetime(2014,6,24,16,0),datetime(2014,6,24,17,0)],
                   index_col=['Timestamp'])

The Pandas documentation says that params can also be passed as a dict, but I can't seem to get this to work having tried for instance:

df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
                     'where "Timestamp" BETWEEN :dstart AND :dfinish'),
                   db,params={"dstart":datetime(2014,6,24,16,0),"dfinish":datetime(2014,6,24,17,0)},
                   index_col=['Timestamp'])

What is the recommended way of running these types of queries from Pandas?

2 Answers 2

134

The read_sql docs say this params argument can be a list, tuple or dict (see docs).

To pass the values in the sql query, there are different syntaxes possible: ?, :1, :name, %s, %(name)s (see PEP249).
But not all of these possibilities are supported by all database drivers, which syntax is supported depends on the driver you are using (psycopg2 in your case I suppose).

In your second case, when using a dict, you are using 'named arguments', and according to the psycopg2 documentation, they support the %(name)s style (and so not the :name I suppose), see http://initd.org/psycopg/docs/usage.html#query-parameters.
So using that style should work:

df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
                     'where "Timestamp" BETWEEN %(dstart)s AND %(dfinish)s'),
                   db,params={"dstart":datetime(2014,6,24,16,0),"dfinish":datetime(2014,6,24,17,0)},
                   index_col=['Timestamp'])
Sign up to request clarification or add additional context in comments.

8 Comments

That's very helpful - I am using psycopg2 so the '%(name)s syntax works perfectly.
We should probably mention something about that in the docstring:github.com/pydata/pandas/issues/7573
This solution no longer works on Postgres - one needs to use the :notation, and then be sure to wrap the SQL string with sqlalchemy.text()
The correct characters for the parameter style can be looked up dynamically by the way in nearly every database driver via the paramstyle attribute. This is acutally part of the PEP 249 definition.
|
-1

I was having trouble passing a large number of parameters when reading from a SQLite Table. Then it turns out since you pass a string to read_sql, you can just use f-string. Tried the same with MSSQL pyodbc and it works as well.

For SQLite, it would look like this:

# write a sample table into memory
from sqlalchemy import create_engine
df = pd.DataFrame({'Timestamp': pd.date_range('2020-01-17', '2020-04-24', 10), 'Value1': range(10)})
engine = create_engine('sqlite://', echo=False)
df.to_sql('MyTable', engine);

# query the table using a query
tpl = (1, 3, 5, 8, 9)
query = f"""SELECT Timestamp, Value1 FROM MyTable WHERE Value1 IN {tpl}"""
df = pd.read_sql(query, engine)

If the parameters are datetimes, it's a bit more complicated but calling the datetime conversion function of the SQL dialect you're using should do the job.

start, end = '2020-01-01', '2020-04-01'
query = f"""SELECT Timestamp, Value1 FROM MyTable WHERE Timestamp BETWEEN STRFTIME("{start}") AND STRFTIME("{end}")"""
df = pd.read_sql(query, engine)

1 Comment

I would say f-strings for SQL parameters are best avoided owing to the risk of SQL injection attacks, e.g. see psycopg.org/psycopg3/docs/basic/params.html#sql-injection

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.