1

Big thanks in advance, relatively new to psycopg2.

I'm trying to bulk insert data in the form of a pandas dataframe to my existing postgres database.

try:
    psycopg2.extras.execute_values(
                cur=cur,
                sql=sql.SQL("""
                INSERT into {table_name} ( {columns} )
                VALUES %s
                """).format(table_name=Identifier(entity),
                            columns=SQL(', ').join(map(Identifier, column_names))
                            ),
                argslist=dataframe,
                template=None,
                page_size=500)

except Exception as error:
    print(ERROR: ' + error)

I get the error below when I run this:

string index out of range

I tried changing the dataframe to a dict, using:

dataframe = dataframe.to_dict(orient='records')

The output that I am getting from the except clause is now as follows:

'dict' object does not support indexing

Any help hugely appreciated, I'm not sure what the issue is here.

Thanks in advance

2
  • 1
    Does this stackoverflow.com/a/8666415/5666087 answer your question? Commented Apr 29, 2020 at 13:32
  • In case you are passing dicts, your placeholder is (%(dict_key_1)s, %(dict_key_2)s, ...) instead of %s Commented Apr 29, 2020 at 21:01

2 Answers 2

1

This seems like a case of an unhelpful error message. Quoting from another SO answer:

You have to give %% to use it as % because % in python is use as string formatting so when you write single % its assume that you are going to replace some value with this.

So when you want to place single % in string with query always place double %.

Try the following code, which replaces %s with %%s.

try:
    psycopg2.extras.execute_values(
                cur=cur,
                sql=sql.SQL("""
                INSERT into {table_name} ( {columns} )
                VALUES %%s
                """).format(table_name=Identifier(entity),
                            columns=SQL(', ').join(map(Identifier, column_names))
                            ),
                argslist=dataframe,
                template=None,
                page_size=500)

except Exception as error:
    print(ERROR: ' + error)
Sign up to request clarification or add additional context in comments.

1 Comment

I get the error message: the query doesn't contain any '%s' placeholder
0

Alternatively, you could use pandas.to_sql:

from sqlalchemy import create_engine
engine = create_engine('"postgresql://postgres:postgres@localhost/postgres"', echo=False)

df.to_sql('table_name', con=engine, if_exists='append')

Play around with this before you try this in production...

3 Comments

I've had a go at doing this and found it to be on the slow side, while reading about how to speed it up, I read elsewhere that to_sql is a less efficient way of doing this when dealing with larger volumes of data.and I should output to temp file and batch copy into the db. Not sure if you have any thoughts on this
i know there is a multi option in pandas to_sql. also if u r using sqlalchemy as ur connection, there is an insert many option that pscopg2 provides. somewhere in the docs. it will speed up ur insertions into the database table
You could try method 4 from this article: medium.com/analytics-vidhya/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.