30

I want to insert multiple rows and get IDs back with asyncpg, i found two ways: 1: generate sql like this

INSERT INTO films (code, title, did, date_prod, kind) VALUES
    ('B6717', 'Tampopo', 110, '1985-02-10', 'Comedy'),
    ('HG120', 'The Dinner Game', 140, DEFAULT, 'Comedy')
RETURNING id;

2: use prepared statement in for loop

values =(('B6717', 'Tampopo', 110, '1985-02-10', 'Comedy'),
        ('HG120', 'The Dinner Game', 140, DEFAULT, 'Comedy'))
stmnt = connection.prepare("INSERT INTO films (code, title, did, date_prod, kind) VALUES $1, $2, $3, $4, $5  RETURNING id")
for val in values:
    stmnt.fetchval(*val)

which way i must prefer in case 100x times with 700 000 rows, or there is some way to combine this approaches? i totally green, so throw some tomattoes in me

3
  • You could try COPY FROM. In my experience it's a lot faster than individual INSERT statements. Commented May 2, 2017 at 13:52
  • Possible duplicate of psycopg2: insert multiple rows with one query Commented May 2, 2017 at 20:34
  • 3
    @Udi this is not a valid duplicate Commented May 4, 2017 at 10:15

3 Answers 3

27

asyncpg provides the executemany method to insert many rows.

statement = """INSERT INTO films (code,
                           title, 
                           did, 
                           date_prod, 
                           kind) VALUES($1, $2, $3, $4, $5);"""
await connection.executemany(statement, values)

If you need to use RETURNING as you later mentioned to return the inserted ids, this answer is the way to go.

Sign up to request clarification or add additional context in comments.

2 Comments

thnx for right tag. i edit question because of ur answer - add "RETURNINNG" clause - so "executemany" now do not fit requirements, sorry for misslead.
what if i have an array of objects, will this work?
22

If you need to use the RETURNING clause to obtain the ids back, then the following is the most efficient way of inserting multiple values:

res = await conn.fetch('''
    INSERT INTO films (code, title, did, date_prod, kind)
    (SELECT
        r.code, r.title, r.did, r.date_prod, r.kind
     FROM
        unnest($1::films[]) as r
    )
    RETURNING id
''', [
    (None, 'B6717', 'Tampopo', 110, '1985-02-10', 'Comedy'),
    (None, 'HG120', 'The Dinner Game', 140, None, 'Comedy')
])

Note that the records you pass as input must correspond to the shape of the table: PostgreSQL does not support arbitrary records as input, so you must use a known record type. Simply pass the columns you are not inserting as None and don't include them in the SELECT return list. This method also doesn't allow you to rely on DEFAULT, you must specify each inserted value explicitly.

2 Comments

Just a minor note that "input must correspond to the shape of the table" is quite dangerous. Shape of the table, order of columns (verify with \d <Table> in psql) can vary from one PG server to another and this makes your code heavily dependent on how the tables were created on each db instance.
A bit late to the party but looks like fetchmany would work magicstack.github.io/asyncpg/current/api/…
19

Another way to insert many rows at once (assuming you don't need the inserted IDs) is to use the copy_records_to_table method.

data = [
    ("row", 1, "some data"),
    ("row", 2, "more data"),
]
await conn.copy_records_to_table('mytable', records=data)

5 Comments

Turns out this was the most convenient manner for me: I had batches of records that were grouped by their table name, and with their columns already in the correct order.
...although, I need "ON CONFLICT ... UPDATE ...", so I'm not sure I'll be able to use this mechanism.
instead of a plain tuple, would an array of objects do, how would you convert the object into a tuple if it doesnt
@PirateApp I don't see why an array wouldn't work, but if a tuple is strictly required, you can simply call tuple(my_list).
Does anyone know what the difference is in terms of performance or generated SQL between using executemany() vs copy_records_to_table()? Two scenarios: (1) PK is a serial (not populated explicitly) (2) PK is a UUID (must be populated by pre-generated UUIDs in the case of copy_records_to_table())

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.