Conditional on multiple insert

Question

I have a lot of records I need to insert into an already massive table (the table has several million rows and I'm inserting several thousand rows at a time).

The current syntax being used is a simple one:

INSERT INTO my_table (name, email) VALUES ('test','[email protected]'),('test2','[email protected]')

These requests are made in batches of 1 thousand rows at a time.

For reasons outside of my control I do not have the ability to declare email as a unique column (the table already has duplicates).

How can I add a conditional to each INSERT request where we skip the inserts of an email when that value is found in the database?

So let's say in the above request we find we already have a row with email [email protected]. I would want to skip that row but still insert the other row ([email protected]).

I'm struggling to find a syntax that works.

Using SQL Server and pydb.

Thanks in advance for any help! :D

Justin C · Accepted Answer · 2022-04-22 20:01:26Z

1

This query should work

INSERT INTO my_table (name, email) 
SELECT name, email
FROM (VALUES ('test','[email protected]'),('test2','[email protected]')) AS Vals(name, email)
WHERE NOT EXISTS (SELECT 1 FROM my_table x WHERE x.email = Vals.email)

An index on the email column will make this as fast as it can be, but it will still be slower than a normal insert.

answered Apr 22, 2022 at 20:01

Justin C

9549 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Conditional on multiple insert

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related