I have, lets say user name unique and I need to insert many users at one query (It could be thousands of users). What would be the best approach to check for each user if the username already exists? Using PostgreSql.
-
The only way to do the "insert" with one query is to use a staging table -- and then join to see what the new users are in the single insert query.Hogan– Hogan2017-03-17 20:25:14 +00:00Commented Mar 17, 2017 at 20:25
-
Thank you for the response. Can you tell me more specifically? And what about performance?SomethingElse– SomethingElse2017-03-17 20:44:12 +00:00Commented Mar 17, 2017 at 20:44
3 Answers
You can use on conflict clause, e.g.
create table my_table(id serial primary key, user_name text unique);
insert into my_table (user_name) values
('John'), ('Ben'), ('Alice'), ('John'), ('Ben'), ('Alice'), ('Sam')
on conflict do nothing;
select *
from my_table;
id | user_name
----+-----------
1 | John
2 | Ben
3 | Alice
7 | Sam
(4 rows)
Note the last row, the primary key sequence has been updated three times with no apparent effect.
2 Comments
on conflict, as it has been added for automatically skipping (or updating) rows with duplicates. There is no means to notify about conflicts in this construction.set your username as a PRIMARY KEY then the engine will do the check for you with no extra hassle
alter table table_name add primary key (username);
In case you already have PK, one can just add a constrain
alter table table_name add constraint constraint_name UNIQUE (username);
2 Comments
It's unclear to me if you want to see those rows that have not been inserted or only those that have. But both requirements can be done using writeable CTEs:
Assuming the following table:
create table some_table (id serial primary key, user_name text unique);
To see (only) those rows that have been inserted:
insert into some_table (user_name)
values
('Tom'), ('Dick'), ('Harry'), ('Arthur'), ('Mary'), ('Ford'), ('Tom'), ('Zaphod')
on conflict do nothing
returning user_name;
To see those rows that have not been inserted:
with new_data (name) as (
values
('Tom'), ('Dick'), ('Harry'), ('Arthur'), ('Mary'), ('Ford'), ('Tom'), ('Zaphod')
), inserted as (
insert into some_table (user_name)
select name
from new_data
on conflict do nothing
returning user_name
)
select *
from new_data
where name not in (select user_name from inserted);