if exists update else insert csv data MySQL

Question

I am populating a MySQL table with a csv file pulled from a third party source. Every day the csv is updated and I want to update rows in MySQL table if an occurrence of column a, b and c already exists, else insert the row. I used load data infile for the initial load but I want to update against a daily csv pull. I am familiar with INSERT...ON DUPLICATE, but not in the context of a csv import. Any advice on how to nest LOAD DATA LOCAL INFILE within INSERT...ON DUPLICATE a, b, c - or if that is even the best approach would be greatly appreciated.

LOAD DATA LOCAL INFILE 'C:\\Users\\nick\\Desktop\\folder\\file.csv' 
INTO TABLE db.tbl
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"' 
LINES TERMINATED BY '\r\n' 
IGNORE 1 lines;

Max Yakimets · Accepted Answer · 2015-08-25 21:18:21Z

7

Since you use LOAD DATA LOCAL INFILE, it is equivalent to specifying IGNORE: i.e. duplicates would be skipped. But

If you specify REPLACE, input rows replace existing rows. In other words, rows that have the same value for a primary key or unique index as an existing row.

So you update-import could be

LOAD DATA LOCAL INFILE 'C:\\Users\\nick\\Desktop\\folder\\file.csv' 
REPLACE
INTO TABLE db.tbl
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"' 
LINES TERMINATED BY '\r\n' 
IGNORE 1 lines;

https://dev.mysql.com/doc/refman/5.6/en/load-data.html

If you need a more complicated merge-logic, you could import CSV to a temp table and then issue INSERT ... SELECT ... ON DUPLICATE KEY UPDATE

answered Aug 25, 2015 at 21:18

Max Yakimets

1,2358 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Nick · Accepted Answer · 2015-09-22 19:42:04Z

I found that the best way to do this is to insert the file with the standard LOAD DATA LOCAL INFILE

LOAD DATA LOCAL INFILE 
INTO TABLE db.table
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"' 
LINES TERMINATED BY '\r\n' 
IGNORE 1 lines;

And use the following to delete duplicates. Note that the below command is comparing db.table to itself by defining it as both a and b.

delete a.* from db.table a, db.table b
where a.id > b.id
and a.field1 = b.field1
and a.field2 = b.field2
and a.field3 = b.field3;

To use this method it is essential that the id field is an auto incremental primary key.The above command then deletes rows that contain duplication on field1 AND field2 AND field3. In this case it will delete the row with the higher of the two auto incremental ids, this works just as well if we were to use < instead of >.

Collectives™ on Stack Overflow

if exists update else insert csv data MySQL

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related