I am creating an application that inserts (or updates) values in mysql daily. A simplified recordset with headers is :
ItemName,ItemNumber,ItemQty,Date
test1,1,5,2016/01/01
test1,1,3,2016/01/02
test2,2,7,2016/01/01
test2,2,5,2016/01/02
When using a simple insert statement for the above recordset with 16 columns and 216.000 records takes about 4 minutes (php/mysql) - This covers a week of values. Of course if I import the same recordset I get duplicates. I am trying to find a way to effectively disallow duplicate entries. The aim is to : In the scenario where I import every day a recordset that has dates for the current week I end up with the addition of the new dates only.
The only thing that might change in consecutive imports is the ItemQty. In php I made a logic where I query the db for ItemName,ItemNumber,Date with the values I am trying to insert. If there is a result on the SELECT statement, I break. If there isn't, I proceed inserting a new row. Problem is that with the addition of this logic now it does not take 4 minutes, but a couple of hours. (Works though)
Any ideas?
I was thinking perhaps when I insert, to insert something like a checksum column, for example md5(ItemName,ItemNumber,ItemQty,Date) and then check this checksum rather than SELECT * FROM $table WHERE ItemName = value ,ItemNumber = value,ItemQty = value,Date = value that I currently have.
My problem is that the records I insert have nothing unique basically. Uniqueness comes from a group of fields only if compared to the dataset to be imported. If I manage somehow to get uniqueness, I'll solve my other problem too, which is deleting a row or updating a row when the ItemQty changes.