mysql unique index from set of columns

Question

I am creating an application that inserts (or updates) values in mysql daily. A simplified recordset with headers is :

ItemName,ItemNumber,ItemQty,Date
test1,1,5,2016/01/01
test1,1,3,2016/01/02
test2,2,7,2016/01/01
test2,2,5,2016/01/02

When using a simple insert statement for the above recordset with 16 columns and 216.000 records takes about 4 minutes (php/mysql) - This covers a week of values. Of course if I import the same recordset I get duplicates. I am trying to find a way to effectively disallow duplicate entries. The aim is to : In the scenario where I import every day a recordset that has dates for the current week I end up with the addition of the new dates only.

The only thing that might change in consecutive imports is the ItemQty. In php I made a logic where I query the db for ItemName,ItemNumber,Date with the values I am trying to insert. If there is a result on the SELECT statement, I break. If there isn't, I proceed inserting a new row. Problem is that with the addition of this logic now it does not take 4 minutes, but a couple of hours. (Works though)

Any ideas?

I was thinking perhaps when I insert, to insert something like a checksum column, for example md5(ItemName,ItemNumber,ItemQty,Date) and then check this checksum rather than SELECT * FROM $table WHERE ItemName = value ,ItemNumber = value,ItemQty = value,Date = value that I currently have.

My problem is that the records I insert have nothing unique basically. Uniqueness comes from a group of fields only if compared to the dataset to be imported. If I manage somehow to get uniqueness, I'll solve my other problem too, which is deleting a row or updating a row when the ItemQty changes.

Ceeee · Accepted Answer · 2016-04-07 09:00:15Z

1

The one that you are looking for is the unique constraint. Using unique constraint, you can add all your columns to the constraint and if all columns satisfied the inserting data, it will not proceed in inserting

answered Apr 7, 2016 at 9:00

Ceeee

1,4422 gold badges17 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Nadir · Accepted Answer · 2016-04-07 09:07:01Z

1

Few options:

1) On PHP, iterate over the records, mapping the duplicate ones and keeping the newests

$itemsArray = []; // The array where you have stored your data

$uniqueItems = [];

foreach($itemsArray as $item)
{
    if(isset($uniqueItems[$item['ItemName']]))
    {
        $oldRecord = $uniqueItems[$item['ItemName']];

        $newTimeStamp = strtotime($item['Date']); // Might not work with your format date
        $currentTimeStamp = strtotiem($oldRecord['Date']);

        if($newTimeStamp > $currentTimeStamp)
        {
            $uniqueItems[$item['ItemName']] = $item;
        }
    }
    else
    {
        $uniqueItems[$item['ItemName']] = $item;
    }
}

// uniqueItems now hold only 1 record per ItemName (the newest one)

2) Sort the data in php by date on ascending order(before inserting in database). Then, on your clause, use ON DUPLICATE KEY UPDATE. This will cause mysql to update the records with duplicate key. In this case, the older records will be inserted first, so the lastest records will be inserted last, overwritting the old records data.

answered Apr 7, 2016 at 9:07

Nadir

1,81913 silver badges20 bronze badges

3 Comments

Barmar Over a year ago

In order to use ON DUPLICATE KEY he needs a unique key in the table.

Barmar Over a year ago

I think the unique key is (itemName, itemNumber, date)

Nadir Over a year ago

So he can use the first approach

Collectives™ on Stack Overflow

mysql unique index from set of columns

2 Answers 2

Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related