1

I am creating an application that inserts (or updates) values in mysql daily. A simplified recordset with headers is :

ItemName,ItemNumber,ItemQty,Date
test1,1,5,2016/01/01
test1,1,3,2016/01/02
test2,2,7,2016/01/01
test2,2,5,2016/01/02

When using a simple insert statement for the above recordset with 16 columns and 216.000 records takes about 4 minutes (php/mysql) - This covers a week of values. Of course if I import the same recordset I get duplicates. I am trying to find a way to effectively disallow duplicate entries. The aim is to : In the scenario where I import every day a recordset that has dates for the current week I end up with the addition of the new dates only.

The only thing that might change in consecutive imports is the ItemQty. In php I made a logic where I query the db for ItemName,ItemNumber,Date with the values I am trying to insert. If there is a result on the SELECT statement, I break. If there isn't, I proceed inserting a new row. Problem is that with the addition of this logic now it does not take 4 minutes, but a couple of hours. (Works though)

Any ideas?

I was thinking perhaps when I insert, to insert something like a checksum column, for example md5(ItemName,ItemNumber,ItemQty,Date) and then check this checksum rather than SELECT * FROM $table WHERE ItemName = value ,ItemNumber = value,ItemQty = value,Date = value that I currently have.

My problem is that the records I insert have nothing unique basically. Uniqueness comes from a group of fields only if compared to the dataset to be imported. If I manage somehow to get uniqueness, I'll solve my other problem too, which is deleting a row or updating a row when the ItemQty changes.

0

2 Answers 2

1

The one that you are looking for is the unique constraint. Using unique constraint, you can add all your columns to the constraint and if all columns satisfied the inserting data, it will not proceed in inserting

Sign up to request clarification or add additional context in comments.

Comments

1

Few options:

1) On PHP, iterate over the records, mapping the duplicate ones and keeping the newests

$itemsArray = []; // The array where you have stored your data

$uniqueItems = [];

foreach($itemsArray as $item)
{
    if(isset($uniqueItems[$item['ItemName']]))
    {
        $oldRecord = $uniqueItems[$item['ItemName']];

        $newTimeStamp = strtotime($item['Date']); // Might not work with your format date
        $currentTimeStamp = strtotiem($oldRecord['Date']);

        if($newTimeStamp > $currentTimeStamp)
        {
            $uniqueItems[$item['ItemName']] = $item;
        }
    }
    else
    {
        $uniqueItems[$item['ItemName']] = $item;
    }
}

// uniqueItems now hold only 1 record per ItemName (the newest one)

2) Sort the data in php by date on ascending order(before inserting in database). Then, on your clause, use ON DUPLICATE KEY UPDATE. This will cause mysql to update the records with duplicate key. In this case, the older records will be inserted first, so the lastest records will be inserted last, overwritting the old records data.

3 Comments

In order to use ON DUPLICATE KEY he needs a unique key in the table.
I think the unique key is (itemName, itemNumber, date)
So he can use the first approach

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.