1

I have a PHP function that do batch insertion into a MYSQL table. The function take input parameter as an array, it then loop thru the array to build the insert query like this:

public function batchInsert($values){

    $nbValues = count($values);
    $sql = 'INSERT INTO vouchers (`code`,`pin`,`owner_id`,`value`,`description`,`expire_date`,`lifetime`) VALUES ';
    for ($i=0; $i < $nbValues; $i++) { 
        $sql .= '(:col1_'.$i.', :col2_'.$i.', :col3_'.$i.', :col4_'.$i.', :col5_'.$i.', :col6_'.$i.', :col7_'.$i.')';
        if ($i !== ($nbValues-1))
            $sql .= ',';
    }
    $command = Yii::app()->db->createCommand($sql);
    for ($i=0; $i < $nbValues; $i++) {
        $command->bindParam(':col1_'.$i, $values[$i]['code'], PDO::PARAM_STR);
        $command->bindValue(':col2_'.$i, sha1($values[$i]['pin']), PDO::PARAM_STR);
        $command->bindParam(':col3_'.$i, $values[$i]['owner_id'], PDO::PARAM_INT); 
        $command->bindParam(':col4_'.$i, $values[$i]['value'], PDO::PARAM_INT);
        $command->bindParam(':col5_'.$i, $values[$i]['description'], PDO::PARAM_STR);
        $command->bindParam(':col6_'.$i, $values[$i]['expire_date'], PDO::PARAM_STR);
        $command->bindParam(':col7_'.$i, $values[$i]['lifetime'], PDO::PARAM_INT);
    }
    return $command->execute();
}

If the input array has 1K elements, the building of this sql query would take quite a large time. I believe this is caused by the way $sql variable is reconstructed after every loop. Is there any better way that you could suggest for me to optimise this? Thank you!

P/S: At the end of this batch insertion, I need to export all generated vouchers to an Excel file. Hence, if I built one single query and if the query was successful then the export function is called. By doing many seperate insertions, I cannot keep track of which one is inserted and which one is not (e.g. voucher code is unique, randomly generated and may have a chance of colliding). That's why I need a single query (or am I wrong?).

3
  • I think, in your first loop you need to build your sql query, and in another need to bind the params, not in a nested for. Commented Apr 28, 2015 at 10:07
  • @lolka_bolka it's not a nested for. It's two sequential fors The if statement is inline, but has the closing for bracket where the ifs closing bracket should be. Commented Apr 28, 2015 at 10:15
  • Oh, I see, sorry, my mistake. Commented Apr 28, 2015 at 10:17

3 Answers 3

4

Rather than build one gigant-enormous string, consider executing individual inserts, but taking advantage of the use of a prepared statement

public function batchInsert($values){
    $nbValues = count($values);
    $sql = 'INSERT INTO vouchers (`code`,`pin`,`owner_id`,`value`,`description`,`expire_date`,`lifetime`) 
        VALUES (:col1, :col2, :col3, :col4, :col5, :col6, :col7)';
    $command = Yii::app()->db->createCommand($sql);
    for ($i=0; $i < $nbValues; $i++) {
        $command->bindParam(':col1', $values[$i]['code'], PDO::PARAM_STR);
        $command->bindValue(':col2', sha1($values[$i]['pin']), PDO::PARAM_STR);
        $command->bindParam(':col3', $values[$i]['owner_id'], PDO::PARAM_INT); 
        $command->bindParam(':col4', $values[$i]['value'], PDO::PARAM_INT);
        $command->bindParam(':col5', $values[$i]['description'], PDO::PARAM_STR);
        $command->bindParam(':col6', $values[$i]['expire_date'], PDO::PARAM_STR);
        $command->bindParam(':col7', $values[$i]['lifetime'], PDO::PARAM_INT);
        $command->execute();
    }
}

So you only prepare the short insert statement once, and just bind/execute inside the loop

Sign up to request clarification or add additional context in comments.

3 Comments

I like getting rid of a for loop. This also removes a potential issue of the database not accepting the query because it is too long (input buffer overflow or something like that).
Thank you Mark and Richard, the reason why I did it that way is because at the end of this batch insertion, I need to export all generated vouchers to an Excel file. Hence, if I built one single query and if the query was successful then the export function is called. By doing many single insertion, I cannot keep track of which one is inserted and which one is not (voucher code is unique, randomly generated and may have a chance of colliding). Any suggestion is much appreciated! Thank you!
If you have potential issues with a failure on individual inserts, then wrap the whole in a transaction and rollback on any failure
1

Let's define your requirement first:

  • You need to insert 1000 records, record being defined in an array
  • It should be fast
  • An insert could fail so it has to be repeated

First problem is that you are dealing with a database here. Modern MySQL uses InnoDB as storage engine - it's a transactional engine. PDO, by default, uses something called auto-commit.

What does all this mean for you? Basically, it means that a transactional engine will force the hard drive to really write the record before it tells you it's written. Engines such as MyISAM or NoSQLs won't do this. They will just let the OS to worry about the writing and OS will just queue the information that it should write to the disk. Disks are terribly slow so OS tries to compensate, and some disks even have caches where they store a lot of temp data.

However, unless information is really written to the disk, it's not saved since it could be lost. This is the D part of ACID in database - data is durable, ergo it's on a permanent storage device. This is why MySQL and other transactional databases are slow - because hard drives are painfully slow devices. A mechanical hard drive is able to perform between 100 - 300 writes per second (we'll call it IOPS or Input-output Operation Per Second). This is snail-like slow.

So, what PDO does by default is that it forces every query to be a transaction. That means every single query you do will take that 1 IOPS and you only have a few of them. So when you run 1000 inserts, if everything is great and you really do have 300 IOPS available, your inserts will take a while. If they fail and you have to retry them, then it's even worse since it lasts longer.

So what can you do to make it quicker? You do two things.

1) You wrap several inserts into a single transaction using PDO's method beginTransaction and commit when you're done. That makes the hard drive write several records using 1 IOPS. If you wrap all 1000 inserts into a single transaction, it will most likely be written extremely fast. Even though disks have low IOPS, they pack quite some bandwith so they will be able to eat up all 1000 inserts in a single go

2) Make sure that all your inserts will be successful. That means you should probably generate your voucher code at a later stage in the game, once everything's inserted. Remember, if a single query in a transaction fails - all of them fails (the A of ACID - Atomicity).

Basically, what I'm trying to highlight here is that Mark Baker posted a great answer and you should most likely modify your logic a bit. Prepare the statement once, execute multiple times. However, do wrap several calls to execution in a transaction - that will make it go real fast.

2 Comments

Much appreciated and respected, N.B! Thank you for sharing detail and useful knowledge! You are a real asset to users on stackoverflow!
@osappuk - thanks :) and good luck with your future programming challenges!
0

What I did was change the string into an array, then implode it at the last step:

public function batchInsert($values){

    $nbValues = count($values);
    $sql = array();
    for ($i=0; $i < $nbValues; $i++) {
        $sql[] = '(:col1_'.$i.', :col2_'.$i.', :col3_'.$i.', :col4_'.$i.', :col5_'.$i.', :col6_'.$i.', :col7_'.$i.')';
    }
    $command = Yii::app()->db->createCommand('INSERT INTO vouchers (`code`,`pin`,`owner_id`,`value`,`description`,`expire_date`,`lifetime`) VALUES (' . implode('),(',$sql) . ')');
    for ($i=0; $i < $nbValues; $i++) {
            $command->bindParam(':col1_'.$i, $values[$i]['code'], PDO::PARAM_STR);
            $command->bindValue(':col2_'.$i, sha1($values[$i]['pin']), PDO::PARAM_STR);
            $command->bindParam(':col3_'.$i, $values[$i]['owner_id'], PDO::PARAM_INT); 
            $command->bindParam(':col4_'.$i, $values[$i]['value'], PDO::PARAM_INT);
            $command->bindParam(':col5_'.$i, $values[$i]['description'], PDO::PARAM_STR);
            $command->bindParam(':col6_'.$i, $values[$i]['expire_date'], PDO::PARAM_STR);
            $command->bindParam(':col7_'.$i, $values[$i]['lifetime'], PDO::PARAM_INT);
    }
    return $command->execute();
}

4 Comments

I ran a test - I concatenated a string 1024 times. String being added was 1024 characters. Time taken: 0.00081706047058105 sec. Memory taken: 1.25 MB. This test determines that what you concluded is basically - wrong. String concatenation is quick, and it takes RAM as any other operation in computing.
Thank you Richard, as said above, I need to build a big single query so I would like to try it your way. However, it seemed to be a programming syntax error in the line with $command variable. Could you please check and correct? Thank you!
So basically you're still claiming that string concatenation is slow and that sticking stuff into an array is so fast that it made your app run as if on steroids. It's just not true, and the sad part is that some naive googler will stumble upon this and take it as truth when the complete opposite is valid. Also, the code above won't make anything run faster. Database will still hang at the committing part. You're looking at the entirely wrong part of the system for optimizing.
This was four or five years ago pulling data out of a Magento api in order to build an xml shopping feed. It kept running out of memory until I did a search and found this to be the reason. I then followed its example of converting the string concatenation to array imploding.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.