4

I am having an issue with bulk inserting into a staging table in MSSQLSRV 2008 R2.

I am inserting a CSV with ~200,000 lines and it is taking roughly 5 minutes to complete.

I tried using both PDO and the sqlsrv driver. They both seem to yield poor performance.

Here is the code to give an idea of what I am doing (I included both SQLSRV and PDO code):

...
    try {
        //create structure table record
        foreach ($mapped_data as $k => $v) {
            $insert .= $k . ",";
            $values .= $v . ",";
        }
        $insert = substr($insert, 0, -1); //remove last ,
        $values = substr($values, 0, -1);
        $tableName = $table;
        if ($stageData) {
            $tableName = "stage_$table";
        }
        if ( $query == "" )
            $query = "INSERT INTO $tableName ($insert) VALUES ";
        $query .= "($values),";

        // Insert in blocks of 1000 lines
        if ($line % 1000 == 0) {
            $log->logInfo("Executing @ line: $line");
            $query = substr($query, 0, -1); //remove last ,
            $query .= ";";

            // ======================
            // = SQLSRV DRIVER CODE =
            // ======================
            sqlsrv_begin_transaction($sqlsrvConn);
            $queryResult = sqlsrv_query($sqlsrvConn,$query);
            if ($queryResult) {
                sqlsrv_commit($sqlsrvConn);
            } else {
                sqlsrv_rollback($sqlsrvConn);
            }

            // ===================
            // = PDO DRIVER CODE =
            // ===================
            $conn->beginTransaction();
            $res = $conn->prepare($query);
            if($res->execute() === false) {
                $errInfo = $res->errorInfo();
                if ( $conn->inTransaction() ) {
                    $conn->rollback();
                }
                $log->logInfo('Data importing error on line: ' . $line . $errInfo[2]);
                $errors[] = 'Data importing error on line: ' . $line . $errInfo[2];
            } else {
                if ( $conn->inTransaction() ) {
                    $conn->commit();
                    $query = "";
                    $importedRows += ($line - 6) - $importedRows;
                }
            }
        }
    }
    catch (PDOException $e) {
        if ( $conn->inTransaction() ) {
            $conn->rollBack();
        }
        $log->logInfo('PDO Exception: ' . $e->getMessage());
        $errors[] = 'PDO Exception: ' . $e->getMessage();
    }
    $line++;
} // End of while loop through each CSV Line

fclose($handle);
$totalRows = $line - 6;
$importedRows += $totalRows - $importedRows;

// Insert remaing queries afterwards...
...

I've been scouring the internet looking for possible solutions but haven't been able to find anything that worked.

I've found this post which basically says to batch rows together (which I've already done).

And I found another post that said for PDO, to set connectionpooling=0. I tried that and didn't see any increase in performance.

Has anyone else run into this issue with SQLSRV and PHP?

Cheers,

6
  • +1 for using PDO, but you have to use prepared statements to be more protected. Commented Oct 7, 2013 at 18:01
  • Is the database server running on the same network as your PHP server? Are you possibly having a bandwidth issue? Commented Oct 7, 2013 at 18:04
  • Have you tried benchmarking your script to see if the slow-down is in fact the INSERTs? Commented Oct 7, 2013 at 18:06
  • The DB and PHP are running on my local development machine (which is a fairly strong machine). I have not tried benchmarking the script. Will try that Commented Oct 7, 2013 at 18:25
  • How big is each record? How much other load is on your server? How good is your disk's write throughput? Are there a lot of large fields to index? Are you CPU-bound? Have you considered that inserting 200k records via any method might take 5 minutes? Commented Oct 7, 2013 at 18:26

1 Answer 1

1

I had a somewhat similar issue. Because my problem was a lack of available memory, my server kept having to take extra time to deal with handling virtual memory. If that's not your problem, my answer will not be useful to you.

You are using string concatenation followed by substr to remove the last comma. When you use substr, it makes another copy of the string, which is memory-intensive for long strings. See this question for an example of what happens when the strings are long. When I switched to array concatenation, my speed greatly increased due to the lower memory usage. However, if you don't have memory issues, array concatenation could actually be slower for you.

A couple of other things I saw were that you only need to collect the $inserts variable once, and you are not unsetting big variables as soon as you no longer need them. I don't know if correcting for that kind of thing will make an appreciable difference for you or not. Here are the basic kinds of changes you could try:

    if(!isset($insert)) {
        $insert = array();
        $collect = true;
    }
    $values = $array();
    foreach ($mapped_data as $k => $v) {
        if(isset($collect)) 
            $insert[] = $k;
        $values[] = $v;
    }
    unset($collect);

    .....
    if(!isset($queryend)) 
         $queryend = array();
    $queryend[] = "(".implode(",",$values).")";

    .....
    $query = "INSERT INTO $tableName ("
             .implode(",",$insert)
             .") VALUES "
             .implode(",", $queryend);
    unset($queryend);  //always unset big things as soon as possible

    .....  //after $res = $conn->prepare($query);
    unset($query);
Sign up to request clarification or add additional context in comments.

1 Comment

Memory isn't an issue on my machine, but it doesn't hurt to make things more memory efficient either. I will try this and see if it helps. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.