2

Which one will give me better performance?

  1. Use Java simply loop the value and added to the sql string and execute the statement at once? Note that PreparedStatement is also used.

    INSERT INTO tbl ( c1 , c2 , c3 ) VALUES ('r1c1', 'r1c2', 'r1c3'), ('r2c1', 'r2c2', 'r2c3'), ('r3c1', 'r3c2', 'r3c3')

  2. Use the batch execution as below.

    String SQL_INSERT = "INSERT INTO tbl (c1, c2, c3) VALUES (?, ?, ?);";

    try (
            Connection connection = database.getConnection();
            PreparedStatement statement = connection.prepareStatement(SQL_INSERT);
        ) {
    
        int i = 0;
        for (Entity entity : entities) {
            statement.setString(1, entity.getSomeProperty());
            // ...
    
            statement.addBatch();
            i++;
    
            if (i % 1000 == 0 || i == entities.size()) {
                statement.executeBatch(); // Execute every 1000 items.
            }
        }
    }
    
5
  • 2
    Benchmark and find out. You'll need to find an optimal batch size. Too large and you waste time preparing. Too small and you run too many queries. Commented Apr 16, 2019 at 22:36
  • 1
    Yeah i agree with @tadman, you should test and find out empirically for yourself. I will say I think Batch Statements were made for this exact reason. Commented Apr 16, 2019 at 22:39
  • Sometimes the performance of a prepared statement is surprising. They're intended to be re-used like this. Commented Apr 16, 2019 at 22:41
  • I might misunderstand your answers, guys. But I am not talking about the optimal batch size for 2nd way. I am thinking about if both way used PreparedStatement to execute. Which way perform better? Assume there are about 10k to 60k rows to be inserted Commented Apr 16, 2019 at 22:50
  • Also look to see if there is a 'LOAD DATA LOCAL INFILE' api in java to programaticly inject data as if it was a CSV file. Commented Apr 16, 2019 at 22:53

1 Answer 1

1

I did a presentation a few years ago I called Load Data Fast!. I compared many different methods of inserting data as fast as possible, and benchmarked them.

LOAD DATA INFILE was much faster than any other method.

But there are other factors that affect the speed, like the type of data, and the type of hardware, and perhaps the load on the system from other concurrent clients of the database. The results I got only describe what the performance is on a Macbook Pro.

Ultimately, you need to test your specific case on your server to get the most accurate answer.

This is what being a software engineer is about. You don't always get the answers spoon-fed to you. You have to do some testing to confirm them.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.