5

Which of the two methods below will be faster for inserting a large number of rows into a table.

Query method 1 : Execute query one by one.

INSERT INTO tbl_user(id, name, number) VALUES(NULL, 'A', '9999999999');
INSERT INTO tbl_user(id, name, number) VALUES(NULL, 'B', '9999999999');
INSERT INTO tbl_user(id, name, number) VALUES(NULL, 'C', '9999999999');

Query method 2 : Execute query at once.

INSERT INTO tbl_user(id, name, number) VALUES(NULL, 'A', '9999999999'),
                                             (NULL, 'B', '9999999999'), 
                                             (NULL, 'C', '9999999999');
3
  • method 2 because it's execute only one time. Commented Sep 24, 2016 at 10:24
  • Method 2 is faster than the first Commented Sep 24, 2016 at 10:26
  • What we have here is display of how many people have no clue what happens. You want to execute one by one. You never want to do option number 2. Why? Because if you wrap the first method in a transaction block and use prepared statements - you will never experience an error because of max_packet_size. Option number 2 is also a bit more resource intensive to parse. You always want to avoid that, it's also harder to debug if anything bad happens. Therefore - option #3 is the fastest. Add BEGIN TRANSACTION ad the beginning and COMMIT at the end. Good luck. Commented Sep 24, 2016 at 15:22

3 Answers 3

6

Since there are a few arguments, I thought I would try a benchmark, but first

 CREATE TABLE `tbl_user` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(20) DEFAULT NULL,
  `number` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB 

I then generate, SQL queries of the form in the question with 2 lines of python.

Scenario 1:
Many single inserts with the queries each being exactly the same

INSERT INTO tbl_user VALUES(NULL,'A','9999999');
INSERT INTO tbl_user VALUES(NULL,'A','9999999');

1000 Rows; Average (mean) running time of three executions 45.80 seconds
5000 Rows; single run 220 seconds

Scenario 2:
A single query to insert a 1000 rows, it looks like this:

INSERT INTO tbl_user VALUES(NULL,'A','9999999'),
(NULL,'A','9999999'),
(NULL,'A','9999999'),
(NULL,'A','9999999'),

1000 rows Average (mean) running time of three executions 0.17 Seconds
5000 rows Average (mean) running time of three executions 0.48
10000 rows Average (mean) running time of three executions 1.06

Scenario 3:
Similar to scenario 1 but with a START TRANSACTION and COMMIT wrapped around the insert statements

1000 rows Average (mean) running time of three executions 0.16 seconds
5000 rows Average (mean) running time of three executions 0.48
10000 rows Average (mean) running time of three executions 0.91

Conclusion:
Scenario 2, which is what's proposed in two other answers indeed outperforms scenario 1 in a big way. With this data, it's hard to choose between 2 and 3 though. More rigorous testing with a larger number of inserts is required. But without that information i would probably go with three, the reason being that parsing a very large string usually has it's overheads and so does preparing one! I suspect that if we tried to insert about 50,000 records at once in a single statement it might actually be a lot slower.

Sign up to request clarification or add additional context in comments.

3 Comments

Perfect answer with proper example.
glad to have been of help
My experience has been a 10x speed up when batching, but "diminishing returns" happens before 1000 rows.
0

Second method(query) is faster then the first one.

Because in first method it executes three different queries on a table where as in second method it gets executed only once to insert multiple records in table.

You will see major difference when you insert hundreds of rows at a single time.

Comments

0

Second query is much faster than first. As per document factors contributing to increase in performance of multiple insert in a single statement are:-

9.2.2.1 Speed of INSERT Statements

To optimize insert speed, combine many small operations into a single large operation. Ideally, you make a single connection, send the data for many new rows at once, and delay all index updates and consistency checking until the very end.

The time required for inserting a row is determined by the following factors, where the numbers indicate approximate proportions:

Connecting: (3)

Sending query to server: (2)

Parsing query: (2)

Inserting row: (1 × size of row)

Inserting indexes: (1 × number of indexes)

Closing: (1)

If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time. This is considerably faster (many times faster in some cases) than using separate single-row INSERT statements. If you are adding data to a nonempty table, you can tune the bulk_insert_buffer_size variable to make data insertion even faster.

5 Comments

Method #2 isn't faster at all. You're providing only half information. If anything, it's slower. The document you quoted has an interesting point - To optimize insert speed, combine many small operations into a single large operation - we do that by using transactions, not by sending a humongous chunk of text for MySQL to parse. I won't vote on your answer, but it's completely misleading.
That was a general thing for optimization. Read the down segment which i have added after that in my answer not as a quote. That is again from the same document that i added "If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time. This is considerably faster (many times faster in some cases) than using separate single-row INSERT statements"
So you think that sending a 512mb string that MySQL has to parse is faster than sending values that you want to insert, wrapped in transaction? Do you know what a transaction does for HDD I/O?
Non-Unique indexes are "delayed" (via the "change buffer"), so they probably have less impact than implied by the formula above. (Of course, the indexes do need to be updated eventually.)
@N.B. - There are settings that would cause 512MB to croak. Limit the batching to about 1MB.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.