0

I wanted to ask how it would be best to proceed in the case that I will explain below:

I enter more than 200,000 rows in a mysql database via php, I take the lines from a csv file, my question is as follows:

I should parse the csv file and as I read I insert the rows, or should I parse the file, forfeit all in an array and then using a foreach insert rows in the database.

I ask this because the inclusion of the data in the database is to be really slow. The table is an InnoDB.

I hope I was clear.

N.B. I can not make a "LOAD DATA LOCAL INFILE" because fields in the .csv file have been mapped previously.

Thank you

9
  • Maybe you could use PHP to remap the fields into something that will work with LOAD DATA LOCAL INFILE. Commented Mar 14, 2016 at 20:05
  • "inclusion of the data in the database is to be really slow": suppose you mean it is slower than you want, or do you really want slow inserts. Commented Mar 14, 2016 at 20:05
  • What do you mean by "mapped previously"? If it's just that the data already exists, you can use LOAD DATA with the column headers skipped. If the columns need processing/conversion, you can read into temp variables. Commented Mar 14, 2016 at 20:06
  • What do you mean by "fields in the .csv file have been mapped previously"? LOAD DATA INFILE will insert new rows into your table, not remove them. Commented Mar 14, 2016 at 20:08
  • @bitfiddler Asking the same question but you beat me! Commented Mar 14, 2016 at 20:09

1 Answer 1

1

The two options you suggest will be nearly as fast, the one with the array maybe slightly slower because of the additional memory access.

But I suppose the bottleneck when executing a lot of insert statements is the database communication. So a third option that might be faster is to reduce the number of insert statements concatenating all insert data into one statement, e.g:

INSERT INTO mytable VALUES(1,2,3),(4,5,6),(7,8,9), ... (10000, 10001, 10002);

As far as I know a mysql statement has a maximum length, so maybe it is necessary to split the statement into multiple chunks.

This answer makes the performance benefit when consolidating insert statements quite clear: https://stackoverflow.com/a/1793209/1696795

Sign up to request clarification or add additional context in comments.

1 Comment

I thought of this solution, in fact, now I will try to implement. Thank you all

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.