A
A
Alexander2016-10-29 14:17:25
PHP
Alexander, 2016-10-29 14:17:25

How long can an INSERT be? How to import 1 million rows?

Hey!
It is necessary to regularly parse a large xml file of 1.5 million lines (2GB). I successfully "parse" it in 20 minutes, I get all the lines into an array. And now you need to fill the table in the database.
Now I am doing inserts in a cycle of 2000 (I chose the number at random) lines at a time:

INSERT INTO tbl (field1, field2) VALUES ('1', '2'),('3', '4'),('5', '6') ...

It turns out 750 iterations of the loop. This whole process takes 5 hours. It is necessary to meet at least 1-2 hours.
Tell me which way to dig? Whether it is possible to make one insert at once on 1.5 million lines?
Which is better: more loop iterations, shorter inserts, or vice versa?
PS Server - 4GB RAM, 2 cores.
Thank you!

Answer the question

In order to leave comments, you need to log in

4 answer(s)
R
romy4, 2016-10-29
@romy4

Turn off indexes before inserting. 1 million is not enough. you are not looking there

S
Sumor, 2016-10-29
@Sumor

Read the documentation section and follow the recommendations
optimizing-innodb-bulk-data-loading
It is necessary to disable additional database work as much as possible: auto-increments, indexes, triggers, constraints, if it is possible to load without transactions (Bulk load).
It might be better to load into an in-memory temporary table via load xml and then insert into your table via insert into ... select ...

M
Max, 2016-10-29
@MaxDukov

I recently faced the task of loading 45 million lines from a file with inserts, loaded into AWS RDS. Directly estimated loading time was about 2 days. after converting sed to csv LOAD DATA INFILE worked in 20 minutes. Turns out it's MUCH faster.
What am I talking about ... if the structure does not match, the structure can be created at the parsing stage. Or load as is into a temporary table, and then do INSERT SELECT FROM

R
Rsa97, 2016-10-29
@Rsa97

If the XML structure matches the base structure, then LOAD XML
. If not, you can try to disable all indexes before inserting, and after inserting, turn them back on.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question