Answer the question
In order to leave comments, you need to log in
There is a huge csv file with data (over 100k lines). How to drive all this goodness in PHP, without SSH, into MySQL?
Hello. Task: I have a huge csv file with data (over 100k lines).
You need all this goodness in PHP, without SSH to drive into MySQL database ... Limit 30 sec.
I tried to read a file of 1000 lines (each time I have to open it), of course, it does not fit in 30 seconds.
How do gurus do such things?
Answer the question
In order to leave comments, you need to log in
Read about LOAD DATA INFILE ( dev.mysql.com/doc/refman/5.1/en/load-data.html )
for example, with its help, 40 million 4kb rows were loaded in 40 minutes (and the bottleneck was php, which generated this data). .in your situation it will be much faster I think.
The algorithm in your case will be as follows:
1. Deleting all indexes from the table where the data is planned to be written.
2. Opening a file (fopen).
3. Read m lines (fgets) until end of file.
4. Compiling a query in the form of a single batch (batch): INSERT INTO ... VALUES ( %row1%, %row2%, ... , %rowm%);
5. Execution of the request.
6. Go to step 3.
7. End of file, close file, build remote indexes.
If steps 3 and 4 are performed in parallel, you can save on memory.
Regarding the limit: the complexity of the algorithm is O (n) - i.e. linearly depends on the number of lines in the file, either optimization (using low-level utilities for inserting data, but this data must be prepared in advance) will help speed up (if not enough) ), or the use of more productive hardware (client, network, server).
I had to parse files with a huge number of e-mails, under similar conditions. I used ajax as a pad. From one, php took the data to the client, and then sent it in batches to another, where it inserted it all into the database.
I can send you this script by mail, although it is rather clumsy, it was done in a hurry. To use it, it is better to split the file into several parts and start parsing in several windows to make it faster. If you add anything, as you need.
I tried to read a file of 1000 lines (each time I have to open it), of course, it does not fit in 30 seconds ...
I usually beat the file into parts on the client side, copy it to ftp. And there a special script with a GET parameter, which contains a counter, reads the first part first, gives a redirect to the second; second to third - and so on until the end of the fill.
Do not open the entire file, but in chunks of N bytes. For example, using CURL.
And what prevents the php script from reading the file not from the beginning and writing on which line (on which byte in the stream) did they stop?
100k - gently excel into pieces (excel pulls up to 999 999). then phpmyadmin neatly pieces into the base.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question