Answer the question
In order to leave comments, you need to log in
How to bypass a large json file and update all related records in MySQL, with a <2 sec lag behind the file?
Hello. There is a parser script (parse.php) that saves the parsing result on the server in a file and JSON format. This parser updates the data every 1.5-3 seconds. The cron runs another PHP script (work.php), which gets the actual JSON every 2 seconds, parses its json_decode and loops through the array.
The array structure is like this:
events : {
1 : {
id, data, title и т.д
},
2 : {
id, data, title и т.д
},
}
ignore_user_abort(true);
Answer the question
In order to leave comments, you need to log in
First, if it is possible to change the parser - remove json
It is read into memory entirely, it cannot be "read line by line" using an iterator moving through the file reading 1 line
at a time It looks like you have implemented a queue, but only your own. Maybe take a queue (adapt a radish, there are simple queues in the kit - or see what other queues there are - there are a dozen of them in my opinion), so that the tasks fall into the list, and the second script sorts out the list, starting on time or when the queue starts executing the task, instead of hanging in memory and while (true) waited until he was told "work" ...
Yandex solves the problem of analytics of who clicked on what exactly in queues, dropping the "lag" from "2 seconds" to "what difference does it make to me, as the percentage is freed, we'll do it" - yes, tasks begin to be processed sequentially, not in parallel.
You can also write your script on the node, asynchronous can help there. or do it with the help of
new class extends \Threaded by including the pthreads.so/.dll extension.
The principle is to make several threads in one script that do not know about each other do your task.
But keep in mind that writing and updating in SQL is still queued, so there is a speed limit
The correct solution is that the parser should immediately write to the database.
If this is not possible, you should try to eliminate overlays and double treatments as much as possible:
For each iteration, work.php sends an ID to the getEvent.php script using fsockopen.
... In getEvent.php, again, there is a get of the actual JSON, the decode and the script looks for the "events" ID that work.php sent to it, then processes this data and updates it in the MYSQL database
With such a setting - really nothing can be done. It's just not designed for such operations. But at least the first time it must be tightened into the base. And always store this data there and process it there.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question