Answer the question
In order to leave comments, you need to log in
How to write an algorithm for parallel processing of a large array of data?
There is a table in a DB with the list of tasks. The script accesses this table, gets a portion of tasks, and then, as it is processed, marks each task as completed.
It is necessary to create an algorithm that will make it possible to run any number of copies of this script so that they process the task queue in parallel. In this case, of course, one should not allow processing of one task by several copies of the script.
Answer the question
In order to leave comments, you need to log in
it is not clear why not to use normal solutions for this, such as rabbitmq.
Well, if you think about it, the task has a status. pending, processing, done, failed and the number of attempts to restart failed tasks if needed. The main snag here is that you need to lock the record so that these workers get up-to-date ones.
Try it like this:
SELECT `Файл`, GROUP_CONCAT(`Автор`), GROUP_CONCAT(`Издательство`) FROM MY_TABLE GROUP BY `Файл`
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question