How to write an algorithm for parallel processing of a large array of data?

S

sindrom2014-09-30 15:51:36

Algorithms

sindrom, 2014-09-30 15:51:36

There is a table in a DB with the list of tasks. The script accesses this table, gets a portion of tasks, and then, as it is processed, marks each task as completed.
It is necessary to create an algorithm that will make it possible to run any number of copies of this script so that they process the task queue in parallel. In this case, of course, one should not allow processing of one task by several copies of the script.

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

S

Sergey, 2014-09-30
Protko @Fesor

it is not clear why not to use normal solutions for this, such as rabbitmq.
Well, if you think about it, the task has a status. pending, processing, done, failed and the number of attempts to restart failed tasks if needed. The main snag here is that you need to lock the record so that these workers get up-to-date ones.

I

idShura, 2016-10-13
@hydra_13

Try it like this:

SELECT `Файл`, GROUP_CONCAT(`Автор`), GROUP_CONCAT(`Издательство`) FROM MY_TABLE GROUP BY `Файл`

R

romy4, 2016-10-13
@romy4

all you need is to learn GROUP BY and HAVING