Implementing a Long Job Queue in MySQL + PHP

A

Aleksey Kuzmin2011-10-05 13:01:47

Message Queues

Aleksey Kuzmin, 2011-10-05 13:01:47

There is a certain queue of tasks (> 1M). It is supposed to store the job queue in a MySQL table.

New tasks are marked with the NEW flag.
Each task from the queue is executed by a certain script handler for about 3 minutes.
After execution, the task is marked in the queue with the READY flag.
In case of an error or abnormal termination of the script, the task must be repeated.

You need to run several scripts in parallel.

Question1: How to show other script instances (which run in parallel) that the task is in progress, so that in case of an abnormal completion, the record in the database is rolled back to its original state, i.e. NEW?

I have this solution
a) run the handler script -> mark the task as PROCESSING + write the time
then
-> if everything is fine, then the task becomes READY
-> if there is an accident, then the task remains PROCESSING

b) Create another script that checks tasks for PROCESSING status for more than 10 minutes and resets it to NEW.

So the hung tasks will be launched again + you can fix the number of attempts.

Question2: Is there a more elegant solution?

Reply

Answer the question

In order to leave comments, you need to log in

7 answer(s)

I

Ivan, 2011-10-05
@iSage

Why not use MQ for the queue, for example the same RabbitMQ . It is both nimble and capable of several parallel workers

C

cat_crash, 2011-10-05
@cat_crash

Yes, you need to use multithreading. PHP does not have native multithreading support, but there is a “crutch” described here phplens.com/phpeverywhere/?q=node/view/254

H

Halfi, 2011-10-21
@Halfi

Item b - creepy crutch. You can wrap the query execution in a transaction and an exception. Already in the exception you change the status to some fatal and then you can hang a demon looking for fatals and processing them. If processing is not required, you can simply change the status to new and add one to the fatality counter.

M

Mikhail Dudek, 2011-10-05
@Nikius

If you run tasks with cron, then multithreading will be organized automatically.
On the question itself: Perhaps it is worth reworking the script itself so that it can correctly process at least some of the errors (and if possible, all of them) and can set the NEW status on its own. And use your decision as a backup plan :)

V

Vitaly Peretyatko, 2011-10-06
@viperet

Job Queue is well implemented with Pub/Sub in Redis

A

Andrey Sergeev, 2011-10-06
@elfin

We are using gearman in a similar situation .

K

karellen, 2011-10-21
@karellen

Radishes are generally good for this. For lightweight options, I personally prefer BLPOP + RPUSH - blocking pop on the left, task replenishment on the right. Well, or vice versa. In its purest form, a FIFO queue, plus workers can easily return the task back through the same RPUSH in case of a temporary error. BLPOP returns both the key it popped from and the value itself. So it is convenient to differentiate the tasks themselves.
Question 1 - With blpop, put the key with the task identifier somewhere. In hash or just in casespace. Rollback - through transactions, of course, wrapping the entire body of the worker in the absolute catching of exceptions.