S
S
Stanislav Gamayunov2012-11-20 16:24:46
PHP
Stanislav Gamayunov, 2012-11-20 16:24:46

Parallelize the work of a part of the script?

Greetings.
Please help me with this moment. There is a task that takes a long time. It communicates with someone else's API, which takes most of the time. The algorithm is this:

  1. We read the initial data from the database, prepare and break it into portions. It's pretty fast.
  2. We iterate the array of portions, send each one to the API, get the answer, write it to the database. But this is very slow.
  3. We process all the received data, we also write the result of processing. It's pretty fast too.

To speed up the execution of this task, it is obviously necessary to send portions of data to the API in parallel, and not sequentially. Options that first came to mind:
  1. Gearman . But it seems that this is something not quite right or I do not understand how to work with it. I don’t just need to fuse the work to the workers, I need to run them in parallel and after waiting for the result of the work of all the workers, perform the final action. Maybe you need to add tasks as background, and then look at the list of tasks on the job server?
  2. pcntl_fork . Here I do not fully understand the principle of operation: at some point I have an array of fairly large objects. I need to send parts of this array with different parameters (different API access details, for example). In this case, you need to fork the required number of times, get "hello" from the clones, distribute data to them and wait for each response? Or am I over complicating things?
  3. I also thought about waiting for the completion of the work of the descendants on some .pid files.

But something tells me that there is some other solution or a better understanding of the problem.
Thanks in advance for the tips.

Answer the question

In order to leave comments, you need to log in

5 answer(s)
S
sajgak, 2012-11-20
@happyproff

Gearman is just that. It supports running tasks in parallel and receiving a response upon completion (aka callback) through methods (doNormal, doHigh, doLow).
It is more difficult to work with forks, but it is also possible. For each iteration, you fork a new process and write the result somewhere, the only question is how to read the overall result later. In general, I would advise you to take a closer look at gearman

U
Urvin, 2012-11-20
@Urvin

2. pcntl_signal will signal the end of the child process.
When I worked with these functions for a long time, I could not get my head around completely those things. The work goes something like this:
a) We hang up the pcntl_signal handler. This is a signal about the end of some of the processes.
b) Cyclically forking, first writing the number of the data packet into a certain variable. According to the help, we determine - a copy / not a copy.
c) if the copy, posix_setsid, we create the necessary code, or through pcntl_exec we replace ourselves with the package handler.
At this point, the package handler takes the data returned by the third-party service and processes it in some way.
d) At the end of the child process, we know its PID and, accordingly, we know the index of the processed packet.
3. ps x and read the output. We know the PID and other process attributes. But not kosher.

J
javax, 2012-11-20
@javax

I don’t know how relevant this is for you, but I write scripts of this type in Groovy
. You can also run threads for parallel work.

S
strib, 2012-11-20
@strib

See how the data is.
If transactions are included in the database, then I would do so.
1) Forked the required number of times.
2) Transactions took a set of records from the database and processed. The records would be blocked. Well, or an anonymous transaction marked as executable (xs is it possible to do this in MySQL)
3) After closing the transaction, take the next portion.

V
Vyacheslav Slinko, 2012-11-21
@KeepYourMind

A very cool framework for async applications in php - reactphp.org

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question