A
A
Andrey2016-08-24 02:17:05
PHP
Andrey, 2016-08-24 02:17:05

How to synchronize the site with the worker server?

There is a site on PHP+MySQL. On this site, users create tasks and then run them.
A task is the performance of certain actions. Average task completion time is 5-6 hours.
When a task is started, a worker (Gearman) is launched in the background and then a client is hung on this worker.
It happens like this:

exec("php task/run/test.php > /dev/null &");

$client = new GearmanClient();
$client->addServer();
$client->doBackground('test', json_encode($data));

There can be a large number of simultaneously running tasks (from 1k).
When a task is executed, data is constantly written to the database.
The question became, what will happen if you really run about 1k workers on the same server with the site? As I understand it, 1000 PHP processes and 1000 database connections will hang. It's a crash!!!
I want to separate the server with the site from the workers. I think to do so.
Site on one server, Gearman with his workers on the second server.
But I don’t know how to properly interact with the site database, because there you need to constantly monitor the status of the task (started / stopped) and update the data that the worker generates.
I see an option for interaction between the site and the server with workers through the API. For example, a user starts a task, sends a GET/POST request to a server with a german.
A worker is launched on the server, and a Gearman client is created on the site, which connects to the worker created above.
The worker starts its work and, when performing certain actions, sends a POST / GET to the site, passing the necessary data, which is then written to the database and displayed on the site.
But I think it's a crutch!
Firstly, servers with workers can simultaneously send ~ 1000 requests to the site, which I think can lie down (maybe I'm wrong).
Secondly, you need to monitor the status of the task on the workers in real time. For example, the user clicked stop and the worker should stop immediately, now I store the status of the task in the cache, the worker uses this cache every 1-2 seconds.
And thirdly, the waiting time for a response from the site or server may be delayed if some server suddenly goes down.
I would be grateful for a hint) In Google I don’t know how to formulate a question correctly.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
V
Vladimir, 2016-08-24
@MechanID

I am by no means a developer, but perhaps these tools will help you.
1 if there are many entries in the database and readings are an order of magnitude smaller, then look at the TokuDB engine from Percona
2 you can also build an architecture around a message broker, for example, rabbitmq, it will have queues for different things - new tasks from the site go in a queue from there, a script picks them up and starts the workers that perform the task, the status of the tasks and the results of the workers can also be placed in other queues, and from there they are already taken to the site in the database or somewhere else. Why is this even necessary? - The queue of tasks and data is a buffer that allows you to spread the load over time. For example, the results are not written directly to the database, but first to the queue, and on the database server, the worker reads the queue and puts the data into the database, while monitoring the load on the database and preventing overload.

A
Artemy, 2016-08-24
@MetaAbstract

Use a message queue between client and server. Thus, you will untie the architectural client and workers and will be able to regulate the load on the system. For the message queue, it is better to take a ready-made solution, although you can write it yourself.
There are even more advanced data flow processing systems, such as Hadoop.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question