N
N
nioterzor2018-11-02 22:08:19
Laravel
nioterzor, 2018-11-02 22:08:19

PHP process hangs?

Prerequisites: Laravel 5.2, php 7.0 or php 7.1 (tested on two versions).
Essence: there is a worker, it takes tasks from Redis, processes it. When it completes N tasks, it dies (there are unknown memory leaks that are very difficult to track down. The option with auto-restarting workers via supervisor works fine and performance does not sag).
There are several thousand such workers and 99% of them work correctly (several servers, each with 1000-1500 processes).
After 5-40 minutes of work, the worker dies, because. processed N tasks. The duration of the work of each individual process is not deterministic, because The 1st task may take 10 seconds, the 2nd - one and a half minutes, ..., Nth - 15 seconds.
But over time, undying processes appear (per day on one server, out of 1000 processes, there are 5-10 such "hung" processes).
They perform tasks, but at the same time, each one loads its own core by 90-100%.
Important(?) point: strace -p pid_of a hung process hangs on the output of
strace -p 22679
strace: Process 22679 attached
strace: [ Process PID=22679 runs in x32 mode. ]
Moreover, if you predict the desired process and connect strace in advance, then it does not hang. Those. strace -p 22679 connected until hangup, system call logs run. We connect to it again, but after the "freeze", strace gives the output indicated above.
As determined that the process is doing useful work.
netstat -tulpna | grep 22679
I took the ports that the process is listening on and checked with tcpdump.
You can see requests to mysql, communication with Redis, etc. There are parts of the tcpdump logs that, for our particular application, allow us to determine the start / end of tasks received from the queue (specific requests).
Next, a piece of the strace of the process that hung is attached, but it was connected to it before it hung and strace did not stop.
https://pastebin.com/KHBQ5vsM
There is a very remotely similar bug https://bugs.xdebug.org/view.php?id=731
PS Laravel in tags may not affect the problem in any way, but the problem is not ruled out somewhere then inside the framework.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
B
Boris Korobkov, 2018-11-02
@BorisKorobkov

The worker "doesn't die after N tasks" because it hasn't completed the last Nth task yet. And why didn’t you do it - you need to deal with your site sources, and not PHP itself. Cycling somewhere.
First, update Laravel to the latest version (5.7.2).
You can set set_time_limit, but this is not a solution to the problem, but to the consequences.
To find the reason, write unut tests, hire a tester, enable logging when calling each function, etc.

S
Stalker_RED, 2018-11-03
@Stalker_RED

Add a log (possible in the database):
Task #456456 was taken into work by worker #1234, start time 1970-01-01 12:45:33, end time 1970-01-01 12:45:56
This log can be used to understand which tasks were carried out for a very long time, and some did not end at all.
You can also record the start time and stop time for each worker. Then you can easily understand which of them have been hanging for a long time, and have not yet been completed.

A
Alexey Sundukov, 2018-11-05
@alekciy

> several servers, each with 1000-1500 processes)
Probably at least 32GB on servers? Precisely there is enough memory at such moments and there is no displacement in the swap? Are the servers iron or virtual machines?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question