T
T
Tutucu2020-10-28 20:43:20
PHP
Tutucu, 2020-10-28 20:43:20

What causes jumps in request processing time?

Hello! What is:

  1. server: 2 x 2.8 GHz, 4 GB RAM, 50 GB NVMe (13 GB full), 200 Mbps
  2. ubuntu, nginx, php-fpm, php 7.3, laravel, mysql
  3. there is an API server on it, which should issue a json string to Yandex in 3 seconds (Yandex.Dialog API), if Yandex does not receive a response, it closes the connection and counts an error.
  4. There can be a lot of requests, especially in the evening


Problem: 0.5% to 1% of requests take more than 3 seconds to complete, resulting in an error, and jumps are sharp, random, but constant. All other requests are completed in 0.3 seconds.

What I did:
- Increased the RAM to 6 GB and 3 cores (didn't help at all).
- Put down all indexes in the database. (Didn't help at all.)
- Optimized database queries from 10-16 to 2-6 per 1 API request (didn't help).
- Changed Apache to NGINX and php-fpm (didn't help)
- Disabled all sites except the problematic one (didn't help)
- Logging slow queries in the database - empty.
- Logging slow scripts in php-fpm - empty.
- I tried to monitor through innotop, atop and iotop (everywhere funny numbers in tenths of a percent, sometimes only the database gets to the top, but there is something about 5% there). And it's not very clear how to monitor if I don't know when the next jump will be.
Where else to dig and what to do? I’ve been sitting for a week now and don’t know what else to try :( Help, please. Here are screenshots from Yandex:
Today (yellow tubercles) depending on the number of normal requests:
5f99ab53474e4728507264.png
Request rate for today:
5f99ac01e97b3622187556.png
Percentage of long requests for the entire month as a percentage:
5f99ac7a4593a555497095.png
At night (yellow bumps) depending on the number of normal requests (night, single requests):
5f99acdc617ea131849753.png
Moreover, in the last picture, long requests appeared when there were single requests, and when constant requests began (a big green mountain), they disappeared.

UPD:
I did nginx logging with request time and found one such slow request, for some reason its size is only 0 bytes:
5f9a91008224f930448616.png

Answer the question

In order to leave comments, you need to log in

3 answer(s)
V
Viktor Taran, 2020-10-29
@shambler81

1 . iotop -okaduring such friezes
2. switch the processor mode from energy save to performance cat /proc/cpuinfo | grep MHz all processors must have either the maximum frequency or close to it.
When the processor is "cold", it needs time to raise the frequency, and it turns out that it sometimes works faster under load than completely empty but with 800MHz
3. do not forget that php + sql can execute the same query at different speeds, moreover, this difference is nothing not in 1%, but sometimes it reaches 300% and is aggravated by the queue both in sql and at any stage.
4. I can tell you that by the way it is the most common
a) disk io, especially HDD (nvme), you can not even test.
b) sql parallelizes its queries but makes one query on 1 core, as a result, 128 core stone at 2Ghz can work slower than your office Corei3 because it has more cycle time per core.
c) php cache cache everything that is possible and competently, as a rule, in this place you can speed up every 10-30 times, without even optimizing queries in the database
d) find the heaviest queries in the database and optimize them.
Now what is most likely happening is
that you have a queue of requests to the database, for example, there is a heavy hit, let's say a directory with 5 filters, at this time the rest of the requests are queued, and even small ones are executed slowly because there is a heavy comrade in front of them.
So, for example, when 1 heavy request is made, 300 more have risen, and together they begin to climb and be executed.
The result is the same - the same as allocating 10,000 files in Windows to hdd and copying in parallel and not in series
IO sags many times, sometimes up to tens of thousands of times.
An example exaggerated but nonetheless.
As a result, you have a plug from scratch when LA system 5 IO 10% sql=100% on 1 stone.
As a rule, the situation further becomes more complicated according to the following scheme
, all the stones are engaged in heavy hits, the more it becomes easier each time, since the resources of other stones are already occupied, as a result, once a day the base starts to slow down, and it is restarted by the crown.
;)))
But everything is individual.

A
Anton Shamanov, 2020-11-03
@SilenceOfWinter

to all of the above, I’ll add that the load may not occur in the current task, but in parallel ones (for example, some resource-intensive process is running, like backup or indexing), incl. it is necessary to take into account the server load at the time of the task start.

V
Vitaly Karasik, 2020-10-28
@vitaly_il1

server: 2 x 2.8 GHz, 4 GB RAM, 50 GB NVMe (13 GB filled)

What is the size of the base itself?
And how much does MySQL use?
Lots of entries?
Logging slow queries in the database is empty.
- Logging slow scripts in php-fpm - empty.

Are you sure that mysql slow query log really works?
You can enable time logging for each request in nginx.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question