Why do requests lag between frontend (separate proxy) and backend (nginx everywhere)?

H

Hint2020-01-25 13:56:03

Nginx

Hint, 2020-01-25 13:56:03

There is a proxy server in OVH (Roubaix RBX7 - France) and an application server in Munich (DC of another company). On both nginx and CentOS servers. The servers are quite powerful, the channels are gigabit, the processors on the proxy are not loaded. They process about 1000 requests per second at the peak. In the evenings, problems arise: some requests hang for 1-5 seconds. If the request does not go to the backend, then the proxy marks it instantly, there is no problem. If on the backend, then a certain percentage of requests begin to freeze (roughly speaking 1 out of 10), although the bulk is completed just as quickly (the response to the browser comes somewhere in 50 ms). If you go through the console to the proxy, then requests via curl to the backend also lag. If you go through the console to the backend, then requests via curl (to yourself) do not lag. It turns out that it's not nginx. Furthermore, if you make requests to the backend from other servers (via curl), there are no lags either. It turns out that the problem is either in the system settings of the proxy server, or in the channel between OVH and backend. Tell me where to dig further.
On the proxy, outgoing plus incoming traffic is about 80 megabits (gigabit channel). At the moment (not peak) status of nginx.
Proxy:

Active connections: 11605 
server accepts handled requests
 3016792 3016792 36761410 
Reading: 0 Writing: 41 Waiting: 11560

At the peak of "Active connections" there are about 18 thousand (in nginx 8 worker_processes, 2048 worker_connections, but it's not about nginx, curl also lags).
backend:

Active connections: 14 
server accepts handled requests
 118781210 118781210 118781116 
Reading: 0 Writing: 13 Waiting: 1

Reply

Answer the question

In order to leave comments, you need to log in

5 answer(s)

B

Badbuka, 2020-01-25
@Badbuka

Check the mtu between networks, ping with a full packet, see if there is fragmentation somewhere.

A

alfss, 2020-01-25
@alfss

What is the delay between servers in the period you specified? Ping, traceroute, etc. The problem itself is common, back and front should be in the same DC.

V

Vitaly Karasik, 2020-01-25
@vitaly_il1

They process about 1000 requests per second at the peak. Trouble comes in the evenings.

Is there more stress in the evenings?
what does monitoring show - by traffic and number of packets ? You can hit the limit on the number of packages, if there are many small ones.
1) system logs and application logs
2) NIC statistics
3) on the proxy server - look at
sysctl fs.file-nr ( https://access.redhat.com/documentation/en-us/red_...
however, if this is the case, then there should be errors in the logs

O

Oleg Kleshchuk, 2020-01-26
@xenozauros

Check via mtr/ping for packet loss. It is likely that the speed is good, and the packets between the balancer and the application disappear.
In general, the more such network "legs" you have, the higher the probability of a problem on which of them.
It is easier and more correct to bring the balancer (your proxy) and the application as close as possible to avoid such non-obviousness

H

Hint, 2020-01-26
@Hint

What exactly rested against, and did not understand. But the problem was solved by setting the keepalive between the servers (there were 1000 connections per second from front to back, it became somewhere around 50). The traffic didn't drop much, the load on the processors didn't decrease, but the delay when opening some new connections between servers disappeared. Why exactly it arose - I do not know. Maybe in the same OVH some limits on the equipment are configured ...