D
D
DmitryKoterov2012-03-04 23:44:08
Computer networks
DmitryKoterov, 2012-03-04 23:44:08

Catching a client disconnect on the server side?

For example, the client sends an SQL query (to PostgreSQL), the server sends a response (may be long), and in the middle the client process suddenly dies (for example, on the 9th signal - let's take it for simplicity).
Questions:
1. How soon will the server know about it and stop waiting (i.e., how soon will PostgreSQL return the process handling this connection to the pool ready for new work)?
2. What is the name of this mechanism, i.e. where can you read about it?
3. Where is this delay configured in Linux (if it is configured at all)?
I just suspect that the server can sometimes learn about a sudden client disconnect after only 2-3 minutes. As a result, free processes can end in time if a mass sea of ​​clients suddenly occurs.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
D
DmZ, 2012-03-06
@DmZ

There are several situations:

  • If there are no problems in the network, then the client that has fallen off will “close” the port and the server will receive RST for its packets, i.e. the server will “see” the connection drop almost immediately.
  • If there are problems in the network when the response from the client does not come, then there are parameters that are responsible for timeouts (set via sysctl):
    tcp_retries1 The tcp_retries1
    integer variable specifies the number of failed attempts after which information should be sent to the network layer. According to the RFC, the minimum value is 3 (which is the default value), which corresponds to a period of approximately 3 seconds to 8 minutes depending on the RTO (Retransmission time-out) value.
    tcp_retries2 The tcp_retries2
    integer specifies the number of failed attempts after which an existing connection is terminated. According to RFC 1122, the timeout must be greater than 100 seconds. This value is too low and the default number of retries is 15, corresponding to a timeout of approximately 13 to 30 minutes depending on the RTO.

All parameters are described in the kernel source documentation (Documentation/networking/ip-sysctl.txt)
Also, these parameters affect all TCP connections of this server. If you need to "tune" for your specific application, then you can use the TCP_USER_TIMEOUT parameter for the tcp socket. Specifies the time in milliseconds to wait for an acknowledgment (ACK) of data. The parameter appeared in 2.6.37.
For earlier kernels, you can monitor the outgoing queue on the socket, and if it does not decrease for a while, then something has happened.

T
TheHorse, 2012-03-05
@TheHorse

As far as I understand, the server will get an error “Connection reset by peer” (WSAECONNRESET in Windows), almost instantly.
PS This is not an answer, but I hope it helps in finding it.

A
Alexey Sundukov, 2012-03-06
@alekciy

1.
a) At the TCP level:
The client process is dead, but the client host is alive, so the client's TCP stack will send FIN in response to data coming from the server. The server TCP will receive it, send an ACK in response, and go from ESTABLISHED to CLOSE_WAIT. After the server worker process closes the server TCP stack will send a FIN and jump to LAST_ACK waiting for an ACK from the TCP client. Having received it, it will go to CLOSED.
Here we have the duration of this process depends on how fast our packets run between hosts. Given the size, it is usually faster, less than a second.
b) At the application level:
I don’t know what PostgreSQL writes to the socket, but in any case, when trying to write to a socket that went into CLOSE_WAIT, the worker process will receive an error. write, in theory, should end with an EPIPE error, after which the worker process should do a close, although it can complete without it, the kernel will do it in the background itself. As a first approximation, everything should happen quickly and without delay.
PS The descriptions added in the clarification of the question contradict the original wording of the question. The client not receiving a thump is a problem with the network or the client host, but in the event of a dying client process, the TCP stack of the client still works and it takes over the functions of notifying the server host of an error, so everything happens quite quickly.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question