How to catch nginx+uwsgi+django response slowdown?

K

kazmiruk2015-05-05 17:53:07

Django

kazmiruk, 2015-05-05 17:53:07

Hello
The problem is the following. There are 2 releases. The 1st release sets the response time for 90% of requests to 60ms. Rolling out the second release (newer) increases this time to 250ms. The first thing I did was look at all the commits that make up the second release. I did not find anything criminal (about 95% of styles and layout). I tried to roll out various releases with disabling some of the changes, nothing was found either (but there is a problem, since it’s not cool to do this in production, and it doesn’t start to appear immediately, so I tried it in this mode with the most suspicious commits). I tried to reproduce locally - I did not see any differences in response time. PgBouncer says that the response time from the base did not change, i.e. it can be excluded. The thoughts are over. Actually the question is this:
- where to dig, what to watch in production?
- the slowdown in the response time does not begin to appear immediately, but after about 30 minutes, although after the release is laid out, all processes are reloaded. With what it can be connected?
On the charts, even when laying out the release, you can see a slight change in the number of requests, which confuses even more. 1 release - fast response, many requests, 2 release - slow response, few requests.
The environment is: django 1.6, django-jinja + jinja2, postgres + pgbouncer, redis for sessions and cache, all this is up to nginx.

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

K

kazmiruk, 2015-05-22
@kazmiruk

Everything turned out to be very interesting. The time was taken as the average response time. After the release, the asynchronous request was gone, which was processed very quickly and often. Accordingly, this request stopped masking real problems on the server with a response time of 350 ms.

H

He11ion, 2015-05-05
@He11ion

If it doesn't leak right away, but a little later, I would take a very close look at the memory costs, it may well be leaking. In general, to hang around with metrics and load testing is a natural way out, in my opinion.

S

sim3x, 2015-05-05
@sim3x

Configure nginx to log response time
maybe get a view
stackoverflow.com/questions/14592773/nginx-request...

log_format  main  '$remote_addr|$time_local]|$request|$request_time|$upstream_response_time|'
                  '$status|$body_bytes_sent|$http_referer|'
                  '$http_user_agent';

uwsgi-docs.readthedocs.org/en/latest/Changelog-1.9...
lists.unbit.it/pipermail/uwsgi/2012-July/004474.html
projects.unbit.it/uwsgi/wiki/TipsAndTricks
I include the --profiler option too
--profiler pycall
--profiler pyline
they will write function timings in the logs. You can analyze logs to find
bottleneck (not an easy task, but it should be doable investing a bit of
time in log parser)