How to find the cause of the load on the processor?

A

Andrey2014-10-28 03:02:31

linux

Andrey, 2014-10-28 03:02:31

Good afternoon, everyone
on the server, there are very strange load peaks, at the time of the peak, the load on the CPU jumps from 2-3 to 100-150. And this can happen at 7 am, when the normal load on the server is at least.
The server runs nginx + php-fpm, redis, rabbitmq. There is no database, it is on another server.
At the time of the peak, the number of requests through nginx drops, network traffic also goes almost to zero. We can't figure out how to get to the source of this.
The peaks themselves are short-term 2-3-5 minutes.

I would be grateful for help with determining the source of this load

Reply

Answer the question

In order to leave comments, you need to log in

7 answer(s)

P

Power, 2014-10-28
@Power

As far as I understand, cpu has nothing to do with it. The last graph clearly shows the peak of iowait, which means you need to look in the direction of disk activity and all that. Set up IO monitoring, it will be clear there.

V

Vladimir, 2014-10-28
@rostel

run atop in the mode of collecting statistics in the log
after the peak in the mode of analyzing logs already see what happened

S

Sergey, 2014-10-28
@butteff

1. Look at the logs of all services that are on the server, what happened at that time, what did they do?
2. If peaks are always at a certain time, is it necessary to look for the reason, for example, in cron?
3. Maybe the hoster uses all the threads and resources to do some of his tasks, such as backups? You have a virtual dedicated server, you use part of the processor capabilities, and they can use your resources if necessary.

A

Andrey, 2014-10-28
@andreyvlru

it seems that the problem is in the banal lack of memory
Server 16Gb of memory
redis 7 Gb
php-fpm 5-8 Gb
I think the slightest memory overloads led to swaps or failures. Haven't seen any confirmation yet. I unloaded the server, removed part of the load on the spare, let's see how it will be today at the peak of the load.
PS Thanks for the tip on newrelic - an awesome thing, but also an expensive dog

P

Puma Thailand, 2014-10-28
@opium

it can be seen that the plug on the disk, or the system went into the swap, or the disk began to actively use

E

Eternalko, 2014-10-28
@Eternalko

You can also use loggers like newrelic / appDynamics. Helps

S

Sergey Petrikov, 2014-10-28
@RicoX

If you wang, the disk subsystem flattens in places, it looks like the symptoms, well, the red graph of the bottom screen confirms it. Check the status of the raid, if OK, look at what the disk is loading at that time, maybe some kind of log rotation, someone dumps the cache on the disk or something else.