D
D
DVoropaev2018-08-24 17:37:15
System administration
DVoropaev, 2018-08-24 17:37:15

How to interpret load average?

8 processors with 4 cores in each
uptime gives out 4.42, 4.80, 4.71
zabbix gives out 0.68, 0.68, 0.67
how to interpret this data, and what range is considered normal?

Answer the question

In order to leave comments, you need to log in

3 answer(s)
H
hx510b, 2018-08-25
@hx510b

A complicated explanation, but apparently methodologically correct, is in the article https://habr.com/company/mailru/blog/335326/
As practice shows, LA is associated not only with the computational load on the CPU, but also depends on I/O and other state factors systems.
Under certain circumstances, it is quite possible to observe an LA of several thousand, with actually unloaded processors and the usual number and state of processes.
I interpret LA for myself as a complex indicator of the load on the system.
Simplistically, it can be perceived as a kind of ephemeral indicator of the length of the queue of processes for execution - this is a conditional deliberately incorrect interpretation, but quite applicable in real work.
Interpretation of LA values:
Where values ​​are from 0 to 1indicate an unloaded system close to idle.
Values ​​from 1 to 10 - as a moderately loaded system. Everything is fine.
Values ​​from 10 to 30 - as a highly loaded system. No load should be added. You might think about looking for load optimization. Optimization is recommended.
Values ​​from 30 to 100- as an excessively loaded system, for example, the cause may be a large share of iowait due to overload - a large number of I/O threads per block device, an abnormally slow operation of a block device due to a malfunction, other similar reasons associated with the occurrence of a "bottleneck" in a system that needs to be expanded - with such values ​​of LA - performance is inefficient. Optimization is needed.
Values ​​above 100 should be taken as a system failure in terms of performance. Action must be taken urgently.
Values ​​above 1000 - and further growth of LA leads to the fall of the core, as a rule, the fall of the system occurs within the next few hours. An emergency response is required to avoid systems failure and data loss.
Boundaries are approximate based on experience.

M
Melkij, 2018-08-24
@melkij

And what is the normal range?

Look at the chart. If it does not stand out against the general background and the system is working normally, then this is the norm for your system.
There is no abstract meaning of the LA norm.
https://www.zabbix.com/forum/zabbix-troubleshootin...
Do you really have an 8-socket piece of iron? 32 cores in total? The calculation just doesn't add up a bit.

L
lega, 2018-08-24
@lega

uptime gives numbers for the average of 1min, 5min and 15min. the number means the number of "eaten" cores per unit of string.
those. if you have 8 working cores , and the value = 8, then the processor is working at 100% (i.e. the processor is just enough for tasks, but there is no reserve), if the value is 4, then 50% of the load, if the value is 16, then the processor works at 100% and the same number (another 100%) of tasks are idle, waiting for the processor, i.e. the processor does not cope 2 times and with a larger number of cores, the tasks would be completed faster.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question