A
A
Anton2018-06-16 18:52:17
linux
Anton, 2018-06-16 18:52:17

How to correctly calculate vm.dirty_background_bytes and vm.dirty_bytes?

Good afternoon!
I appeal to all of you as experts in Linux and not only.
The fact is that I am writing a script to optimize PostgreSQL (it doesn’t really matter)
https://github.com/patsevanton/postgrestun
And I got to checking vm.dirty_background_ratio and vm.dirty_ratio
everywhere they write that it needs to be replaced with vm. dirty_background_bytes and vm.dirty_bytes and specify 2-4-8 MB
All indicate different numbers.
I understand that for the smooth operation of the Linux server, for the smooth writing of dirty pages, you need to correct the vm.dirty_background_bytes and vm.dirty_bytes parameters to be commensurate with the write speed to disk?
Is it so? Or is it possible to calculate the values ​​of these parameters in some other way?
If so, what is the best way to calculate disk write speed?
So dd if=/dev/zero of=ddfile2 bs=1M count=1953 oflag=direct
The linear write speed will be displayed.
In most cases, random write speed matters - right?
Or use fio https://habr.com/post/154235/?
fio seems to be testing the entire disk.
How do you calculate vm.dirty_background_bytes and vm.dirty_bytes (on a Linux install - no reinstall)?
Probably a really difficult question - if there are no comments, no answers

Answer the question

In order to leave comments, you need to log in

1 answer(s)
Y
Yuri, 2018-06-19
@TaHKucT

My answer is over 10000 characters. The toaster does not let it through, so I'll break it down into answers:
No. vm.dirty_background_bytes - how many bytes can be used for dirty pages before this memory is automatically synchronized to disk. vm.dirty_bytes - how many bytes can be used for dirty pages.
For example, you have vm.dirty_background_bytes = 10, vm.dirty_bytes = 30, write speed to disk is 2 bytes per second, and write speed to RAM is 5 bytes per second (everything is specially so small that it would be clear what is happening and not be confused in converting values ​​from bytes to megabytes, bytes per second to megabits per second, etc.).
(Let's make a reservation that dirty pages works much more complicated than described below. It works with pages, not bytes, it is tied to specific files and a lot of nuances. Below is a very simplified version of the explanation to get a rough idea of ​​what happens at all).
At Second 0 (beginning of the example) vm.dirty_background_bytes = 0, the disk is completely idle and nothing happens (omitting the fact that linux is multi-threaded and this situation is almost unrealistic in real life). Suddenly, some process starts writing something to disk, in fact it changes the data in RAM and marks the page as a dirty page, no actual writing to disk occurs (unless fsync is called, but more on that below).
That is, in a second, in second 1, we have the following layout: vm.dirty_background_bytes = 5 (the speed of writing to RAM is 5, after all, it will continue to assume that the process will always write at this speed, because, for example, it needs to write a lot and all the data to already prepared for this), vm.dirty_bytes = 5 (vm.dirty_bytes is the amount of memory available for the dirty page, vm.dirty_background_bytes is a fraction of vm.dirty_bytes), write speed to disk = 0.
Next, 2 seconds: vm.dirty_background_bytes = 10, vm.dirty_bytes = 10, disk write speed = 0.
Next, 3 seconds: vm.dirty_background_bytes = 15, vm.dirty_bytes = 15, disk write speed = 2 (vm.dirty_background_bytes has exceeded the conditional 10 bytes we set and the synchronization process starts, it selects 2 bytes per second for the speed (as we agreed above) data and writes them to disk) (here one can argue at what point the process of synchronizing dirty pages to disk will start, when vm.dirty_background_bytes is 9, 10 or 11, but in this case this question is not fundamental).
Next, 4 seconds: vm.dirty_bytes = 18 (+5 process wrote, -2 synchronized the synchronization process, total 15+5-2=18). vm.dirty_background_bytes is no longer of interest to us, because it has already started the synchronizer.
5 second: vm.dirty_bytes = 18
6 second: vm.dirty_bytes = 21
7th second: vm.dirty_bytes = 24
8th second: vm.dirty_bytes = 27
9th second: vm.dirty_bytes = 30
10th second: vm.dirty_bytes = 30 byte/s, then we hit 2 bytes/s until the process writes all of its data).
...
Nth second: (the process has written all the data it needs): vm.dirty_bytes = 28
n+1: vm.dirty_bytes = 26 (etc.).
That is, as you can see, the dirty page allows us to "smooth out" a small write burst, due to the fact that the data that needs to be written to disk will be temporarily stored in memory. Moreover, it is a short-term burst, because on a long-term burst you fill up all the memory allocated for the dirty page and after that the write speed drops to the speed of the disk. It is obvious that the technology is useful and sometimes necessary, but it all depends on your load parents (that is why everyone advises different numbers, for someone one option works better, for someone else.) By increasing the size of vm.dirty_bytes, you with one On the other hand, increase the size of the peak that you can smooth out, on the other hand, increase the amount of RAM that you will take for this peak and the time it will take to clear this RAM.
At the same time, I see that by default in centos7 the values ​​\u200b\u200bof vm.dirty_background_ratio = 10 and vm.dirty_ratio = 30 are set (that is, 30% of all available memory can be taken under the dirty page, start synchronization if more than 10% is occupied), vm.dirty_expire_centisecs = 3000 (flush to disk data that is in the dirty page for more than 30 seconds, regardless of how much of this size is occupied by the dirty page) and vm.dirty_writeback_centisecs = 500 (the synchronizer must wake up every 5 seconds to process data that falls under the vm. dirty_expire_centisecs).
On ubuntu (16.04 and 18.04) vm.dirty_ratio = 20, all other parameters are exactly the same as centos.
The logic here is very simple: no memory is reserved for the dirty page. If it is needed and there is free memory, it is used (maximum 30% or 20% of all available). Moreover, the available memory is not "total installed", it is the available memory in the "free" output.
That is, we get "some kind of automatic balancing" of the system by this parameter, depending on "Total RAM - Used RAM", the more available memory, the more dirty page can be. If you are going to limit the dirty page to 2-4-8 megabytes, then at least run a set of tests under your combat load (not synthetic, but combat) and make sure that this at least does not make it worse (do not reduce performance). My experience is that in my workloads, most servers do fine with the defaults.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question