V
V
Vladimir2019-05-16 18:11:53
linux
Vladimir, 2019-05-16 18:11:53

Pg_basebackup, does the copy speed depend on the type of data prevailing in the database, and is it possible to somehow increase the copy speed using postgres?

There is a DB on 1, it is created artificially. contains a bunch of tablets with text fields containing a hash from random (each about 10Gb).
Copying the entire cluster to a neighboring machine takes about 14 hours. At the same time, it does not seem to be a network system, nor does it experience a disk load.
is there any way to increase the speed (ideally reduce the time to 5 hours) (maybe there is any multithreading factor?).
or somehow copy the cluster directly from the FS without using pg_basebackup. (I tried to press tar in basebackup but it did not affect the speed)
I know that ideally pg_basebackup does not affect the performance of the cluster and you can make a full copy once a week and keep wal archives at the same time. but I'm still interested in the option with daily full backs (if it's possible, of course)

Answer the question

In order to leave comments, you need to log in

3 answer(s)
S
Saboteur, 2019-05-16
@idskill

There is a DB on 1, it is created artificially. contains a bunch of tablets with text fields containing a hash from random (each about 10Gb).
Copying the entire cluster to a neighboring machine takes about 14 hours.

1 TB = 1000 GB.
100 Mbps = ~10 Megabytes per second. 600 megabytes per minute That is 1000 gigabytes / 0.6 = 27 hours.
1000 mbps = ~100 megabytes per second, but often the throughput of disks rests on about 50 megabytes / sec = 3 GB / min, 1000 gigabytes / 3 GB ~ 5.5 hours
I suspect that it’s still a bottleneck or network (gzip compresses data, therefore, instead of 27 hours, you get about 14)
Either gzip compression into one stream (only one core is stressed, which makes it seem that the CPU is not loaded), takes too long.
In the first case, make sure -z is enabled and try -Z 9 as well as a gigabit network
In the second case, try the opposite -Z 1 to reduce the load on the CPU

D
Dmitry Shitskov, 2019-05-16
@Zarom

Look towards using Barman . Just right for your situation.
Supports two modes:
And combinations of the above
docs.pgbarman.org/release/2.7/#two-typical-scenari...

A
Artem @Jump, 2019-05-16
Tag

Run a backup and measure the load - disk queue, network, processor, the same on the machine where you copy.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question