G
G
grabbee2016-11-07 11:15:33
MySQL
grabbee, 2016-11-07 11:15:33

Why is mysql replica so slow?

The replica does not keep up with the master, the backlog is growing every second, and now it will reach up to a day. There is no reduction in the backlog, only growth. Even overnight, when the master had a minimum load, the backlog did not decrease. IOWAIT - 100% - as I understand it, it works in one thread per processor core, so it completely clogs it. According to statistics, the replica manages to make only 30 requests per second to the database. At the same time on the master up to 1500 requests per second. Slav is obviously not in time for him. How can I find out why the replica is running so slowly?
master-slave 5.7 ubuntu 16.04

Master_User: slave
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: db-ssd-bin.000261
          Read_Master_Log_Pos: 428260237
               Relay_Log_File: db-slave2-relay-bin.000007
                Relay_Log_Pos: 87052675
        Relay_Master_Log_File: db-ssd-bin.000258
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 611339749
              Relay_Log_Space: 2847712884
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
           Master_SSL_Cert: 
           Master_SSL_Cipher: 
           Master_SSL_Key: 
        Seconds_Behind_Master: 78349

SHOW FULL PROCESSLIST;
|  1 | system user |           | NULL | Connect |  2225 | Waiting for master to send event | NULL                  |
|  2 | system user |           | NULL | Connect | 78999 | System lock                      | NULL                  |

iotop
Total DISK READ :      50.26 K/s | Total DISK WRITE :     299.20 K/s
Actual DISK READ:      50.26 K/s | Actual DISK WRITE:    1116.20 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                           
  180 be/3 root        0.00 B/s 1634.01 B/s  0.00 % 99.99 % [jbd2/vda1-8]
17723 be/4 mysql      40.69 K/s  144.41 K/s  0.00 %  5.89 % mysqld

atop
PRC | sys    0.30s |  user   0.18s |              |  #proc    125 | #trun      1 | #tslpi   166  | #tslpu     2 | #zombie    0  | clones     0 |               | #exit      0 |
CPU | sys       3% |  user      2% | irq       0% |               | idle    191% | wait    105%  |              | steal     0%  | guest     0% | curf 3.29GHz  | curscal   ?% |
cpu | sys       2% |  user      1% | irq       0% |               | idle     97% | cpu001 w  1%  |              | steal     0%  | guest     0% | curf 3.29GHz  | curscal   ?% |
cpu | sys       1% |  user      1% | irq       0% |               | idle     93% | cpu000 w  5%  |              | steal     0%  | guest     0% | curf 3.29GHz  | curscal   ?% |
cpu | sys       0% |  user      0% | irq       0% |               | idle      1% | cpu002 w 98%  |              | steal     0%  | guest     0% | curf 3.29GHz  | curscal   ?% |
CPL | avg1    2.03 |  avg5    2.32 |              |  avg15   2.30 |              |               | csw    14825 | intr    6458  |              |               | numcpu     3 |
MEM | tot     3.9G |  free  948.3M | cache   1.7G |  dirty   0.4M | buff  135.9M |               | slab   94.4M |               |              |               |              |
SWP | tot     0.0M |  free    0.0M |              |               |              |               |              |               |              | vmcom   3.3G  | vmlim   1.9G |
DSK |          vda |  busy     99% | read      30 |  write   1040 | KiB/r     11 |               | KiB/w     10 | MBr/s   0.03  | MBw/s   1.02 | avq     1.16  | avio 9.25 ms |
NET | transport    |  tcpi     707 | tcpo     707 |  udpi       0 | udpo       0 | tcpao      0  | tcppo      0 | tcprs      0  | tcpie      0 | tcpor      0  | udpip      0 |
NET | network      |  ipi      717 | ipo      707 |  ipfrw      0 | deliv    707 |               |              |               |              | icmpi      0  | icmpo      0 |
NET | ens3    ---- |  pcki     803 | pcko     707 |  si  347 Kbps | so   37 Kbps | coll       0  | mlti       0 | erri       0  | erro       0 | drpi       0  | drpo       0 |

  PID     RUID          EUID          THR       SYSCPU      USRCPU       VGROW      RGROW       RDDSK      WRDSK      ST     EXC      S     CPUNR       CPU     CMD         1/1
17688     mysql         mysql          35        0.27s       0.18s          0K         0K        336K      2880K      --       -      S         0        5%     mysqld
  180     root          root            1        0.02s       0.00s          0K         0K          0K        88K      --       -      D         2        0%     jbd2/vda1-8
  632     root          root            1        0.01s       0.00s          0K         0K          0K         0K      --       -      S         2        0%     kworker/2:1H
23416     root          root            1        0.00s       0.00s          0K         0K          0K         0K      --       -      R         0        0%     atop
  225     root          root            1        0.00s       0.00s          0K         0K          0K         0K      --       -      S         0        0%     systemd-journa
  481     syslog        syslog          4        0.00s       0.00s          0K         0K          0K         8K      --       -      S         2        0%     rsyslogd

Answer the question

In order to leave comments, you need to log in

2 answer(s)
M
Max, 2016-11-07
@grabbee


180 be/3 root 0.00 B/s 1634.01 B/s 0.00% 99.99% [jbd2/vda1-8]
the log eats up 99.99% of the IO
someone is actively writing to disk. several recipes.
- disable logging (bad option)
- change the fsync period (also not ice, only if the server is reliable and with a UPS)
- look again at the list of processes. Someone is shitting. ps axf lay out?

F
Fixid, 2016-11-07
@Fixid

show atop and iotop

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question