【MySQL】Mysql5.7.21 传统复制切换到gtid复制遇到的一个现象

2019-06-02 15:29:53 浏览数 (1)

说明:

系统:centos7 主库 M:192.168.16.12:3306 从库 S:192.168.16.15:3306 主从复制:传统复制

一、场景

M、S目前基于log、pos点的复制,想要将其在线切换为基于GTID的复制, 操作步骤(M、S都需执行,哪个先不影响):

代码语言:javascript复制
1、M、S:mysql> set @@global.enforce_gtid_consistency=warn;
2、M、S:mysql> set @@global.enforce_gtid_consistency=on;
3、M、S:mysql> SET @@GLOBAL.GTID_MODE = OFF_PERMISSIVE;
4、M、S:mysql> SET @@GLOBAL.GTID_MODE = on_permissive;
5、M、S:Mysql> SHOW STATUS LIKE 'ONGOING_ANONYMOUS_TRANSACTION_COUNT'; //是否=0
6、M、S:mysql> SET @@GLOBAL.GTID_MODE = ON;
7、M、S:修改配置文件

配置完成后查看slave状态:

代码语言:javascript复制
mysql> show slave statusG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.16.12
                  Master_User: slave
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000079
          Read_Master_Log_Pos: 584132
               Relay_Log_File: relay-log.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File: mysql-bin.000079
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 584132
              Relay_Log_Space: 4
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
                  Master_UUID: 58343797-e7f4-11e8-9c78-246e968f00b0
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: System lock
           Master_Retry_Count: 86400
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 58343797-e7f4-11e8-9c78-246e968f00b0:1-442       
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

二、现象

1、连续查看好多次slave的状态,这几个点都一直不变(此时master的这几个点一直在变大,gtid增加到了800多)

代码语言:javascript复制
Master_Log_File: mysql-bin.000079
Read_Master_Log_Pos: 584132
Executed_Gtid_Set: 58343797-e7f4-11e8-9c78-246e968f00b0:1-442 

2、连续查看好多次slave的状态,发现

Slave_IO_Running: 由Yes变为 Connecting

3、slave的error日志信息:

代码语言:javascript复制
2018-11-21T10:00:46.981201 08:00 2803 [Note] Error reading relay log event for channel '': slave SQL thread was killed
2018-11-21T10:00:46.981503 08:00 2802 [Note] Slave I/O thread killed while waiting to reconnect after a failed read for channel ''
2018-11-21T10:00:46.981522 08:00 2802 [Note] Slave I/O thread exiting for channel '', read up to log 'mysql-bin.000079', position 584132
2018-11-21T10:00:58.950843 08:00 2808 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2018-11-21T10:00:58.951239 08:00 2809 [Note] Slave SQL thread for channel '' initialized, starting replication in log 'mysql-bin.000079' at position 584132, relay log '/data/mysql/relay-log.000001' position: 4
2018-11-21T10:00:58.951370 08:00 2808 [Note] Slave I/O thread for channel '': connected to master 'slave@192.168.16.12:3306',replication started in log 'mysql-bin.000079' at position 584132

4、查看master的进程:有多个slave复制账户的连接

master errror log:

代码语言:javascript复制
[Note] While initializing dump thread for slave with UUID <30525ae8-e7f4-11e8-9af5-246e968f0048>, found a zombie dump thread with the same UUID. Master is killing the zombie dump thread(9549).
2018-11-21T09:59:20.506354 08:00 2168119 [Note] While initializing dump thread for slave with UUID <30525ae8-e7f4-11e8-9af5-246e968f0048>, found a zombie dump thread with the same UUID. Master is killing the zombie dump thread(2167205).
2018-11-21T10:03:42.429940 08:00 2168119 [Note] Start binlog_dump to master_thread_id(2168119) slave_server(2), pos(, 4)

三、原因

此时master可能执行了一个大的insert or update,超过了30s master没有推送binlog给slave。导致slave重连。 slave 过了 slave-net-timeout(30)秒还没有收到主库来的数据,它就会开始第一次重试。重试的过程中,连上了主库,那么它认为当前主库是好的 然后再过 30s 则再一次重连主库。 所以此时才看到master上有多个slave的进程。

四、解决方法

将slave-net-timeout调大,设置为1小时 mysql> set global slave-net-time=3600;

五、错误的做法

看到二 出现的现象后,做了如下操作:

代码语言:javascript复制
stop slave; 
change master to master_user='192.168.16.12',master_port=3306,master_user='slave',master_password='',master_auto_position=1; 
start slave;

此时查看slave的状态如下:

代码语言:javascript复制
mysql> show slave statusG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.16.12
                  Master_User: slave
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: 
          Read_Master_Log_Pos: 
               Relay_Log_File: relay-log.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File: mysql-bin.000079
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
             Master_Server_Id: 1
                  Master_UUID: 58343797-e7f4-11e8-9c78-246e968f00b0
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: System lock
           Master_Retry_Count: 86400
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 58343797-e7f4-11e8-9c78-246e968f00b0:1-442       
                Auto_Position: 1
1 row in set (0.00 sec)

slave无法找到master的file、pos位置点,会从master最开始的binlog进行读取,slave的日志如下:

代码语言:javascript复制
 [Note] Slave SQL thread for channel '' initialized, starting replication in log 'FIRST' at position 0, relay log '/data/mysql/relay-log.000001' position: 4
2018-11-21T10:05:47.912953 08:00 2820 [Note] Slave I/O thread for channel '': connected to master 'slave@192.168.16.12:3306',replication started in log 'FIRST' at position 4
2018-11-21T10:57:52.722093 08:00 31 [ERROR] Error reading packet from server for channel '': Cannot replicate anonymous transaction when AUTO_POSITION = 1, at file
 /data/mysql/mysql-bin.000003, position 194.; the first event '' at 4, the last event read from '/data/mysql/mysql-bin.000003' at 259, the last b
yte read from '/data/mysql/mysql-bin.000003' at 259. (server_errno=1236)
2018-11-21T10:57:52.722133 08:00 31 [ERROR] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Cannot replicate anonymo
us transaction when AUTO_POSITION = 1, at file /data/mysql/mysql-bin.000003, position 194.; the first event '' at 4, the last event read from '/data/mysql
/mysql-bin.000003' at 259, the last byte read from '/data/mysql/mysql-bin.000003' at 259.', Error_code: 1236

此时需要先关闭auto_position,再设置file、pos点,最后再开启auto_position=1,具体如下:

代码语言:javascript复制
set global slave-net-time=3600;
stop slave;
change master  to master_auto_position=0;
change master to master_user='192.168.16.12',master_port=3306,master_user='slave',master_password='',master_log_file=mysql-bin.000079,master_log_pos=584132;
change master  to master_auto_position=1;
start slave;

再次查看slave的状态,可以看到状态正常:

代码语言:javascript复制
mysql> show slave statusG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.16.12
                  Master_User: slave
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000079
          Read_Master_Log_Pos: 175299608
               Relay_Log_File: relay-log.000002
                Relay_Log_Pos: 34778333
        Relay_Master_Log_File: mysql-bin.000079
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 35362051
              Relay_Log_Space: 174716091
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 3317
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
                  Master_UUID: 58343797-e7f4-11e8-9c78-246e968f00b0
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: System lock
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 58343797-e7f4-11e8-9c78-246e968f00b0:443-20481
            Executed_Gtid_Set: 58343797-e7f4-11e8-9c78-246e968f00b0:1-4431
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

0 人点赞