Skip to main content

Mysql半同步和数据一致性修复Percona-Toolkit

Mysql 半同步、一致性检查、数据库复制过滤

半同步配置

安装插件

在主从上都进行如下操作

检查动态加载是否可用:
mysql> select @@have_dynamic_loading;  
+------------------------+
| @@have_dynamic_loading |
+------------------------+
| YES                    |
+------------------------+
1 row in set (0.00 sec)

mysql>  
主服务器安装插件:
mysql> install plugin rpl_semi_sync_master soname 'semisync_master.so';  
Query OK, 0 rows affected (0.02 sec)  
从服务器安装插件:
mysql> install plugin rpl_semi_sync_slave soname 'semisync_slave.so';  
Query OK, 0 rows affected (0.00 sec)  
检查插件:
mysql> show plugins;  
+----------------------------+----------+--------------------+--------------------+---------+
| Name                       | Status   | Type               | Library            | License |
+----------------------------+----------+--------------------+--------------------+---------+
| binlog                     | ACTIVE   | STORAGE ENGINE     | NULL               | GPL     |
...
| rpl_semi_sync_master       | ACTIVE   | REPLICATION        | semisync_master.so | GPL     |
+----------------------------+----------+--------------------+--------------------+---------+
45 rows in set (0.01 sec)  

开启功能

配置功能

Master:

开启功能
mysql> set global rpl_semi_sync_master_enabled=1;  
Query OK, 0 rows affected (0.00 sec)  
设置超时时间
mysql> set global rpl_semi_sync_master_timeout=1000;  
Query OK, 0 rows affected (0.00 sec)

mysql> show global variables like '%rpl%';  
+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | ON         |
| rpl_semi_sync_master_timeout              | 1000       |
| rpl_semi_sync_master_trace_level          | 32         |
| rpl_semi_sync_master_wait_for_slave_count | 1          |
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
| rpl_stop_slave_timeout                    | 31536000   |
+-------------------------------------------+------------+
7 rows in set (0.01 sec)  
mysql>  

Slave:

mysql> set global rpl_semi_sync_slave_enabled=1;  
Query OK, 0 rows affected (0.00 sec)

mysql> stop slave;  
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> start slave;  
Query OK, 0 rows affected (0.01 sec)

mysql>

rpl_semi_sync_master_enabled和rpl_semi_sync_slave_enabled同样可以在配置文件中设置

卸载插件

uninstall plugin rpl_semi_sync_master;  
uninstall plugin rpl_semi_sync_slave;  

状态检查

mysql> Show status like '%semi%';  
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 1     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 400   |
| Rpl_semi_sync_master_tx_wait_time          | 400   |
| Rpl_semi_sync_master_tx_waits              | 1     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 1     |
| Rpl_semi_sync_slave_status                 | ON    |
+-------------------------

Rpl_semi_sync_master_clients

记录支持半同步的slave的个数

Rpl_semi_sync_master_status

标记master现在是否是半同步复制状态

Rpl_semi_sync_master_no_tx

Master上多少提交的请求没收到slave的响应

Rpl_semi_sync_master_yes_tx

Master上多少提交的请求收到slave的响应

Rpl_semi_sync_slave_status

标记slave现在是否是半同步复制状态

Rpl_semi_sync_master_net_waits

Master等待slave回复总的等待次数

mysql> Show variables like '%semi%';  
+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | ON         |
| rpl_semi_sync_master_timeout              | 1000       |
| rpl_semi_sync_master_trace_level          | 32         |
| rpl_semi_sync_master_wait_for_slave_count | 1          |
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
+-------------------------------------------+------------+
6 rows in set (0.00 sec)  

rpl_semi_sync_master_timeout

Master等待Slave返回的超时时间

半同步的问题

  1. 半同步在等待同步返回结果超时的时候会切换成为异步模式,而如果频繁的在同步和异步中切换,会导致数据丢失。
  2. 半同步不适合于高并发环境下的操作。

利用pt工具检查数据一致性并修正一致性问题

当数据库出现大量不一致导致主从无法同步的时候,我们可以使用pt-table-checksum和pt-table-sync工具来进行错误的修复工作。

安装

Yum

yum install -y percona-toolkit  

Brew

brew install percona-toolkit  

Compile

wget https://www.percona.com/downloads/percona-toolkit/2.2.19/tarball/percona-toolkit-2.2.19.tar.gz  
tar xvzf percona-toolkit-2.2.19.tar.gz  
cd percona-toolkit-2.2.19  
perl Makefile.PL  
make  
make test  
make install  

制造一个错误

主服务器:23.83.250.18
从服务器:104.224.132.213

在主服务器上创建数据库插入数据

mysql> create database test;  
Query OK, 1 row affected (0.00 sec)

mysql> use test;  
Database changed  
mysql> create table log(id int(11) auto_increment, val int(11) not null, primary key(id));  
Query OK, 0 rows affected (0.01 sec)

mysql> show tables;  
+----------------+
| Tables_in_test |
+----------------+
| log            |
+----------------+
1 row in set (0.00 sec)

mysql>  
mysql> insert into log(val) values(1),(2),(3),(4),(5),(6);  
Query OK, 6 rows affected (0.04 sec)  
Records: 6  Duplicates: 0  Warnings: 0

mysql> select * from log;  
+----+-----+
| id | val |
+----+-----+
|  1 |   1 |
|  2 |   2 |
|  3 |   3 |
|  4 |   4 |
|  5 |   5 |
|  6 |   6 |
+----+-----+
6 rows in set (0.00 sec)

mysql>  

在从服务器上插入脏数据并删除数据

mysql> select * from log;  
+----+-----+
| id | val |
+----+-----+
|  1 |   1 |
|  2 |   2 |
|  3 |   3 |
|  4 |   4 |
|  5 |   5 |
|  6 |   6 |
+----+-----+
6 rows in set (0.01 sec)

mysql> insert into log(val) values(7),(8),(9);  
Query OK, 3 rows affected (0.00 sec)  
Records: 3  Duplicates: 0  Warnings: 0

mysql> delete from log where id=5;  
Query OK, 1 row affected (0.00 sec)

mysql> select * from log;  
+----+-----+
| id | val |
+----+-----+
|  1 |   1 |
|  2 |   2 |
|  3 |   3 |
|  4 |   4 |
|  6 |   6 |
|  7 |   7 |
|  8 |   8 |
|  9 |   9 |
+----+-----+
8 rows in set (0.00 sec)  

此时我们对Master进行数据的插入和删除

mysql> insert into log(val) values(11),(12),(13);  
Query OK, 3 rows affected (0.00 sec)  
Records: 3  Duplicates: 0  Warnings: 0

mysql> delete from log where id=5;  
Query OK, 1 row affected (0.00 sec)

mysql>  

再查看Slave的状态

*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 23.83.250.18
                  Master_User: fgy58963
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: 3306-mysql-bin.000002
          Read_Master_Log_Pos: 2209
               Relay_Log_File: 3306-relay-bin.000002
                Relay_Log_Pos: 1896
        Relay_Master_Log_File: 3306-mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1062
                   Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 5 failed executing transaction 'a60f0f4b-73dd-11e6-b50f-126ade1ae73b:8' at master log 3306-mysql-bin.000002, end_log_pos 1919. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1673
              Relay_Log_Space: 2638
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No  
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1062
               Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 5 failed executing transaction 'a60f0f4b-73dd-11e6-b50f-126ade1ae73b:8' at master log 3306-mysql-bin.000002, end_log_pos 1919. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 330601
                  Master_UUID: a60f0f4b-73dd-11e6-b50f-126ade1ae73b
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State:
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 160905 23:17:20
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: a60f0f4b-73dd-11e6-b50f-126ade1ae73b:1-9
            Executed_Gtid_Set: a4084a2f-73dd-11e6-b452-094953c3daf9:1-6,
a60f0f4b-73dd-11e6-b50f-126ade1ae73b:1-7  
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)  

我们可以看到出现错误1062,同步停止,同时Slave无法继续启动。
假设当前情况是服务器存在大量不一致的问题,我们使用pt-table系列工具进行解决。

修复错误

首先进行检查

一般情况下,因为权限限制问题,检查用的用户没有创建表的权限,所以最好我们先创建好表和数据库。这一步最好在程序完成之后就做好,方便同步到从库,避免二次创建。

CREATE TABLE `test` (  
  `db` char(64) NOT NULL,
  `tbl` char(64) NOT NULL,
  `chunk` int(11) NOT NULL,
  `chunk_time` float DEFAULT NULL,
  `chunk_index` varchar(200) DEFAULT NULL,
  `lower_boundary` text,
  `upper_boundary` text,
  `this_crc` char(40) NOT NULL,
  `this_cnt` int(11) NOT NULL,
  `master_crc` char(40) DEFAULT NULL,
  `master_cnt` int(11) DEFAULT NULL,
  `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`db`,`tbl`,`chunk`),
  KEY `ts_db_tbl` (`ts`,`db`,`tbl`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

然后我们尝试执行:

pt-table-checksum --nocheck-replication-filters --nocheck-binlog-format --replicate=ptcheck.test --databases=test --recursion-method="processlist" -h 23.83.250.18 -u fgy58963 -P 3306 -p xiaocan  

这时候如果看到卡在下面:

Waiting for the --replicate table to replicate to xxx.xxx  

或者其他类似的提示,一般是因为主从无法同步,导致无法统计数据。此时我们需要先将错误全部跳过,保证主从可以同步。
修改配置文件,忽略所有错误:

---->my.cnf<----
slave_skip_errors=all  

然后重新启动mysql并启动从库,检查状态是否是正常:

mysql> start slave;  
Query OK, 0 rows affected (0.00 sec)

mysql> show slave status\G  
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 23.83.250.18
                  Master_User: fgy58963
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: 3306-mysql-bin.000004
          Read_Master_Log_Pos: 5374
               Relay_Log_File: 3306-relay-bin.000007
                Relay_Log_Pos: 469
        Relay_Master_Log_File: 3306-mysql-bin.000004
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 5374
              Relay_Log_Space: 6335
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No  
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 330601
                  Master_UUID: a60f0f4b-73dd-11e6-b50f-126ade1ae73b
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: a60f0f4b-73dd-11e6-b50f-126ade1ae73b:1-24
            Executed_Gtid_Set: a4084a2f-73dd-11e6-b452-094953c3daf9:1-10,
a60f0f4b-73dd-11e6-b50f-126ade1ae73b:1-24  
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

mysql>  

确认主从同步正常后,我们再次执行命令检查:

[root@localhost ~]# pt-table-checksum --nocheck-replication-filters --nocheck-binlog-format --replicate=ptcheck.test --databases=test --recursion-method="processlist" -h 主库ip -u fgy58963 -P 3306 -p xiaocan
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
09-06T10:58:01      0      1        8       1       0   0.127 test.log  
[root@localhost ~]#

会发现提示有DIFFS,即数据不一致。此时,我们就需要利用sync工具来修复错误:

pt-table-sync --charset=utf8 --replicate=ptcheck.test --databases=test --recursion-method=processlist h=23.83.250.18,u=fgy58963,P=3306,p='xiaocan' --execute  

当然,可以打印先查看错误有哪些,并且sync是如何纠正的,以免出现更严重的问题:

[root@localhost ~]# pt-table-sync --charset=utf8 --replicate=ptcheck.test --databases=test --recursion-method=processlist h=主库IP,u=fgy58963,P=3306,p='xiaocan' --print
REPLACE INTO `test`.`log`(`id`, `val`) VALUES ('7', '11') /*percona-toolkit src_db:test src_tbl:log src_dsn:A=utf8,P=3306,h=23.83.250.18,p=...,u=fgy58963 dst_db:test dst_tbl:log dst_dsn:A=utf8,P=3306,h=104.224.132.213,p=...,u=fgy58963 lock:1 transaction:1 changing_src:ptcheck.test replicate:ptcheck.test bidirectional:0 pid:17363 user:root host:localhost.localdomain*/;  
REPLACE INTO `test`.`log`(`id`, `val`) VALUES ('8', '12') /*percona-toolkit src_db:test src_tbl:log src_dsn:A=utf8,P=3306,h=23.83.250.18,p=...,u=fgy58963 dst_db:test dst_tbl:log dst_dsn:A=utf8,P=3306,h=104.224.132.213,p=...,u=fgy58963 lock:1 transaction:1 changing_src:ptcheck.test replicate:ptcheck.test bidirectional:0 pid:17363 user:root host:localhost.localdomain*/;  
REPLACE INTO `test`.`log`(`id`, `val`) VALUES ('9', '13') /*percona-toolkit src_db:test src_tbl:log src_dsn:A=utf8,P=3306,h=23.83.250.18,p=...,u=fgy58963 dst_db:test dst_tbl:log dst_dsn:A=utf8,P=3306,h=104.224.132.213,p=...,u=fgy58963 lock:1 transaction:1 changing_src:ptcheck.test replicate:ptcheck.test bidirectional:0 pid:17363 user:root host:localhost.localdomain*/;  
[root@localhost ~]#

此时再手动检查数据的一致性:
Master:

mysql> select * from test.log;  
+----+-----+
| id | val |
+----+-----+
|  1 |   1 |
|  2 |   2 |
|  3 |   3 |
|  4 |   4 |
|  6 |   6 |
|  7 |  11 |
|  8 |  12 |
|  9 |  13 |
| 10 | 111 |
+----+-----+
9 rows in set (0.00 sec)  

Slave:

mysql> select * from test.log;  
+----+-----+
| id | val |
+----+-----+
|  1 |   1 |
|  2 |   2 |
|  3 |   3 |
|  4 |   4 |
|  6 |   6 |
|  7 |  11 |
|  8 |  12 |
|  9 |  13 |
| 10 | 111 |
+----+-----+
9 rows in set (0.00 sec)  

然后去除Slave配置文件中的slaveskiperrors=all,并启动检查slave状态(最好插入数据验证):

start slave;  
show slave status\G  

pt-table-checksum常用参数

--nocheck-replication-filters

不检查复制过滤器,检测中忽略mysql配置参数binlogignoredb等,建议启用。后面可以用--databases来指定需要检查的数据库。

--no-check-binlog-format

不检查复制的binlog模式,要是binlog模式是ROW,则会报错。

--replicate-check-only

只显示不同步的信息。

--replicate=

把checksum的信息写入到指定的库中的表中,该选项会默认自动创建 percona.checksums 表,默认存储到percona库的checksums表中。

--databases=

指定需要被检查的数据库,多个则用逗号隔开

--tables=

指定需要被检查的表, 多个则用逗号隔开

--host=127.0.0.1 或-h127.0.0.1 或 h=127.0.0.1

Master的地址

--user=root 或-uroot 或 u=root

用户名

--password=123456 或-p123456 或 p=123456

密码,如有特殊字符时需要使用引号

--port=3306 或-P3306 或 P=3306

端口

--socket=

指定socket接口路径

采用最右边格式时,中间需要加逗号隔开h=127.0.0.1,u=root,p='123456',P=3306 ,左边两种则不用。

--ask-pass

不在命令行指定密码,手动输入
--chunk-size, --chunk-size-limit
用于指定检测块的大小。 可控性更强
--ignore-databases/tables/column
跳出指定元素的过滤
--lock-wait-timeout
innodb 锁的超时设定, 默认为1
--max-load
设置最大并发连接数
--replicate-check-only
只输出数据不一致的信息。

pt-table-sync常用参数

参数基本上与checksum相同,没有--nocheck-replication-filters和--no-check-binlog-format

--print

将修复的SQL语句打印输出

--execute

直接执行SQL语句进行修复

pt-table-checksum表结构

如果不给创建表和数据库的权限,需要先自己准备好表和数据库:

CREATE TABLE `test` (  
  `db` char(64) NOT NULL,
  `tbl` char(64) NOT NULL,
  `chunk` int(11) NOT NULL,
  `chunk_time` float DEFAULT NULL,
  `chunk_index` varchar(200) DEFAULT NULL,
  `lower_boundary` text,
  `upper_boundary` text,
  `this_crc` char(40) NOT NULL,
  `this_cnt` int(11) NOT NULL,
  `master_crc` char(40) DEFAULT NULL,
  `master_cnt` int(11) DEFAULT NULL,
  `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`db`,`tbl`,`chunk`),
  KEY `ts_db_tbl` (`ts`,`db`,`tbl`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

主从同步对数据库进行过滤

主从同步的时候,有时候会有指定同步数据库的需求。这时候需要对数据库进行过滤。
如果要对数据库进行过滤,可以在配置文件中添加如下字段以达到效果:
replicate-do-db

配置slave只复制某个db

replicate-ignore-db

配置slave只忽略某个db

replicate-do-table

配置slave只复制某个table

replicate-ignore-table

配置slave只复制某个table

replicate-wild-do-table

支持通配符,配置slave只复制table (replicate-wild-do-table=db1.t*)

replicate-wild-ignore-table

支持通配符,配置slave只忽略table (replicate-wild-ignore-table=db1.s.*)

repicate-rewrite-db

如果master和slave有同名的A库了,那么可以采用rewrite规则,用master的A库复制到slave的B库(语法:master replicate-rewrite-db=a->b)

replicate-do-db

配置slave只复制某个db

replicate-ignore-db

配置slave只忽略某个db

replicate-do-table

配置slave只复制某个table

replicate-ignore-table

配置slave只复制某个table

replicate-wild-do-table

支持通配符,配置slave只复制table (replicate-wild-do-table=db1.t*)

replicate-wild-ignore-table

支持通配符,配置slave只忽略table (replicate-wild-ignore-table=db1.s.*)

repicate-rewrite-db

如果master和slave有同名的A库了,那么可以采用rewrite规则,用master的A库复制到slave的B库(语法:master replicate-rewrite-db=a->b)

用法

首先,考虑到Master以后的同步数据问题和不可停机的问题,对于同步数据库的过滤应当在从库进行配置。
其次,配置从库不应使用do系列配置来只复制某个数据库,而应该使用ignore来忽略某个数据库。
然后,建议只使用replicate-wild-ignore-table和replicate-ignore-table来进行数据库的过滤。

在从库配置忽略数据库

Slave配置文件中加上:

忽略多个数据库要分开写
replicate-wild-ignore-table=ig.*  
replicate-wild-ignore-table=is.a.*  

然后启动数据库,观察是否主库的ig库和is库下a开头的table都被忽略了
Master:

insert into ig.test(val) values(1),(2),(3);select * from ig.test;

Query OK, 3 rows affected (0.00 sec)  
Records: 3  Duplicates: 0  Warnings: 0

+----+------+
| id | val  |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
|  4 |    1 |
|  5 |    2 |
|  6 |    3 |
|  7 |    1 |
|  8 |    2 |
|  9 |    3 |
+----+------+
9 rows in set (0.00 sec)

mysql> insert into `is`.test(val) values(1),(2),(3);insert into `is`.aaatest(val) values(1),(2),(3);select * from `is`.test;select * from `is`.aaatest;

Query OK, 3 rows affected (0.01 sec)  
Records: 3  Duplicates: 0  Warnings: 0

Query OK, 3 rows affected (0.00 sec)  
Records: 3  Duplicates: 0  Warnings: 0

+----+------+
| id | val  |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
+----+------+
3 rows in set (0.00 sec)

+----+------+
| id | val  |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
+----+------+
3 rows in set (0.00 sec)

mysql>  

Slave:

mysql> select * from ig.test;  
+----+------+
| id | val  |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
|  4 |    1 |
|  5 |    2 |
|  6 |    3 |
+----+------+
6 rows in set (0.00 sec)

mysql>  
mysql>  
mysql> select * from `is`.test;select * from `is`.aaatest;  
+----+------+
| id | val  |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
+----+------+
3 rows in set (0.00 sec)

Empty set (0.00 sec)

mysql>  

主从同步顺序切换

现在有一台服务器M1->S1+S2,要切换成M1->S1->S2。操作顺序如下:
1. 在Master上创建一张空表
2. 在Slave上删除空表,制造同步冲突
3. 在Master上删除空表,此时冲突出现。Slave都停止在同一个位置
4. 更改S2为S1的从库,跳过冲突,恢复Slave

Binlog server

MySQL Binlog Server:是利用某个工具,把线上活跃的库的日志拉取到本地进行备份。在MySQL 5.6以后,可以利用mysqlbinlog这个命令去把远程机器的日志备份到本地目录,从而达到增量或是日志安全方面的备份。
首先确认Master上binlog日志的id,然后在备份服务器上启动mysqlbinlog:

mysqlbinlog -h 23.83.250.18 -u fgy58963 --password='xiaocan' -P 3306 --stop-never --raw --read-from-remote-master=BINLOG-DUMP-GTIDS 3306-mysql-bin.000001  

然后binlog就被同步过来了:

[root@localhost ~]# cd backup
[root@localhost backup]# ls
3306-mysql-bin.000001  3306-mysql-bin.000002  3306-mysql-bin.000003  3306-mysql-bin.000004  3306-mysql-bin.000005  
[root@localhost backup]#

可以通过nohup的方式使其在后台运行:

nohup mysqlbinlog -h 23.83.250.18 -u fgy58963 --password='xiaocan' -P 3306 --stop-never --raw --read-from-remote-master=BINLOG-DUMP-GTIDS 3306-mysql-bin.000001 &  

数据一致性保证

配置文件

Master:

每次提交都将数据刷新到磁盘
innodb_flush_log_at_trx_commit = 1  
每次事务的变更都将数据刷新到磁盘
sync_binlog = 1  

Slave:

将master.info保存在表中,使其成为事务
master_info_repository = "TABLE"  
将relay_log.info保存在表中,使其成为事务
relay_log_info_repository = "TABLE"  
使用relay log info的自动修复功能,避免主从停止
relay_log_recovery = 1  

人为防范

使用crond或者python或者任何后台可运行工具,定期对数据库执行一致性检查,发现异常立刻报警通知相关人员进行修复操作。

主从复制中需要注意的

文件

relay-log-info

记录SQL线程读取Master binlog的位置,用于Slave down机之后根据文件中记录的pos点恢复SQL线程

master-info

记录IO线程读取已经读取到的Master binlog位置,用于Slave down机之后IO线程根据文件中记录的pos点重新拉取binlog日志

参数

relay-log-info-repository

在使用这个参数后,syncrelaylog_info默认为1,且不能修改

master-info-repository

syncmasterinfo可以不用再进行设置,转而设置relaylogrecovery。
这个参数的含义是,当Shalve 重启之后会根据slaverelaylog_info表重新创建一个文件,SQL线程会根据这个文件进行恢复复制,IO线程会读取SQL线程的pos点,根据这个pos点向主库申请拉取数据。

relay_log_recovery = ON

开启relay log自动恢复的功能。