Mysql 半同步、一致性检查、数据库复制过滤
半同步配置
安装插件
在主从上都进行如下操作
检查动态加载是否可用:
mysql> select @@have_dynamic_loading;
+------------------------+
| @@have_dynamic_loading |
+------------------------+
| YES |
+------------------------+
1 row in set (0.00 sec)
mysql>
主服务器安装插件:
mysql> install plugin rpl_semi_sync_master soname 'semisync_master.so';
Query OK, 0 rows affected (0.02 sec)
从服务器安装插件:
mysql> install plugin rpl_semi_sync_slave soname 'semisync_slave.so';
Query OK, 0 rows affected (0.00 sec)
检查插件:
mysql> show plugins;
+----------------------------+----------+--------------------+--------------------+---------+
| Name | Status | Type | Library | License |
+----------------------------+----------+--------------------+--------------------+---------+
| binlog | ACTIVE | STORAGE ENGINE | NULL | GPL |
...
| rpl_semi_sync_master | ACTIVE | REPLICATION | semisync_master.so | GPL |
+----------------------------+----------+--------------------+--------------------+---------+
45 rows in set (0.01 sec)
开启功能
配置功能
Master:
开启功能
mysql> set global rpl_semi_sync_master_enabled=1;
Query OK, 0 rows affected (0.00 sec)
设置超时时间
mysql> set global rpl_semi_sync_master_timeout=1000;
Query OK, 0 rows affected (0.00 sec)
mysql> show global variables like '%rpl%';
+-------------------------------------------+------------+
| Variable_name | Value |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled | ON |
| rpl_semi_sync_master_timeout | 1000 |
| rpl_semi_sync_master_trace_level | 32 |
| rpl_semi_sync_master_wait_for_slave_count | 1 |
| rpl_semi_sync_master_wait_no_slave | ON |
| rpl_semi_sync_master_wait_point | AFTER_SYNC |
| rpl_stop_slave_timeout | 31536000 |
+-------------------------------------------+------------+
7 rows in set (0.01 sec)
mysql>
Slave:
mysql> set global rpl_semi_sync_slave_enabled=1;
Query OK, 0 rows affected (0.00 sec)
mysql> stop slave;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
mysql>
rpl_semi_sync_master_enabled和rpl_semi_sync_slave_enabled同样可以在配置文件中设置
卸载插件
uninstall plugin rpl_semi_sync_master;
uninstall plugin rpl_semi_sync_slave;
状态检查
mysql> Show status like '%semi%';
+--------------------------------------------+-------+
| Variable_name | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients | 1 |
| Rpl_semi_sync_master_net_avg_wait_time | 0 |
| Rpl_semi_sync_master_net_wait_time | 0 |
| Rpl_semi_sync_master_net_waits | 1 |
| Rpl_semi_sync_master_no_times | 0 |
| Rpl_semi_sync_master_no_tx | 0 |
| Rpl_semi_sync_master_status | ON |
| Rpl_semi_sync_master_timefunc_failures | 0 |
| Rpl_semi_sync_master_tx_avg_wait_time | 400 |
| Rpl_semi_sync_master_tx_wait_time | 400 |
| Rpl_semi_sync_master_tx_waits | 1 |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0 |
| Rpl_semi_sync_master_wait_sessions | 0 |
| Rpl_semi_sync_master_yes_tx | 1 |
| Rpl_semi_sync_slave_status | ON |
+-------------------------
Rpl_semi_sync_master_clients
记录支持半同步的slave的个数
Rpl_semi_sync_master_status
标记master现在是否是半同步复制状态
Rpl_semi_sync_master_no_tx
Master上多少提交的请求没收到slave的响应
Rpl_semi_sync_master_yes_tx
Master上多少提交的请求收到slave的响应
Rpl_semi_sync_slave_status
标记slave现在是否是半同步复制状态
Rpl_semi_sync_master_net_waits
Master等待slave回复总的等待次数
mysql> Show variables like '%semi%';
+-------------------------------------------+------------+
| Variable_name | Value |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled | ON |
| rpl_semi_sync_master_timeout | 1000 |
| rpl_semi_sync_master_trace_level | 32 |
| rpl_semi_sync_master_wait_for_slave_count | 1 |
| rpl_semi_sync_master_wait_no_slave | ON |
| rpl_semi_sync_master_wait_point | AFTER_SYNC |
+-------------------------------------------+------------+
6 rows in set (0.00 sec)
rpl_semi_sync_master_timeout
Master等待Slave返回的超时时间
半同步的问题
- 半同步在等待同步返回结果超时的时候会切换成为异步模式,而如果频繁的在同步和异步中切换,会导致数据丢失。
- 半同步不适合于高并发环境下的操作。
利用pt工具检查数据一致性并修正一致性问题
当数据库出现大量不一致导致主从无法同步的时候,我们可以使用pt-table-checksum和pt-table-sync工具来进行错误的修复工作。
安装
Yum
yum install -y percona-toolkit
Brew
brew install percona-toolkit
Compile
wget https://www.percona.com/downloads/percona-toolkit/2.2.19/tarball/percona-toolkit-2.2.19.tar.gz
tar xvzf percona-toolkit-2.2.19.tar.gz
cd percona-toolkit-2.2.19
perl Makefile.PL
make
make test
make install
制造一个错误
主服务器:23.83.250.18
从服务器:104.224.132.213
在主服务器上创建数据库插入数据
mysql> create database test;
Query OK, 1 row affected (0.00 sec)
mysql> use test;
Database changed
mysql> create table log(id int(11) auto_increment, val int(11) not null, primary key(id));
Query OK, 0 rows affected (0.01 sec)
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| log |
+----------------+
1 row in set (0.00 sec)
mysql>
mysql> insert into log(val) values(1),(2),(3),(4),(5),(6);
Query OK, 6 rows affected (0.04 sec)
Records: 6 Duplicates: 0 Warnings: 0
mysql> select * from log;
+----+-----+
| id | val |
+----+-----+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 5 | 5 |
| 6 | 6 |
+----+-----+
6 rows in set (0.00 sec)
mysql>
在从服务器上插入脏数据并删除数据
mysql> select * from log;
+----+-----+
| id | val |
+----+-----+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 5 | 5 |
| 6 | 6 |
+----+-----+
6 rows in set (0.01 sec)
mysql> insert into log(val) values(7),(8),(9);
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> delete from log where id=5;
Query OK, 1 row affected (0.00 sec)
mysql> select * from log;
+----+-----+
| id | val |
+----+-----+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 6 | 6 |
| 7 | 7 |
| 8 | 8 |
| 9 | 9 |
+----+-----+
8 rows in set (0.00 sec)
此时我们对Master进行数据的插入和删除
mysql> insert into log(val) values(11),(12),(13);
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> delete from log where id=5;
Query OK, 1 row affected (0.00 sec)
mysql>
再查看Slave的状态
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 23.83.250.18
Master_User: fgy58963
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: 3306-mysql-bin.000002
Read_Master_Log_Pos: 2209
Relay_Log_File: 3306-relay-bin.000002
Relay_Log_Pos: 1896
Relay_Master_Log_File: 3306-mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1062
Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 5 failed executing transaction 'a60f0f4b-73dd-11e6-b50f-126ade1ae73b:8' at master log 3306-mysql-bin.000002, end_log_pos 1919. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Skip_Counter: 0
Exec_Master_Log_Pos: 1673
Relay_Log_Space: 2638
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1062
Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 5 failed executing transaction 'a60f0f4b-73dd-11e6-b50f-126ade1ae73b:8' at master log 3306-mysql-bin.000002, end_log_pos 1919. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Replicate_Ignore_Server_Ids:
Master_Server_Id: 330601
Master_UUID: a60f0f4b-73dd-11e6-b50f-126ade1ae73b
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 160905 23:17:20
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: a60f0f4b-73dd-11e6-b50f-126ade1ae73b:1-9
Executed_Gtid_Set: a4084a2f-73dd-11e6-b452-094953c3daf9:1-6,
a60f0f4b-73dd-11e6-b50f-126ade1ae73b:1-7
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
我们可以看到出现错误1062,同步停止,同时Slave无法继续启动。
假设当前情况是服务器存在大量不一致的问题,我们使用pt-table系列工具进行解决。
修复错误
首先进行检查
一般情况下,因为权限限制问题,检查用的用户没有创建表的权限,所以最好我们先创建好表和数据库。这一步最好在程序完成之后就做好,方便同步到从库,避免二次创建。
CREATE TABLE `test` (
`db` char(64) NOT NULL,
`tbl` char(64) NOT NULL,
`chunk` int(11) NOT NULL,
`chunk_time` float DEFAULT NULL,
`chunk_index` varchar(200) DEFAULT NULL,
`lower_boundary` text,
`upper_boundary` text,
`this_crc` char(40) NOT NULL,
`this_cnt` int(11) NOT NULL,
`master_crc` char(40) DEFAULT NULL,
`master_cnt` int(11) DEFAULT NULL,
`ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`db`,`tbl`,`chunk`),
KEY `ts_db_tbl` (`ts`,`db`,`tbl`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
然后我们尝试执行:
pt-table-checksum --nocheck-replication-filters --nocheck-binlog-format --replicate=ptcheck.test --databases=test --recursion-method="processlist" -h 23.83.250.18 -u fgy58963 -P 3306 -p xiaocan
这时候如果看到卡在下面:
Waiting for the --replicate table to replicate to xxx.xxx
或者其他类似的提示,一般是因为主从无法同步,导致无法统计数据。此时我们需要先将错误全部跳过,保证主从可以同步。
修改配置文件,忽略所有错误:
---->my.cnf<----
slave_skip_errors=all
然后重新启动mysql并启动从库,检查状态是否是正常:
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 23.83.250.18
Master_User: fgy58963
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: 3306-mysql-bin.000004
Read_Master_Log_Pos: 5374
Relay_Log_File: 3306-relay-bin.000007
Relay_Log_Pos: 469
Relay_Master_Log_File: 3306-mysql-bin.000004
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 5374
Relay_Log_Space: 6335
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 330601
Master_UUID: a60f0f4b-73dd-11e6-b50f-126ade1ae73b
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: a60f0f4b-73dd-11e6-b50f-126ade1ae73b:1-24
Executed_Gtid_Set: a4084a2f-73dd-11e6-b452-094953c3daf9:1-10,
a60f0f4b-73dd-11e6-b50f-126ade1ae73b:1-24
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
mysql>
确认主从同步正常后,我们再次执行命令检查:
[root@localhost ~]# pt-table-checksum --nocheck-replication-filters --nocheck-binlog-format --replicate=ptcheck.test --databases=test --recursion-method="processlist" -h 主库ip -u fgy58963 -P 3306 -p xiaocan
TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE
09-06T10:58:01 0 1 8 1 0 0.127 test.log
[root@localhost ~]#
会发现提示有DIFFS,即数据不一致。此时,我们就需要利用sync工具来修复错误:
pt-table-sync --charset=utf8 --replicate=ptcheck.test --databases=test --recursion-method=processlist h=23.83.250.18,u=fgy58963,P=3306,p='xiaocan' --execute
当然,可以打印先查看错误有哪些,并且sync是如何纠正的,以免出现更严重的问题:
[root@localhost ~]# pt-table-sync --charset=utf8 --replicate=ptcheck.test --databases=test --recursion-method=processlist h=主库IP,u=fgy58963,P=3306,p='xiaocan' --print
REPLACE INTO `test`.`log`(`id`, `val`) VALUES ('7', '11') /*percona-toolkit src_db:test src_tbl:log src_dsn:A=utf8,P=3306,h=23.83.250.18,p=...,u=fgy58963 dst_db:test dst_tbl:log dst_dsn:A=utf8,P=3306,h=104.224.132.213,p=...,u=fgy58963 lock:1 transaction:1 changing_src:ptcheck.test replicate:ptcheck.test bidirectional:0 pid:17363 user:root host:localhost.localdomain*/;
REPLACE INTO `test`.`log`(`id`, `val`) VALUES ('8', '12') /*percona-toolkit src_db:test src_tbl:log src_dsn:A=utf8,P=3306,h=23.83.250.18,p=...,u=fgy58963 dst_db:test dst_tbl:log dst_dsn:A=utf8,P=3306,h=104.224.132.213,p=...,u=fgy58963 lock:1 transaction:1 changing_src:ptcheck.test replicate:ptcheck.test bidirectional:0 pid:17363 user:root host:localhost.localdomain*/;
REPLACE INTO `test`.`log`(`id`, `val`) VALUES ('9', '13') /*percona-toolkit src_db:test src_tbl:log src_dsn:A=utf8,P=3306,h=23.83.250.18,p=...,u=fgy58963 dst_db:test dst_tbl:log dst_dsn:A=utf8,P=3306,h=104.224.132.213,p=...,u=fgy58963 lock:1 transaction:1 changing_src:ptcheck.test replicate:ptcheck.test bidirectional:0 pid:17363 user:root host:localhost.localdomain*/;
[root@localhost ~]#
此时再手动检查数据的一致性:
Master:
mysql> select * from test.log;
+----+-----+
| id | val |
+----+-----+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 6 | 6 |
| 7 | 11 |
| 8 | 12 |
| 9 | 13 |
| 10 | 111 |
+----+-----+
9 rows in set (0.00 sec)
Slave:
mysql> select * from test.log;
+----+-----+
| id | val |
+----+-----+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 6 | 6 |
| 7 | 11 |
| 8 | 12 |
| 9 | 13 |
| 10 | 111 |
+----+-----+
9 rows in set (0.00 sec)
然后去除Slave配置文件中的slaveskiperrors=all,并启动检查slave状态(最好插入数据验证):
start slave;
show slave status\G
pt-table-checksum常用参数
--nocheck-replication-filters
不检查复制过滤器,检测中忽略mysql配置参数binlogignoredb等,建议启用。后面可以用--databases来指定需要检查的数据库。
--no-check-binlog-format
不检查复制的binlog模式,要是binlog模式是ROW,则会报错。
--replicate-check-only
只显示不同步的信息。
--replicate=
把checksum的信息写入到指定的库中的表中,该选项会默认自动创建 percona.checksums 表,默认存储到percona库的checksums表中。
--databases=
指定需要被检查的数据库,多个则用逗号隔开
--tables=
指定需要被检查的表, 多个则用逗号隔开
--host=127.0.0.1 或-h127.0.0.1 或 h=127.0.0.1
Master的地址
--user=root 或-uroot 或 u=root
用户名
--password=123456 或-p123456 或 p=123456
密码,如有特殊字符时需要使用引号
--port=3306 或-P3306 或 P=3306
端口
--socket=
指定socket接口路径
采用最右边格式时,中间需要加逗号隔开h=127.0.0.1,u=root,p='123456',P=3306 ,左边两种则不用。
--ask-pass
不在命令行指定密码,手动输入
--chunk-size, --chunk-size-limit
用于指定检测块的大小。 可控性更强
--ignore-databases/tables/column
跳出指定元素的过滤
--lock-wait-timeout
innodb 锁的超时设定, 默认为1
--max-load
设置最大并发连接数
--replicate-check-only
只输出数据不一致的信息。
pt-table-sync常用参数
参数基本上与checksum相同,没有--nocheck-replication-filters和--no-check-binlog-format
将修复的SQL语句打印输出
--execute
直接执行SQL语句进行修复
pt-table-checksum表结构
如果不给创建表和数据库的权限,需要先自己准备好表和数据库:
CREATE TABLE `test` (
`db` char(64) NOT NULL,
`tbl` char(64) NOT NULL,
`chunk` int(11) NOT NULL,
`chunk_time` float DEFAULT NULL,
`chunk_index` varchar(200) DEFAULT NULL,
`lower_boundary` text,
`upper_boundary` text,
`this_crc` char(40) NOT NULL,
`this_cnt` int(11) NOT NULL,
`master_crc` char(40) DEFAULT NULL,
`master_cnt` int(11) DEFAULT NULL,
`ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`db`,`tbl`,`chunk`),
KEY `ts_db_tbl` (`ts`,`db`,`tbl`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
主从同步对数据库进行过滤
主从同步的时候,有时候会有指定同步数据库的需求。这时候需要对数据库进行过滤。
如果要对数据库进行过滤,可以在配置文件中添加如下字段以达到效果:
replicate-do-db
配置slave只复制某个db
replicate-ignore-db
配置slave只忽略某个db
replicate-do-table
配置slave只复制某个table
replicate-ignore-table
配置slave只复制某个table
replicate-wild-do-table
支持通配符,配置slave只复制table (replicate-wild-do-table=db1.t*)
replicate-wild-ignore-table
支持通配符,配置slave只忽略table (replicate-wild-ignore-table=db1.s.*)
repicate-rewrite-db
如果master和slave有同名的A库了,那么可以采用rewrite规则,用master的A库复制到slave的B库(语法:master replicate-rewrite-db=a->b)
replicate-do-db
配置slave只复制某个db
replicate-ignore-db
配置slave只忽略某个db
replicate-do-table
配置slave只复制某个table
replicate-ignore-table
配置slave只复制某个table
replicate-wild-do-table
支持通配符,配置slave只复制table (replicate-wild-do-table=db1.t*)
replicate-wild-ignore-table
支持通配符,配置slave只忽略table (replicate-wild-ignore-table=db1.s.*)
repicate-rewrite-db
如果master和slave有同名的A库了,那么可以采用rewrite规则,用master的A库复制到slave的B库(语法:master replicate-rewrite-db=a->b)
用法
首先,考虑到Master以后的同步数据问题和不可停机的问题,对于同步数据库的过滤应当在从库进行配置。
其次,配置从库不应使用do系列配置来只复制某个数据库,而应该使用ignore来忽略某个数据库。
然后,建议只使用replicate-wild-ignore-table和replicate-ignore-table来进行数据库的过滤。
在从库配置忽略数据库
Slave配置文件中加上:
忽略多个数据库要分开写
replicate-wild-ignore-table=ig.*
replicate-wild-ignore-table=is.a.*
然后启动数据库,观察是否主库的ig库和is库下a开头的table都被忽略了
Master:
insert into ig.test(val) values(1),(2),(3);select * from ig.test;
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0
+----+------+
| id | val |
+----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 1 |
| 5 | 2 |
| 6 | 3 |
| 7 | 1 |
| 8 | 2 |
| 9 | 3 |
+----+------+
9 rows in set (0.00 sec)
mysql> insert into `is`.test(val) values(1),(2),(3);insert into `is`.aaatest(val) values(1),(2),(3);select * from `is`.test;select * from `is`.aaatest;
Query OK, 3 rows affected (0.01 sec)
Records: 3 Duplicates: 0 Warnings: 0
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0
+----+------+
| id | val |
+----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
+----+------+
3 rows in set (0.00 sec)
+----+------+
| id | val |
+----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
+----+------+
3 rows in set (0.00 sec)
mysql>
Slave:
mysql> select * from ig.test;
+----+------+
| id | val |
+----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 1 |
| 5 | 2 |
| 6 | 3 |
+----+------+
6 rows in set (0.00 sec)
mysql>
mysql>
mysql> select * from `is`.test;select * from `is`.aaatest;
+----+------+
| id | val |
+----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
+----+------+
3 rows in set (0.00 sec)
Empty set (0.00 sec)
mysql>
主从同步顺序切换
现在有一台服务器M1->S1+S2,要切换成M1->S1->S2。操作顺序如下:
1. 在Master上创建一张空表
2. 在Slave上删除空表,制造同步冲突
3. 在Master上删除空表,此时冲突出现。Slave都停止在同一个位置
4. 更改S2为S1的从库,跳过冲突,恢复Slave
Binlog server
MySQL Binlog Server:是利用某个工具,把线上活跃的库的日志拉取到本地进行备份。在MySQL 5.6以后,可以利用mysqlbinlog这个命令去把远程机器的日志备份到本地目录,从而达到增量或是日志安全方面的备份。
首先确认Master上binlog日志的id,然后在备份服务器上启动mysqlbinlog:
mysqlbinlog -h 23.83.250.18 -u fgy58963 --password='xiaocan' -P 3306 --stop-never --raw --read-from-remote-master=BINLOG-DUMP-GTIDS 3306-mysql-bin.000001
然后binlog就被同步过来了:
[root@localhost ~]# cd backup
[root@localhost backup]# ls
3306-mysql-bin.000001 3306-mysql-bin.000002 3306-mysql-bin.000003 3306-mysql-bin.000004 3306-mysql-bin.000005
[root@localhost backup]#
可以通过nohup的方式使其在后台运行:
nohup mysqlbinlog -h 23.83.250.18 -u fgy58963 --password='xiaocan' -P 3306 --stop-never --raw --read-from-remote-master=BINLOG-DUMP-GTIDS 3306-mysql-bin.000001 &
数据一致性保证
配置文件
Master:
每次提交都将数据刷新到磁盘
innodb_flush_log_at_trx_commit = 1
每次事务的变更都将数据刷新到磁盘
sync_binlog = 1
Slave:
将master.info保存在表中,使其成为事务
master_info_repository = "TABLE"
将relay_log.info保存在表中,使其成为事务
relay_log_info_repository = "TABLE"
使用relay log info的自动修复功能,避免主从停止
relay_log_recovery = 1
人为防范
使用crond或者python或者任何后台可运行工具,定期对数据库执行一致性检查,发现异常立刻报警通知相关人员进行修复操作。
主从复制中需要注意的
文件
relay-log-info
记录SQL线程读取Master binlog的位置,用于Slave down机之后根据文件中记录的pos点恢复SQL线程
master-info
记录IO线程读取已经读取到的Master binlog位置,用于Slave down机之后IO线程根据文件中记录的pos点重新拉取binlog日志
参数
relay-log-info-repository
在使用这个参数后,syncrelaylog_info默认为1,且不能修改
master-info-repository
syncmasterinfo可以不用再进行设置,转而设置relaylogrecovery。
这个参数的含义是,当Shalve 重启之后会根据slaverelaylog_info表重新创建一个文件,SQL线程会根据这个文件进行恢复复制,IO线程会读取SQL线程的pos点,根据这个pos点向主库申请拉取数据。
relay_log_recovery = ON
开启relay log自动恢复的功能。