MGR的gtid_executed不连续的问题分析
1、问题描述
在做MGR测试的时候偶尔遇到gtid_executed事务ID不连续的问题,但是并不影响数据库的正常运行。现象如下
GreatDB Cluster[sysbench]> select @@gtid_executed;
+-----------------------------------------------------------------+
| @@gtid_executed |
+-----------------------------------------------------------------+
| 5cd1a64d-7358-11ec-b349-080027fa2d35:1-1286:2052-2290:3052-3277 |
+-----------------------------------------------------------------+
1 row in set (0.00 sec)
2、确认原因
查看官方文档发现以下参数:group_replication_gtid_assignment_block_size
以下是对官方文档的翻译和理解
group_replication_gtid_assignment_block_size为每个成员保留的连续GTID数。每个成员从中进行消耗,并在其需要的时候获取更多的GTID数(类似于分布式事务中的全局序列,该系统变量设置的值表示每个成员每一次从全局序列中获取多大范围的连续数字范围来作为自身写事务的GTID号)。
该系统变量是组范围的配置设置,它必须在所有组成员上设置相同的值,在组复制运行时不允许修改,要使得修改值生效,需要完全重新引导组(使用系统变量group_replication_bootstrap_group= on来重新引导组)。
全局变量,动态变量,整型类型,默认值为1000000,取值范围:32位平台为1~4294967295,64位平台为1~9223372036854775807,MySQL 5.7.17版本引入。
官方文档地址:https://dev.mysql.com/doc/refman/8.0/en/group-replication-options.html
通过对文档了解到MGR会为每个实例节点分配一段连续的GTID值,所以怀疑是MGR发生了主从切换,从而导致GTID的事务ID不连续。
2.1、复现问题
为方便测试修改group_replication_gtid_assignment_block_size为1000,并重启MGR
GreatDB Cluster[(none)]> set persist group_replication_gtid_assignment_block_size=1000;
Query OK, 0 rows affected (0.00 sec)
GreatDB Cluster[(none)]> set global group_replication_bootstrap_group=on;
Query OK, 0 rows affected (0.00 sec)
GreatDB Cluster[(none)]> start group_replication;
Query OK, 0 rows affected (2.45 sec)
GreatDB Cluster[(none)]> set global group_replication_bootstrap_group=off;
Query OK, 0 rows affected (0.00 sec)
把82切换为主节点然后通过sysbench模拟业务
GreatDB Cluster[sysbench]> select @@gtid_executed;
+---------------------------------------------+
| @@gtid_executed |
+---------------------------------------------+
| 5cd1a64d-7358-11ec-b349-080027fa2d35:1-5445 |
+---------------------------------------------+
1 row in set (0.00 sec)
GreatDB Cluster[sysbench]> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| group_replication_applier | cf43d5d7-7354-11ec-af9a-080027fa2d35 | 172.16.50.81 | 4444 | ONLINE | PRIMARY | 8.0.25 |
| group_replication_applier | cf520b3b-7354-11ec-b785-08002792d155 | 172.16.50.82 | 4444 | ONLINE | SECONDARY | 8.0.25 |
| group_replication_applier | cf85763c-7354-11ec-898d-0800276e4bea | 172.16.50.83 | 4444 | ONLINE | SECONDARY | 8.0.25 |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
GreatDB Cluster[sysbench]> SELECT group_replication_set_as_primary('cf520b3b-7354-11ec-b785-08002792d155');
+--------------------------------------------------------------------------+
| group_replication_set_as_primary('cf520b3b-7354-11ec-b785-08002792d155') |
+--------------------------------------------------------------------------+
| Primary server switched to: cf520b3b-7354-11ec-b785-08002792d155 |
+--------------------------------------------------------------------------+
1 row in set (1.38 sec)
GreatDB Cluster[sysbench]> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
| group_replication_applier | cf43d5d7-7354-11ec-af9a-080027fa2d35 | 172.16.50.81 | 4444 | ONLINE | SECONDARY | 8.0.25 |
| group_replication_applier | cf520b3b-7354-11ec-b785-08002792d155 | 172.16.50.82 | 4444 | ONLINE | PRIMARY | 8.0.25 |
| group_replication_applier | cf85763c-7354-11ec-898d-0800276e4bea | 172.16.50.83 | 4444 | ONLINE | SECONDARY | 8.0.25 |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
模拟业务在新的主节点写业务
# sysbench ./oltp_read_write.lua --mysql-db=sysbench --mysql-host=172.16.50.82 --mysql-port=4444 --mysql-user=greatdb --mysql-password=greatdb --tables=3 --table_size=1000000 --report-interval=2 --threads=3 --db-driver=mysql --skip-trx=off --db-ps-mode=disable --create-secondary=off --time=10 --mysql-ignore-errors=9001,9002,9000,1062,8530,8532 run
检查当前的gtid_executed
GreatDB Cluster[sysbench]> select @@gtid_executed;
+-------------------------------------------------------+
| @@gtid_executed |
+-------------------------------------------------------+
| 5cd1a64d-7358-11ec-b349-080027fa2d35:1-5445:6053-6268 |
+-------------------------------------------------------+
1 row in set (0.00 sec)
2.2、空洞对实际使用的影响
模拟业务持续在82节点写业务
sysbench ./oltp_read_write.lua --mysql-db=sysbench --mysql-host=172.16.50.82 --mysql-port=4444 --mysql-user=greatdb --mysql-password=greatdb --tables=3 --table_size=1000000 --report-interval=2 --threads=3 --db-driver=mysql --skip-trx=off --db-ps-mode=disable --create-secondary=off --time=100 --mysql-ignore-errors=9001,9002,9000,1062,8530,8532 run
检查当前的gtid_executed,一段时间后gtid_executed恢复连续
GreatDB Cluster[sysbench]> select @@gtid_executed;
+-------------------------------------------------------+
| @@gtid_executed |
+-------------------------------------------------------+
| 5cd1a64d-7358-11ec-b349-080027fa2d35:1-5451:6053-6659 |
+-------------------------------------------------------+
1 row in set (0.00 sec)
GreatDB Cluster[sysbench]> select @@gtid_executed;
+---------------------------------------------+
| @@gtid_executed |
+---------------------------------------------+
| 5cd1a64d-7358-11ec-b349-080027fa2d35:1-6665 |
+---------------------------------------------+
1 row in set (0.01 sec)
解析binlog,分析gtid生成时间
SET @@SESSION.GTID_NEXT= '5cd1a64d-7358-11ec-b349-080027fa2d35:6268'/*!*/;
# at 173167217
#220112 15:12:29 server id 1 end_log_pos 173167287 Query thread_id=19 exec_time=0 error_code=0
SET @@SESSION.GTID_NEXT= '5cd1a64d-7358-11ec-b349-080027fa2d35:6269'/*!*/;
# at 173169472
#220112 15:14:29 server id 1 end_log_pos 173169542 Query thread_id=19 exec_time=1 error_code=0
SET @@SESSION.GTID_NEXT= '5cd1a64d-7358-11ec-b349-080027fa2d35:6659'/*!*/;
# at 174048922
#220112 15:14:50 server id 1 end_log_pos 174048992 Query thread_id=19 exec_time=0 error_code=0
SET @@SESSION.GTID_NEXT= '5cd1a64d-7358-11ec-b349-080027fa2d35:6660'/*!*/;
# at 175419962
#220112 15:15:04 server id 1 end_log_pos 175420032 Query thread_id=19 exec_time=0 error_code=0
SET @@SESSION.GTID_NEXT= '5cd1a64d-7358-11ec-b349-080027fa2d35:5445'/*!*/;
# at 172680137
#220112 15:06:15 server id 1 end_log_pos 172680207 Query thread_id=19
SET @@SESSION.GTID_NEXT= '5cd1a64d-7358-11ec-b349-080027fa2d35:5446'/*!*/;
# at 174051177
#220112 15:14:50 server id 1 end_log_pos 174051247 Query thread_id=19
分析发现事务id先从6268增长到6659然后开始补空洞补完后继续从6659开始增长
3、总结
group_replication_gtid_assignment_block_size为每个成员保留的连续GTID数。每个成员从中进行消耗,并在其需要的时候获取更多的GTID数(类似于分布式事务中的全局序列,该系统变量设置的值表示每个成员每一次从全局序列中获取多大范围的连续数字范围来作为自身写事务的GTID号)。
举个例子,集群中有2个节点,group_replication_gtid_assignment_block_size为1000,那么为节点A分配的Gtid_set为group_name:1-1000,节点B分配的Gtid_set为group_name:1001-2000。则group_name:1-1000和group_name:1001-2000分别作为Gtid_set保存在member_gtids上。A节点的事务T1认证通过后,分配gtid为group_name:1,接着A节点事务T2分配group_name:2,然后B节点事务进入认证模块,认证通过后,为其分配group_name:1001,每分配一次gtid则gtids_assigned_in_blocks_counter增一。当发生主从切换时候,节点B会从1001开始记录gtid,所以会造成MGR的gtid_executed有时是不连续的多段,如aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-2:1001-1005
若分配次数gtids_assigned_in_blocks_counter已达到gtid_assignment_block_size,则需要compute_group_available_gtid_intervals()重新计算。基于member_uuid找到该成员可用的gtid区间,若还没为该成员分配gtid,则调用reserve_gtid_block()进行分配。需要注意的是,reserve_gtid_block()是最多分配而不是一定分配block_size大小的gtid序列,是否等于block_size依赖于group_available_gtid_intervals的第一个可用的连续gtid序列大小是否等于或大于block_size。
本文转载自公众号Greatsql