pt-archiver使用姿势问题导致的数据错误

2024-08-23 16:39:56 浏览数 (1)

先说结论: 使用者的姿势问题导致,但是pt-archiver也缺少相关的校验机制,最终导致数据错误没有被及时发现。

初始化数据

代码语言:txt复制
use test;

CREATE TABLE `t1` (
  `id` int NOT NULL AUTO_INCREMENT,
  `batch_id` int DEFAULT NULL,
  `res_status` int NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3;

CREATE TABLE `t2` (
  `id` int NOT NULL AUTO_INCREMENT,
  `batch_id` int DEFAULT NULL,
  `res_status` int NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3;
CREATE TABLE `t3` (
  `id` int NOT NULL AUTO_INCREMENT,
  `batch_id` int DEFAULT NULL,
  `res_status` int NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3;


insert into t1(batch_id,res_status) select 112,1;
insert into t1(batch_id,res_status) select batch_id,res_status from t1;
insert into t1(batch_id,res_status) select batch_id,res_status from t1;
insert into t1(batch_id,res_status) select batch_id,res_status from t1;

[test]> select * from t1 order by id asc limit 5;
 ---- ---------- ------------ 
| id | batch_id | res_status |
 ---- ---------- ------------ 
|  1 |      112 |          1 |
|  2 |      112 |          1 |
|  3 |      112 |          1 |
|  4 |      112 |          1 |
|  5 |      112 |          1 |
 ---- ---------- ------------ 
5 rows in set (0.00 sec)

案例

1 有问题的写法

代码语言:txt复制
# 注意下面的字段 res_status是大写字母

pt-archiver 
--source h=192.168.31.181,u=dts,p='123456',P=3306,D=test,t=t1 
--dest h=192.168.31.181,u=dts,p='123456',P=3306,D=test,t=t2 
--progress 100 --where '1=1' 
--no-delete --charset=UTF8 --limit=100 
--no-check-columns --columns batch_id,RES_STATUS 
--statistics --txn-size 100 --no-version-check


同步完成后,查询t2表,可以看到res_status字段的值都错了
[test]> select * from t2 limit 5;
 ---- ---------- ------------ 
| id | batch_id | res_status |
 ---- ---------- ------------ 
|  1 |      112 |          0 |
|  2 |      112 |          0 |
|  3 |      112 |          0 |
|  4 |      112 |          0 |
|  5 |      112 |          0 |
 ---- ---------- ------------ 
5 rows in set (0.00 sec)


general_log部分内容如下:
2024-08-23T12:24:05.327275 08:00	   42 Query	USE `test`
2024-08-23T12:24:05.327542 08:00	   42 Query	SHOW CREATE TABLE `test`.`t1`
2024-08-23T12:24:05.339690 08:00	   43 Query	USE `test`
2024-08-23T12:24:05.339920 08:00	   43 Query	SHOW CREATE TABLE `test`.`t2`
2024-08-23T12:24:05.348830 08:00	   42 Query	SELECT MAX(`id`) FROM `test`.`t1`
2024-08-23T12:24:05.349553 08:00	   42 Query	SELECT /*!40001 SQL_NO_CACHE */ `batch_id`,`RES_STATUS`,`id` FROM `test`.`t1` FORCE INDEX(`PRIMARY`) WHERE (1=1) AND (`id` < '20') ORDER BY `id` LIMIT 100
2024-08-23T12:24:05.380780 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.381702 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.382146 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.382448 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.382752 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.383147 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.383461 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.383759 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.384169 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.384503 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.384831 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.385259 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.385578 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.385855 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.386262 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.386618 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.387013 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.387331 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.387627 08:00	   43 Query	INSERT INTO `test`.`t2`(`batch_id`) VALUES ('112')
2024-08-23T12:24:05.387991 08:00	   42 Query	SELECT /*!40001 SQL_NO_CACHE */ `batch_id`,`RES_STATUS`,`id` FROM `test`.`t1` FORCE INDEX(`PRIMARY`) WHERE (1=1) AND (`id` < '20') AND ((`id` > '19')) ORDER BY `id` LIMIT 100

2 没问题的写法

代码语言:txt复制
pt-archiver 
--source h=192.168.31.181,u=dts,p='123456',P=3306,D=test,t=t1 
--dest h=192.168.31.181,u=dts,p='123456',P=3306,D=test,t=t3 
--progress 100 --where '1=1' 
--no-delete --charset=UTF8 --limit=100 
--no-check-columns --columns batch_id,res_status 
--statistics --txn-size 100 --no-version-check

[test]> select * from t3 limit 5;
 ---- ---------- ------------ 
| id | batch_id | res_status |
 ---- ---------- ------------ 
|  1 |      112 |          1 |
|  2 |      112 |          1 |
|  3 |      112 |          1 |
|  4 |      112 |          1 |
|  5 |      112 |          1 |
 ---- ---------- ------------ 
5 rows in set (0.00 sec)

可以看到 t3 表 的 res_status的是没问题的。


general_log部分内容如下:
2024-08-23T12:19:10.000235 08:00	   38 Query	USE `test`
2024-08-23T12:19:10.000482 08:00	   38 Query	SHOW CREATE TABLE `test`.`t1`
2024-08-23T12:19:10.015465 08:00	   39 Query	USE `test`
2024-08-23T12:19:10.015677 08:00	   39 Query	SHOW CREATE TABLE `test`.`t3`
2024-08-23T12:19:10.025237 08:00	   38 Query	SELECT MAX(`id`) FROM `test`.`t1`
2024-08-23T12:19:10.025925 08:00	   38 Query	SELECT /*!40001 SQL_NO_CACHE */ `batch_id`,`res_status`,`id` FROM `test`.`t1` FORCE INDEX(`PRIMARY`) WHERE (1=1) AND (`id` < '20') ORDER BY `id` LIMIT 100
2024-08-23T12:19:10.026402 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.027093 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.027416 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.027720 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.028154 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.028552 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.028938 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.029324 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.029695 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.030029 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.030351 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.030679 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.031042 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.031480 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.031830 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.032214 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.032545 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.032839 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.033294 08:00	   39 Query	INSERT INTO `test`.`t3`(`batch_id`,`res_status`) VALUES ('112','1')
2024-08-23T12:19:10.033736 08:00	   38 Query	SELECT /*!40001 SQL_NO_CACHE */ `batch_id`,`res_status`,`id` FROM `test`.`t1` FORCE INDEX(`PRIMARY`) WHERE (1=1) AND (`id` < '20') AND ((`id` > '19')) ORDER BY `id` LIMIT 100

3 也可以使用pt-archiver的debug功能

代码语言:txt复制
此外,也可以启用pt-archiver的debug选项

export PTDEBUG=1  
然后再执行pt-arhiver命令,即可在控制台输出明细日志


正常的
# pt_archiver:6721 47664 inst stmt:  $VAR1 = {
#   cols => [
#     'batch_id',
#     'res_status'
#   ],
#   slice => [
#     0,
#     1
#   ]
# };
# 

# pt_archiver:6721 47664 inst stmt:  $VAR1 = {
#   cols => [
#     'batch_id',
#     'res_status'
#   ],
#   slice => [
#     0,
#     1
#   ]
# };
# 


异常的
# pt_archiver:6721 38702 inst stmt:  $VAR1 = {
#   cols => [
#     'batch_id'
#   ],
#   slice => [
#     0
#   ]
# };
#

# pt_archiver:6721 38702 inst stmt:  $VAR1 = {
#   cols => [
#     'batch_id'
#   ],
#   slice => [
#     0
#   ]
# };
# 

0 人点赞