问题描述
最近遇到用户反馈同样的 SQL 在自建的 MySQL 测试实例和腾讯云的 CDB 实例上查询结果不一致的现象。
解决方案
自建库建议升级版本,至少高于 5.7.21,腾讯云数据库的话,建议临时改写 SQL,绕过这个 bug,后续版本更新之后,可以通过升级修复这个问题。
场景还原
构造一下复现的场景,并根据实际的案例,改写一下 SQL:
代码语言:txt复制CREATE TABLE `stu` (
`id` int(11) NOT NULL,
`sname` varchar(16) NOT NULL,
`cname` varchar(8) DEFAULT NULL,
`math` int(11) NOT NULL,
`eng` int(11) DEFAULT NULL,
`his` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE `stu1` LIKE stu;
INSERT INTO `stu` VALUES (100,'Dave','1',100,80,77),(101,'Lily','1',87,91,99),(102,'Wolf','3',82,99,89),(103,'Tom','2',69,74,75),(104,'Adam','2',99,94,95),(105,'Jay','3',77,64,55),(106,'adam','2',95,94,95),(107,'jay','2',95,94,95),(109,'jayy','2',95,94,95),(110,'jayyy','2',95,94,95);
INSERT INTO `stu1` VALUES (100,'Dave','1',100,80,77),(101,'Lily','1',87,91,99),(102,'Wolf','3',82,99,89),(103,'Tom','2',69,74,75),(104,'Adam','2',99,94,95),(105,'Jay','3',77,64,55),(110,'jayyy','2',95,94,95);
有问题的 SQL:
代码语言:txt复制select * from stu t1
left join(select id,sname as sname_t1,1 as cname from stu1) t2 on t1.id=t2.id
where t2.cname is not null;
在腾讯云 CDB 上的执行结果:
代码语言:txt复制mysql> select * from stu t1 left join(select id,sname as sname_t1,1 as cname from stu1) t2 on t1.id=t2.id where t2.cname is not null;
----- ------- ------- ------ ------ ------ ------ ---------- -------
| id | sname | cname | math | eng | his | id | sname_t1 | cname |
----- ------- ------- ------ ------ ------ ------ ---------- -------
| 100 | Dave | 1 | 100 | 80 | 77 | 100 | Dave | 1 |
| 101 | Lily | 1 | 87 | 91 | 99 | 101 | Lily | 1 |
| 102 | Wolf | 3 | 82 | 99 | 89 | 102 | Wolf | 1 |
| 103 | Tom | 2 | 69 | 74 | 75 | 103 | Tom | 1 |
| 104 | Adam | 2 | 99 | 94 | 95 | 104 | Adam | 1 |
| 105 | Jay | 3 | 77 | 64 | 55 | 105 | Jay | 1 |
| 106 | adam | 2 | 95 | 94 | 95 | NULL | NULL | NULL |
| 107 | jay | 2 | 95 | 94 | 95 | NULL | NULL | NULL |
| 109 | jayy | 2 | 95 | 94 | 95 | NULL | NULL | NULL |
| 110 | jayyy | 2 | 95 | 94 | 95 | 110 | jayyy | 1 |
----- ------- ------- ------ ------ ------ ------ ---------- -------
10 rows in set (0.00 sec)
在 Oracle 官方的执行结果:
代码语言:txt复制mysql> select * from stu t1 left join(select id,sname as sname_t1,1 as cname from stu1) t2 on t1.id=t2.id where t2.cname is not null;
----- ------- ------- ------ ------ ------ ------ ---------- -------
| id | sname | cname | math | eng | his | id | sname_t1 | cname |
----- ------- ------- ------ ------ ------ ------ ---------- -------
| 100 | Dave | 1 | 100 | 80 | 77 | 100 | Dave | 1 |
| 101 | Lily | 1 | 87 | 91 | 99 | 101 | Lily | 1 |
| 102 | Wolf | 3 | 82 | 99 | 89 | 102 | Wolf | 1 |
| 103 | Tom | 2 | 69 | 74 | 75 | 103 | Tom | 1 |
| 104 | Adam | 2 | 99 | 94 | 95 | 104 | Adam | 1 |
| 105 | Jay | 3 | 77 | 64 | 55 | 105 | Jay | 1 |
| 110 | jayyy | 2 | 95 | 94 | 95 | 110 | jayyy | 1 |
----- ------- ------- ------ ------ ------ ------ ---------- -------
7 rows in set (0.00 sec)
可以看到,两者的结果存在差异,NULL 的行在 Oracle 官方的版本中并未出现,而且根据 SQL 的语义来看,结果中没有 NULL 才是正确的。
原理简析
实际上,引发这个问题的原因是 MySQL bug,而在确认这个是 bug 之前,无论是 show warnings,还是检查 optimizer_trace,都没发现什么异常,最初分析问题的时候,看起来不像是 MySQL 的执行优化器出现了误判。
明确这个 bug 的情况之后,解决问题的方法自然也明确了:where 条件中不要使用这个被定义为常量的 cname 就行,比如:
代码语言:txt复制select * from stu t1
left join(select id,sname as sname_t1,1 as cname from stu1) t2 on t1.id=t2.id
where t2.sname_t1 is not null;
不过在改写之后,发现了一个有意思的现象,MySQL 执行优化器把这个 left join 改写成了 join:
代码语言:txt复制mysql> explain select * from stu t1 left join(select id,sname as sname_t1,1 as cname from stu1) t2 on t1.id=t2.id where t2.sname_t1 is not null;
---- ------------- ------- ------------ -------- --------------- --------- --------- -------------- ------ ---------- -------------
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---- ------------- ------- ------------ -------- --------------- --------- --------- -------------- ------ ---------- -------------
| 1 | SIMPLE | stu1 | NULL | ALL | PRIMARY | NULL | NULL | NULL | 7 | 85.71 | Using where |
| 1 | SIMPLE | t1 | NULL | eq_ref | PRIMARY | PRIMARY | 4 | test.stu1.id | 1 | 100.00 | NULL |
---- ------------- ------- ------------ -------- --------------- --------- --------- -------------- ------ ---------- -------------
2 rows in set, 1 warning (0.00 sec)
mysql> show warnings;
------- ------ -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Level | Code | Message |
------- ------ -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Note | 1003 | /* select#1 */ select `test`.`t1`.`id` AS `id`,...(省略)...,1 AS `cname` from `test`.`stu` `t1` join `test`.`stu1` where ((`test`.`t1`.`id` = `test`.`stu1`.`id`) and (`test`.`stu1`.`sname` is not null)) |
------- ------ -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 row in set (0.00 sec)
实际上,这个和 MySQL 的空值拒绝优化策略有一定的关系,简单来说,如果 where 条件中出现了 is not null 或者是效果等同的条件,那么 MySQL 可以把外连接改写成内连接,此处引用官方文档的说明:
Instead, MySQL converts the query to a query with no outer join operation if the WHERE condition is null-rejected. (That is, it converts the outer join to an inner join.) A condition is said to be null-rejected for an outer join operation if it evaluates to FALSE or UNKNOWN for any NULL-complemented row generated for the operation.
后续有机会在详细解析这个优化策略,从结果上来看,替换掉 where 条件的内容之后,查询结果就恢复正常了:
代码语言:txt复制mysql> select * from stu t1 left join(select id,sname as sname_t1,1 as cname from stu1) t2 on t1.id=t2.id where t2.sname_t1 is not null;
----- ------- ------- ------ ------ ------ ------ ---------- -------
| id | sname | cname | math | eng | his | id | sname_t1 | cname |
----- ------- ------- ------ ------ ------ ------ ---------- -------
| 100 | Dave | 1 | 100 | 80 | 77 | 100 | Dave | 1 |
| 101 | Lily | 1 | 87 | 91 | 99 | 101 | Lily | 1 |
| 102 | Wolf | 3 | 82 | 99 | 89 | 102 | Wolf | 1 |
| 103 | Tom | 2 | 69 | 74 | 75 | 103 | Tom | 1 |
| 104 | Adam | 2 | 99 | 94 | 95 | 104 | Adam | 1 |
| 105 | Jay | 3 | 77 | 64 | 55 | 105 | Jay | 1 |
| 110 | jayyy | 2 | 95 | 94 | 95 | 110 | jayyy | 1 |
----- ------- ------- ------ ------ ------ ------ ---------- -------
7 rows in set (0.00 sec)