MySQL 案例:摸不准的查询优化器与索引

2022-02-28 16:14:15 浏览数 (2)

背景

近期有用户在咨询查询的问题,发现一个比较典型的案例,SQL 语句无法选择正确的索引,导致查询效率偏低,正好借这个案例来学习一下查询优化器的知识。所使用的的 SQL 语句、数据以及表结构均已脱敏,

问题描述

使用 MariaDB 10.3,在一张业务表中,包含了 col1 和 col2 的唯一索引,使用如下语句查询时,只用了唯一索引的第一列,而没有用到第二列。

select col2 from t1 where col1 = 1 and col2 >= 4000000 order by col2 limit 50;

从常理来看,这个查询用上 col1 和 col2 的联合索引应该是最优解。而且衍生的相关查询均无法利用联合索引的两个列。

代码语言:txt复制
select col2 from t1 where col1 = 1 and col2 >= 4000000 order by col2;

select col2 from t1 where col1 = 1 and col2 >= 4000000 limit 50;

select col2 from t1 where col1 = 1 and col2 >= 4000000;

模拟验证

部署 MariaDB 10.3.34 官方版本,构造测试环境用的语句如下:

代码语言:txt复制
CREATE TABLE `t1` (

  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,

  `col1` int(11) NOT NULL,

  `col2` bigint(20) NOT NULL,

  `col3` varchar(20) NOT NULL DEFAULT '',

  `col4` varchar(20) DEFAULT NULL,

  `col5` varchar(100) DEFAULT NULL,

  `col6` varchar(20) DEFAULT NULL,

  `col7` varchar(100) DEFAULT NULL,

  `col8` varchar(100) DEFAULT NULL,

  `col9` varchar(100) DEFAULT NULL,

  `col10` tinyint(1) NOT NULL DEFAULT 0,

  `col11` timestamp NOT NULL DEFAULT current_timestamp(),

  `col12` timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),

  PRIMARY KEY (`id`,`col1`),

  UNIQUE KEY `unq_col1_col2` (`col1`,`col2`),

  KEY `idx_1` (`col2`),

  KEY `idx_2` (`col5`),

  KEY `idx_3` (`col1`,`col4`),

  KEY `idx_4` (`col1`,`col11`),

  KEY `idx_5` (`col11`),

  KEY `idx_6` (`col12`)

) ENGINE=InnoDB;

insert into t1(col1,col2) values(1,1);

insert into t1 (col2, col1) select col2   1, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   2, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   4, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   8, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   16, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   32, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   64, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   128, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   256, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   512, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   1024, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   2048, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   4096, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   8192, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   16384, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   32768, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   65536, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   131072, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   262144, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   524288, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   1048576, col1 from t1 where col1 = 1;

insert into t1 (col2, col1) select col2   2097152, col1 from t1 where col1 = 1;

insert into t1(col1,col2) values(10,10),(10,11),(10,12);

那么先看看执行计划:

代码语言:txt复制
MariaDB [test]> explain select col2 from t1 where col1 = 1 and col2 >= 4000000 order by col2 limit 50;
 ------ ------------- ------- ------ --------------------------------- --------------- --------- ------- -------- -------------------------- 
| id   | select_type | table | type | possible_keys                   | key           | key_len | ref   | rows   | Extra                    |
 ------ ------------- ------- ------ --------------------------------- --------------- --------- ------- -------- -------------------------- 
|    1 | SIMPLE      | t1    | ref  | unq_col1_col2,idx_1,idx_3,idx_4 | unq_col1_col2 | 4       | const | 388450 | Using where; Using index |
 ------ ------------- ------- ------ --------------------------------- --------------- --------- ------- -------- -------------------------- 
1 row in set (0.000 sec)

MariaDB [test]> explain select col2 from t1 where col1 = 1 and col2 >= 4000000 order by col2;
 ------ ------------- ------- ------ --------------------------------- --------------- --------- ------- -------- -------------------------- 
| id   | select_type | table | type | possible_keys                   | key           | key_len | ref   | rows   | Extra                    |
 ------ ------------- ------- ------ --------------------------------- --------------- --------- ------- -------- -------------------------- 
|    1 | SIMPLE      | t1    | ref  | unq_col1_col2,idx_1,idx_3,idx_4 | unq_col1_col2 | 4       | const | 388450 | Using where; Using index |
 ------ ------------- ------- ------ --------------------------------- --------------- --------- ------- -------- -------------------------- 
1 row in set (0.000 sec)

MariaDB [test]> explain select col2 from t1 where col1 = 1 and col2 >= 4000000 limit 50;
 ------ ------------- ------- ------ --------------------------------- --------------- --------- ------- -------- -------------------------- 
| id   | select_type | table | type | possible_keys                   | key           | key_len | ref   | rows   | Extra                    |
 ------ ------------- ------- ------ --------------------------------- --------------- --------- ------- -------- -------------------------- 
|    1 | SIMPLE      | t1    | ref  | unq_col1_col2,idx_1,idx_3,idx_4 | unq_col1_col2 | 4       | const | 388450 | Using where; Using index |
 ------ ------------- ------- ------ --------------------------------- --------------- --------- ------- -------- -------------------------- 
1 row in set (0.000 sec)

MariaDB [test]> explain select col2 from t1 where col1 = 1 and col2 >= 4000000;
 ------ ------------- ------- ------ --------------------------------- --------------- --------- ------- -------- -------------------------- 
| id   | select_type | table | type | possible_keys                   | key           | key_len | ref   | rows   | Extra                    |
 ------ ------------- ------- ------ --------------------------------- --------------- --------- ------- -------- -------------------------- 
|    1 | SIMPLE      | t1    | ref  | unq_col1_col2,idx_1,idx_3,idx_4 | unq_col1_col2 | 4       | const | 388450 | Using where; Using index |
 ------ ------------- ------- ------ --------------------------------- --------------- --------- ------- -------- -------------------------- 
1 row in set (0.000 sec)

MariaDB [test]> explain select col2 from t1 where col1 = 1 and col2 >= 1 order by col2 limit 50;
 ------ ------------- ------- ------ --------------------------------- --------------- --------- ------- --------- -------------------------- 
| id   | select_type | table | type | possible_keys                   | key           | key_len | ref   | rows    | Extra                    |
 ------ ------------- ------- ------ --------------------------------- --------------- --------- ------- --------- -------------------------- 
|    1 | SIMPLE      | t1    | ref  | unq_col1_col2,idx_1,idx_3,idx_4 | unq_col1_col2 | 4       | const | 2090440 | Using where; Using index |
 ------ ------------- ------- ------ --------------------------------- --------------- --------- ------- --------- -------------------------- 
1 row in set (0.001 sec)

当变更查询条件,直到实际上结果集为空的时候,查询计划才会符合预期:

代码语言:txt复制
MariaDB [test]> explain select col2 from t1 where col1 = 1 and col2 >= 5000000 order by col2 limit 50;
 ------ ------------- ------- ------- --------------------------------- --------------- --------- ------ ------ -------------------------- 
| id   | select_type | table | type  | possible_keys                   | key           | key_len | ref  | rows | Extra                    |
 ------ ------------- ------- ------- --------------------------------- --------------- --------- ------ ------ -------------------------- 
|    1 | SIMPLE      | t1    | range | unq_col1_col2,idx_1,idx_3,idx_4 | unq_col1_col2 | 12      | NULL |    1 | Using where; Using index |
 ------ ------------- ------- ------- --------------------------------- --------------- --------- ------ ------ -------------------------- 
1 row in set (0.000 sec)

PS:随机生成的数据中,col2 的最大值为 4194304。

原因简析

由于 MariaDB 10.3 并没有 optimizer_trace,因此很难去准确判断查询优化器因为什么原因没有选择联合索引,那么采用通常的人为干预手段,去试试看联合索引的效果,看看是否会有较好的查询效率:

代码语言:txt复制
MariaDB [test]> explain select col2 from t1 force index(unq_col1_col2) where col1 = 1 and col2 >= 4000000 order by col2 limit 50;
 ------ ------------- ------- ------- --------------- --------------- --------- ------ -------- -------------------------- 
| id   | select_type | table | type  | possible_keys | key           | key_len | ref  | rows   | Extra                    |
 ------ ------------- ------- ------- --------------- --------------- --------- ------ -------- -------------------------- 
|    1 | SIMPLE      | t1    | range | unq_col1_col2 | unq_col1_col2 | 12      | NULL | 388450 | Using where; Using index |
 ------ ------------- ------- ------- --------------- --------------- --------- ------ -------- -------------------------- 
1 row in set (0.000 sec)

MariaDB [test]> select col2 from t1 force index(unq_col1_col2) where col1 = 1 and col2 >= 4000000 order by col2 limit 50;
 --------- 
| col2    |
 --------- 
| 4000000 |
| 4000001 |
......
......
| 4000049 |
 --------- 
50 rows in set (0.000 sec)

MariaDB [test]> select col2 from t1 where col1 = 1 and col2 >= 4000000 order by col2 limit 50;
 --------- 
| col2    |
 --------- 
| 4000000 |
| 4000001 |
......
......
| 4000047 |
| 4000048 |
| 4000049 |
 --------- 
50 rows in set (0.796 sec)

可以发现查询时间上有非常明显的差距,确实是查询优化器选错了索引。

换到 MySQL 8.0 之后(官方版本和腾讯云数据库),查询计划选择了正确的索引,可以faxian 执行计划完全没有问题,且随着查询条件的变化,选择的索引都是合理且效率很高的。

代码语言:txt复制
mysql> explain select col2 from t1 where col1 = 1 and col2 >= 4000000 order by col2 limit 50;
 ---- ------------- ------- ------------ ------- --------------------------------- ------- --------- ------ -------- ---------- -------------------------- 
| id | select_type | table | partitions | type  | possible_keys                   | key   | key_len | ref  | rows   | filtered | Extra                    |
 ---- ------------- ------- ------------ ------- --------------------------------- ------- --------- ------ -------- ---------- -------------------------- 
|  1 | SIMPLE      | t1    | NULL       | range | unq_col1_col2,idx_1,idx_3,idx_4 | idx_1 | 8       | NULL | 388450 |    50.00 | Using where; Using index |
 ---- ------------- ------- ------------ ------- --------------------------------- ------- --------- ------ -------- ---------- -------------------------- 
1 row in set, 1 warning (0.00 sec)

mysql> explain select col2 from t1 where col1 = 1 and col2 >= 1000000 order by col2 limit 50;
 ---- ------------- ------- ------------ ------- --------------------------------- --------------- --------- ------ --------- ---------- -------------------------- 
| id | select_type | table | partitions | type  | possible_keys                   | key           | key_len | ref  | rows    | filtered | Extra                    |
 ---- ------------- ------- ------------ ------- --------------------------------- --------------- --------- ------ --------- ---------- -------------------------- 
|  1 | SIMPLE      | t1    | NULL       | range | unq_col1_col2,idx_1,idx_3,idx_4 | unq_col1_col2 | 12      | NULL | 2090440 |   100.00 | Using where; Using index |
 ---- ------------- ------- ------------ ------- --------------------------------- --------------- --------- ------ --------- ---------- -------------------------- 
1 row in set, 1 warning (0.00 sec)

如上示例中可以看到随着条件的变化,执行计划是发生了变化的,恰好 8.0 的版本有 optimizer_trace,那也正好借此机会看一下查询计划的变更与索引,查询条件的关系。

截取 trace 的部分信息如下当查询条件为 4000000 < mid 的时候:

代码语言:txt复制
......
                      {
                        "index": "unq_col1_col2",
                        "ranges": [
                          "col1 = 1 AND 4000000 <= col2"
                        ],
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": false,
                        "using_mrr": false,
                        "index_only": true,
                        "in_memory": 0.999848,
                        "rows": 388450,
                        "cost": 39129.3,  // 判断查询计划好坏的最终指标
                        "chosen": true
                      },
                      {
                        "index": "idx_1",
                        "ranges": [
                          "4000000 <= col2"
                        ],
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": false,
                        "using_mrr": false,
                        "index_only": true,
                        "in_memory": 1,
                        "rows": 388450,
                        "cost": 39082.1,  // 判断查询计划好坏的最终指标
                        "chosen": true
                      },
......

可以看到 mid 单列索引以微弱的 cost 优势胜过了唯一索引。而 1000000 > mid 的时候,再看一下 trace 的信息:

代码语言:txt复制
......
                      {
                        "index": "unq_col1_col2",
                        "ranges": [
                          "col1 = 1 AND 1000000 <= col2"
                        ],
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": false,
                        "using_mrr": false,
                        "index_only": true,
                        "in_memory": 0.999848,
                        "rows": 2090440,
                        "cost": 210573,
                        "chosen": true
                      },
                      {
                        "index": "idx_1",
                        "ranges": [
                          "1000000 <= col2"
                        ],
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": false,
                        "using_mrr": false,
                        "index_only": true,
                        "in_memory": 1,
                        "rows": 2090440,
                        "cost": 210319,
                        "chosen": true
                      },
......
                    {
                      "rows_to_scan": 2090440,
                      "filtering_effect": [
                      ],
                      "final_filtering_effect": 0.5,
                      "access_type": "range",
                      "range_details": {
                        "used_index": "idx_1"
                      },
                      "resulting_rows": 1.04522e 06,
                      "cost": 419363, //execution 阶段估算出来的 idx_1 最终 cost
                      "chosen": false
                    }
......
                        "reconsidering_access_paths_for_index_ordering":{
                            "clause":"ORDER BY",
                            "steps":[
......
                        "chosen_range_access_summary": {
                          "range_access_plan": {
                            "type": "range_scan",
                            "index": "unq_col1_col2",
                            "rows": 2.09044e 06,
                            "ranges": [
                              "col1 = 1 AND 1000000 <= col2"
                            ]
                          },
                          "rows_for_plan": 2.09044e 06,
                          "cost_for_plan": 210573,
                          "chosen": true
......
                            ],
                            "index_order_summary":{
                                "table":"`t1`",
                                "index_provides_order":true,
                                "order_direction":"asc",
                                "index":"unq_col1_col2",
                                "plan_changed":true,
                                "access_type":"range"
                            }
                        }
......

可以看到,在执行计划的选择中,其实还是用 mid 单列索引比较好,但是执行优化器在考虑到排序的时候,变更了一次索引,最终选择了唯一索引 unq_col1_col2。

总结

在大多数时候,MySQL 选择索引的唯一依据其实还是 cost,即通过统计信息来估算不同索引的执行代价,因此同一个语句出现执行计划变更的时候,不一定是遇到了 bug 或者是问题,可能只是在当前版本的代价计算模型中,数据库选择了一个“它认为更好的索引”。

诚然算法不是万能的,总会有一些处理不好的 case,相对而言高版本的代价计算总是相对准确的,可以考虑尽量使用大版本较新的数据库来支撑业务。

0 人点赞