一、题目
有一张表t_id记录了id,id不重复,但是会存在间断,求出连续段的起始位置和结束位置。
代码语言:javascript复制 -----
| id |
-----
| 1 |
| 2 |
| 3 |
| 5 |
| 6 |
| 8 |
| 10 |
| 12 |
| 13 |
| 14 |
| 15 |
-----
二、分析
- 本题对重新分组的考察,此类题目真的是常见呀!
- 使用累积求和方式对数据进行重新分组;
- 根据重新分组标签进行分组,使用聚合函数min(),max()计算出每组的起始位置和结束位置;
维度 | 评分 |
---|---|
题目难度 | ⭐️⭐️⭐️⭐️ |
题目清晰度 | ⭐️⭐️⭐️⭐️⭐️ |
业务常见度 | ⭐️⭐️⭐️ |
三、SQL
1.lag()函数进行开窗计算与上一行的差值;
执行SQL
代码语言:javascript复制select id,
id - lag(id) over (order by id) as diff
from t_id
查询结果
代码语言:javascript复制 ----- -------
| id | diff |
----- -------
| 1 | NULL |
| 2 | 1 |
| 3 | 1 |
| 5 | 2 |
| 6 | 1 |
| 8 | 2 |
| 10 | 2 |
| 12 | 2 |
| 13 | 1 |
| 14 | 1 |
| 15 | 1 |
----- -------
2.获得分组字段
根据diff进行判断,如果差值为1代表连续赋值为0,否则代表不连续赋值为1,然后使用sum()进行累积计算,获得分组依据字段。
执行SQL
代码语言:javascript复制select id,
sum(if(diff = 1, 0, 1)) over (order by id) as group_type
from (select id,
id - lag(id) over (order by id) as diff
from t_id) t
查询结果
代码语言:javascript复制 ----- -------------
| id | group_type |
----- -------------
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 5 | 2 |
| 6 | 2 |
| 8 | 3 |
| 10 | 4 |
| 12 | 5 |
| 13 | 5 |
| 14 | 5 |
| 15 | 5 |
----- -------------
3.得出结果
执行SQL
代码语言:javascript复制select group_type,
min(id) as start_pos,
max(id) as end_pos
from (select id,
sum(if(diff = 1, 0, 1)) over (order by id) as group_type
from (select id,
id - lag(id) over (order by id) as diff
from t_id) t) tt
group by group_type
查询结果
代码语言:javascript复制 ------------- ------------ ----------
| group_type | start_pos | end_pos |
------------- ------------ ----------
| 1 | 1 | 3 |
| 2 | 5 | 6 |
| 3 | 8 | 8 |
| 4 | 10 | 10 |
| 5 | 12 | 15 |
------------- ------------ ----------
四、建表语句和数据插入
代码语言:javascript复制--建表语句
CREATE TABLE t_id (
id bigint COMMENT 'ID'
) COMMENT 'ID记录表'
ROW FORMAT DELIMITED FIELDS TERMINATED BY 't'
;
-- 插入数据
insert into t_id(id)
values
(1),
(2),
(3),
(5),
(6),
(8),
(10),
(12),
(13),
(14),
(15)