一、题目
有temp表包含A,B两列,请使用SQL对该B列进行处理,形成C列,按A列顺序,B列值不变,则C列累计计数,C列值变化,则C列重新开始计数,期望结果如下
样例数据
代码语言:javascript复制 ------- ----
| a | b |
------- ----
| 2010 | 1 |
| 2011 | 1 |
| 2012 | 1 |
| 2013 | 0 |
| 2014 | 0 |
| 2015 | 1 |
| 2016 | 1 |
| 2017 | 1 |
| 2018 | 0 |
| 2019 | 0 |
------- ----
期望结果
代码语言:javascript复制 ------- ---- ----
| a | b | c |
------- ---- ----
| 2010 | 1 | 1 |
| 2011 | 1 | 2 |
| 2012 | 1 | 3 |
| 2013 | 0 | 1 |
| 2014 | 0 | 2 |
| 2015 | 1 | 1 |
| 2016 | 1 | 2 |
| 2017 | 1 | 3 |
| 2018 | 0 | 1 |
| 2019 | 0 | 2 |
------- ---- ----
二、分析
- 本题是连续问题的变种,在判断连续的基础上进行累积计数。连续问题解决方案参考一文搞懂连续问题
- 本题考点相对较多,连续问题本身已经较难,会涉及到lag函数、sum()over(order by) 进行累积求和、连续数据处理的技巧,本题在连续的基础上又考察了count(*)over(order by )的累积计数。
维度 | 评分 |
---|---|
题目难度 | ⭐️⭐️⭐️⭐️ |
题目清晰度 | ⭐️⭐️⭐️⭐️⭐️ |
业务常见度 | ⭐️⭐️⭐️ |
三、SQL
1.lag()判断是否连续
使用lag判断是否连续,并对连续打标为0,不连续打标为1
执行SQL
代码语言:javascript复制select a,
b,
if(b = lag(b,1) over (order by a asc), 0, 1) as is_conn
from t19_temp
SQL结果
代码语言:javascript复制 ------- ---- ----------
| a | b | is_conn |
------- ---- ----------
| 2010 | 1 | 1 |
| 2011 | 1 | 0 |
| 2012 | 1 | 0 |
| 2013 | 0 | 1 |
| 2014 | 0 | 0 |
| 2015 | 1 | 1 |
| 2016 | 1 | 0 |
| 2017 | 1 | 0 |
| 2018 | 0 | 1 |
| 2019 | 0 | 0 |
------- ---- ----------
2.计算分组id
使sum()over(order by )方式计算出连续的分组id:conn_group_id
执行SQL
代码语言:javascript复制select a,
b,
sum(is_conn) over (order by a asc) as conn_group_id
from (select a,
b,
if(b = lag(b, 1) over (order by a asc), 0, 1) as is_conn
from t19_temp
) t
SQL结果
代码语言:javascript复制 ------- ---- ----------------
| a | b | conn_group_id |
------- ---- ----------------
| 2010 | 1 | 1 |
| 2011 | 1 | 1 |
| 2012 | 1 | 1 |
| 2013 | 0 | 2 |
| 2014 | 0 | 2 |
| 2015 | 1 | 3 |
| 2016 | 1 | 3 |
| 2017 | 1 | 3 |
| 2018 | 0 | 4 |
| 2019 | 0 | 4 |
------- ---- ----------------
3.按照分组id分组,count(*)over(order by) 计数,得出最后结果
执行SQL
代码语言:javascript复制select a,
b,
count(1) over (partition by conn_group_id order by a asc) as c
from (select a,
b,
sum(is_conn) over (order by a asc) as conn_group_id
from (select
a,
b,
if(b = lag(b, 1) over (order by a asc), 0, 1) as is_conn
from db_interview_cj.t19_temp
) t
) tt
SQL结果
代码语言:javascript复制 ------- ---- ----
| a | b | c |
------- ---- ----
| 2010 | 1 | 1 |
| 2011 | 1 | 2 |
| 2012 | 1 | 3 |
| 2013 | 0 | 1 |
| 2014 | 0 | 2 |
| 2015 | 1 | 1 |
| 2016 | 1 | 2 |
| 2017 | 1 | 3 |
| 2018 | 0 | 1 |
| 2019 | 0 | 2 |
------- ---- ----
四、建表语句和数据插入
代码语言:javascript复制--建表语句
CREATE TABLE IF NOT EXISTS t19_temp
(
a string, -- 用户id
b bigint -- 登陆日期
)
COMMENT 'temp';
--插入数据
INSERT INTO t19_temp VALUES
('2010',1),
('2011',1),
('2012',1),
('2013',0),
('2014',0),
('2015',1),
('2016',1),
('2017',1),
('2018',0),
('2019',0);