一、题目
有用户账户表,包含年份,用户id和值,请按照年份分组,取出值前两小和前两大对应的用户id,需要保持值最小和最大的用户id排首位。
样例数据
代码语言:javascript复制 ------- ---------- --------
| year | user_id | value |
------- ---------- --------
| 2022 | A | 30 |
| 2022 | B | 10 |
| 2022 | C | 20 |
| 2023 | A | 40 |
| 2023 | B | 50 |
| 2023 | C | 20 |
| 2023 | D | 30 |
------- ---------- --------
期望结果
代码语言:javascript复制 ------- ----------------- -----------------
| year | max2_user_list | min2_user_list |
------- ----------------- -----------------
| 2022 | A,C | B,C |
| 2023 | B,A | C,D |
------- ----------------- -----------------
二、分析
属于取最大最小记录的升级版,最大难点在于拼用户要保证有序。
维度 | 评分 |
---|---|
题目难度 | ⭐️⭐️⭐️⭐️ |
题目清晰度 | ⭐️⭐️⭐️⭐️⭐️ |
业务常见度 | ⭐️⭐️⭐️⭐️ |
三、SQL
1.row_number函数根据年份分组,value正排和倒排得到两个序列
使用row_number函数根据年份分组,根据value正排得到 asc_rn用于取出value最小两行记录,根据value倒叙得到desc_rn用于取出最大两行记录
执行SQL
代码语言:javascript复制select user_id
, year
, value
, row_number() over (partition by year order by value desc) as desc_rn
, row_number() over (partition by year order by value) as rn
from t_amount
查询结果
代码语言:javascript复制 ---------- ------- -------- ---------- -----
| user_id | year | value | desc_rn | rn |
---------- ------- -------- ---------- -----
| B | 2022 | 10 | 3 | 1 |
| C | 2022 | 20 | 2 | 2 |
| A | 2022 | 30 | 1 | 3 |
| C | 2023 | 20 | 4 | 1 |
| D | 2023 | 30 | 3 | 2 |
| A | 2023 | 40 | 2 | 3 |
| B | 2023 | 50 | 1 | 4 |
---------- ------- -------- ---------- -----
2.根据年份分组,取出value最大user_id,第二大user_id,最小user_id,第二小user_id
根据年份分组,取出每年最大、第二大,最小、第二小用户ID。使用if对desc_rn,rn进行判断,对符合条件的数据取出user_id,其他去null,然后使用聚合函数取出结果。
执行SQL
代码语言:javascript复制select year,
max(if(desc_rn = 1, user_id, null)) as max1_user_id,
max(if(desc_rn = 2, user_id, null)) as max2_user_id,
max(if(rn = 1, user_id, null)) as min1_user_id,
max(if(rn = 2, user_id, null)) as min2_user_id
from (select user_id
, year
, value
, row_number() over (partition by year order by value desc) as desc_rn
, row_number() over (partition by year order by value) as rn
from t_amount) t1
group by year
查询结果
代码语言:javascript复制 ------- --------------- --------------- --------------- ---------------
| year | max1_user_id | max2_user_id | min1_user_id | min2_user_id |
------- --------------- --------------- --------------- ---------------
| 2022 | A | C | B | C |
| 2023 | B | A | C | D |
------- --------------- --------------- --------------- ---------------
3.按照顺序拼接,得到最终结果
按照题目要求,进行字符拼接
- 拼接max1_user_id、max2_user_id为max2_list;
- 拼接min1_user_id、min2_user_id为min2_list;
执行SQL
代码语言:javascript复制select year,
concat(max(if(desc_rn = 1, user_id, null)), ',',
max(if(desc_rn = 2, user_id, null))) as max2_user_list,
concat(max(if(rn = 1, user_id, null)), ',',
max(if(rn = 2, user_id, null))) as min2_user_list
from (select user_id
, year
, value
, row_number() over (partition by year order by value desc) as desc_rn
, row_number() over (partition by year order by value) as rn
from t_amount) t1
group by year
查询结果
代码语言:javascript复制 ------- ----------------- -----------------
| year | max2_user_list | min2_user_list |
------- ----------------- -----------------
| 2022 | A,C | B,C |
| 2023 | B,A | C,D |
------- ----------------- -----------------
四、建表语句和数据插入
代码语言:javascript复制--建表语句
create table if not exists t_amount
(
year string,
user_id string,
value bigint
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS orc;
--插入数据
insert into t_amount(year, user_id, value)
values ('2022', 'A', 30),
('2022', 'B', 10),
('2022', 'C', 20),
('2023', 'A', 40),
('2023', 'B', 50),
('2023', 'C', 20),
('2023', 'D', 30)