1、基本概念
- Base表(基表):也就是通过
create table
命令创建出来的表,包括明细表(DUPLICATE )、聚合表(Aggregate )、更新表(UNIQUE ) - 物化视图(Materialized Views):简称 MVs,将预先计算(根据定义好的 SELECT 语句)好的数据集,存储在 Doris 中的一个特殊的表。
- ROLLUP是物化视图的一个子集。
ALTER TABLE ADD ROLLUP
语法功能可以通过CREATE MATERIALIZED VIEW
语法实现。 - ROLLUP本意是上卷,属于多维分析中的概念,将数据按照某种指定粒度进行聚合。
- 在 Base 表之上,可以创建多个 ROLLUP表,在物理上是独立存储。
- Base 表中的更新表属于特殊的聚合表,可以建立ROLLUP; Duplicate表不能聚合,建立ROLLUP仅仅是作为调整列顺序,以命中前缀索引的作用。
2、创建ROLLUP
(1)准备数据
代码语言:javascript复制MySQL [test]> insert into test.user_visit values("2021-09-01",1,"2021-08-31 11:30:22",12,1,2);
Query OK, 1 row affected (0.14 sec)
{'label':'insert_842e8f6967df4032-925ba0c7db659b4c', 'status':'VISIBLE', 'txnId':'240'}
MySQL [test]> insert into test.user_visit values("2021-09-01",2,"2021-08-31 10:31:32",9,3,3);
Query OK, 1 row affected (0.05 sec)
{'label':'insert_d671f0dc68144843-bb6f94cc499c1b0e', 'status':'VISIBLE', 'txnId':'241'}
MySQL [test]> insert into test.user_visit values("2021-08-31",2,"2021-08-30 11:32:32",11,2,3);
Query OK, 1 row affected (0.06 sec)
{'label':'insert_d2937852f1804868-8ba323356c9f772c', 'status':'VISIBLE', 'txnId':'242'}
MySQL [test]> select * from test.user_visit;
------------ --------- --------------------- ---------------- ---------------- ------
| dt | user_id | last_visit | max_dwell_time | min_dwell_time | pv |
------------ --------- --------------------- ---------------- ---------------- ------
| 2021-08-31 | 1 | 2021-08-30 11:30:22 | 10 | 0 | 5 |
| 2021-09-01 | 1 | 2021-08-31 11:30:22 | 12 | 1 | 2 |
| 2021-08-31 | 2 | 2021-08-30 11:32:32 | 11 | 2 | 3 |
| 2021-09-01 | 2 | 2021-08-31 10:31:32 | 9 | 3 | 3 |
------------ --------- --------------------- ---------------- ---------------- ------
4 rows in set (0.02 sec)
MySQL [test]>
(2)创建ROLLUP
代码语言:javascript复制MySQL [test]> alter table test.user_visit add rollup rollup_userid(user_id,pv);
Query OK, 0 rows affected (0.06 sec)
MySQL [test]>
说明:rollup_userid是我们创建的ROLLUP名称
(3)查看查询计划
代码语言:javascript复制MySQL [test]> select user_id,sum(pv) from test.user_visit group by user_id;
--------- -----------
| user_id | sum(`pv`) |
--------- -----------
| 2 | 6 |
| 1 | 7 |
--------- -----------
2 rows in set (0.04 sec)
MySQL [test]> explain select user_id,sum(pv) from test.user_visit group by user_id;
------------------------------------------------------------------------------
| Explain String |
------------------------------------------------------------------------------
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS:<slot 2> `user_id` | <slot 3> sum(`pv`) |
| PARTITION: UNPARTITIONED |
| |
| RESULT SINK |
| |
| 2:EXCHANGE |
| |
| PLAN FRAGMENT 1 |
| OUTPUT EXPRS: |
| PARTITION: HASH_PARTITIONED: `default_cluster:test`.`user_visit`.`user_id` |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 02 |
| UNPARTITIONED |
| |
| 1:AGGREGATE (update finalize) |
| | output: sum(`pv`) |
| | group by: `user_id` |
| | cardinality=-1 |
| | |
| 0:OlapScanNode |
| TABLE: user_visit |
| PREAGGREGATION: ON |
| partitions=1/1 |
| rollup: rollup_userid |
| tabletRatio=8/8 |
| tabletList=10455,10457,10459,10461,10463,10465,10467,10469 |
| cardinality=2 |
| avgRowSize=12.0 |
| numNodes=1 |
------------------------------------------------------------------------------
31 rows in set (0.01 sec)
MySQL [test]>
可以看到rollup: rollup_userid
,表示查询我们刚才创建的ROLLUP
(4)未命中ROLLUP的例子
代码语言:javascript复制MySQL [test]> explain select dt,sum(pv) from test.user_visit group by dt;
------------------------------------------------------------------------------
| Explain String |
------------------------------------------------------------------------------
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS:<slot 2> `dt` | <slot 3> sum(`pv`) |
| PARTITION: UNPARTITIONED |
| |
| RESULT SINK |
| |
| 4:EXCHANGE |
| |
| PLAN FRAGMENT 1 |
| OUTPUT EXPRS: |
| PARTITION: HASH_PARTITIONED: <slot 2> `dt` |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 04 |
| UNPARTITIONED |
| |
| 3:AGGREGATE (merge finalize) |
| | output: sum(<slot 3> sum(`pv`)) |
| | group by: <slot 2> `dt` |
| | cardinality=-1 |
| | |
| 2:EXCHANGE |
| |
| PLAN FRAGMENT 2 |
| OUTPUT EXPRS: |
| PARTITION: HASH_PARTITIONED: `default_cluster:test`.`user_visit`.`user_id` |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 02 |
| HASH_PARTITIONED: <slot 2> `dt` |
| |
| 1:AGGREGATE (update serialize) |
| | STREAMING |
| | output: sum(`pv`) |
| | group by: `dt` |
| | cardinality=-1 |
| | |
| 0:OlapScanNode |
| TABLE: user_visit |
| PREAGGREGATION: ON |
| partitions=1/1 |
| rollup: user_visit |
| tabletRatio=8/8 |
| tabletList=10308,10310,10312,10314,10316,10318,10320,10322 |
| cardinality=3 |
| avgRowSize=24.0 |
| numNodes=1 |
------------------------------------------------------------------------------
47 rows in set (0.01 sec)
MySQL [test]>
(5)查询已创建的ROLLUP
代码语言:javascript复制MySQL [test]> SHOW ALTER TABLE ROLLUP FROM test G
*************************** 1. row ***************************
JobId: 10453
TableName: user_visit
CreateTime: 2021-09-01 15:38:16
FinishTime: 2021-09-01 15:38:36
BaseIndexName: user_visit
RollupIndexName: rollup_userid
RollupId: 10454
TransactionId: 243
State: FINISHED
Msg:
Progress: NULL
Timeout: 86400
1 row in set (0.01 sec)
MySQL [test]>