Doris动态分区

2021-09-08 15:55:54 浏览数 (2)

1、说明

本文主要是测试Doris动态分区相关功能; 关于动态分区相关理论部分请参考官方文档: http://doris.incubator.apache.org/master/zh-CN/administrator-guide/dynamic-partition.html

2、创建一个张按天调度的、不删除历史分区的动态分区表

(1)建表语句

代码语言:javascript复制
CREATE TABLE IF NOT EXISTS test.dynamic_partition_tbl
(
	id INT,
	dt DATE,
	pv BIGINT
)
DUPLICATE KEY(id,dt)
PARTITION BY RANGE(dt)()
DISTRIBUTED BY HASH(id)
PROPERTIES
(
    "dynamic_partition.enable" = "true",
    "dynamic_partition.time_unit" = "DAY",
    "dynamic_partition.end" = "3",
    "dynamic_partition.prefix" = "p",
    "dynamic_partition.buckets" = "32",
	"replication_num" = "1"
);

代码说明:

  • PARTITION BY RANGE(dt)()最后有个括号,看起来有些疑惑。该括号里面可以通过PARTITION pxxxxxx VALUES LESS THAN指定普通分区,见前面的例子https://blog.csdn.net/chengyuqiang/article/details/120044191;
  • 动态分区的规则参数都以dynamic_partition.为前缀
  • dynamic_partition.enable表示是否开启动态分区
  • dynamic_partition.time_unit表示动态分区调度的单位,可以按小时、天、星期、月等进行分区创建或删除。本例中是按天进行动态分区管理。
  • dynamic_partition.end,动态分区的结束偏移量(正值)。根据 time_unit 属性的不同,以当前为基准,提前创建对应范围的分区。本例的值为3,表示以当天(20210902)为基准,提前创建未来3天的分区(20210903、20210904、20210905)
  • dynamic_partition.start,动态分区的起始偏移量(负值)。以当前为基准,分区范围在此偏移之前的分区将会被删除。如果不填写,则默认为 -2147483648,即不删除历史分区。本例没有设置该参数,所以不删除历史分区。
  • dynamic_partition.prefix,分区名前缀,本例中分区名以“p”为前缀

(2)创建过程

代码语言:javascript复制
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> CREATE TABLE IF NOT EXISTS test.dynamic_partition_tbl
    -> (
    -> id INT,
    -> dt DATE,
    -> pv BIGINT
    -> )
    -> DUPLICATE KEY(id,dt)
    -> PARTITION BY RANGE(dt)()
    -> DISTRIBUTED BY HASH(id)
    -> PROPERTIES
    -> (
    ->     "dynamic_partition.enable" = "true",
    ->     "dynamic_partition.time_unit" = "DAY",
    ->     "dynamic_partition.end" = "3",
    ->     "dynamic_partition.prefix" = "p",
    ->     "dynamic_partition.buckets" = "32",
    -> "replication_num" = "1"
    -> );
Query OK, 0 rows affected (0.11 sec)

mysql> 

(3)查看动态分区表

代码语言:javascript复制
mysql> SHOW DYNAMIC PARTITION TABLES G
*************************** 1. row ***************************
             TableName: dynamic_partition_tbl
                Enable: true
              TimeUnit: DAY
                 Start: -2147483648
                   End: 3
                Prefix: p
               Buckets: 32
        ReplicationNum: 1
               StartOf: NULL
        LastUpdateTime: 2021-09-02 09:50:32
     LastSchedulerTime: 2021-09-02 09:51:23
                 State: NORMAL
LastCreatePartitionMsg: NULL
  LastDropPartitionMsg: NULL
1 row in set (0.00 sec)

mysql> 

(4)查看表的分区信息

代码语言:javascript复制
mysql> show partitions from test.dynamic_partition_tbl G
*************************** 1. row ***************************
             PartitionId: 10473
           PartitionName: p20210902
          VisibleVersion: 1
      VisibleVersionTime: 2021-09-02 09:50:32
      VisibleVersionHash: 0
                   State: NORMAL
            PartitionKey: dt
                   Range: [types: [DATE]; keys: [2021-09-02]; ..types: [DATE]; keys: [2021-09-03]; )
         DistributionKey: id
                 Buckets: 32
          ReplicationNum: 1
           StorageMedium: HDD
            CooldownTime: 9999-12-31 23:59:59
LastConsistencyCheckTime: NULL
                DataSize: .000 
              IsInMemory: false
*************************** 2. row ***************************
             PartitionId: 10538
           PartitionName: p20210903
          VisibleVersion: 1
      VisibleVersionTime: 2021-09-02 09:50:32
      VisibleVersionHash: 0
                   State: NORMAL
            PartitionKey: dt
                   Range: [types: [DATE]; keys: [2021-09-03]; ..types: [DATE]; keys: [2021-09-04]; )
         DistributionKey: id
                 Buckets: 32
          ReplicationNum: 1
           StorageMedium: HDD
            CooldownTime: 9999-12-31 23:59:59
LastConsistencyCheckTime: NULL
                DataSize: .000 
              IsInMemory: false
*************************** 3. row ***************************
             PartitionId: 10603
           PartitionName: p20210904
          VisibleVersion: 1
      VisibleVersionTime: 2021-09-02 09:50:32
      VisibleVersionHash: 0
                   State: NORMAL
            PartitionKey: dt
                   Range: [types: [DATE]; keys: [2021-09-04]; ..types: [DATE]; keys: [2021-09-05]; )
         DistributionKey: id
                 Buckets: 32
          ReplicationNum: 1
           StorageMedium: HDD
            CooldownTime: 9999-12-31 23:59:59
LastConsistencyCheckTime: NULL
                DataSize: .000 
              IsInMemory: false
*************************** 4. row ***************************
             PartitionId: 10668
           PartitionName: p20210905
          VisibleVersion: 1
      VisibleVersionTime: 2021-09-02 09:50:32
      VisibleVersionHash: 0
                   State: NORMAL
            PartitionKey: dt
                   Range: [types: [DATE]; keys: [2021-09-05]; ..types: [DATE]; keys: [2021-09-06]; )
         DistributionKey: id
                 Buckets: 32
          ReplicationNum: 1
           StorageMedium: HDD
            CooldownTime: 9999-12-31 23:59:59
LastConsistencyCheckTime: NULL
                DataSize: .000 
              IsInMemory: false
4 rows in set (0.02 sec)

mysql> 

发现已经创建了4个分区,当前分区p20210903和提前创建的3个未来分区(p20210903、p20210904、p20210905)

(5)插入测试数据

代码语言:javascript复制
mysql> insert into dynamic_partition_tbl values(1,'2021-09-02',100);
Query OK, 1 row affected (0.08 sec)
{'label':'insert_bbfb17eb456846ee-890ca4a8f45ba046', 'status':'VISIBLE', 'txnId':'244'}

mysql> insert into dynamic_partition_tbl values(1,'2021-09-03',100);
Query OK, 1 row affected (0.05 sec)
{'label':'insert_d96d0f4f6f5e4b2d-9a0dfd3c55ef5d53', 'status':'VISIBLE', 'txnId':'245'}

mysql> insert into dynamic_partition_tbl values(1,'2021-09-04',100);
Query OK, 1 row affected (0.06 sec)
{'label':'insert_b13dfce59ec04ab2-b43a201c35b37412', 'status':'VISIBLE', 'txnId':'246'}

mysql> insert into dynamic_partition_tbl values(1,'2021-09-05',100);
Query OK, 1 row affected (0.05 sec)
{'label':'insert_fdc31cf0b3a4a9a-9df5c527ca692003', 'status':'VISIBLE', 'txnId':'247'}

mysql> insert into dynamic_partition_tbl values(1,'2021-09-06',100);
ERROR 1064 (HY000): errCode = 2, detailMessage = all partitions have no load data. url: http://10.17.12.160:8040/api/_load_error_log?file=__shard_0/error_log_insert_stmt_84a9868b27494d1e-8e040826450f55a9_84a9868b27494d1e_8e040826450f55a9
mysql> 

4个分区数据可以正常插入,当前没有’2021-09-06’的分区,所以第5条数据插入失败。

3、创建一个只保留最近7天并且预先创建未来3天的分区表

(1)建表语句

代码语言:javascript复制
CREATE TABLE IF NOT EXISTS test.dynamic_tbl2
(
    id INT,
	dt DATE,
	pv BIGINT
)
PARTITION BY RANGE(dt) ()
DISTRIBUTED BY HASH(id)
PROPERTIES
(
    "dynamic_partition.enable" = "true",
    "dynamic_partition.time_unit" = "DAY",
    "dynamic_partition.start" = "-7",
    "dynamic_partition.end" = "3",
    "dynamic_partition.prefix" = "p",
    "dynamic_partition.buckets" = "10"
);

代码说明:

  • “dynamic_partition.start” = “-7”,表示以当前为基准7天前的分区
  • 当前日期是20210902,“dynamic_partition.end” = "3"则提前创建未来3天的分区20210903、20210904、20210905
  • 则到了明天20210903,则会创建新的分区20210905;依次类推,到了20210905这天,提前创建新分区20210908;
  • 假设到了20210910这天,会删除以这天为基准7天的分区,也就是说这天前保留最近7天的分区(20210909、20210908、20210907、20210906、20210905、20210904、20210903),删除20210902分区。

(2)创建过程

代码语言:javascript复制
mysql> CREATE TABLE IF NOT EXISTS test.dynamic_tbl2
    -> (
    ->     id INT,
    ->     dt DATE,
    ->     pv BIGINT
    -> )
    -> PARTITION BY RANGE(dt) ()
    -> DISTRIBUTED BY HASH(id)
    -> PROPERTIES
    -> (
    ->     "dynamic_partition.enable" = "true",
    ->     "dynamic_partition.time_unit" = "DAY",
    ->     "dynamic_partition.start" = "-7",
    ->     "dynamic_partition.end" = "3",
    ->     "dynamic_partition.prefix" = "p",
    ->     "dynamic_partition.buckets" = "10"
    -> );
Query OK, 0 rows affected (0.10 sec)

mysql> 

(3)查看分区表

代码语言:javascript复制
mysql>  SHOW DYNAMIC PARTITION TABLES G
*************************** 1. row ***************************
             TableName: dynamic_tbl2
                Enable: true
              TimeUnit: DAY
                 Start: -7
                   End: 3
                Prefix: p
               Buckets: 10
        ReplicationNum: 3
               StartOf: NULL
        LastUpdateTime: 2021-09-02 10:59:15
     LastSchedulerTime: 2021-09-02 11:01:23
                 State: NORMAL
LastCreatePartitionMsg: NULL
  LastDropPartitionMsg: NULL
*************************** 2. row ***************************
             TableName: dynamic_partition_tbl
                Enable: true
              TimeUnit: DAY
                 Start: -2147483648
                   End: 3
                Prefix: p
               Buckets: 32
        ReplicationNum: 1
               StartOf: NULL
        LastUpdateTime: 2021-09-02 09:50:32
     LastSchedulerTime: 2021-09-02 11:01:23
                 State: NORMAL
LastCreatePartitionMsg: NULL
  LastDropPartitionMsg: NULL
2 rows in set (0.00 sec)

mysql>

(4)查看分区信息

代码语言:javascript复制
mysql> show partitions from test.dynamic_tbl2 G                                                                                                               *************************** 1. row ***************************
             PartitionId: 10739
           PartitionName: p20210902
          VisibleVersion: 1
      VisibleVersionTime: 2021-09-02 10:59:15
      VisibleVersionHash: 0
                   State: NORMAL
            PartitionKey: dt
                   Range: [types: [DATE]; keys: [2021-09-02]; ..types: [DATE]; keys: [2021-09-03]; )
         DistributionKey: id
                 Buckets: 10
          ReplicationNum: 3
           StorageMedium: HDD
            CooldownTime: 9999-12-31 23:59:59
LastConsistencyCheckTime: NULL
                DataSize: .000 
              IsInMemory: false
*************************** 2. row ***************************
             PartitionId: 10780
           PartitionName: p20210903
          VisibleVersion: 1
      VisibleVersionTime: 2021-09-02 10:59:15
      VisibleVersionHash: 0
                   State: NORMAL
            PartitionKey: dt
                   Range: [types: [DATE]; keys: [2021-09-03]; ..types: [DATE]; keys: [2021-09-04]; )
         DistributionKey: id
                 Buckets: 10
          ReplicationNum: 3
           StorageMedium: HDD
            CooldownTime: 9999-12-31 23:59:59
LastConsistencyCheckTime: NULL
                DataSize: .000 
              IsInMemory: false
*************************** 3. row ***************************
             PartitionId: 10821
           PartitionName: p20210904
          VisibleVersion: 1
      VisibleVersionTime: 2021-09-02 10:59:15
      VisibleVersionHash: 0
                   State: NORMAL
            PartitionKey: dt
                   Range: [types: [DATE]; keys: [2021-09-04]; ..types: [DATE]; keys: [2021-09-05]; )
         DistributionKey: id
                 Buckets: 10
          ReplicationNum: 3
           StorageMedium: HDD
            CooldownTime: 9999-12-31 23:59:59
LastConsistencyCheckTime: NULL
                DataSize: .000 
              IsInMemory: false
*************************** 4. row ***************************
             PartitionId: 10862
           PartitionName: p20210905
          VisibleVersion: 1
      VisibleVersionTime: 2021-09-02 10:59:15
      VisibleVersionHash: 0
                   State: NORMAL
            PartitionKey: dt
                   Range: [types: [DATE]; keys: [2021-09-05]; ..types: [DATE]; keys: [2021-09-06]; )
         DistributionKey: id
                 Buckets: 10
          ReplicationNum: 3
           StorageMedium: HDD
            CooldownTime: 9999-12-31 23:59:59
LastConsistencyCheckTime: NULL
                DataSize: .000 
              IsInMemory: false
4 rows in set (0.00 sec)

mysql> 

0 人点赞