Online Statistics Gathering for Bulk Loads
Oracle12c后,在大批量灌数后,提供了自动收集统计信息的方式。
大批量灌数后立即查询是很多数据库的痛点,通常都会因为统计信息不准导致计划出问题,下面总结下Oracle的解决方法:
什么场景会触发自动收集统计信息?(已验证)
- create table as。
- insert into select必须带APPEND hint,普通的insert into不行。
- delete清空表,做bulk insert不会收集统计信息,因为表有segment,必须是空表才行。
分区表:遵循上述规则的前提下
- 对父表的insert只会收集父表统计信息。
- 对子表的insert只会收集子表统计信息。
收集哪些统计信息?
- 行数
- 页面数
- 平均行长度
SELECT table_name, num_rows, BLOCKS, AVG_SPACE, CHAIN_CNT, AVG_ROW_LEN FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME NUM_ROWS BLOCKS AVG_SPACE CHAIN_CNT AVG_ROW_LEN
-------------------- ---------- ---------- ---------- ---------- -----------
TAB1 1000 7 0 0 23
什么时候不会触发自动收集统计信息?
- 非空segments(delete ×,truncate ok)
- 用户自定义的schema。
- 嵌套表、索引组织表、external。
- 带虚列的表。
- 表的统计信息已经锁定。
- PUBLISH为false。
- multitable
INSERT
statement,即insert all into a … into b … into c …。
文档:
- Bulk Loads后在线收集统计信息
- “Bulk Loads统计信息收集” 具体触发场景
INSERT INTO ... SELECT
CREATE TABLE AS SELECT
- “Bulk Loads统计信息收集” 目标:更准确的统计信息,提升性能
- 分析性业务往往需要灌入大量数据,老版本中是会建议用户灌完数后主动收集统计信息。
- 但在实践中,由于疏忽或等维护窗口没有收集统计信息,是生成错误计划的主要原因。
- “Bulk Loads统计信息收集” 分区表
- 如果插入父表,则收集父表统计信息,不会收集子表。
- 如果插入子表,则收集子表统计信息,不会手机父表。
- “Bulk Loads统计信息收集” 后的直方图?
- 数据库不会自动生成直方图,需要手动执行GATHER_TABLE_STATS。
- 注意在bulk loads统计信息收集中,Oracle不会收集列统计信息。
- “Bulk Loads统计信息收集” 限制,下面场景不会自动收集统计信息:
- 目标对象已经有数据了。
- oracle自己的schema。
- 表类型限制:nested table, index-organized table (IOT), external table, or global temporary table。
- 表的PUBLISH为false。
- 表的统计信息已经锁定了。
- multitable
INSERT
statement,即insert all into a … into b … into c …
- “Bulk Loads统计信息收集” 的hint开关:
CREATE TABLE employees2 AS SELECT /* NO_GATHER_OPTIMIZER_STATISTICS*/ * FROM employees
CREATE TABLE employees2 AS SELECT /* GATHER_OPTIMIZER_STATISTICS*/ * FROM employees
- “Bulk Loads统计信息收集” 具体触发场景
一些实验和SQL
场景一:create table as
代码语言:javascript复制DROP TABLE tab1 PURGE;
CREATE TABLE tab1 AS SELECT level AS id, 'Description of '||level AS description FROM dual CONNECT BY level <= 1000;
COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME NUM_ROWS
-------------------- ----------
TAB1 1000
SELECT NUM_ROWS, AVG_ROW_LEN, BLOCKS, EMPTY_BLOCKS, LAST_ANALYZED FROM DBA_TAB_STATISTICS WHERE TABLE_NAME='TAB1';
NUM_ROWS AVG_ROW_LEN BLOCKS EMPTY_BLOCKS LAST_ANAL
---------- ----------- ---------- ------------ ---------
1000 23 7 0 25-JUN-23
删除统计信息
代码语言:javascript复制SELECT table_name, num_rows, BLOCKS, AVG_SPACE, CHAIN_CNT, AVG_ROW_LEN FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME NUM_ROWS BLOCKS AVG_SPACE CHAIN_CNT AVG_ROW_LEN
-------------------- ---------- ---------- ---------- ---------- -----------
TAB1 1000 7 0 0 23
EXEC DBMS_STATS.delete_table_stats(USER, 'TAB1');
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'TAB1', options => 'GATHER AUTO');
SELECT table_name, num_rows, BLOCKS, AVG_SPACE, CHAIN_CNT, AVG_ROW_LEN FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME NUM_ROWS BLOCKS AVG_SPACE CHAIN_CNT AVG_ROW_LEN
-------------------- ---------- ---------- ---------- ---------- -----------
TAB1 1000 7 0 0 23
场景二:insert into select(必须带APPEND才能收集)
带APPEND直插:统计信息会更新
代码语言:javascript复制TRUNCATE TABLE tab1;
INSERT /* APPEND */ INTO tab1 SELECT level AS id,'Description of ' || level AS description FROM dual CONNECT BY level <= 500;
COMMIT;
COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME NUM_ROWS
-------------------- ----------
TAB1 500
SELECT NUM_ROWS, AVG_ROW_LEN, BLOCKS, EMPTY_BLOCKS, LAST_ANALYZED FROM DBA_TAB_STATISTICS WHERE TABLE_NAME='TAB1';
NUM_ROWS AVG_ROW_LEN BLOCKS EMPTY_BLOCKS LAST_ANAL
---------- ----------- ---------- ------------ ---------
500 23 5 0 25-JUN-23
不带APPEND:统计信息未更新
代码语言:javascript复制TRUNCATE TABLE tab1;
INSERT INTO tab1 SELECT level AS id,'Description of ' || level AS description FROM dual CONNECT BY level <= 800;
COMMIT;
COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME NUM_ROWS
-------------------- ----------
TAB1 500
SELECT NUM_ROWS, AVG_ROW_LEN, BLOCKS, EMPTY_BLOCKS, LAST_ANALYZED FROM DBA_TAB_STATISTICS WHERE TABLE_NAME='TAB1';
NUM_ROWS AVG_ROW_LEN BLOCKS EMPTY_BLOCKS LAST_ANAL
---------- ----------- ---------- ------------ ---------
500 23 5 0 25-JUN-23
select count(*) from TAB1;
COUNT(*)
----------
800
场景三:delete清空表不会收集统计信息,表必须是空的(no segments)
使用DELET清空表,表中其实还是有数据的。表示空的(没有段)是触发收集的必要条件。
代码语言:javascript复制DELETE FROM tab1;
COMMIT;
INSERT /* APPEND */ INTO tab1 SELECT level AS id,'Description of ' || level AS description FROM dual CONNECT BY level <= 900;
COMMIT;
COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME NUM_ROWS
-------------------- ----------
TAB1 500
SELECT NUM_ROWS, AVG_ROW_LEN, BLOCKS, EMPTY_BLOCKS, LAST_ANALYZED FROM DBA_TAB_STATISTICS WHERE TABLE_NAME='TAB1';
NUM_ROWS AVG_ROW_LEN BLOCKS EMPTY_BLOCKS LAST_ANAL
---------- ----------- ---------- ------------ ---------
500 23 5 0 25-JUN-23
使用truncate清空,会收集统计信息。
代码语言:javascript复制TRUNCATE TABLE tab1;
INSERT /* APPEND */ INTO tab1 SELECT level AS id,'Description of ' || level AS description FROM dual CONNECT BY level <= 1200;
COMMIT;
COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME NUM_ROWS
-------------------- ----------
TAB1 1200
SELECT NUM_ROWS, AVG_ROW_LEN, BLOCKS, EMPTY_BLOCKS, LAST_ANALYZED FROM DBA_TAB_STATISTICS WHERE TABLE_NAME='TAB1';
NUM_ROWS AVG_ROW_LEN BLOCKS EMPTY_BLOCKS LAST_ANAL
---------- ----------- ---------- ------------ ---------
1200 23 8 0 25-JUN-23
场景四:使用hint显示关闭该功能
NO_GATHER_OPTIMIZER_STATISTICS可以关闭该功能。
代码语言:javascript复制TRUNCATE TABLE tab1;
INSERT /* APPEND */ INTO tab1 SELECT /* NO_GATHER_OPTIMIZER_STATISTICS */ level AS id,'Description of ' || level AS description FROM dual CONNECT BY level <= 200;
COMMIT;
COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME NUM_ROWS
-------------------- ----------
TAB1 1200
SELECT NUM_ROWS, AVG_ROW_LEN, BLOCKS, EMPTY_BLOCKS, LAST_ANALYZED FROM DBA_TAB_STATISTICS WHERE TABLE_NAME='TAB1';
NUM_ROWS AVG_ROW_LEN BLOCKS EMPTY_BLOCKS LAST_ANAL
---------- ----------- ---------- ------------ ---------
1200 23 8 0 25-JUN-23
分区表场景一:父表insert select append只收集父表统计信息
代码语言:javascript复制DROP TABLE tab1 PURGE;
CREATE TABLE tab1(id NUMBER, created_date DATE)
PARTITION BY RANGE (created_date)
(PARTITION tab1_2015 VALUES LESS THAN (TO_DATE('01/01/2016', 'DD/MM/YYYY')) TABLESPACE users,
PARTITION tab1_2016 VALUES LESS THAN (TO_DATE('01/01/2017', 'DD/MM/YYYY')) TABLESPACE users);
INSERT /* APPEND */ INTO tab1
SELECT level, TO_DATE('01/01/2016', 'DD/MM/YYYY')
FROM dual CONNECT BY level <= 100;
COMMIT;
COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME NUM_ROWS
-------------------- ----------
TAB1 100
COLUMN table_name FORMAT A20
COLUMN partition_name FORMAT A20
SELECT table_name, partition_name, num_rows FROM user_tab_partitions WHERE table_name = 'TAB1' ORDER BY partition_name;
TABLE_NAME PARTITION_NAME NUM_ROWS
-------------------- -------------------- ----------
TAB1 TAB1_2015
TAB1 TAB1_2016
删除统计信息
代码语言:javascript复制EXEC DBMS_STATS.delete_table_stats(USER, 'TAB1');
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME NUM_ROWS
-------------------- ----------
TAB1
分区表场景二:直接插入子分区表只收集子表统计信息
代码语言:javascript复制TRUNCATE TABLE tab1;
INSERT /* APPEND */ INTO tab1 PARTITION (TAB1_2016)
SELECT level, TO_DATE('01/01/2016', 'DD/MM/YYYY') FROM dual CONNECT BY level <= 100;
COMMIT;
COLUMN table_name FORMAT A20
SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1';
TABLE_NAME NUM_ROWS
-------------------- ----------
TAB1
COLUMN table_name FORMAT A20
COLUMN partition_name FORMAT A20
SELECT table_name, partition_name, num_rows FROM user_tab_partitions WHERE table_name = 'TAB1' ORDER BY partition_name;
TABLE_NAME PARTITION_NAME NUM_ROWS
-------------------- -------------------- ----------
TAB1 TAB1_2015
TAB1 TAB1_2016 100