在SQL tuning中,不良写法导致SQL执行效率比比皆是。最近的SQL tuning中一个外部连接写法不当导致过SQL执行时间超过15分钟左右此即 为一例。通过修正该SQL的写法之后,过高的逻辑读呈数量级下降以及SQL语句执行时间也大幅下降。下面给出一个列子来演示该情形。
一、创建演示环境
代码语言:javascript复制-->当前数据库版本
SQL> select * from v$version where rownum<2;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
-->创建演示表并插入记录
SQL> create table t as select empno,ename,job,sal,deptno
2 from emp where 1=2;
SQL> insert into t select empno,ename,job,sal,deptno
2 from emp e where empno=(select max(empno) from emp where deptno=e.deptno);
SQL> insert into t(empno,ename,job,sal) values(8888,'ROBINSON','DBA',2000);
SQL> insert into t(empno,ename,job,sal) values(9999,'JACKSON','CLERK',2500);
SQL> commit;
SQL> analyze table t compute statistics;
SQL> select * from t;
EMPNO ENAME JOB SAL DEPTNO
---------- ---------- --------- ---------- ----------
7934 MILLER CLERK 1300 10
7902 FORD ANALYST 3000 20
7900 JAMES CLERK 950 30
8888 ROBINSON DBA 2000
9999 JACKSON CLERK 2500
-->使用left join连接查看数据,此时表t中所有记录被返回
SQL> select empno,ename,sal,dname from t left join dept d on t.deptno=d.deptno;
EMPNO ENAME SAL DNAME
---------- ---------- ---------- --------------
7934 MILLER 1300 ACCOUNTING
7902 FORD 3000 RESEARCH
7900 JAMES 950 SALES
8888 ROBINSON 2000
9999 JACKSON 2500
-->下面同样是使用left join连接,但在on子句中增加了过滤条件t.sal>=2000
-->从下面的返回结果可知,t.sal>=2000子句并没有过滤掉sal小于2000的记录
SQL> select empno,ename,sal,dname from t left join dept d -->简称语句A
2 on(t.deptno=d.deptno and t.sal>=2000);
EMPNO ENAME SAL DNAME
---------- ---------- ---------- --------------
7934 MILLER 1300
7902 FORD 3000 RESEARCH
7900 JAMES 950
8888 ROBINSON 2000
9999 JACKSON 2500
-->使用left join连接,将过滤条件放到where 子句中
-->此时仅仅t.sal>=2000且符合t.deptno=d.deptno的记录被返回(结果与所期望一致)
SQL> select empno,ename,sal,dname from t left join dept d -->简称语句B
2 on t.deptno=d.deptno where t.sal>=2000;
EMPNO ENAME SAL DNAME
---------- ---------- ---------- --------------
7902 FORD 3000 RESEARCH
8888 ROBINSON 2000
9999 JACKSON 2500
-->查看执行计划
SQL> set autotrace traceonly exp;
-->语句A(过滤条件位于on 子句中的情形)的执行计划
SQL> select empno,ename,sal,dname from t left join dept d
2 (on t.deptno=d.deptno and t.sal>=2000);
Execution Plan
----------------------------------------------------------
Plan hash value: 2195752858
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 120 | 6 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 5 | 120 | 6 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL | T | 5 | 70 | 3 (0)| 00:00:01 |
| 3 | VIEW | | 1 | 10 | 1 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("T"."DEPTNO"="D"."DEPTNO") -->重点关注这里的谓词信息,两个过滤条件合在一起
filter("T"."DEPTNO" IS NOT NULL AND "T"."SAL">=2000) -->从执行计划来看位于第5步为INDEX UNIQUE SCAN
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
11 consistent gets -->此时的逻辑读为11
0 physical reads
0 redo size
696 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
5 rows processed
-->语句B(将谓词信息置于到where子句中的情形)的执行计划
SQL> select empno,ename,sal,dname from t left join dept d
2 on t.deptno=d.deptno where t.sal>=2000;
Execution Plan
----------------------------------------------------------
Plan hash value: 832694258
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 81 | 4 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 3 | 81 | 4 (0)| 00:00:01 |
|* 2 | TABLE ACCESS FULL | T | 3 | 42 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 |
|* 4 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("T"."SAL">=2000) -->此时的谓词信息分为两部分,"T"."SAL">=2000位于第二步
4 - access("T"."DEPTNO"="D"."DEPTNO"( )) -->此条谓词信息用于实现表连接
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
10 consistent gets -->此时的逻辑读为10,由于2 - filter("T"."SAL">=2000)过滤后,内部循环少执行了一次
0 physical reads
0 redo size
658 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
/**************************************************/
/* Author: Robinson Cheng */
/* Blog: http://blog.csdn.net/robinson_0612 */
/* MSN: robinson_0612@hotmail.com */
/* QQ: 645746311 */
/**************************************************/
-->从上面的观察中发现上述两条SQL语句执行计划并非最佳,存在改良的余地
-->由于是nested loops outer,因此考虑在表t的谓词列增加索引以快速过滤记录
SQL> create index i_t_sal on t(sal);
SQL> exec dbms_stats.gather_table_stats('SCOTT','T',cascade=>true);
-->增加索引后两个语句的执行情况
-->语句A的执行计划以及统计信息没有发生任何变化
SQL> select empno,ename,sal,dname from t left join dept d
2 on (t.deptno=d.deptno and t.sal>=2000);
Execution Plan
----------------------------------------------------------
Plan hash value: 2195752858
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 140 | 6 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 5 | 140 | 6 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL | T | 5 | 90 | 3 (0)| 00:00:01 |
| 3 | VIEW | | 1 | 10 | 1 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("T"."DEPTNO"="D"."DEPTNO")
filter("T"."DEPTNO" IS NOT NULL AND "T"."SAL">=2000)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
11 consistent gets
0 physical reads
0 redo size
696 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
5 rows processed
-->语句B的执行计划发生变化,原来的全表扫描变为索引扫描
SQL> select empno,ename,sal,dname from t left join dept d
2 on t.deptno=d.deptno where t.sal>=2000;
Execution Plan
----------------------------------------------------------
Plan hash value: 2452308905
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 93 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 3 | 93 | 3 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| T | 3 | 54 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | I_T_SAL | 3 | | 1 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("T"."SAL">=2000)
5 - access("T"."DEPTNO"="D"."DEPTNO"( ))
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
6 consistent gets -->逻辑读也由10下降到6
0 physical reads
0 redo size
658 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
二、总结 1、尽可能避免SQL不良写法导致的不良后果 2、此例中由于将谓词信息放到ON子句中,在数据量庞大的表(百万行)连接中,则该写法导致过多的物理和逻辑I/O,使得中间结果集庞大 3、谓词信息放到ON子句中同时也导致索引失效 4、尽可能的在满足需求的情况下减小中间结果集