记一条sql优化2
Oracle里的统计信息是非常重要的,统计信息的不正确会使CBO走错误的执行计划,下面就是个例子
Plan hash value: 209305330
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 492 |00:04:44.30 | 693K| 11795 | 2970 | | | |
| 1 | NESTED LOOPS OUTER | | 1 | 76 | 492 |00:04:44.30 | 693K| 11795 | 2970 | | | |
|* 2 | FILTER | | 1 | | 491 |00:04:44.14 | 691K| 11795 | 2970 | | | |
| 3 | NESTED LOOPS OUTER | | 1 | 76 | 301K|00:04:47.24 | 701K| 11911 | 2970 | | | |
| 4 | VIEW | | 1 | 7633 | 168K|00:00:07.29 | 15363 | 3117 | 2970 | | | |
| 5 | HASH GROUP BY | | 1 | 7633 | 168K|00:00:07.19 | 15363 | 3117 | 2970 | 1484K| 1032K| |
| 6 | TABLE ACCESS BY INDEX ROWID| SDK_START | 1 | 7633 | 347K|00:00:03.79 | 15363 | 147 | 0 | | | |
|* 7 | INDEX RANGE SCAN | IDX_SDK_START_DT | 1 | 7732 | 347K|00:00:00.76 | 1736 | 21 | 0 | | | |
|* 8 | TABLE ACCESS BY INDEX ROWID | H1_ACTIVE_DEV | 168K| 4889K| 300K|00:04:39.77 | 685K| 8794 | 0 | | | |
|* 9 | INDEX RANGE SCAN | IDX_H1_ACTIVE_DEV_DEVICEID | 168K| 1 | 316K|00:04:37.67 | 371K| 8794 | 0 | | | |
|* 10 | TABLE ACCESS BY INDEX ROWID | SDK_START | 491 | 1 | 489 |00:00:00.16 | 2454 | 0 | 0 | | | |
|* 11 | INDEX RANGE SCAN | IDX_SDK_START_SDA | 491 | 1 | 489 |00:00:00.15 | 2007 | 0 | 0 | | | |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("DEVICEID" IS NULL)
7 - access("DT"='2016-01-04')
8 - filter("R1"."APPKEY"="APPKEY")
9 - access("R1"."DEVICEID"="DEVICEID")
10 - filter("DT"='2016-01-04')
11 - access("R1"."SERVERTIME"="SERVERTIME" AND "R1"."DEVICEID"="DEVICEID" AND "R1"."APPKEY"="APPKEY")
以上我们可以看到id为3的操作使用NESTED LOOPS OUTER方式与id为9的表进行了168K次的循环,这里其实是导致这个sql慢的地方。为什么这条sql选择走NESTED LOOPS OUTER?是因为错误的统计信息。
收集统计信息后,CBO选择走以下的执行计划,执行时间一下变为6s了
Plan hash value: 3152507629
----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.56 | 11343 | | | |
|* 1 | HASH JOIN OUTER | | 1 | 341 | 0 |00:00:00.56 | 11343 | 702K| 702K| 158K (0)|
|* 2 | FILTER | | 1 | | 0 |00:00:00.56 | 11343 | | | |
|* 3 | HASH JOIN OUTER | | 1 | 341 | 42490 |00:00:00.56 | 11343 | 3527K| 1086K| 4845K (0)|
| 4 | VIEW | | 1 | 34053 | 34111 |00:00:00.15 | 1893 | | | |
| 5 | HASH GROUP BY | | 1 | 34053 | 34111 |00:00:00.14 | 1893 | 3526K| 1046K| 5000K (0)|
| 6 | PARTITION RANGE SINGLE| | 1 | 88721 | 88729 |00:00:00.05 | 1893 | | | |
|* 7 | TABLE ACCESS FULL | SDK_START | 1 | 88721 | 88729 |00:00:00.03 | 1893 | | | |
| 8 | TABLE ACCESS FULL | H1_ACTIVE_DEV | 1 | 552K| 552K|00:00:00.10 | 9450 | | | |
| 9 | PARTITION RANGE SINGLE | | 0 | 88721 | 0 |00:00:00.01 | 0 | | | |
|* 10 | TABLE ACCESS FULL | SDK_START | 0 | 88721 | 0 |00:00:00.01 | 0 | | | |
----------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("R1"."SERVERTIME"="SERVERTIME" AND "R1"."DEVICEID"="DEVICEID" AND "R1"."APPKEY"="APPKEY")
2 - filter("DEVICEID" IS NULL)
3 - access("R1"."APPKEY"="APPKEY" AND "R1"."DEVICEID"="DEVICEID")
7 - filter("DT"='2015-12-30')
10 - filter("DT"='2015-12-30')
Note
-----
- cardinality feedback used for this statement
其实解决这个问题有两个方法
1.收集统计信息
2.使用sql profile 固定走 HASH JOIN OUTER的执行计划
使用哪一种方法,需要来平衡,大表收集统计信息可能会比较慢,浪费系统资源,使用sql profile可能导致后面执行环境改变,出问题了。这些需要熟悉业务的人来平衡。
正文到此结束