客户的一个数据库mytest出现非常缓慢,cpu使用相当高,数据库wait event显示大量read by other session、latch: cache buffers chains和read by other session事件
查询等待事件,有几个比较主要的等待事件如下:
SQL> select sql_id,event,count(1) counts from v$session group by sql_id,event order by 3 desc;
SQL_ID EVENT COUNTS
------------- -------------------------- ------
353205q5fk492 read by other session 25
353205q5fk492 latch: cache buffers chains 16
353205q5fk492 db file sequential read 14
问题出在353205q5fk492语句上。仔细查看这条语句,有两个执行计划。
查看数据库sql使用情况,发现采用了一条效率并不高的执行计划,如下:
Plan hash value: 3282783069
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%C
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 1154 (1
| 1 | COUNT STOPKEY | | | |
| 2 | INLIST ITERATOR | | | |
| 3 | TABLE ACCESS BY INDEX ROWID| xxxxxxxxx | 1 | 537 | 1153
| 4 | INDEX RANGE SCAN | xxxxxxxxx~id1 | 27479 | | 14
--------------------------------------------------------------------------------
查看语句的情况如下,当前采用的是这一条执行计划。
SQL> select address,hash_value from v$sqlarea where sql_id='353205q5fk492';
ADDRESS HASH_VALUE
---------------- ----------
000000235FFBE9B0 3282783069
所以为了紧急处理这个问题,我们采用如下方法进行处理
1、首先把这个执行计划踢出shared_pool
SQL> exec dbms_shared_pool.purge('000000235FFBE9B0,3282783069','C')
2、然后征求客户意见后,采用这个执行计划的语句进行查杀。
3、针对这条语句中where条件进行了重新统计分析
经过上面三步处理后,迫使该sql重新解析,解析为好的执行计划,然后使用spm成功固定其执行计划,固定的方法如下:
SQL> var plans_loaded number
SQL> exec :plans_loaded:=dbms_spm.load_plans_from_cursor_cache(sql_id=>'353205q5fk492',plan_hash_value=>3566569303)
最后后检查新最后采取的执行计划
SQL> select sql_handle,plan_name,origin,enabled,accepted,sql_text from dba_sql_plan_baselines
SQL_HANDLE PLAN_NAME ORIGIN ENA ACC SQL_TEXT
------------------------------ ------------------------------ -------------- --- --- ------------------------------------
SQL_e96f1f0834698c71 SQL_PLAN_fkvsz10u6m33j21b3b9d1 MANUAL-LOAD YES YES SELECT xxxxxxxxx
Plan hash value: 3566569303
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 1 (100)| |
| 1 | COUNT STOPKEY | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| xxxxxxxxx | 1 | 537 | 1 (0)| 00:00:01 |
| 3 | INDEX RANGE SCAN | xxxxxxxxx~id2 | 1 | | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
最后检查,发现已经使用绑定的执行计了
SQL> select hash_value, sql_id,child_number,plan_hash_value,executions,round(buffer_gets/executions,0) buffergets_per,SQL_PLAN_BASELINE ,last_load_time,last_active_time from v$sql where sql_id='353205q5fk492';
HASH_VALUE SQL_ID CHILD_NUMBER PLAN_HASH_VALUE EXECUTIONS BUFFERGETS_PER SQL_PLAN_BASELINE LAST_LOAD_TIME LAST_ACTIVE_TIME
---------- ------------- ------------ --------------- ---------- -------------- ------------------------------ -------------------------------------- -------------------
3566569303 353205q5fk492 12 1263503027 284 573 SQL_PLAN_fkvsz10u6m33j21b3b9d1 2015-03-06/17:21:20 2015-03-06/17:21:20 --表明用上spm固定了好的执行计划。
通过这样处理后,发现效率提升非常明显,cpu空闲率直接上升到80%以上了。