今天qq上一好友发过来一个sql让我帮他看看,说这个sql加上一个条件查询时间在3~4秒左右,而不加上这个条件则非常快,正好这段时间也在学习优化,就看看问题所在。
sql语句大致如下:
SQL> SELECT a.mc_id AS company_id,
a.mc_name AS company_name,
b.area_name AS district_name,
c.code_name AS operating_quality,
a.agent_tel AS mobile_phone,
a.od_lng,
a.od_lat
FROM tp_company a
LEFT JOIN ads_area b
ON a.district = b.area_id
LEFT JOIN (SELECT b.code, b.code_name
FROM md_cat a, md_dict b
WHERE a.cat_id = b.cat_id
AND a.cat_code = 'BUSINESS_GRADE') c
ON a.operating_quality = c.code
WHERE EXISTS (SELECT 1
FROM tp_company_role m
WHERE a.mc_id = m.mc_id
AND m.com_role_type = 4)
AND a.district IN (330903, 330900, 330921, 330902, 330922)
ORDER BY a.mc_id DESC;
其中m.com_role_type=4根据好友说加上这个语句就变得慢了,由于他的环境上没有sqlplus,就直接使用下面方法查看执行计划(为了保密,SQL已处理过)
SQL> explain plan for
SELECT a.mc_id AS company_id,
a.mc_name AS company_name,
b.area_name AS district_name,
c.code_name AS operating_quality,
a.agent_tel AS mobile_phone,
a.od_lng,
a.od_lat
FROM tp_company a
LEFT JOIN ads_area b
ON a.district = b.area_id
LEFT JOIN (SELECT b.code, b.code_name
FROM md_cat a, md_dict b
WHERE a.cat_id = b.cat_id
AND a.cat_code = 'BUSINESS_GRADE') c
ON a.operating_quality = c.code
WHERE EXISTS (SELECT 1
FROM tp_company_role m
WHERE a.mc_id = m.mc_id
AND m.com_role_type = 4)
AND a.district IN (330903, 330900, 330921, 330902, 330922)
ORDER BY a.mc_id DESC;
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2353373994
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempS
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 40749 | 5690K|
| 1 | SORT ORDER BY | | 40749 | 5690K| 616
|* 2 | HASH JOIN RIGHT OUTER | | 40749 | 5690K|
| 3 | VIEW | | 42 | 2352 |
| 4 | NESTED LOOPS | | 42 | 1890 |
| 5 | TABLE ACCESS BY INDEX ROWID| MD_CAT | 1 | 21 |
|* 6 | INDEX UNIQUE SCAN | IDX_MD_CAT | 1 | |
|* 7 | TABLE ACCESS FULL | MD_DICT | 42 | 1008 |
|* 8 | HASH JOIN RIGHT OUTER | | 40749 | 3462K|
| 9 | TABLE ACCESS FULL | ADS_AREA | 8 | 120 |
| 10 | NESTED LOOPS | | 40749 | 2865K|
| 11 | NESTED LOOPS | | 40749 | 2865K|
| 12 | SORT UNIQUE | | 1995 | 13965 |
|* 13 | TABLE ACCESS FULL | TP_COMPANY_ROLE | 1995 | 13965 |
|* 14 | INDEX UNIQUE SCAN | PK_TP_COMPANY | 1 | |
|* 15 | TABLE ACCESS BY INDEX ROWID| TP_COMPANY | 20 | 1300 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A"."OPERATING_QUALITY"="C"."CODE"(+))
6 - access("A"."CAT_CODE"='BUSINESS_GRADE')
7 - filter("A"."CAT_ID"="B"."CAT_ID")
8 - access("B"."AREA_ID"(+)=TO_NUMBER("A"."DISTRICT"))
13 - filter(TO_NUMBER("M"."COM_ROLE_TYPE")=4)
14 - access("A"."MC_ID"="M"."MC_ID")
15 - filter(TO_NUMBER("A"."DISTRICT")=330900 OR TO_NUMBER("A"."DISTRICT")=3309
TO_NUMBER("A"."DISTRICT")=330903 OR TO_NUMBER("A"."DISTRICT")=3309
TO_NUMBER("A"."DISTRICT")=330922)
从执行计划中的id=13可以看出,TP_COMPANY_ROLE根据COM_ROLE_TYPE=4这个条件返回1995行,然后与TP_COMPANY返回的20行数据进行NESTED LOOPS。凡是这种慢的SQL遇上NESTED LOOPS就要非常小心了,这个时候我让好友去查了下这个TP_COMPANY_ROLE根据COM_ROLE_TYPE=4这个条件实际返回了多少行,得到的结果是返回大概49万行,到此为止可以断定是由于统计信息过旧导致CBO计算返回的行数少从而本该进行HASH JOIN的却进行了大量的NESTED LOOPS。
重新收集统计信息
BEGIN
dbms_stats.gather_table_stats(ownname => 'TPSM',
tabname => 'TP_COMPANY_ROLE',
estimate_percent => 100,
method_opt => 'for all columns size auto',
no_invalidate => FALSE,
degree => 1,
cascade => TRUE);
END;
/
并根据SQL创建了,MC_ID和COM_ROLE_TYPE的组合索引,再次执行语句的时候,语句从原来的3~4秒已经提升到了0.3秒左右,查看修改过后的执行计划
SQL> explain plan for
SELECT a.mc_id AS company_id,
a.mc_name AS company_name,
b.area_name AS district_name,
c.code_name AS operating_quality,
a.agent_tel AS mobile_phone,
a.od_lng,
a.od_lat
FROM tp_company a
LEFT JOIN ads_area b
ON a.district = b.area_id
LEFT JOIN (SELECT b.code, b.code_name
FROM md_cat a, md_dict b
WHERE a.cat_id = b.cat_id
AND a.cat_code = 'BUSINESS_GRADE') c
ON a.operating_quality = c.code
WHERE EXISTS (SELECT 1
FROM tp_company_role m
WHERE a.mc_id = m.mc_id
AND m.com_role_type = 4)
AND a.district IN (330903, 330900, 330921, 330902, 330922)
ORDER BY a.mc_id DESC;
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 37531519
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Byte
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 290K| 4
|* 1 | HASH JOIN RIGHT OUTER | | 290K| 4
| 2 | VIEW | | 42 | 289
| 3 | NESTED LOOPS | | 42 | 189
| 4 | TABLE ACCESS BY INDEX ROWID| MD_CAT | 1 | 2
|* 5 | INDEX UNIQUE SCAN | IDX_MD_CAT | 1 |
|* 6 | TABLE ACCESS FULL | MD_DICT | 42 | 100
|* 7 | HASH JOIN RIGHT OUTER | | 290K| 2
| 8 | TABLE ACCESS FULL | ADS_AREA | 8 | 12
|* 9 | HASH JOIN RIGHT SEMI | | 290K| 2
|* 10 | INDEX FAST FULL SCAN | IDX_MC_ID_COM_ROLE_TYPE | 490K| 622
|* 11 | TABLE ACCESS FULL | TP_COMPANY | 290K| 1
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
1 - access("A"."OPERATING_QUALITY"="C"."CODE"(+))
5 - access("A"."CAT_CODE"='BUSINESS_GRADE')
6 - filter("A"."CAT_ID"="B"."CAT_ID")
7 - access("B"."AREA_ID"(+)=TO_NUMBER("A"."DISTRICT"))
9 - access("A"."MC_ID"="M"."MC_ID")
10 - filter(TO_NUMBER("M"."COM_ROLE_TYPE")=4)
11 - filter(TO_NUMBER("A"."DISTRICT")=330900 OR TO_NUMBER("A"."DISTRICT")=3309
TO_NUMBER("A"."DISTRICT")=330903 OR TO_NUMBER("A"."DISTRICT")=3309
30 rows selected
Id=9这里已经可以看出执行计划已经在走HASH JOIN而不是原来错误的NESTED LOOPS了。由于手头上还有其他事情,时间已经达到优化目的,就没有再去检查其他可以优化的地方了