基于oradebug poke分析不同redo相关latch获取的先后次序
结论
1,测试环境10.2.0.5
2,可见写redo时,先要获取redo copy latch,然后才是redo writing latch
3,如果是REDO相关的3个LATCH,先是获取redo copy latch,接着才是redo allocation latch 最后是redo writing latch,
4, 上述3个redo latch到底是在server process---->log buffer--->redo log file,哪个阶段获取,将在下文进行分析
5,掌握了重要的分析思路,即:
其实通过上述的可以把所有的REDO相关的LATCH全部用ORADEBUG POKE HANG住,然后基于v$latch以及v$latch_children的immediate_misses进行对比分析,进而依次释放首个获取的LATCH,接着重复前面的动作,即可以把所有相关的REDO LATCH获取的先后次序
a,先查询所有要测试LATCH的immediate_gets相关信息
SQL> select latch#,name,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch where latch#=148;
LATCH# NAME GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- -------------- ----------------
148 redo writing 35809 0 0 0 0
SQL> select latch#,name,child#,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch_children where latch#=149 and child#=1;
LATCH# NAME CHILD# GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- ---------- -------------- ----------------
149 redo copy 1 5 0 0 10690 10
B,ORADEBUG POKE,HANG 测试相关REDO LATCH
C,再次查询a步的脚本,进行对比,看哪些LATCH的immediate_misses变化,首先获取的latch就是哪个
D,释放由C步获取的首个LATCH
E,再次重复上述A到D的过程,即可以把获取相关LATCH的先后次序分析出来
扩展问题
1,上述的3个redo latch, 到底具体的含义是什么,如何反证它们的业务含义,这是最有价值的地方
2,发现自己对于oracle核心后台进程ckpt,lgwr,dbwr相关交互机制理解还是不到位,还要学习官方手册
强化测试与总结
测试
SQL> select * from v$version where rownum=1;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
SQL> select latch#,name from v$latch where lower(name) like '%redo%';
LATCH# NAME
---------- --------------------------------------------------
116 ping redo on-disk SCN
148 redo writing
149 redo copy
372 KFR redo allocation latch
115 redo on-disk SCN
150 redo allocation
6 rows selected.
SQL> select latch#,name,child#,addr from v$latch_children where latch#=149;
LATCH# NAME CHILD# ADDR
---------- ------------------------------ ---------- ----------------
149 redo copy 1 0000000069094BC0
149 redo copy 2 0000000069094C88
SQL> select latch#,name,child#,addr from v$latch_children where latch#=150;
LATCH# NAME CHILD# ADDR
---------- ------------------------------ ---------- ----------------
150 redo allocation 1 00000000693C70F0
150 redo allocation 2 00000000693C7190
150 redo allocation 3 00000000693C7230
150 redo allocation 4 00000000693C72D0
150 redo allocation 5 00000000693C7370
150 redo allocation 6 00000000693C7410
150 redo allocation 7 00000000693C74B0
150 redo allocation 8 00000000693C7550
150 redo allocation 9 00000000693C75F0
150 redo allocation 10 00000000693C7690
150 redo allocation 11 00000000693C7730
11 rows selected.
SQL> select latch#,name,child#,addr from v$latch_children where latch#=148;
no rows selected
SQL> select latch#,name,addr from v$latch where latch#=148;
LATCH# NAME ADDR
---------- ------------------------------ ----------------
148 redo writing 0000000060017DB0
SQL> show user
USER is "SCOTT"
SQL> create table t_redo(a int);
Table created.
SQL> insert into t_redo values(1);
1 row created.
SQL> commit;
Commit complete.
SQL> select * from t_redo;
A
----------
1
--hang redo writing latch
SQL> oradebug setmypid
Statement processed.
SQL> oradebug poke 0x0000000060017DB0 4 1
BEFORE: [060017DB0, 060017DB4) = 00000000
AFTER: [060017DB0, 060017DB4) = 00000001
SQL>
--hang redo copy child latch 1, 0000000069094BC0
SQL> oradebug poke 0x0000000069094BC0 4 1
BEFORE: [069094BC0, 069094BC4) = 00000000
AFTER: [069094BC0, 069094BC4) = 00000001
---可见SQL查询不用持redo相关的latch,当然,我这里只是一种分析思路,请大家不要局限于此
SQL> select * from t_redo;
A
----------
1
SQL> select latch#,name,child#,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch_children where latch#=149 and child#=1;
LATCH# NAME CHILD# GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- ---------- -------------- ----------------
149 redo copy 1 5 0 0 10690 10
SQL> select latch#,name,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch where latch#=148;
LATCH# NAME GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- -------------- ----------------
148 redo writing 35809 0 0 0 0
SQL> insert into t_redo values(1);
1 row created.
SQL> select latch#,name,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch where latch#=148;
LATCH# NAME GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- -------------- ----------------
148 redo writing 35809 0 0 0 0
SQL> select latch#,name,child#,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch_children where latch#=149 and child#=1;
LATCH# NAME CHILD# GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- ---------- -------------- ----------------
149 redo copy 1 5 0 0 10690 10
---提交上述的DML HANG住
SQL> commit;
---可见写redo时,先要获取redo copy latch,然后才是redo writing latch
SQL> select latch#,name,child#,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch_children where latch#=149 and child#=1;
LATCH# NAME CHILD# GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- ---------- -------------- ----------------
149 redo copy 1 5 0 0 10690 11
SQL> select latch#,name,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch where latch#=148;
LATCH# NAME GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- -------------- ----------------
148 redo writing 35809 0 0 0 0
SQL>
---释放redo copy latch latch 1,上述的commit仍hang住
SQL> oradebug poke 0x0000000069094BC0 4 0
BEFORE: [069094BC0, 069094BC4) = 000000FF
AFTER: [069094BC0, 069094BC4) = 00000000
--hang redo allocation latch 1
SQL> oradebug poke 0x00000000693C70F0 4 1
BEFORE: [0693C70F0, 0693C70F4) = 00000000
AFTER: [0693C70F0, 0693C70F4) = 00000001
---释放redo writing latch
SQL> oradebug poke 0x0000000060017DB0 4 0
BEFORE: [060017DB0, 060017DB4) = 000000FF
AFTER: [060017DB0, 060017DB4) = 00000000
--commit还是hang住,可见先是获取redo copy latch,然后是redo writing latch,最后才是redo allocation latch
SQL> commit;
SQL> select latch#,name,child#,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch_children where latch#=150 and child#=1;
LATCH# NAME CHILD# GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- ---------- -------------- ----------------
150 redo allocation 1 11867 4 3 10823 1
----释放 redo allocation latch 1,上述COMMIT提成完成
SQL> oradebug poke 0x00000000693C70F0 4 0
BEFORE: [0693C70F0, 0693C70F4) = 000000FF
AFTER: [0693C70F0, 0693C70F4) = 00000000
SQL> commit;
Commit complete.
---其实通过上述的可以把所有的REDO相关的LATCH全部用ORADEBUG POKE HANG住,然后基于v$latch以及v$latch_children进行对比分析,进而依次释放首个获取的LATCH,接着重复前面的动作,即可以把所有相关的REDO LATCH获取的先后次序
分析出来
---有个问题,上述的哪个LATCH是用于LGWR FLUSH LOG BUFFER到REDO LOG FILE的呢?
SQL> alter system checkpoint;
System altered.
--hang redo writing latch
SQL> oradebug poke 0x0000000060017DB0 4 1
BEFORE: [060017DB0, 060017DB4) = 00000000
AFTER: [060017DB0, 060017DB4) = 00000001
---checkpoint hang,可见是redo writing latch是用于LGWR FLUSH LOG BUFFER到REDO LOG FILE的
SQL> alter system checkpoint;
SQL> oradebug poke 0x0000000060017DB0 4 0
BEFORE: [060017DB0, 060017DB4) = 000000FF
AFTER: [060017DB0, 060017DB4) = 00000000
SQL> alter system checkpoint;
System altered.
为了全面,再看下redo copy及redo allocation latch
---可见redo copy不是用于LGWR FLUSH LOG BUFFER到REDO LOG FILE的,反过来说它用于服务器进程到log buffer之间
SQL> oradebug poke 0x0000000069094BC0 4 1
BEFORE: [069094BC0, 069094BC4) = 00000000
AFTER: [069094BC0, 069094BC4) = 00000001
SQL> alter system checkpoint;
System altered.
SQL> oradebug poke 0x0000000069094BC0 4 0
BEFORE: [069094BC0, 069094BC4) = 000000FF
AFTER: [069094BC0, 069094BC4) = 00000000
---可见redo allocation latch用于LGWR FLUSH LOG BUFFER到REDO LOG FILE
SQL> oradebug poke 0x00000000693C70F0 4 1
BEFORE: [0693C70F0, 0693C70F4) = 00000000
AFTER: [0693C70F0, 0693C70F4) = 00000001
--检查点HANG
SQL> alter system checkpoint;
SQL> oradebug poke 0x00000000693C70F0 4 0
BEFORE: [0693C70F0, 0693C70F4) = 000000FF
AFTER: [0693C70F0, 0693C70F4) = 00000000
SQL> alter system checkpoint;
System altered.
--上述发现redo allocation latch及redo writing latch自己的理解还是有些问题,到底它们的先后次序是什么呢,重复用上述思路分析下
---POKE前
SQL> select latch#,name,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch where latch#=148;
LATCH# NAME GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- -------------- ----------------
148 redo writing 36771 8 8 0 0
SQL> select latch#,name,child#,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch_children where latch#=150 and child#=1;
LATCH# NAME CHILD# GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- ---------- -------------- ----------------
150 redo allocation 1 12198 9 11 11105 5
---POKE
---REDO ALLOCATION LATCH
SQL> oradebug poke 0x00000000693C70F0 4 1
BEFORE: [0693C70F0, 0693C70F4) = 00000000
AFTER: [0693C70F0, 0693C70F4) = 00000001
--REDO WRITING LATCH
SQL> oradebug poke 0x0000000060017DB0 4 1
BEFORE: [060017DB0, 060017DB4) = 00000000
AFTER: [060017DB0, 060017DB4) = 00000001
SQL> insert into t_redo values(3);
1 row created.
--hang commit
SQL> commit;
--poke后
----可见先是获取redo allocation ,然后是redo writing(注:我们主要看immediate_misses列的变化,另外:发现gets也有变化,这个列的含义还要进一步测试与理解)
SQL> select latch#,name,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch where latch#=148;
LATCH# NAME GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- -------------- ----------------
148 redo writing 36876 8 8 0 0
SQL>
SQL>
SQL> select latch#,name,child#,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch_children where latch#=150 and child#=1;
LATCH# NAME CHILD# GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- ---------- -------------- ----------------
150 redo allocation 1 12221 9 11 11105 7
---释放REDO ALLOCATION LATCH
SQL> oradebug poke 0x00000000693C70F0 4 0
BEFORE: [0693C70F0, 0693C70F4) = 000000FF
AFTER: [0693C70F0, 0693C70F4) = 00000000
---可见释放REDO ALLOCATION LATCH,其v$latch_children的列gets有增加,且misses有增加,immediate_gets也有增加(这里变化之间的含义是什么,还要好好思考)
SQL> select latch#,name,child#,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch_children where latch#=150 and child#=1;
LATCH# NAME CHILD# GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- ---------- -------------- ----------------
150 redo allocation 1 12224 11 13 11114 7
---而后者redo writing latch没有变化,也只是说只要自己的LATCH由获取不到变化为可以获取的信息,它的信息才会变化
SQL> select latch#,name,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch where latch#=148;
LATCH# NAME GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- -------------- ----------------
148 redo writing 36876 8 8 0 0
---释放REDO writing latch,同上我的分析,它对应的列也发生了变化
SQL> oradebug poke 0x0000000060017DB0 4 0
BEFORE: [060017DB0, 060017DB4) = 000000FF
AFTER: [060017DB0, 060017DB4) = 00000000
SQL>
SQL>
SQL> select latch#,name,gets,misses,sleeps,immediate_gets,immediate_misses from v$latch where latch#=148;
LATCH# NAME GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES
---------- ------------------------------ ---------- ---------- ---------- -------------- ----------------
148 redo writing 36888 11 11 0 0
正文到此结束