转载

从v$sysstat的指标ges messages sent理解oracle 10.2.0.1 rac lmd进程系列三

结论

1,如果节点间的DML操作非常频繁,会在节点间产生大量的GES信息传递
2,ges messages sent可以评估RAC节点的DML操作或通讯是否频繁,如果此值小,说明RAC数据库并不繁忙
3,lmd进程是负责发送ges message到远端的RAC节点
4,如果用oradebug suspend lmd,会导致远端RAC节点的insert操作受阻,可见lmd进程就是管理全局锁资源
5,ges messages sent在lmd hang时,也会有微量的增加
6,引申一下,可能v$sysstat中的性能指标,说白了就是评估RAC不同的后台进程的性能的,通过这些指标可以进一步分析后台进程是否正常
   进而进一步诊断分析RAC数据库的性能


测试

SQL> select * from v$version where rownum=1;


BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi


SQL> select statistic#,name,class from v$statname where lower(name) like '%ges%';


STATISTIC# NAME                                                                  CLASS
---------- ---------------------------------------------------------------- ----------
        22 messages sent                                                           128
        23 messages received                                                       128
        44 gcs messages sent                                                        32
        45 ges messages sent                                                        32  ---主要研究这个指标






可见ges messages sent隶属于集群层面
CLASS NUMBER A number representing one or more statistics classes. The following class numbers are additive:
1 - User
2 - Redo
4 - Enqueue
8 - Cache
16 - OS
32 - Real Application Clusters
64 - SQL
128 - Debug


--node1


SQL> select v$statname.name,v$sysstat.value from v$sysstat,v$statname where v$sysstat.statistic#=v$statname.statistic# and v$statname.statistic# in (45);


NAME                                                                  VALUE
---------------------------------------------------------------- ----------
ges messages sent                                                     44374




SQL> create table t_ges(a int,b int);


Table created.


SQL> insert into t_ges select level,level from dual connect by level<=1000000;


1000000 rows created.


SQL> commit;


Commit complete.


可见大量的DML操作后,指标值大幅提升
SQL> select v$statname.name,v$sysstat.value from v$sysstat,v$statname where v$sysstat.statistic#=v$statname.statistic# and v$statname.statistic# in (45);


NAME                                                                  VALUE
---------------------------------------------------------------- ----------
ges messages sent                                                     44925




可见TRUNCATE大表指标也会提升,不过不如INSERT增幅大
SQL> truncate table t_ges;


Table truncated.


SQL> select v$statname.name,v$sysstat.value from v$sysstat,v$statname where v$sysstat.statistic#=v$statname.statistic# and v$statname.statistic# in (45);


NAME                                                                  VALUE
---------------------------------------------------------------- ----------
ges messages sent                                                     45104






--node2
SQL> create table t_ges2(a int,b int);


Table created.


可见在另一个节点也会使指标提升
SQL> insert into t_ges2 select level,level from dual connect by level<=1000000;


1000000 rows created.


SQL> commit;


Commit complete.


---node1


SQL> select v$statname.name,v$sysstat.value from v$sysstat,v$statname where v$sysstat.statistic#=v$statname.statistic# and v$statname.statistic# in (45);


NAME                                                                  VALUE
---------------------------------------------------------------- ----------
ges messages sent                                                     45373




如果暂停lmd,上述指标还有变化吗?会如何表现呢?
SQL> select addr,program,username,pid,spid from v$process where username='oracle' and pid=6;


ADDR             PROGRAM                                          USERNAME               PID SPID
---------------- ------------------------------------------------ --------------- ---------- ------------
0000000083A585C8 oracle@jingfa1 (LMD0)                            oracle                   6 15271


SQL> select v$statname.name,v$sysstat.value from v$sysstat,v$statname where v$sysstat.statistic#=v$statname.statistic# and v$statname.statistic# in (45);


NAME                                                                  VALUE
---------------------------------------------------------------- ----------
ges messages sent                                                     45430


SQL> oradebug setospid 15271
Oracle pid: 6, Unix process pid: 15271, image: oracle@jingfa1 (LMD0)
SQL> oradebug suspend
Statement processed.


如果HANG LMD,指值增量极小
SQL> select v$statname.name,v$sysstat.value from v$sysstat,v$statname where v$sysstat.statistic#=v$statname.statistic# and v$statname.statistic# in (45);


NAME                                                                  VALUE
---------------------------------------------------------------- ----------
ges messages sent                                                     45678


SQL> oradebug resume
Statement processed.


SQL> select v$statname.name,v$sysstat.value from v$sysstat,v$statname where v$sysstat.statistic#=v$statname.statistic# and v$statname.statistic# in (45);


NAME                                                                  VALUE
---------------------------------------------------------------- ----------
ges messages sent                                                     45711


--可见如果LMD超过一个时间期限,仍不能恢复正常,会引发IPC TIMEOUT,进而会把另一个节点从集群中驱逐
oracle@jingfa2 bdump]$ tail -f alert_jingfa2.log 
IPC Send timeout to 0.0 inc 8 for msg type 29 from opid 22
Wed Nov 11 03:29:07 2015
Communications reconfiguration: instance_number 1
Wed Nov 11 03:29:07 2015
Trace dumping is performing id=[cdmp_20151111032907]
Wed Nov 11 03:29:11 2015
IPC Send timeout detected.Sender: ospid 18114
Receiver: inst 1 binc 433078410 ospid 15271
Wed Nov 11 03:29:13 2015
IPC Send timeout to 0.0 inc 8 for msg type 12 from opid 18




Wed Nov 11 03:30:59 2015
Evicting instance 1 from cluster
Wed Nov 11 03:31:01 2015
Trace dumping is performing id=[cdmp_20151111033042]
Wed Nov 11 03:31:06 2015
Reconfiguration started (old inc 8, new inc 12)
List of nodes:
 1
 Global Resource Directory frozen
 * dead instance detected - domain 0 invalid = TRUE 
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Wed Nov 11 03:31:06 2015
 LMS 0: 0 GCS shadows cancelled, 0 closed
 Set master node info 

正文到此结束
Loading...