sitemap

RSS地图

收藏本站

设为首页

Oracle研究中心

当前位置:Oracle研究中心 > 运维DBA >

【学习笔记】Oracle lgwr 在写redo 时需要获得哪些latch

时间:2016-12-08 10:05   来源:Oracle研究中心   作者:网络   点击:

天萃荷净 Oracle研究中心学习笔记:分享一篇关于Oracle数据库lgwr 在写redo 时需要获得哪些latch的文章。

本站文章除注明转载外,均为本站原创: 转载自love wife & love life —Roger 的Oracle技术博客
本文链接地址: 关于lgwr 进程的一点思考

今天参加ACOUG活动,eygle大师分享了一个案例,其中讲到进程的strace时,问我某个字符是神马意思?

研究Oracle的lgwr 在写redo 时需要获得哪些latch?

具体这些latch的持有和释放的顺序是怎么样的? 做了一会儿实验,没有搞明白,突然trace lgwr进程,发现了另外一点内容,彻底颠覆了我以前的认知。

SQL> !ps -ef|grep lgwr
ora10g   23584     1  0 08:36          00:00:00 ora_lgwr_roger
ora10g   23637 23572  0 08:37 pts/1    00:00:00 /bin/bash -c ps -ef|grep lgwr

使用strace -ftr -o a.log -p 23584 跟踪lgwr进程,然后另外开启一个窗口进行如下操作:
1000 rows deleted.

SQL> commit;
Commit complete.

SQL>  alter system switch logfile;
System altered.

SQL> !ps -ef|grep lgwr|grep -v grep
ora10g   23584     1  0 08:36          00:00:00 ora_lgwr_roger

然后我们再回头去看看strace 打出来的trace是什么样的。

23584      0.000056 gettimeofday({1345909238, 375806}, NULL) = 0
23584      0.000053 times(NULL)         = 432511470
23584      0.000071 gettimeofday({1345909238, 375933}, NULL) = 0
23584      0.000060 gettimeofday({1345909238, 375990}, NULL) = 0
23584      0.000059 semtimedop(622592, 0x1, 0, {4294901770, 0}) = -1 EAGAIN (Resource temporarily unavailable)
23584      3.001243 gettimeofday({1345909241, 377292}, NULL) = 0
.....省略部分信息
23584      0.000166 times(NULL)         = 432511770
23584      0.000032 gettimeofday({1345909241, 377821}, NULL) = 0
23584      0.000043 times(NULL)         = 432511770
23584      0.000031 pwrite64(20, "\1\"\0\0\16\0\0\0\226\0\0\0\20\200\303\265h\0\0\0\4\0\0"..., 921088, 7168) = 921088
23584      0.008318 times(NULL)         = 432511771
23584      0.000040 gettimeofday({1345909241, 386276}, NULL) = 0
23584      0.000072 times(NULL)         = 432511771
23584      0.000040 gettimeofday({1345909241, 386367}, NULL) = 0
23584      0.000036 getrusage(RUSAGE_SELF, {ru_utime={0, 13997}, ru_stime={0, 66989}, ...}) = 0
23584      0.000048 getrusage(RUSAGE_SELF, {ru_utime={0, 13997}, ru_stime={0, 66989}, ...}) = 0
23584      0.000046 gettimeofday({1345909241, 386496}, NULL) = 0
.....省略部分信息
23584      0.000126 gettimeofday({1345909242, 708436}, NULL) = 0
23584      0.000071 times(NULL)         = 432511903
23584      0.000074 times(NULL)         = 432511903
23584      0.000054 gettimeofday({1345909242, 708690}, NULL) = 0
23584      0.000134 times(NULL)         = 432511903
23584      0.000110 pwrite64(20, "\1\"\0\0\25\7\0\0\226\0\0\0\20\200j\351\214\0\0\0\5\235"..., 512, 928256) = 512
23584      0.000944 times(NULL)         = 432511904
23584      0.000062 gettimeofday({1345909242, 709896}, NULL) = 0
23584      0.000089 times(NULL)         = 432511904
23584      0.000121 gettimeofday({1345909242, 710096}, NULL) = 0
23584      0.000171 semctl(622592, 23, IPC_64|SETVAL, 0xbfffd4fc) = 0
23584      0.001596 gettimeofday({1345909242, 711864}, NULL) = 0
23584      0.000064 gettimeofday({1345909242, 711935}, NULL) = 0
23584      0.000068 gettimeofday({1345909242, 711992}, NULL) = 0
23584      0.000060 times(NULL)         = 432511904
23584      0.000068 gettimeofday({1345909242, 712122}, NULL) = 0
23584      0.000120 gettimeofday({1345909242, 712247}, NULL) = 0
23584      0.000072 semtimedop(622592, 0x1, 670000, {4294901770, 0}) = -1 EAGAIN (Resource temporarily unavailable)
23584      1.671260 gettimeofday({1345909244, 383580}, NULL) = 0
23584      0.000072 gettimeofday({1345909244, 383647}, NULL) = 0
......省略部分信息
23584      0.000462 gettimeofday({1345909247, 141720}, NULL) = 0
23584      0.000106 pread64(15, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\225\377\0\0\0\0\0\0"..., 16384, 16384) = 16384
23584      0.000135 gettimeofday({1345909247, 142041}, NULL) = 0
23584      0.000193 gettimeofday({1345909247, 142133}, NULL) = 0
23584      0.000139 pread64(16, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\225\377\0\0\0\0\0\0"..., 16384, 16384) = 16384
23584      0.000105 gettimeofday({1345909247, 142405}, NULL) = 0
23584      0.000122 gettimeofday({1345909247, 142503}, NULL) = 0
23584      0.000071 pread64(17, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\225\377\0\0\0\0\0\0"..., 16384, 16384) = 16384
23584      0.000093 gettimeofday({1345909247, 143127}, NULL) = 0
23584      0.000576 gettimeofday({1345909247, 143255}, NULL) = 0
23584      0.000087 pread64(15, "\25\302\0\0\20\0\0\0\227\31\0\0\377\377\1\4P\31\0\0\200"..., 16384, 262144) = 16384
23584      0.000237 gettimeofday({1345909247, 143563}, NULL) = 0
23584      0.000156 gettimeofday({1345909247, 143722}, NULL) = 0
23584      0.000464 pread64(15, "\25\302\0\0\22\0\0\0\227\31\0\0\377\377\1\4\vL\0\0\0\0"..., 16384, 294912) = 16384
23584      0.000408 gettimeofday({1345909247, 144677}, NULL) = 0
23584      0.000472 gettimeofday({1345909247, 145071}, NULL) = 0
23584      0.000095 pread64(15, "\25\302\0\0\24\0\0\0\226\31\0\0\377\377\1\4\243R\0\0\17"..., 16384, 327680) = 16384
23584      0.000112 gettimeofday({1345909247, 145282}, NULL) = 0
23584      0.000090 gettimeofday({1345909247, 145358}, NULL) = 0
23584      0.000082 pread64(15, "\25\302\0\0\25\0\0\0\227\31\0\0\377\377\1\4B<\0\0\0\220"..., 16384, 344064) = 16384
23584      0.000181 gettimeofday({1345909247, 145623}, NULL) = 0
23584      0.00008Oracleoracleplus.net3 gettimeofday({1345909247, 145705}, NULL) = 0
23584      0.000230 semctl(622592, 9, IPC_64|SETVAL, 0xbfffe10c) = 0
23584      0.000809 gettimeofday({1345909247, 146745}, NULL) = 0
23584      0.000069 times(NULL)         = 432512347
23584      0.000055 pwrite64(15, "\25\302\0\0\26\0\0\0\230\31\0\0\377\377\1\4\362\230\0\0"..., 16384, 360448) = 16384
23584      0.000852 times(NULL)         = 432512347
23584      0.000052 times(NULL)         = 432512347
23584      0.000053 pwrite64(16, "\25\302\0\0\26\0\0\0\230\31\0\0\377\377\1\4\362\230\0\0"..., 16384, 360448) = 16384
23584      0.000689 times(NULL)         = 432512348
23584      0.000057 times(NULL)         = 432512348
23584      0.000052 pwrite64(17, "\25\302\0\0\26\0\0\0\230\31\0\0\377\377\1\4\362\230\0\0"..., 16384, 360448) = 16384
23584      0.000612 times(NULL)         = 432512348
23584      0.000067 gettimeofday({1345909247, 149302}, NULL) = 0
23584      0.000077 gettimeofday({1345909247, 149402}, NULL) = 0
23584      0.000089 times(NULL)         = 432512348
23584      0.000088 pwrite64(15, "\25\302\0\0\21\0\0\0\230\31\0\0\377\377\1\4\10L\0\0\0\0"..., 16384, 278528) = 16384
23584      0.000565 times(NULL)         = 432512348
23584      0.000050 times(NULL)         = 432512348
23584      0.000052 pwrite64(16, "\25\302\0\0\21\0\0\0\230\31\0\0\377\377\1\4\10L\0\0\0\0"..., 16384, 278528) = 16384
23584      0.000562 times(NULL)         = 432512348
23584      0.000051 times(NULL)         = 432512348
23584      0.000051 pwrite64(17, "\25\302\0\0\21\0\0\0\230\31\0\0\377\377\1\4\10L\0\0\0\0"..., 16384, 278528) = 16384
23584      0.000616 times(NULL)         = 432512348
23584      0.000067 gettimeofday({1345909247, 151570}, NULL) = 0
23584      0.000073 gettimeofday({1345909247, 151643}, NULL) = 0
23584      0.000063 times(NULL)         = 432512348
23584      0.000052 pwrite64(15, "\25\302\0\0\17\0\0\0\230\31\0\0\377\377\1\4\317\33\0\0"..., 16384, 245760) = 16384
23584      0.000622 times(NULL)         = 432512348
23584      0.000067 times(NULL)         = 432512348
23584      0.000091 pwrite64(16, "\25\302\0\0\17\0\0\0\230\31\0\0\377\377\1\4\317\33\0\0"..., 16384, 245760) = 16384
23584      0.000641 times(NULL)         = 432512348
23584      0.000051 times(NULL)         = 432512348
23584      0.000062 pwrite64(17, "\25\302\0\0\17\0\0\0\230\31\0\0\377\377\1\4\317\33\0\0"..., 16384, 245760) = 16384
23584      0.000649 times(NULL)         = 432512348
23584      0.000127 gettimeofday({1345909247, 154107}, NULL) = 0
23584      0.000119 gettimeofday({1345909247, 154198}, NULL) = 0
23584      0.000085 gettimeofday({1345909247, 154271}, NULL) = 0
23584      0.000063 times(NULL)         = 432512348
23584      0.000054 pwrite64(15, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\300\377\0\0\0\0\0\0"..., 16384, 16384) = 16384
23584      0.001031 times(NULL)         = 432512348
23584      0.000083 times(NULL)         = 432512348
23584      0.000056 pwrite64(16, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\300\377\0\0\0\0\0\0"..., 16384, 16384) = 16384
23584      0.000639 times(NULL)         = 432512348
23584      0.000060 times(NULL)         = 432512348
23584      0.000051 pwrite64(17, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\300\377\0\0\0\0\0\0"..., 16384, 16384) = 16384
23584      0.000624 times(NULL)         = 432512348
23584      0.000088 gettimeofday({1345909247, 157021}, NULL) = 0
23584      0.000065 gettimeofday({1345909247, 157085}, NULL) = 0
23584      0.000065 pread64(15, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\300\377\0\0\0\0\0\0"..., 16384, 16384) = 16384
23584      0.000090 gettimeofday({1345909247, 157250}, NULL) = 0
23584      0.000077 gettimeofday({1345909247, 157317}, NULL) = 0
23584      0.000124 times(NULL)         = 432512348
23584      0.000092 times(NULL)         = 432512348
23584      0.000061 gettimeofday({1345909247, 157594}, NULL) = 0
23584      0.000119 semctl(622592, 23, IPC_64|SETVAL, 0xbfffcfb4) = 0
23584      0.000143 gettimeofday({1345909247, 157857}, NULL) = 0
23584      0.000065 gettimeofday({1345909247, 158008}, NULL) = 0
23584      0.000170 pread64(20, "\1\"\0\0\1\0\0\0\226\0\0\0\0\200\246\270\0\0\0\0\0\5 \n"..., 512, 512) = 512
23584      0.000067 gettimeofday({1345909247, 158151}, NULL) = 0
23584      0.000042 gettimeofday({1345909247, 158200}, NULL) = 0
23584      0.000046 pread64(15, "\25\302\0\0\36\0\0\0\223\31\0\0\377\377\1\4\236\177\0\0"..., 16384, 491520) = 16384
23584      0.000054 gettimeofday({1345909247, 158293}, NULL) = 0
23584      0.000047 gettimeofday({1345909247, 158340}, NULL) = 0
23584      0.000039 pwrite64(18, "\1\"\0\0\1\0\0\0\227\0\0\0\0\200\244\272\0\0\0\0\0\5 \n"..., 512, 512) = 512
23584      0.000504 gettimeofday({1345909247, 158885}, NULL) = 0
23584      0.000078 gettimeofday({1345909247, 158994}, NULL) = 0
23584      0.000084 pread64(20, "\1\"\0\0\1\0\0\0\226\0\0\0\0\200\246\270\0\0\0\0\0\5 \n"..., 512, 512) = 512
23584      0.000050 gettimeofday({1345909247, 159095}, NULL) = 0
23584      0.000041 gettimeofday({1345909247, 159137}, NULL) = 0
23584      0.000039 pwrite64(20, "\1\"\0\0\1\0\0\0\226\0\0\0\0\200u\210\0\0\0\0\0\5 \n)M"..., 512, 512) = 512
23584      0.000874 gettimeofday({1345909247, 160050}, NULL) = 0
23584      0.000052 gettimeofday({1345909247, 160103}, NULL) = 0
23584      0.000041 times(NULL)         = 432512349
23584      0.000032 pwrite64(15, "\25\302\0\0\25\0\0\0\231\31\0\0\377\377\1\4\217\234\0\0"..., 16384, 344064) = 16384
23584      0.000816 times(NULL)         = 432512349
23584      0.000031 times(NULL)         = 432512349
23584      0.000032 pwrite64(16, "\25\302\0\0\25\0\0\0\231\31\0\0\377\377\1\4\217\234\0\0"..., 16384, 344064) = 16384
23584      0.000589 times(NULL)         = 432512349
23584      0.000030 times(NULL)         = 432512349
23584      0.000031 pwrite64(17, "\25\302\0\0\25\0\0\0\231\31\0\0\377\377\1\4\217\234\0\0"..., 16384, 344064) = 16384
23584      0.000571 times(NULL)         = 432512349
23584      0.000036 gettimeofday({1345909247, 162310}, NULL) = 0
23584      0.000044 gettimeofday({1345909247, 162354}, NULL) = 0
23584      0.000054 pread64(15, "\25\302\0\0(\1\0\0\220\31\0\0\377\377\1\4\200L\0\0\n\0"..., 16384, 4849664) = 16384
23584      0.000077 gettimeofday({1345909247, 162488}, NULL) = 0
23584      0.000047 gettimeofday({1345909247, 162532}, NULL) = 0
23584      0.000043 gettimeofday({1345909247, 162575}, NULL) = 0
23584      0.000038 pread64(15, "\25\302\0\0\275\0\0\0\226\31\0\0\377\377\1\4$\177\0\0Y"..., 16384, 3096576) = 16384
23584      0.000052 gettimeofday({1345909247, 162665}, NULL) = 0
23584      0.000042 gettimeofday({1345909247, 162707}, NULL) = 0
23584      0.000038 pread64(15, "\25\302\0\0\25\0\0\0\231\31\0\0\377\377\1\4\217\234\0\0"..., 16384, 344064) = 16384
23584      0.000051 gettimeofday({1345909247, 162796}, NULL) = 0
23584      0.000106 gettimeofday({1345909247, 162904}, NULL) = 0
23584      0.000039 times(NULL)         = 432512349
23584      0.000051 pwrite64(15, "\25\302\0\0\276\0\0\0\231\31\0\0\377\377\1\4}\177\0\0Y"..., 16384, 3112960) = 16384
23584      0.000520 times(NULL)         = 432512349
23584      0.000030 times(NULL)         = 432512349
23584      0.000032 pwrite64(16, "\25\302\0\0\276\0\0\0\231\31\0\0\377\377\1\4}\177\0\0Y"..., 16384, 3112960) = 16384
23584      0.000598 times(NULL)         = 432512349
23584      0.000037 times(NULL)         = 432512349
23584      0.000031 pwrite64(17, "\25\302\0\0\276\0\0\0\231\31\0\0\377\377\1\4}\177\0\0Y"..., 16384, 3112960) = 16384
23584      0.000507 times(NULL)         = 432512349
23584      0.000035 gettimeofday({1345909247, 164782}, NULL) = 0
23584      0.000042 gettimeofday({1345909247, 164825}, NULL) = 0
23584      0.000039 pread64(15, "\25\302\0\0\24\0\0\0\226\31\0\0\377\377\1\4\243R\0\0\17"..., 16384, 327680) = 16384
23584      0.000054 gettimeofday({1345909247, 164917}, NULL) = 0
23584      0.000078 gettimeofday({1345909247, 165000}, NULL) = 0
23584      0.000043 times(NULL)         = 432512349
23584      0.000031 pwrite64(15, "\25\302\0\0\23\0\0\0\231\31\0\0\377\377\1\4\246R\0\0\17"..., 16384, 311296) = 16384
23584      0.000568 times(NULL)         = 432512349
23584      0.000043 times(NULL)         = 432512349
23584      0.000032 pwrite64(16, "\25\302\0\0\23\0\0\0\231\31\0\0\377\377\1\4\246R\0\0\17"..., 16384, 311296) = 16384
23584      0.000458 times(NULL)         = 432512349
23584      0.000039 times(NULL)         = 432512349
23584      0.000031 pwrite64(17, "\25\302\0\0\23\0\0\0\231\31\0\0\377\377\1\4\246R\0\0\17"..., 16384, 311296) = 16384
23584      0.000543 times(NULL)         = 432512349
23584      0.000034 gettimeofday({1345909247, 166819}, NULL) = 0
23584      0.000045 gettimeofday({1345909247, 166863}, NULL) = 0
23584      0.000038 times(NULL)         = 432512349
23584      0.000031 pwrite64(15, "\25\302\0\0\22\0\0\0\231\31\0\0\377\377\1\4\3332\0\0\0"..., 16384, 294912) = 16384
23584      0.000697 times(NULL)         = 432512349
23584      0.000030 times(NULL)         = 432512349
23584      0.000031 pwrite64(16, "\25\302\0\0\22\0\0\0\231\31\0\0\377\377\1\4\3332\0\0\0"..., 16384, 294912) = 16384
23584      0.000509 times(NULL)         = 432512349
23584      0.000037 times(NULL)         = 432512349
23584      0.000032 pwrite64(17, "\25\302\0\0\22\0\0\0\231\31\0\0\377\377\1\4\3332\0\0\0"..., 16384, 294912) = 16384
23584      0.000565 times(NULL)         = 432512350
23584      0.000035 gettimeofday({1345909247, 168867}, NULL) = 0
23584      0.000060 gettimeofday({1345909247, 168933}, NULL) = 0
23584      0.000289 times(NULL)         = 432512350
23584      0.000037 pwrite64(15, "\25\302\0\0\20\0\0\0\231\31\0\0\377\377\1\4P8\0\0\200$"..., 16384, 262144) = 16384
23584      0.000525 times(NULL)         = 432512350
23584      0.000030 times(NULL)         = 432512350
23584      0.000032 pwrite64(16, "\25\302\0\0\20\0\0\0\231\31\0\0\377\377\1\4P8\0\0\200$"..., 16384, 262144) = 16384
23584      0.000576 times(NULL)         = 432512350
23584      0.000046 times(NULL)         = 432512350
23584      0.000092 pwrite64(17, "\25\302\0\0\20\0\0\0\231\31\0\0\377\377\1\4P8\0\0\200$"..., 16384, 262144) = 16384
23584      0.000630 times(NULL)         = 432512350
23584      0.000043 gettimeofday({1345909247, 171228}, NULL) = 0
23584      0.000045 gettimeofday({1345909247, 171272}, NULL) = 0
23584      0.000044 gettimeofday({1345909247, 171316}, NULL) = 0
23584      0.000037 times(NULL)         = 432512350
23584      0.000054 pwrite64(15, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\300\377\0\0\0\0\0\0"..., 16384, 16384) = 16384
23584      0.000804 times(NULL)         = 432512350
23584      0.000038 times(NULL)         = 432512350
23584      0.000032 pwrite64(16, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\300\377\0\0\0\0\0\0"..., 16384, 16384) = 16384
23584      0.000589 times(NULL)         = 432512350
23584      0.000030 times(NULL)         = 432512350
23584      0.000031 pwrite64(17, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\300\377\0\0\0\0\0\0"..., 16384, 16384) = 16384
23584      0.000606 times(NULL)         = 432512350
23584      0.000034 gettimeofday({1345909247, 173571}, NULL) = 0
23584      0.000038 gettimeofday({1345909247, 173609}, NULL) = 0
23584      0.000039 pread64(15, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\300\377\0\0\0\0\0\0"..., 16384, 16384) = 16384
23584      0.000054 gettimeofday({1345909247, 173702}, NULL) = 0
23584      0.000040 gettimeofday({1345909247, 173742}, NULL) = 0
23584      0.000040 times(NULL)         = 432512350
23584      0.000751 times(NULL)         = 432512350
23584      0.000043 gettimeofday({1345909247, 174576}, NULL) = 0
23584      0.000041 semctl(622592, 23, IPC_64|SETVAL, 0xbfffd09c) = 0
23584      0.001190 gettimeofday({1345909247, 175992}, NULL) = 0
23584      0.000242 pread64(15, "\25\302\0\0\36\0\0\0\223\31\0\0\377\377\1\4\236\177\0\0"..., 16384, 491520) = 16384
23584      0.000061 gettimeofday({1345909247, 176111}, NULL) = 0
23584      0.000057 gettimeofday({1345909247, 176175}, NULL) = 0
23584      0.000111 getrusage(RUSAGE_SELF, {ru_utime={0, 15997}, ru_stime={0, 72988}, ...}) = 0
23584      0.000051 getrusage(RUSAGE_SELF, {ru_utime={0, 15997}, ru_stime={0, 72988}, ...}) = 0
23584      0.000045 gettimeofday({1345909247, 176374}, NULL) = 0
23584      0.000078 gettimeofday({1345909247, 176545}, NULL) = 0
23584      0.000155 semctl(622592, 9, IPC_64|SETVAL, 0xbfffe0f4) = 0
23584      0.000165 open("/proc/23604/stat", O_RDONLY) = 32
23584      0.000058 read(32, "23604 (oracle) S 1 23604 23604 0"..., 999) = 190
23584      0.000085 close(32)           = 0
23584      0.000115 gettimeofday({1345909247, 177038}, NULL) = 0
23584      0.000049 semctl(622592, 20, IPC_64|SETVAL, 0xbfffe0a4) = 0
23584      0.000624 semctl(622592, 20, IPC_64|SETVAL, 0xbfffe14c) = 0
23584      0.000042 close(8)            = 0
23584      0.000035 open("/home/ora10g/admin/roger/bdump/alert_roger.log", O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE, 0660) = 8
23584      0.000062 time(NULL)          = 1345909247
23584      0.000040 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1017, ...}) = 0
23584      0.000146 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1017, ...}) = 0
23584      0.000071 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1017, ...}) = 0
23584      0.000064 writev(8, [{"Sat Aug 25 08:40:47 PDT 2012\n", 29}, {"Thread 1 advanced to log sequenc"..., 51}, {"\n", 1}], 3) = 81
23584      0.000074 times(NULL)         = 432512350
23584      0.000051 times(NULL)         = 432512350
23584      0.000031 close(8)            = 0
23584      0.000031 open("/home/ora10g/admin/roger/bdump/alert_roger.log", O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE, 0660) = 8
23584      0.000063 writev(8, [{"  Current log# 1 seq# 151 mem# 0"..., 71}, {"\n", 1}], 2) = 72
23584      0.000072 gettimeofday({1345909247, 178488}, NULL) = 0
.....省略部分信息
23584      0.000057 gettimeofday({1345909247, 179580}, NULL) = 0
23584      0.000197 semtimedop(622592, 0x1, 200000, {4294901770, 0}) = -1 EAGAIN (Resource temporarily unavailable)
23584      0.201062 gettimeofday({1345909247, 380703}, NULL) = 0
.....省略部分信息
23584      0.000068 gettimeofday({1345909247, 382030}, NULL) = 0
23584      0.000047 semtimedop(622592, 0x1, 0, {4294901770, 0}) = -1 EAGAIN (Resource temporarily unavailable)
23584      3.000146 gettimeofday({1345909250, 382231}, NULL) = 0
23584      0.000068 gettimeofday({1345909250, 382295}, NULL) = 0
23584      0.000058 gettimeofday({1345909250, 382353}, NULL) = 0
23584      0.000062 gettimeofday({1345909250, 382415}, NULL) = 0
23584      0.000055 gettimeofday({1345909250, 382548}, NULL) = 0
23584      0.000170 gettimeofday({1345909250, 382644}, NULL) = 0
23584      0.000063 times(NULL)         = 432512671
23584      0.000225 gettimeofday({1345909250, 382930}, NULL) = 0
23584      0.000058 gettimeofday({1345909250, 382984}, NULL) = 0
23584      0.000055 times(NULL)         = 432512671
23584      0.000065 times(NULL)         = 432512671
23584      0.000107 gettimeofday({1345909250, 383219}, NULL) = 0
23584      0.000073 times(NULL)         = 432512671
23584      0.000050 pwrite64(18, "\1\"\0\0\2\0\0\0\227\0\0\0\20\200\200oh\0\0\0\4\0\0\0\223"..., 512, 1024) = 512
23584      0.000886 times(NULL)         = 432512671
23584      0.000078 gettimeofday({1345909250, 384300}, NULL) = 0
.....省略部分信息
23584      0.000092 gettimeofday({1345909250, 385232}, NULL) = 0
23584      0.000061 semtimedop(622592, 0x1, 0, {4294901770, 0}) = -1 EAGAIN (Resource temporarily unavailable)
23584      3.000481 gettimeofday({1345909253, 385780}, NULL) = 0
.....省略部分信息
23584      0.001228 times(NULL)         = 432512971
23584      0.000520 gettimeofday({1345909253, 388094}, NULL) = 0
23584      0.000184 getrusage(RUSAGE_SELF, {ru_utime={0, 15997}, ru_stime={0, 74988}, ...}) = 0
23584      0.000081 getrusage(RUSAGE_SELF, {ru_utime={0, 15997}, ru_stime={0, 74988}, ...}) = 0
23584      0.000183 gettimeofday({1345909253, 388540}, NULL) = 0
.....省略部分信息
23584      0.000141 gettimeofday({1345909253, 389471}, NULL) = 0
23584      0.000079 semtimedop(622592, 0x1, 0, {4294901770, 0}

我们来简单分析下上面的trace内容,首先pread64 这样的,是一个参数,可以看到时一些读的操作。

这个操作后面跟的第一个数字是表示文件描述符,你可以进行系统的/proc/fd下面去查看所有的的文件描述符信息,如下:

[ora10g@killdb 23584]$ cd fd

[ora10g@killdb fd]$ ls -ltr
total 32
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 9 -> /home/ora10g/product/10.2/dbs/hc_roger.dat
l-wx------  1 ora10g oinstall 64 Aug 25 08:47 8 -> /home/ora10g/admin/roger/bdump/alert_roger.log
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 7 -> /home/ora10g/product/10.2/dbs/lkinstroger (deleted)
l-wx------  1 ora10g oinstall 64 Aug 25 08:47 6 -> /home/ora10g/admin/roger/bdump/alert_roger.log
l-wx------  1 ora10g oinstall 64 Aug 25 08:47 5 -> /home/ora10g/admin/roger/udump/roger_ora_23573.trc
lr-x------  1 ora10g oinstall 64 Aug 25 08:47 4 -> /dev/null
lr-x------  1 ora10g oinstall 64 Aug 25 08:47 3 -> /dev/null
lr-x------  1 ora10g oinstall 64 Aug 25 08:47 2 -> /dev/null
lr-x------  1 ora10g oinstall 64 Aug 25 08:47 1 -> /dev/null
lr-x------  1 ora10g oinstall 64 Aug 25 08:47 0 -> /dev/null
lr-x------  1 ora10g oinstall 64 Aug 25 08:47 31 -> /home/ora10g/product/10.2/rdbms/mesg/oraus.msb
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 30 -> /home/ora10g/oradata/roger/temp02
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 29 -> /home/ora10g/oradata/roger/undotbs03.dbf
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 28 -> /home/ora10g/oradata/roger/sqlt_01.dbf
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 27 -> /home/ora10g/oradata/roger/test1.dbf
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 26 -> /home/ora10g/oradata/roger/undotb2_01.dbf
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 25 -> /home/ora10g/oradata/roger/roger01.dbf
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 24 -> /home/ora10g/oradata/roger/users01.dbf
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 23 -> /home/ora10g/oradata/roger/sysaux01.dbf
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 22 -> /home/ora10g/oradata/roger/undotbs01.dbf
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 21 -> /home/ora10g/oradata/roger/system01.dbf
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 20 -> /home/ora10g/oradata/roger/redo03.log
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 19 -> /home/ora10g/oradata/roger/redo02.log
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 18 -> /home/ora10g/oradata/roger/redo01.log
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 17 -> /home/ora10g/oradata/roger/control03.ctl
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 16 -> /home/ora10g/oradata/roger/control02.ctl
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 15 -> /home/ora10g/oradata/roger/control01.ctl
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 14 -> /home/ora10g/product/10.2/dbs/lkROGER
lrwx------  1 ora10g oinstall 64 Aug 25 08:47 13 -> /home/ora10g/product/10.2/dbs/hc_roger.dat
lr-x------  1 ora10g oinstall 64 Aug 25 08:47 12 -> /home/ora10g/product/10.2/rdbms/mesg/oraus.msb
lr-x------  1 ora10g oinstall 64 Aug 25 08:47 11 -> /dev/zero
lr-x------  1 ora10g oinstall 64 Aug 25 08:47 10 -> /dev/zero

从上面我们可以看到,18~20 是我们的redo log,15~17 是controlfile。

下面是关于pread和pwrite 函数的描述:

#include 

ssize_t pread(int fd, void *buf, size_t count, off_t offset);
ssize_t pwrite(int fd, const void *buf, size_t count, off_t offset);
Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
pread(), pwrite():
_XOPEN_SOURCE >= 500
|| /* Since glibc 2.12: */ _POSIX_C_SOURCE >= 200809L

pread() reads up to count bytes from file descriptor fd at offset offset (from the start of the file) into
the buffer starting at buf. The file offset is not changed.

pwrite() writes up to count bytes from the buffer starting at buf to the file descriptor fd at offset offset.
The file offset is not changed. The file referenced by fd must be capable of seeking.

oracle 10g concepets 关于lgwr进程有如下的一段描述:

LGWR writes synchronously to the active mirrored group of redo log files. If one of the files in the group is
damaged or unavailable, LGWR continues writing to other files in the group and logs an error in the LGWR trace
file and in the system alert log. If all files in a group are damaged, or the group is unavailable because it
has not been archived, LGWR cannot continue to function.

从上面描述,提到了lgwr会写lgwr trace和alert log,但是并没有提到controlfile。
从上面的strace信息我们可以得出如下的结论:

1.  lgwr进程会读和写controlfile;
2.  lgwr进程读redo log的读写单位是512 byte,这其实就是redo的block 大小。
3.  lgwr进程会去写alert log,具体些什么信息呢?就是你redo logfile切换或增加时的信息会儿写到
    alert文件中,且会记录具体的时间。
4.  当日志切换时,lgwr进程回去通知arc0归档进程进行归档操作。当然,如果你是归档环境的话。
5.  gettimeofday 是linux获取当前时间的一个函数,可以精确到微妙级别。
6.  stat64 是linux的一个函数,用于判断相关文件是否存在,跟stat函数类似,下面是关于stat64函数的描述:


The stat64 structure is similar to the stat structure, except that it is capable of returning information
about files that are larger than 2 gigabytes. The stat64 structure is returned by the and functions, which
are a part of the large file extensions.

7.  关于函数semctl,起结构如下:
    int semctl(int semid,int semnum,int cmd,union semunarg);
    该参数的主要作用是系统调用semctl用来执行在信号量集上的控制操作,这里我们不关注这个函数。

8.  关于linux 函数getrusage,该函数主要是用于获取进程的资源使用情况,主要是如下几种情况:

    int getrusage(int who, struct rusage *usage);

    例如我们这里strace 的部分信息如下:

    getrusage(RUSAGE_SELF, {ru_utime={0, 15997}, ru_stime={0, 74988}, ...}) = 0

    这表示获取当前进程的资源使用情况。
    rusage_self:获取当前进程的资源使用信息。
    rusage_children:获取子进程的资源使用信息。

最后我们再回到问题上来,lgwr在写redo的过程中对latch的持有和释放顺序是怎么样的呢 ?

我们来看看10g中 redo 相关的latch有哪些,如下(10g不同版本有一些差异,主要是latch#有所变化,redo相关latch都是一样的):
—for oracle 10205

SQL> l
  1* SELECT adde,name,latch#,level#,hash FROM v$latch WHERE name LIKE '%redo%' ORDER BY 3
SQL> /

NAME                                              LATCH#     LEVEL#       HASH
--------------------------------------------- ---------- ---------- ----------
redo writing                                         148          1 3245733566
redo copy                                            149          4 4092072627
KFR redo allocation latch                            372          4  575052579
redo allocation                                      150          5  999804931
redo on-disk SCN                                     115          8 3153648710
ping redo on-disk SCN                                116          8 2268508676

6 ROWS selected.

SQL> SELECT addr,name,latch#,level#,hash FROM v$latch WHERE name LIKE '%lgwr%' ORDER BY 3;

NAME                                              LATCH#     LEVEL#       HASH
--------------------------------------------- ---------- ---------- ----------
lgwr LWN SCN                                         114          7  863803026

SELECT addr,name,latch#,level#,child#,hash FROM v$latch_children WHERE
latch# IN(SELECT latch# FROM v$latch WHERE name LIKE '%redo%');

NAME                                              LATCH#     LEVEL#     CHILD#       HASH
--------------------------------------------- ---------- ---------- ---------- ----------
redo copy                                            149          4          1 4092072627
redo copy                                            149          4          2 4092072627
redo allocation                                      150          5          1  999804931
redo allocation                                      150          5          2  999804931
redo allocation                                      150          5          3  999804931
redo allocation                                      150          5          4  999804931
redo allocation                                      150          5          5  999804931
redo allocation                                      150          5          6  999804931
redo allocation                                      150          5          7  999804931
redo allocation                                      150          5          8  999804931
redo allocation                                      150          5          9  999804931

NAME                                              LATCH#     LEVEL#     CHILD#       HASH
--------------------------------------------- ---------- ---------- ---------- ----------
redo allocation                                      150          5         10  999804931
redo allocation                                      150          5         11  999804931
redo allocation                                      150          5         12  999804931
redo allocation                                      150          5         13  999804931
redo allocation                                      150          5         14  999804931
redo allocation                                      150          5         15  999804931
redo allocation                                      150          5         16  999804931
redo allocation                                      150          5         17  999804931
redo allocation                                      150          5         18  999804931
redo allocation                                      150          5         19  999804931

21 ROWS selected.

我们可以看到,只有redo copy和redo redo allocation 这2个latch是存在子latch的,其中redo copy存在2个child latch,redo allocation 存在19个child latch。

在jonathan lewis大师的oracle core一书中第137页,有提到,据他的描述是这样的一个顺序:

1.   Get a redo copy latch. Since the number of latches is 2 × cpu_count, we can try
     each in turn in immediate mode, only going to willing-to wait-mode if they are
     all busy. The latch we pick first is randomized so that different sessions won’t
     all end up waiting on the same latch.

    ---申请获得redo copy latch。该latch的数量是cpu_count的2倍(我这里cpu_count为1,所以看到redo copy 子latch有2个).
     lgwr进程会尽可能的使用immediate 模式去获得latch,如果获取不到才会去转入willing-to-wait模式,     直到获得redo copy latch为止。latch的获得是随机性的,所以不同的session并不总是会wait同一个latch。

     事实上,lgwr进程是必须获得latch的,不然实例会crash,经过实验证明。

2.   Get the redo allocation latch. (With multiple public redo threads, the copy  latch will dictate which buffer, which dictates which allocation latch; I’ll make  a few more comments on the effects of multiple log buffers a bit later in the  chapter.)

     ---获得redo allocation latch.

3.   Move the start of free space pointer.

     ---移动log buffer中free space pointer指针。

4.   Drop the redo allocation latch.

     ----删除redo allocation latch。

5.   Make the copy.

     ---开始copy操作。

6.   Drop the redo copy latch.

     ----删除redo copy latch。

7.   If the allocation has taken the used space over one-third of the log buffer or1MB, or if the redo record was a commit record, then post lgwr to write (but get the redo writing latch first to check if lgwr is already writing).

     ----如果已经分配的buffer中超过1/3的log buffer或超过1MB,再或者redo 记录的事务进行commit后,这都会触发lgwr进程去写。(在获得redo writing latch之前会去检查lgwr进程是否正在写)。

8.   If the redo record was a commit record, increment redo synch writes, set up a
     log file sync wait, set up a 1-second (10g) or 10-centisecond alarm (11.2),
     and take self off run queue.

     ----如果redo record是已经提交的记录,会增加到redo 同步写入记录,并设置一个log file sync等待, 其时间阈值是1s(10g)或0.1s(11.2),采取自我关闭或开启的运行队列。
     这里翻译有些绕口,简单的讲,就是已经提交的redo记录会追加到redo 同步队列中,并等待lgwr去写,此时的等待事件就是log file sync,lgwr进程是自我触发,超过1s(10g)或者0.1(11.2)lgwr就会开始
     redo sync writes操作。

下面还有一段:

So a session allocates space in the log buffer and then copies into that space while holding the redocopy latch. In theory this means that lgwr need only get (and then release) all the redo copy latches as a    check for holes before it starts to write, because once it is holding all of them, any holes in the log buffermust have been filled. According to a note by Steve Adams, though (see ww.ixora.com.au/tips/tuning/          redo_latches.htm), Oracle stopped doing this after 8.0; instead, it has a mechanism for seeing who is         currently holding the redo copy latches without obtaining the latch (using the internal equivalent of         v$latchholder), and if there are any holders, it starts waiting on LGWR wait for redo copy until posted       that the latch has been released.                                                                             

By watching the latch activity, we can see that something like this must be happening, but I don’tknow for certain what process posts lgwr (presumably it has to be the session holding the latch) and I
don’t know how it knows it has to post lgwr. However, since parameter1 of the LGWR wait for redo copywait is the copy latch #, and since each session will know which redo copy child latch it is holding, it’s
not hard to imagine that there is some code a session always calls just before it drops the latch to check ifit has been holding the latch that lgwr has been waiting for, and posting lgwr if it was.
另外在Steve Adams 的http://www.ixora.com.au/tips/tuning/redo_latches.htm 也提到了,如下:

However, before taking the redo allocation latch, the process first takes the redo copy latch that will be neededfor the copy into the log buffer. The redo copy latches are used to indicate that a process is copying redo intothe log buffer, and that LGWR should wait until the copy has finished, before writing the target log buffer blocksto disk.
No-wait mode is used for most gets against the redo copy latches, because the process can use any one of them toprotect its copy into the log buffer. It first attempts to get the copy latch that it last held. Failing that, theprocess attempts to get each other copy latch in turn, in no-wait mode. Willing-to-wait mode is only used to getthe last copy latch if no-wait gets against all the other copy latches have failed.

Once a redo copy latch and the redo allocation latch have been acquired, and space has been allocated in the redolog buffer, the redo allocation latch is then released and the change vectors are copied into the log buffer fromtemporary buffers in the PGA of the process.

Once the copy is complete, the change vectors are applied to the affected database blocks, the redo entry is markedas valid, and the process then releases its redo copy latch. At this point the process may need to post LGWR to signalthat it should begin to flush the log buffer. This applies if the allocation raised the number of used blocks in thelog buffer above the threshold set by the _log_io_size parameter, or if a commit marker has been copied into the redostream as part of a commit. However, to ensure that LGWR is not posted needlessly, the process takes the redo writinglatch to check whether LGWR is already active or has already been posted. The redo writing latch is then released, and appropriate the LGWR process is posted.

Steve Adams的描述其实跟jonathan lewis大师的描述差不多,这里就不多说了。最后一段Steve Adams大师还提到:一旦进程获得了redo copy和redo allocation patch以后,也就意味着其对应的log buffer中的空间已经分配出来了,此时redo allocation latch会被释放,与此同时change vectors会从PGA 的temporary buffers中copy到log buffer中。当然,最后再由lgwr进程将redo buffer的redo record写入到redo logfile中去。

到这里自然有一个疑问产生了,是什么进程从pga 中去copy change vectors到log buffer中的? 难得是lgwr进程? 显示不是。

这其实还跟你的数据库是shared 还是专用模式有关系,显然我这里是通过sqlplus killdb/killdb 本地操作,那么就是专用模式了。

[ora10g@killdb udump]$ ps -ef|grep 32511
ora10g 32511 32510 0 00:09 00:00:00 oracleroger (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
ora10g 2988 32177 0 01:41 pts/3 00:00:00 grep 32511

如果是专用模式的话,就是我们的user 进程来完成从pga temporary buffer 拷贝change vectors到log buffer的过程的。
上面 Steve Adams大师的这段话也说明了。

从上面的理解,我们可以得出如下结论:

1. user process (proc1) 进行操作,例如dml操作某个表;
2. buffer cache中对应的脏块在被dbwN进程写入到disk之前,这些block对应change vector会被copy到redo buffer中。
3. proc1 进程获得redo copy latch;表明该进程正在copy 内容到log buffer中。
4. proc1 进程获得redo allocation latch;因为进程需要copy 内容到log buffer中,那么必然要在log buffer中
   分配相关的空间,所以必须获得该latch;
5. proc1 进程从pga temporary buffer中copy  change vectors到log buffer中。 
6. proc1 进程完成copy以后,释放reod allocation latch;
7. proc1 进程释放redo copy latch;
8. proc1 进程post lgwr进程去flush log buffer.另外如果相关的事务已经提交或者达到log buffer写的要求时,lgwr进程会
   自己唤醒然后去把log buffer的内容写到redo logfile。
9. 当然,lgwr进程在写log buffer之前会先获得redo writing latch,每次写完以后会释放该latch。
说明:如果你是shard 模式,那么说完成change vectors copy的工作是用Snnn进程来完成的。


关于latch,可以通过oradebug 来进行相关的测试,下面是我的一个基于10.2.0.2的测试:

—-for 10.2.0.2

SQL> SELECT addr,name,latch#,level#,hash FROM v$latch WHERE name LIKE '%redo%' ORDER BY 4;

ADDR     NAME                                                   LATCH#     LEVEL#       HASH
-------- -------------------------------------------------- ---------- ---------- ----------
5000D8F4 redo writing                                              146          1 3245733566
5000D95C redo copy                                                 147          4 4092072627
5001B5D0 KFR redo allocation latch                                 362          4  575052579
5000D9C4 redo allocation                                           148          5  999804931
5000A45C ping redo on-disk SCN                                     115          8 2268508676
5000A3F8 redo on-disk SCN                                          114          8 3153648710

6 ROWS selected.

SQL> SELECT addr,name,latch#,level#,hash FROM v$latch WHERE name LIKE '%lgwr%' ORDER BY 3;

ADDR     NAME                                                   LATCH#     LEVEL#       HASH
-------- -------------------------------------------------- ---------- ---------- ----------
5000A394 lgwr LWN SCN                                              113          7  863803026

SQL> SELECT addr,name,latch#,level#,child#,hash FROM v$latch_children WHERE
  2  latch# IN(SELECT latch# FROM v$latch WHERE name LIKE '%redo%');

ADDR     NAME                        LATCH#     LEVEL#     CHILD#       HASH
-------- ----------------------- ---------- ---------- ---------- ----------
592CD2A4 redo copy                      147          4          1 4092072627
592CD320 redo copy                      147          4          2 4092072627
589A0FD0 redo allocation                148          5          1  999804931
589A1034 redo allocation                148          5          2  999804931
589A1098 redo allocation                148          5          3  999804931
589A10FC redo allocation                148          5          4  999804931
589A1160 redo allocation                148          5          5  999804931
589A11C4 redo allocation                148          5          6  999804931
589A1228 redo allocation                148          5          7  999804931
589A128C redo allocation                148          5          8  999804931
589A12F0 redo allocation                148          5          9  999804931

ADDR     NAME                        LATCH#     LEVEL#     CHILD#       HASH
-------- ----------------------- ---------- ---------- ---------- ----------
589A1354 redo allocation                148          5         10  999804931
589A13B8 redo allocation                148          5         11  999804931
589A141C redo allocation                148          5         12  999804931
589A1480 redo allocation                148          5         13  999804931
589A14E4 redo allocation                148          5         14  999804931
589A1548 redo allocation                148          5         15  999804931
589A15AC redo allocation                148          5         16  999804931
589A1610 redo allocation                148          5         17  999804931
589A1674 redo allocation                148          5         18  999804931
589A16D8 redo allocation                148          5         19  999804931

21 ROWS selected.

SQL>
下面用oradebug 离开手工持有redo copy 进行验证。

SQL> show user
USER is "SYS"
SQL> oradebug call kslgetl 1342232924 2
Function returned 1
SQL> oradebug call kslgetl 1496109732 2
ORA-03113: end-of-file on communication channel
ORA-24323: value not allowed
SQL>
SQL>
SQL> conn /as sysdba
Connected to an idle instance.

此时alert 如下:
Tue Jul 10 09:34:15 2012
Errors in file /export/home/oracle/ora10g/product/10.2/admin/killdb/udump/killdb_ora_5950.trc:
ORA-00600: internal error code, arguments: [504], [0x592CD2A4], [16], [4], [redo copy], [1], [0], [0x5000D95C]
Tue Jul 10 09:34:15 2012
Errors in file /export/home/oracle/ora10g/product/10.2/admin/killdb/udump/killdb_ora_5950.trc:
ORA-07445: exception encountered: core dump [SIGSEGV] [Address not mapped to object] [1849245988] [] [] []
ORA-00600: internal error code, arguments: [504], [0x592CD2A4], [16], [4], [redo copy], [1], [0], [0x5000D95C]
Tue Jul 10 09:34:15 2012
Errors in file /export/home/oracle/ora10g/product/10.2/admin/killdb/udump/killdb_ora_5950.trc:
ORA-07445: exception encountered: core dump [SIGSEGV] [Address not mapped to object] [1849245988] [] [] []
ORA-07445: exception encountered: core dump [SIGSEGV] [Address not mapped to object] [1849245988] [] [] []
ORA-00600: internal error code, arguments: [504], [0x592CD2A4], [16], [4], [redo copy], [1], [0], [0x5000D95C]
Tue Jul 10 09:34:15 2012
Errors in file /export/home/oracle/ora10g/product/10.2/admin/killdb/udump/killdb_ora_5950.trc:
ORA-07445: exception encountered: core dump [SIGSEGV] [Address not mapped to object] [1849245988] [] [] []
ORA-07445: exception encountered: core dump [SIGSEGV] [Address not mapped to object] [1849245988] [] [] []
ORA-07445: exception encountered: core dump [SIGSEGV] [Address not mapped to object] [1849245988] [] [] []
ORA-00600: internal error code, arguments: [504], [0x592CD2A4], [16], [4], [redo copy], [1], [0], [0x5000D95C]
Tue Jul 10 09:34:30 2012
Errors in file /export/home/oracle/ora10g/product/10.2/admin/killdb/bdump/killdb_pmon_5923.trc:
ORA-07445: 出现异常错误: 核心转储 [SIGSEGV] [Address not mapped to object] [152] [] [] []
Tue Jul 10 09:34:39 2012
MMON: terminating instance due to error 472
Instance terminated by MMON, pid = 5941


下面我们反过来,想持有子latch,再去持有父latch看看情况如何:
SQL> startup
ORACLE instance started.

Total System Global Area  167772160 bytes
Fixed Size                  1279120 bytes
Variable Size              62917488 bytes
Database Buffers          100663296 bytes
Redo Buffers                2912256 bytes
Database mounted.
Database opened.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug call kslgetl 1496109732 2
Function returned 1
SQL> oradebug call kslgetl 1342232924 2
ORA-03113: end-of-file on communication channel
ORA-24323: value not allowed


SQL> conn /as sysdba
Connected.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug call kslgetl 1496109856 2
Function returned 1
SQL> oradebug call kslgetl 1496109732 2
ORA-03113: end-of-file on communication channel
ORA-24323: value not allowed
SQL>
此时数据库实例可能会crash掉,oracle这里是不允许你即获得redo copy 父latch,同时又获得子latch的,
反过来也是一样的,同时该子latch也不能同时获得2个,那样是不允许的。但是这个时候实例不会crash。
只是会抛出相关错误,数据库仍然可以正常工作。错误如下:

Tue Jul 10 09:39:49 2012
Thread 1 advanced to log sequence 18
  Current log# 2 seq# 18 mem# 0: /export/home/oracle/ora10g/product/10.2/oradata/killdb/redo02.log
Tue Jul 10 09:40:32 2012
Errors in file /export/home/oracle/ora10g/product/10.2/admin/killdb/udump/killdb_ora_6036.trc:
ORA-00600: internal error code, arguments: [504], [0x592CD2A4], [16], [4], [redo copy], [1], [0], [0x592CD320]
Tue Jul 10 09:40:32 2012
Errors in file /export/home/oracle/ora10g/product/10.2/admin/killdb/udump/killdb_ora_6036.trc:
ORA-07445: exception encountered: core dump [SIGSEGV] [Address not mapped to object] [1849245988] [] [] []
ORA-00600: internal error code, arguments: [504], [0x592CD2A4], [16], [4], [redo copy], [1], [0], [0x592CD320]
Tue Jul 10 09:40:32 2012
Errors in file /export/home/oracle/ora10g/product/10.2/admin/killdb/udump/killdb_ora_6036.trc:
ORA-07445: exception encountered: core dump [SIGSEGV] [Address not mapped to object] [1849245988] [] [] []
ORA-07445: exception encountered: core dump [SIGSEGV] [Address not mapped to object] [1849245988] [] [] []
ORA-00600: internal error code, arguments: [504], [0x592CD2A4], [16], [4], [redo copy], [1], [0], [0x592CD320]
Tue Jul 10 09:40:32 2012
Errors in file /export/home/oracle/ora10g/product/10.2/admin/killdb/udump/killdb_ora_6036.trc:
ORA-07445: exception encountered: core dump [SIGSEGV] [Address not mapped to object] [1849245988] [] [] []
ORA-07445: exception encountered: core dump [SIGSEGV] [Address not mapped to object] [1849245988] [] [] []
ORA-07445: exception encountered: core dump [SIGSEGV] [Address not mapped to object] [1849245988] [] [] []
ORA-00600: internal error code, arguments: [504], [0x592CD2A4], [16], [4], [redo copy], [1], [0], [0x592CD320]
Tue Jul 10 09:42:00 2012
Thread 1 advanced to log sequence 19
  Current log# 3 seq# 19 mem# 0: /export/home/oracle/ora10g/product/10.2/oradata/killdb/redo03.log

所以,关于redo copy latch可以简单的总结如下几点:
1. redo copy 父latch和子latch 不能同时持有;
2. 在获得父latch后再去申请获得子latch或获得子latch后再去获得父latch都是不允许的;
3. 2个子latch也是不能同时持有的。


最后关于redo allocation 的子latch为什么那么多,我想是出于并发考虑。

备注:其中有部分关于其他进程的trace我去掉了,内容太多了,还有关于latch的持有和释放,可以通过event 10005去观察,如下:

ora10g@killdb ~]$ oerr ora 10005
10005, 00000, “trace latch operations for debugging”
// *Document: NO
// *Cause:
// *Action: Enable tracing for various latch operations
// *Comment:
// level 1 – trace latch gets and frees
// level 4 – trace multiple posts by processes when latch is freed

We can set events 10005 (trace latch gets and frees) and 10073 (have PMON
dump info before latch cleanup). Level does not matter for these events.
We can also set event=”600 trace name LATCHES level 1″ to dump latch info
when PMON hits the error (I’m not so sure this one is going to work).

event = “600 trace name latches level 10″
event = “10005 trace name context forever, level 1″

10005 event has been removed above 9i , try dump KSTDUMPCURPROC 1

SQL> oradebug setmypid;
SQL> oradebug unlimit;
SQL> oradebug dump KSTDUMPCURPROC 1
SQL> oradebug tracefile_name;

--------------------------------------ORACLE-DBA----------------------------------------

最权威、专业的Oracle案例资源汇总之【学习笔记】Oracle lgwr 在写redo 时需要获得哪些latch

本文由大师惜分飞原创分享,网址:http://www.oracleplus.net/arch/1377.html

Oracle研究中心

关键词:

about oracle lgwr

Oracle 写redo时需要获得哪些latch