sitemap

RSS地图

收藏本站

设为首页

Oracle研究中心

当前位置:Oracle研究中心 > 故障案例 >

【案例】Oracle RAC报错ORA-00469 节点被强制重启的解决办法

时间:2016-11-28 22:14   来源:Oracle研究中心   作者:网络   点击:

天萃荷净 Oracle研究中心案例分析:运维DBA反映Oracle RAC环境数据库报错ORA-00469,分析原因是由于BUG 10008092导致RAC节点重启。

本站文章除注明转载外,均为本站原创: 转载自love wife & love life —Roger 的Oracle技术博客
本文链接地址: BUG 10008092 caused instance crash

一个双节点rac,其中某节点被重启了,如下:

###### 1节点 02:23:55 2011 ######

Sat Dec  3 02:23:55 2011
Errors in file /oracle/admin/crmdb/bdump/crmdb1_pmon_12765.trc:
ORA-00469: CKPT process terminated with error
Sat Dec  3 02:23:55 2011
ORA-469 encountered when generating server alert SMG-3503
Sat Dec  3 02:23:55 2011
Errors in file /oracle/admin/crmdb/bdump/crmdb1_j000_8539.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00469: CKPT process terminated with error
Sat Dec  3 02:23:56 2011
###### 1节点crash ######

Sat Dec  3 02:23:57 2011
Errors in file /oracle/admin/crmdb/bdump/crmdb1_smon_12876.trc:
ORA-00469: CKPT process terminated with error
Sat Dec  3 02:23:58 2011
Shutting down instance (abort)
License high water mark = 55
Sat Dec  3 02:24:00 2011

从上面来看,由于检查点进程ckpt出现问题,导致实例crash。
###### 1节点pmon进程trace如下:######

*** 2011-12-03 02:23:55.731
Background process CKPT found dead
Oracle pid = 24
OS pid (from detached process) = 12869
OS pid (from process state) = 12869
dtp = c000000040016e40, proc = c0000004950057c8
Dump of memory from 0xC000000040016E40 to 0xC000000040016E88
C000000040016E40 00000076 00000000 C0000004 950057C8  [...v..........W.]
C000000040016E50 00000000 00000000 00000000 434B5054  [............CKPT]
C000000040016E60 00020000 00000000 00003245 00000000  [..........2E....]
....................
....................
....................
....................
        Repeat 13 times
C000000495005CF0 6F726163 6C650000 00000000 00000000  [oracle..........]
C000000495005D00 00000000 00000000 00000000 00000000  [................]
C000000495005D10 00000000 00000006 6A6C6372 6D310000  [........jlcrm1..]
C000000495005D20 00000000 00000000 00000000 00000000  [................]
        Repeat 2 times
C000000495005D50 00000000 00000000 00000000 00000006Oracleо  [................]
C000000495005D60 554E4B4E 4F574E00 00000000 00000000  [UNKNOWN.........]
C000000495005D70 00000000 00000000 00000000 00000000  [................]
C000000495005D80 00000000 00000008 31323836 39000000  [........12869...]
C000000495005D90 00000000 00000000 00000000 00000000  [................]
C000000495005DA0 00000000 00000005 6F726163 6C65406A  [........oracle@j]
C000000495005DB0 6C63726D 31202843 4B505429 00000000  [lcrm1 (CKPT)....]
C000000495005DC0 00000000 00000000 00000000 00000000  [................]
....................
....................
....................
....................
C000000495005FA0 00000000 00000000 00000000 00001308  [................]
C000000495005FB0 00000006 00000000                    [........]       

error 469 detected in background process
ORA-00469: CKPT process terminated with error
*** 2011-12-03 02:24:07.798
ksuitm: waiting up to [5] seconds before killing DIAG

经同事确认,diag trace,甚至ckpt trace都没用生成,跟bug 10008092描述十分相似,包括版本,diagnostic analysis 都十分吻合,大概情况如下:

ckpt 进程死掉(可能是hang) --> pmon cleanup --> 保护后台进程,pmon crash instance

对于 alert 中的如下信息就非常容易解释了:
*** SESSION ID:(1089.34028) 2011-12-03 02:23:56.172
kgefec: fatal error 0
*** 2011-12-03 02:23:56.172
ksedmp: internal or fatal error
ORA-00603: ORACLE server session terminated by fatal error
ORA-00449: background process 'LCK0' unexpectedly terminated with error 469
ORA-00469: CKPT process terminated with error
ORA-00469: CKPT process terminated with error
Current SQL statement for this session:
TRUNCATE TABLE DINF.TEMP1_IN_PDT_CM_USER
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
c00000006e89ab60       200  procedure DINF.P_IN_PDT_CM_USER
c00000009e142850         1  anonymous block
----- Call Stack Trace -----
为什么这么说呢?因为truncate table是要触发object checkpoint的。

该bug如下:

Bug 10008092: INSTANCE CRASH WITH ORA-00469: CKPT PROCESS TERMINATED WITH ERROR

--------------------------------------ORACLE-DBA----------------------------------------

最权威、专业的Oracle案例资源汇总之【案例】Oracle RAC报错ORA-00469 节点被强制重启的解决办法

本文由大师惜分飞原创分享,网址:http://www.oracleplus.net/arch/1340.html

Oracle研究中心

关键词:

ORA-00469

Oracle BUG 10008092

Oracle RAC节点被重启