sitemap

RSS地图

收藏本站

设为首页

Oracle研究中心

当前位置:Oracle研究中心 > 故障案例 >

【案例】Oracle RAC报错ORA-00600 [kgeade_is_0]产生原因和解决办法

时间:2016-11-16 21:10   来源:Oracle研究中心   作者:网络   点击:

天萃荷净 Oracle研究中心案例分析:运维DBA反映Oracle数据库alert日志报错ORA-00600 [kgeade_is_0],结合MOS分析原因为parallel_execution_message_size参数设置异常。

本站文章除注明转载外,均为本站原创: 转载自love wife & love life —Roger 的Oracle技术博客
本文链接地址: ora-00600 [kgeade_is_0]

今天一同事遇到一个600错误,比较怪异,发过来看一下,如下:
SQL> SELECT inst_id,COUNT(1) FROM gv$session GROUP BY inst_id;
SELECT inst_id,COUNT(1) FROM gv$session GROUP BY inst_id
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kgeade_is_0], [], [], [], [], [],[], []

SQL> SELECT COUNT(*) FROM gv$session GROUP BY inst_id;

SELECT COUNT(*) FROM gv$session GROUP BY inst_id
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kgeade_is_0], [], [], [], [], [],[], []
对于的trace call stack信息如下:

*** 2011-07-03 21:24:47.324
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kgeade_is_0], [], [], [], [], [], [], []
Current SQL statement for this session:
select inst_id,count(:"SYS_B_0") from gv$session group by inst_id
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (  means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+64 call ksedst1() 000000000   000000001 
ksedmp()+2176 call ksedst() 000000000 
C000000000000C9F 
4000000003FEDB60 
000000000   000000000 
000000000 
ksfdmp()+48 call ksedmp() 000000003 
kgerinv()+304 call ksfdmp() C000000000000612 
000000003 
4000000009441D90 
000030223   000000000 
000000000 
kgeasnmierr()+144 call kgerinv() 6000000000031340 
40000000019EF530 
60000000000323F8 
40000000019EF530 
9FFFFFFFFFFF35C0 
$cold_kgeade()+64 call kgeasnmierr() 6000000000031340 
9FFFFFFFBF3A6310 
9FFFFFFFBF3A6320 
6000000000032770 
9FFFFFFFBF45E568 
C0000002EED97058 
000000000   000000002 
kgerev()+96 call $cold_kgeade() 6000000000031340 
60000000000314C0 
9FFFFFFFBF3A6310 
000000000   000000000 
000000000   000000000 
000000000 
kserec0()+160 call kgerev() 6000000000031340 
9FFFFFFFBF3A6310 
000000000 
6000000000032760 
9FFFFFFFFFFF3688 
$cold_kxfpg1sg()+35 call kserec0() 000000000   000000002 
52 9FFFFFFFFFFF35A0 
60000000000B5C78 
C0000000000017B7 
40000000044D3F70 
00003C9A3 
60000000000A4390 
kxfpgsg()+4320 call $cold_kxfpg1sg() C0000002EED97058 
000000000   000000002 
C0000000000026D5 
60000000000A59F8 
C0000002E61BA758 
9FFFFFFFBF3D9990 
9FFFFFFFBF3D3D68 
kxfrAllocSlahttp://www.oracleplus.netves()+6 call kxfpgsg() 9FFFFFFFFFFF3FE0 
40 4000000003DAF660 
000000002   000030021 
9FFFFFFFFFFF3A50 
000000000 
C000000000000B9F 
000000000 
kxfrialo()+3680 call kxfrAllocSlaves() 0001A58D9 
9FFFFFFFBF3F68F0 
000000001 
9FFFFFFFFFFF3FF8 
000000002   000000002 
9FFFFFFFBF39F7F0 
9FFFFFFFBF39F7F8 
kxfralo()+704 call kxfrialo() 9FFFFFFFFFFF4740 
4000000003E2D480 
00003832D 
C0000000000017B7 
9FFFFFFFFFFF41C0 
9FFFFFFFFFFF4088 
0000010E4 
60000000000C2630 
qerpx_rowsrc_start( call kxfralo() 0001A58D9   000000001 
)+2256 000000001 
C0000002ACADB728 
9FFFFFFFBF3F6998 
000000000 
C0000002ACADB6E8 
000000000 
qerpxStart()+1184 call qerpx_rowsrc_start( 9FFFFFFFBF3D5CA0 
) 9FFFFFFFFFFF4DD0 
60000000000B5C78 
9FFFFFFFFFFF53A0 
C000000000000CA1 
4000000003D39A40 
000038327   000000000 
qergsStart()+1312 call qerpxStart() C0000002ACADBA70 
000000001   000000000 
C000000000000DA3 
4000000002F7AC20 
000000000 
9FFFFFFFBF3D3F90 
9FFFFFFFBF3D3F8C 
selexe()+1920 call qergsStart() 4000000001B008C0 
000000001 
60000000000B5C78 
opiexe()+7984 call selexe() C0000002ACB48C38 
9FFFFFFFFFFF5720 
000004678 
60000000000B5C78 
C0000000000026D5 
4000000002E626B0 
000038269   000000000 
kpoal8()+3872 call opiexe() 9FFFFFFFFFFF70C0 
4000000002A89D80 
00001E915 
9FFFFFFFFFFF5440 
60000000000B5C78 
C0000000000012AD 
60000000000C2638 
60000000000314C0 
opiodr()+2128 call kpoal8() 9FFFFFFFFFFF77F0 
C000000000001530 
9FFFFFFFFFFF9ED0 
9FFFFFFFFFFF7110 
60000000000B5C78 
9FFFFFFFBF3D3D90 
ttcpip()+1680 call opiodr() 00000005E   000000017 
4000000001B03730 
0000046B0 
9FFFFFFFFFFF7800 
opitsk()+2336 call ttcpip() 600000000003D0C0 
000000001 
9FFFFFFFFFFF9ED0 
000000001 
9FFFFFFFFFFFA040 
9FFFFFFFFFFF9E34 
4000000001BE9730 
000000000 
opiino()+1840 call opitsk() 000000000   000000000 
60000000000B5C78 
40000000027FBFB0 
00001804D 
4000000001B03748 
opiodr()+2128 call opiino() 00000003C 
9FFFFFFFFFFFC890 
9FFFFFFFFFFFF030 
9FFFFFFFFFFFBD50 
60000000000B5C78 
C000000000001530 
opidrv()+1088 call opiodr() 00000003C   000000004 
4000000001B031E0 
0000046B0 
9FFFFFFFFFFFC8A0 
60000000000B5C78 
sou2o()+336 call opidrv() 00000003C 
9FFFFFFFFFFFF030 
60000000000C2630 
opimai_real()+224 call sou2o() 9FFFFFFFFFFFF050 
00000003C   000000004 
9FFFFFFFFFFFF030 
main()+368 call opimai_real() 000000000 
9FFFFFFFFFFFF080 
main_opd_entry()+80 call main() 000000002 
9FFFFFFFFFFFF538 
60000000000B5C78 
C000000000000004 
在trace中搜索Plan Table,找到如下信息:
============
Plan Table
============
--------------------------------------------+-----------------------------------+-------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time | TQ |IN-OUT|PQ Distrib |
--------------------------------------------+-----------------------------------+-------------------------+
| 0 | SELECT STATEMENT | | | | | | | | |
| 1 | SORT GROUP BY | | | | | | | | |
| 2 | PX COORDINATOR | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | | | | |:Q1000| P->S |QC (RANDOM)|
| 4 | VIEW | GV$SESSION| | | | |:Q1000| PCWP | |
| 5 | MERGE JOIN | | | | | |:Q1000| PCWP | |
| 6 | FIXED TABLE FULL | X$KSLED | | | | |:Q1000| PCWP | |
| 7 | SORT JOIN | | | | | |:Q1000| PCWP | |
| 8 | FIXED TABLE FULL | X$KSUSE | | | | |:Q1000| PCWP | |
--------------------------------------------+-----------------------------------+-------------------------+

Content of other_xml column
===========================
db_version : 10.2.0.4
parse_schema : SYS
plan_hash : 303308595
Outline Data:
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('10.2.0.4')
OPT_PARAM('optimizer_index_cost_adj' 50)
RBO_OUTLINE
OUTLINE_LEAF(@"SEL$2")
OUTLINE_LEAF(@"SEL$1")
FULL(@"SEL$2" "E"@"SEL$2")
FULL(@"SEL$2" "S"@"SEL$2")
LEADING(@"SEL$2" "E"@"SEL$2" "S"@"SEL$2")
USE_MERGE(@"SEL$2" "S"@"SEL$2")
END_OUTLINE_DATA
*/

从执行计划来看,上面报错的sql语句走了并行。执行如下的语句也会报错,

*** 2011-07-03 21:33:13.485
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kgeade_is_0], [], [], [], [], [], [], []
Current SQL statement for this session:
select sid from gv$session_wait where event like :"SYS_B_0"
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (  means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+64 call ksedst1() 000000000   000000001 
ksedmp()+2176 call ksedst() 000000000 
C000000000000C9F 
4000000003FEDB60 
000000000   000000000 
000000000 
ksfdmp()+48 call ksedmp() 000000003 
kgerinv()+304 call ksfdmp() C000000000000612 
000000003 
4000000009441D90 
000030223   000000000 
000000000 
kgeasnmierr()+144 call kgerinv() 6000000000031340 
40000000019EF530 
60000000000323F8 
40000000019EF530 
9FFFFFFFFFFF3690 
$cold_kgeade()+64 call kgeasnmierr() 6000000000031340 
9FFFFFFFBF3B0040 
9FFFFFFFBF3B0050 
6000000000032770 
9FFFFFFFBF45E568 
C0000002E61C0BE0 
000000000   000000002 
kgerev()+96 call $cold_kgeade() 6000000000031340 
60000000000314C0 
9FFFFFFFBF3B0040 
000000000   000000000 
000000000   000000000 
000000000 
kserec0()+160 call kgerev() 6000000000031340 
9FFFFFFFBF3B0040 
000000000 
6000000000032760 
9FFFFFFFFFFF3758 
$cold_kxfpg1sg()+35 call kserec0() 000000000   000000002 
52 9FFFFFFFFFFF3670 
60000000000B5C78 
C0000000000017B7 
40000000044D3F70 
00003C9A3 
60000000000A4390 
kxfpgsg()+4320 call $cold_kxfpg1sg() C0000002E61C0BE0 
000000000   000000002 
C0000000000026D5 
60000000000A59F8 
C0000002E61CCCF0 
9FFFFFFFBF3D9990 
9FFFFFFFBF3D4A48 
kxfrAllocSlaves()+6 call kxfpgsg() 9FFFFFFFFFFF40B0 
40 4000000003DAF660 
000000002   000030021 
9FFFFFFFFFFF3B20 
000000000 
C000000000000B9F 
000000000 
kxfrialo()+3680 call kxfrAllocSlaves() 0001A50DC 
9FFFFFFFBF3F3A68 
000000001 
9FFFFFFFFFFF40C8 
000000002   000000002 
9FFFFFFFBF39F7F0 
9FFFFFFFBF39F7F8 
kxfralo()+704 call kxfrialo() 9FFFFFFFFFFF4810 
4000000003E2D480 
00003832D 
C0000000000017B7 
9FFFFFFFFFFF4290 
9FFFFFFFFFFF4158 
0000010E4 
60000000000C2630 
qerpx_rowsrc_start( call kxfralo() 0001A50DC   000000001 
)+2256 000000001 
C0000002CFE9A518 
9FFFFFFFBF3F3B10 
000000000 
C0000002CFE9A4D8 
000000000 
qerpxStart()+1184 call qerpx_rowsrc_start( 9FFFFFFFBF3D7CE0 
) 9FFFFFFFFFFF4EA0 
60000000000B5C78 
9FFFFFFFFFFF5470 
C000000000000CA1 
4000000003D39A40 
000038329   000000000 
selexe()+1920 call qerpxStart() C0000002CFE9AB20 
000000001 
9FFFFFFFBF3D7A14 
opiexe()+7984 call selexe() C0000002CFEC3380 
9FFFFFFFFFFF57D0 
000004678 
60000000000B5C78 
C0000000000026D5 
4000000002E626B0 
000038269   000000000 
opiall0()+2992 call opiexe() 9FFFFFFFFFFF7170 
4000000003021E60 
00001F201 
9FFFFFFFFFFF54F0 
60000000000B5C78 
C000000000001736 
60000000000C2638 
60000000000314C0 
opial7()+928 call opiall0() 9FFFFFFFFFFF7930 
4000000002419500 
000010207 
60000000000A3CE0 
000000040 
9FFFFFFFFFFF7A50 
000000000 
C000000000000593 
opiodr()+2128 call opial7() C000000000001530 
4000000002E34A00 
9FFFFFFFFFFF7970 
60000000000B5C78 
000018287 
9FFFFFFFFFFFA1C0 
ttcpip()+1680 call opiodr() 000000047   00000000F 
4000000001B03398 
0000046B0 
9FFFFFFFFFFF7AB0 
opitsk()+2336 call ttcpip() 600000000003D0C0 
000000001 
9FFFFFFFFFFFA180 
000000001 
9FFFFFFFFFFFA2F0 
9FFFFFFFFFFFA0E4 
4000000001BE9730 
000000000 
opiino()+1840 call opitsk() 000000000   000000000 
60000000000B5C78 
40000000027FBFB0 
0000180CD 
4000000001B033B0 
opiodr()+2128 call opiino() 00000003C 
9FFFFFFFFFFFCB40 
9FFFFFFFFFFFF2E0 
9FFFFFFFFFFFC000 
60000000000B5C78 
C000000000001530 
opidrv()+1088 call opiodr() 00000003C   000000004 
4000000001B031E0 
0000046B0 
9FFFFFFFFFFFCB50 
60000000000B5C78 
sou2o()+336 call opidrv() 00000003C 
9FFFFFFFFFFFF2E0 
60000000000C2630 
opimai_real()+224 call sou2o() 9FFFFFFFFFFFF300 
00000003C   000000004 
9FFFFFFFFFFFF2E0 
main()+368 call opimai_real() 000000000 
9FFFFFFFFFFFF330 
main_opd_entry()+80 call main() 000000002 
9FFFFFFFFFFFF7E0 
60000000000B5C78 
C000000000000004 

对于的执行计划如下:
============
Plan Table
============
------------------------------------------------+-----------------------------------+-------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time | TQ |IN-OUT|PQ Distrib |
------------------------------------------------+-----------------------------------+-------------------------+
| 0 | SELECT STATEMENT | | | | | | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10000 | | | | |:Q1000| P->S |QC (RANDOM)|
| 3 | VIEW | GV$SESSION_WAIT| | | | |:Q1000| PCWP | |
| 4 | MERGE JOIN | | | | | |:Q1000| PCWP | |
| 5 | FIXED TABLE FULL | X$KSLED | | | | |:Q1000| PCWP | |
| 6 | SORT JOIN | | | | | |:Q1000| PCWP | |
| 7 | FIXED TABLE FULL | X$KSUSECST | | | | |:Q1000| PCWP | |
------------------------------------------------+-----------------------------------+-------------------------+

Content of other_xml column
===========================
db_version : 10.2.0.4
parse_schema : ITNMAGT
plan_hash : 3419475036
Outline Data:
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('10.2.0.4')
OPT_PARAM('optimizer_index_cost_adj' 50)
RBO_OUTLINE
OUTLINE_LEAF(@"SEL$3")
OUTLINE_LEAF(@"SEL$F5BB74E1")
MERGE(@"SEL$2")
OUTLINE(@"SEL$1")
OUTLINE(@"SEL$2")
FULL(@"SEL$3" "E"@"SEL$3")
FULL(@"SEL$3" "S"@"SEL$3")
LEADING(@"SEL$3" "E"@"SEL$3" "S"@"SEL$3")
USE_MERGE(@"SEL$3" "S"@"SEL$3")
END_OUTLINE_DATA
*/

2次的call stakc完全一样,这样来看,似乎是查询gv$视图就好出现这个600错误,怪哉。
查metalink,看了差不多10多篇文章,都分别对比了一下,发现如下2个比较靠谱:

Bug 8339221: ORA-600 [KGEADE_IS_0], [], [],
Bug 7504296: ORA-600 [KGEADE_IS_0] ERRORS IF BOTH INSTANCES ARE UP


特别的下面这个bug,基本上call stack都完全符合。
另外在一篇文章中发现了可能还跟参数parallel_execution_message_size 有关系,解释如下:
Symptoms
Query against gv$ views on RAC system like
select INST_ID, NAME, VALUE
from gv$parameter
where NAME ='parallel_threads_per_cpu';
.
Would fail with
.
ERROR at line 1:
ORA-12850: Could not allocate slaves on all specified instances: 2 needed, 1
allocated
ORA-12801: error signaled in parallel query server PZ99, instance xxxx
ORA-00000: normal, successful completion
or instead of the ORA-0 with a ORA-600 [Kgeade_is_0]

There was a internal bug fix that a ORA-0 should be reported as ORA-600 [Kgeade_is_0].
And this was merged into the 10.2.0.4 patchset.

Cause
Often the cause is a different size of parallel_execution_message_size. If parallel_execution_message_size is not the same
on all nodes in a cluster, a parallel query can not spawn on all nodes parallel execution servers what is need for a query
against a gv$ view.
A different size of parallel_execution_message_size as example can be caused by different setting of parallel_automatic_tuning.
This parameter implicit change the default of parallel_execution_message_size.

Solution
To fix the you can explicit set parallel_execution_message_size to the same value cluster-wide

恰恰这套rac的2个节点目前该参数的值不一样,节点1是2152,节点2是16384,(也就是默认值)同事最后调整这个参数以后,测试ok了

--------------------------------------ORACLE-DBA----------------------------------------

最权威、专业的Oracle案例资源汇总之【案例】Oracle RAC报错ORA-00600 [kgeade_is_0]产生原因和解决办法

本文由大师惜分飞原创分享,网址:http://www.oracleplus.net/arch/1284.html

Oracle研究中心

关键词:

ORA-00600

ORA-00600 [kgeade_is_0]

Oracle RAC报错ORA-00600 [kgeade_is_0]