天萃荷净
Oracle研究中心案例分析:运维DBA反映Oracle数据库alert日志报错ORA-00600 [kgeade_is_0],结合MOS分析原因为parallel_execution_message_size参数设置异常。
本站文章除注明转载外,均为本站原创: 转载自love wife & love life —Roger 的Oracle技术博客
本文链接地址: ora-00600 [kgeade_is_0]
今天一同事遇到一个600错误,比较怪异,发过来看一下,如下:
SQL> SELECT inst_id,COUNT(1) FROM gv$session GROUP BY inst_id;
SELECT inst_id,COUNT(1) FROM gv$session GROUP BY inst_id
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kgeade_is_0], [], [], [], [], [],[], []
SQL> SELECT COUNT(*) FROM gv$session GROUP BY inst_id;
SELECT COUNT(*) FROM gv$session GROUP BY inst_id
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kgeade_is_0], [], [], [], [], [],[], []
对于的trace call stack信息如下:
*** 2011-07-03 21:24:47.324
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kgeade_is_0], [], [], [], [], [], [], []
Current SQL statement for this session:
select inst_id,count(:"SYS_B_0") from gv$session group by inst_id
----- Call Stack Trace -----
calling call entry argument values in hex
location type point ( means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+64 call ksedst1() 000000000 000000001
ksedmp()+2176 call ksedst() 000000000
C000000000000C9F
4000000003FEDB60
000000000 000000000
000000000
ksfdmp()+48 call ksedmp() 000000003
kgerinv()+304 call ksfdmp() C000000000000612
000000003
4000000009441D90
000030223 000000000
000000000
kgeasnmierr()+144 call kgerinv() 6000000000031340
40000000019EF530
60000000000323F8
40000000019EF530
9FFFFFFFFFFF35C0
$cold_kgeade()+64 call kgeasnmierr() 6000000000031340
9FFFFFFFBF3A6310
9FFFFFFFBF3A6320
6000000000032770
9FFFFFFFBF45E568
C0000002EED97058
000000000 000000002
kgerev()+96 call $cold_kgeade() 6000000000031340
60000000000314C0
9FFFFFFFBF3A6310
000000000 000000000
000000000 000000000
000000000
kserec0()+160 call kgerev() 6000000000031340
9FFFFFFFBF3A6310
000000000
6000000000032760
9FFFFFFFFFFF3688
$cold_kxfpg1sg()+35 call kserec0() 000000000 000000002
52 9FFFFFFFFFFF35A0
60000000000B5C78
C0000000000017B7
40000000044D3F70
00003C9A3
60000000000A4390
kxfpgsg()+4320 call $cold_kxfpg1sg() C0000002EED97058
000000000 000000002
C0000000000026D5
60000000000A59F8
C0000002E61BA758
9FFFFFFFBF3D9990
9FFFFFFFBF3D3D68
kxfrAllocSlahttp://www.oracleplus.netves()+6 call kxfpgsg() 9FFFFFFFFFFF3FE0
40 4000000003DAF660
000000002 000030021
9FFFFFFFFFFF3A50
000000000
C000000000000B9F
000000000
kxfrialo()+3680 call kxfrAllocSlaves() 0001A58D9
9FFFFFFFBF3F68F0
000000001
9FFFFFFFFFFF3FF8
000000002 000000002
9FFFFFFFBF39F7F0
9FFFFFFFBF39F7F8
kxfralo()+704 call kxfrialo() 9FFFFFFFFFFF4740
4000000003E2D480
00003832D
C0000000000017B7
9FFFFFFFFFFF41C0
9FFFFFFFFFFF4088
0000010E4
60000000000C2630
qerpx_rowsrc_start( call kxfralo() 0001A58D9 000000001
)+2256 000000001
C0000002ACADB728
9FFFFFFFBF3F6998
000000000
C0000002ACADB6E8
000000000
qerpxStart()+1184 call qerpx_rowsrc_start( 9FFFFFFFBF3D5CA0
) 9FFFFFFFFFFF4DD0
60000000000B5C78
9FFFFFFFFFFF53A0
C000000000000CA1
4000000003D39A40
000038327 000000000
qergsStart()+1312 call qerpxStart() C0000002ACADBA70
000000001 000000000
C000000000000DA3
4000000002F7AC20
000000000
9FFFFFFFBF3D3F90
9FFFFFFFBF3D3F8C
selexe()+1920 call qergsStart() 4000000001B008C0
000000001
60000000000B5C78
opiexe()+7984 call selexe() C0000002ACB48C38
9FFFFFFFFFFF5720
000004678
60000000000B5C78
C0000000000026D5
4000000002E626B0
000038269 000000000
kpoal8()+3872 call opiexe() 9FFFFFFFFFFF70C0
4000000002A89D80
00001E915
9FFFFFFFFFFF5440
60000000000B5C78
C0000000000012AD
60000000000C2638
60000000000314C0
opiodr()+2128 call kpoal8() 9FFFFFFFFFFF77F0
C000000000001530
9FFFFFFFFFFF9ED0
9FFFFFFFFFFF7110
60000000000B5C78
9FFFFFFFBF3D3D90
ttcpip()+1680 call opiodr() 00000005E 000000017
4000000001B03730
0000046B0
9FFFFFFFFFFF7800
opitsk()+2336 call ttcpip() 600000000003D0C0
000000001
9FFFFFFFFFFF9ED0
000000001
9FFFFFFFFFFFA040
9FFFFFFFFFFF9E34
4000000001BE9730
000000000
opiino()+1840 call opitsk() 000000000 000000000
60000000000B5C78
40000000027FBFB0
00001804D
4000000001B03748
opiodr()+2128 call opiino() 00000003C
9FFFFFFFFFFFC890
9FFFFFFFFFFFF030
9FFFFFFFFFFFBD50
60000000000B5C78
C000000000001530
opidrv()+1088 call opiodr() 00000003C 000000004
4000000001B031E0
0000046B0
9FFFFFFFFFFFC8A0
60000000000B5C78
sou2o()+336 call opidrv() 00000003C
9FFFFFFFFFFFF030
60000000000C2630
opimai_real()+224 call sou2o() 9FFFFFFFFFFFF050
00000003C 000000004
9FFFFFFFFFFFF030
main()+368 call opimai_real() 000000000
9FFFFFFFFFFFF080
main_opd_entry()+80 call main() 000000002
9FFFFFFFFFFFF538
60000000000B5C78
C000000000000004
在trace中搜索Plan Table,找到如下信息:
============
Plan Table
============
--------------------------------------------+-----------------------------------+-------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time | TQ |IN-OUT|PQ Distrib |
--------------------------------------------+-----------------------------------+-------------------------+
| 0 | SELECT STATEMENT | | | | | | | | |
| 1 | SORT GROUP BY | | | | | | | | |
| 2 | PX COORDINATOR | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | | | | |:Q1000| P->S |QC (RANDOM)|
| 4 | VIEW | GV$SESSION| | | | |:Q1000| PCWP | |
| 5 | MERGE JOIN | | | | | |:Q1000| PCWP | |
| 6 | FIXED TABLE FULL | X$KSLED | | | | |:Q1000| PCWP | |
| 7 | SORT JOIN | | | | | |:Q1000| PCWP | |
| 8 | FIXED TABLE FULL | X$KSUSE | | | | |:Q1000| PCWP | |
--------------------------------------------+-----------------------------------+-------------------------+
Content of other_xml column
===========================
db_version : 10.2.0.4
parse_schema : SYS
plan_hash : 303308595
Outline Data:
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('10.2.0.4')
OPT_PARAM('optimizer_index_cost_adj' 50)
RBO_OUTLINE
OUTLINE_LEAF(@"SEL$2")
OUTLINE_LEAF(@"SEL$1")
FULL(@"SEL$2" "E"@"SEL$2")
FULL(@"SEL$2" "S"@"SEL$2")
LEADING(@"SEL$2" "E"@"SEL$2" "S"@"SEL$2")
USE_MERGE(@"SEL$2" "S"@"SEL$2")
END_OUTLINE_DATA
*/
从执行计划来看,上面报错的sql语句走了并行。执行如下的语句也会报错,
*** 2011-07-03 21:33:13.485
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kgeade_is_0], [], [], [], [], [], [], []
Current SQL statement for this session:
select sid from gv$session_wait where event like :"SYS_B_0"
----- Call Stack Trace -----
calling call entry argument values in hex
location type point ( means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+64 call ksedst1() 000000000 000000001
ksedmp()+2176 call ksedst() 000000000
C000000000000C9F
4000000003FEDB60
000000000 000000000
000000000
ksfdmp()+48 call ksedmp() 000000003
kgerinv()+304 call ksfdmp() C000000000000612
000000003
4000000009441D90
000030223 000000000
000000000
kgeasnmierr()+144 call kgerinv() 6000000000031340
40000000019EF530
60000000000323F8
40000000019EF530
9FFFFFFFFFFF3690
$cold_kgeade()+64 call kgeasnmierr() 6000000000031340
9FFFFFFFBF3B0040
9FFFFFFFBF3B0050
6000000000032770
9FFFFFFFBF45E568
C0000002E61C0BE0
000000000 000000002
kgerev()+96 call $cold_kgeade() 6000000000031340
60000000000314C0
9FFFFFFFBF3B0040
000000000 000000000
000000000 000000000
000000000
kserec0()+160 call kgerev() 6000000000031340
9FFFFFFFBF3B0040
000000000
6000000000032760
9FFFFFFFFFFF3758
$cold_kxfpg1sg()+35 call kserec0() 000000000 000000002
52 9FFFFFFFFFFF3670
60000000000B5C78
C0000000000017B7
40000000044D3F70
00003C9A3
60000000000A4390
kxfpgsg()+4320 call $cold_kxfpg1sg() C0000002E61C0BE0
000000000 000000002
C0000000000026D5
60000000000A59F8
C0000002E61CCCF0
9FFFFFFFBF3D9990
9FFFFFFFBF3D4A48
kxfrAllocSlaves()+6 call kxfpgsg() 9FFFFFFFFFFF40B0
40 4000000003DAF660
000000002 000030021
9FFFFFFFFFFF3B20
000000000
C000000000000B9F
000000000
kxfrialo()+3680 call kxfrAllocSlaves() 0001A50DC
9FFFFFFFBF3F3A68
000000001
9FFFFFFFFFFF40C8
000000002 000000002
9FFFFFFFBF39F7F0
9FFFFFFFBF39F7F8
kxfralo()+704 call kxfrialo() 9FFFFFFFFFFF4810
4000000003E2D480
00003832D
C0000000000017B7
9FFFFFFFFFFF4290
9FFFFFFFFFFF4158
0000010E4
60000000000C2630
qerpx_rowsrc_start( call kxfralo() 0001A50DC 000000001
)+2256 000000001
C0000002CFE9A518
9FFFFFFFBF3F3B10
000000000
C0000002CFE9A4D8
000000000
qerpxStart()+1184 call qerpx_rowsrc_start( 9FFFFFFFBF3D7CE0
) 9FFFFFFFFFFF4EA0
60000000000B5C78
9FFFFFFFFFFF5470
C000000000000CA1
4000000003D39A40
000038329 000000000
selexe()+1920 call qerpxStart() C0000002CFE9AB20
000000001
9FFFFFFFBF3D7A14
opiexe()+7984 call selexe() C0000002CFEC3380
9FFFFFFFFFFF57D0
000004678
60000000000B5C78
C0000000000026D5
4000000002E626B0
000038269 000000000
opiall0()+2992 call opiexe() 9FFFFFFFFFFF7170
4000000003021E60
00001F201
9FFFFFFFFFFF54F0
60000000000B5C78
C000000000001736
60000000000C2638
60000000000314C0
opial7()+928 call opiall0() 9FFFFFFFFFFF7930
4000000002419500
000010207
60000000000A3CE0
000000040
9FFFFFFFFFFF7A50
000000000
C000000000000593
opiodr()+2128 call opial7() C000000000001530
4000000002E34A00
9FFFFFFFFFFF7970
60000000000B5C78
000018287
9FFFFFFFFFFFA1C0
ttcpip()+1680 call opiodr() 000000047 00000000F
4000000001B03398
0000046B0
9FFFFFFFFFFF7AB0
opitsk()+2336 call ttcpip() 600000000003D0C0
000000001
9FFFFFFFFFFFA180
000000001
9FFFFFFFFFFFA2F0
9FFFFFFFFFFFA0E4
4000000001BE9730
000000000
opiino()+1840 call opitsk() 000000000 000000000
60000000000B5C78
40000000027FBFB0
0000180CD
4000000001B033B0
opiodr()+2128 call opiino() 00000003C
9FFFFFFFFFFFCB40
9FFFFFFFFFFFF2E0
9FFFFFFFFFFFC000
60000000000B5C78
C000000000001530
opidrv()+1088 call opiodr() 00000003C 000000004
4000000001B031E0
0000046B0
9FFFFFFFFFFFCB50
60000000000B5C78
sou2o()+336 call opidrv() 00000003C
9FFFFFFFFFFFF2E0
60000000000C2630
opimai_real()+224 call sou2o() 9FFFFFFFFFFFF300
00000003C 000000004
9FFFFFFFFFFFF2E0
main()+368 call opimai_real() 000000000
9FFFFFFFFFFFF330
main_opd_entry()+80 call main() 000000002
9FFFFFFFFFFFF7E0
60000000000B5C78
C000000000000004
对于的执行计划如下:
============
Plan Table
============
------------------------------------------------+-----------------------------------+-------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time | TQ |IN-OUT|PQ Distrib |
------------------------------------------------+-----------------------------------+-------------------------+
| 0 | SELECT STATEMENT | | | | | | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10000 | | | | |:Q1000| P->S |QC (RANDOM)|
| 3 | VIEW | GV$SESSION_WAIT| | | | |:Q1000| PCWP | |
| 4 | MERGE JOIN | | | | | |:Q1000| PCWP | |
| 5 | FIXED TABLE FULL | X$KSLED | | | | |:Q1000| PCWP | |
| 6 | SORT JOIN | | | | | |:Q1000| PCWP | |
| 7 | FIXED TABLE FULL | X$KSUSECST | | | | |:Q1000| PCWP | |
------------------------------------------------+-----------------------------------+-------------------------+
Content of other_xml column
===========================
db_version : 10.2.0.4
parse_schema : ITNMAGT
plan_hash : 3419475036
Outline Data:
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('10.2.0.4')
OPT_PARAM('optimizer_index_cost_adj' 50)
RBO_OUTLINE
OUTLINE_LEAF(@"SEL$3")
OUTLINE_LEAF(@"SEL$F5BB74E1")
MERGE(@"SEL$2")
OUTLINE(@"SEL$1")
OUTLINE(@"SEL$2")
FULL(@"SEL$3" "E"@"SEL$3")
FULL(@"SEL$3" "S"@"SEL$3")
LEADING(@"SEL$3" "E"@"SEL$3" "S"@"SEL$3")
USE_MERGE(@"SEL$3" "S"@"SEL$3")
END_OUTLINE_DATA
*/
2次的call stakc完全一样,这样来看,似乎是查询gv$视图就好出现这个600错误,怪哉。
查metalink,看了差不多10多篇文章,都分别对比了一下,发现如下2个比较靠谱:
Bug 8339221: ORA-600 [KGEADE_IS_0], [], [],
Bug 7504296: ORA-600 [KGEADE_IS_0] ERRORS IF BOTH INSTANCES ARE UP
特别的下面这个bug,基本上call stack都完全符合。
另外在一篇文章中发现了可能还跟参数parallel_execution_message_size 有关系,解释如下:
Symptoms
Query against gv$ views on RAC system like
select INST_ID, NAME, VALUE
from gv$parameter
where NAME ='parallel_threads_per_cpu';
.
Would fail with
.
ERROR at line 1:
ORA-12850: Could not allocate slaves on all specified instances: 2 needed, 1
allocated
ORA-12801: error signaled in parallel query server PZ99, instance xxxx
ORA-00000: normal, successful completion
or instead of the ORA-0 with a ORA-600 [Kgeade_is_0]
There was a internal bug fix that a ORA-0 should be reported as ORA-600 [Kgeade_is_0].
And this was merged into the 10.2.0.4 patchset.
Cause
Often the cause is a different size of parallel_execution_message_size. If parallel_execution_message_size is not the same
on all nodes in a cluster, a parallel query can not spawn on all nodes parallel execution servers what is need for a query
against a gv$ view.
A different size of parallel_execution_message_size as example can be caused by different setting of parallel_automatic_tuning.
This parameter implicit change the default of parallel_execution_message_size.
Solution
To fix the you can explicit set parallel_execution_message_size to the same value cluster-wide
恰恰这套rac的2个节点目前该参数的值不一样,节点1是2152,节点2是16384,(也就是默认值)同事最后调整这个参数以后,测试ok了
--------------------------------------ORACLE-DBA----------------------------------------
最权威、专业的Oracle案例资源汇总之【案例】Oracle RAC报错ORA-00600 [kgeade_is_0]产生原因和解决办法
本文由大师惜分飞原创分享,网址:http://www.oracleplus.net/arch/1284.html