sitemap

RSS地图

收藏本站

设为首页

Oracle研究中心

当前位置:Oracle研究中心 > 故障案例 >

【案例】Oracle GES:Potential blocker on resource FU解决思路

时间:2016-11-14 21:37   来源:Oracle研究中心   作者:网络   点击:

天萃荷净 Oracle研究中心案例分析:运维DBA反映Oracle数据库出现大量等待事件异常,详细信息如GES:Potential blocker on resource FU,结合案例分析产生原因为MMON子进程异常导致。

本站文章除注明转载外,均为本站原创: 转载自love wife & love life —Roger 的Oracle技术博客
本文链接地址: GES: Potential blocker (pid=13839) on resource FU

GES: Potential blocker (pid=13839) on resource FU

进行了数据库故障分析,简约过程如下:

###### 在5月25号出现了一个ges tm锁 ######
Wed May 25 10:16:13 2011
GES: Potential blocker (pid=16272) on resource TM-000023AA-00000000;
enqueue info in file /export/home/oracle/admin/sqyydb/bdump/sqyydb1_lmd0_1771.trc and DIAG trace file
Wed May 25 10:36:41 2011
Thread 1 advanced to log sequence 58737

###### 直到5月31号才出现ges fu锁,且一直不断产生,如下:######
Tue May 31 00:45:28 2011
GES: Potential blocker (pid=13839) on resource FU-00000000-00000000;
enqueue info in file /export/home/oracle/admin/sqyydb/bdump/sqyydb1_lmd0_1771.trc and DIAG trace file
Tue May 31 01:45:29 2011
GES: Potential blocker (pid=13839) on resource FU-00000000-00000000;
enqueue info in file /export/home/oracle/admin/sqyydb/bdump/sqyydb1_lmd0_1771.trc and DIAG trace file
Tue May 31 02:45:30 2011
GES: Potential blocker (pid=13839) on resource FU-00000000-00000000;
enqueue info in file /export/home/oracle/admin/sqyydb/bdump/sqyydb1_lmd0_1771.trc and DIAG trace file
Tue May 31 02:49:08 2011
Thread 1 advanced to log sequence 58946
  Current log# 2 seq# 58946 mem# 0: +DG1/sqyydb/onlinelog/group_2.260.622838565
  Current log# 2 seq# 58946 mem# 1: +DG1/sqyydb/onlinelog/group_2.261.622838565
Tue May 31 03:45:32 2011
GES: Potential blocker (pid=13839) on resource FU-00000000-00000000;
enqueue info in file /export/home/oracle/admin/sqyydb/bdump/sqyydb1_lmd0_1771.trc and DIAG trace file
Tue May 31 04:45:34 2011
GES: Potential blocker (pid=13839) on resource FU-00000000-00000000;
enqueue info in file /export/home/oracle/admin/sqyydb/bdump/sqyydb1_lmd0_1771.trc and DIAG trace file
Tue May 31 05:45:36 2011
GES: Potential blocker (pid=13839) on resource FU-00000000-00000000;
enqueue info in file /export/home/oracle/admin/sqyydb/bdump/sqyydb1_lmd0_1771.trc and DIAG trace file
Tue May 31 06:11:51 2011

奇怪的是13号故障之前的快照居然没有生成,怀疑可能mmon进程出现异常。

SQL> SELECT SNAP_ID,to_char(BEGIN_INTERVAL_TIME,'yyyymmdd hh24:mi:ss') BEGIN_INTERVAL_TIME
  2  FROM WRM$_SNAPSHOT
  3  WHERE BEGIN_INTERVAL_TIME BETWEEN to_date('20110610 00','yyyymmdd hh24')
  4    AND to_date('20110613 12','yyyymmdd hh24')
  5    AND INSTANCE_NUMBER=1 ORDER BY 2;

   SNAP_ID BEGIN_INTERVAL_TI
---------- -----------------
     30681 20110613 10:58:15
     30682 20110613 11:04:16

从部分m00x trace中发现了如下类似信息:

* kjdrpkey2hv: called with pkey 203395, options x8
* kjdrpkey2hv: called with pkey 203393, options x8


经查该信息由于oracle Bug 6155791 导致,请见文档

Kjdrpkey2hv:Called With Pkey" Messages In Trace Files Under Udump [ID 764063.1]

该错误信息完全可以忽略,metalink 764063.1解释如下:

1. You can  ignore these messages as they are harmless
OR
2. Apply patch 6155791 on the RDBMS_HOME

由于这些信息是可以完全忽略的,虽然是个bug, 但跟这次故障应该完全没有关系;
下面继续分析, 根据alert log信息,我们可以看到如下信息:

GES: Potential blocker (pid=13839) on resource FU-00000000-00000000

从这个GES 锁,我断定这个event应该保护FU关键字,通过查询v$event_name发现还真有这样一样等待事件,如下:

enq: FU - contention
Serializes the capture of the DB feature, usage, and high watermark statistics

单从alert log来看还无法确认这个ges锁产生的event是不是 enq: FU - contention

通过部署过一个脚本,发现了如下信息:

*** 2011-05-31 01:35:00.875
==============

HANG ANALYSIS:

==============
Open chains found:
Chain 1 :  :
    <0/390/33355/0x59006f28/13839/No Wait>
-- <0/645/52388/0x98007ef8/11172/enq: FU - contention>

[644]/0/645/52388/0x98166400/11172/REMOTE_WT/207/208/[389]/none
[655]/0/656/18755/0x98174790/15027/SINGLE_NODE_NW/217/218//none

*** 2011-06-03 09:40:00.696

=============
HANG ANALYSIS:
=============

Open chains found:
Chain 1 :  :
    <0/390/33355/0x59006f28/13839/No Wait>
-- <0/656/56890/0x98002018/5565/enq: FU - contention>
[389]/0/390/33355/0x5913ac88/13839/LEAF_NW/125/126//655
[655]/0/656/56890/0x98174790/5565/REMOTE_WT/213/214/[389]/none

*** 2011-06-03 09:45:01.241

==============
HANG ANALYSIS:
==============
Open chains found:
Chain 1 :  :
    <0/390/33355/0x59006f28/13839/No Wait>
-- <0/656/56890/0x98002018/5565/enq: FU - contention>
[389]/0/390/33355/0x5913ac88/13839/LEAF_NW/127/128//655
[655]/0/656/56890/0x98174790/5565/REMOTE_WT/217/218/[389]/none

*** 2011-06-03 10:40:00.697

==============
HANG ANALYSIS:
==============

Open chains found:

Chain 1 :  :
    <0/390/33355/0x59006f28/13839/No Wait>
-- <0/654/11173/0x9800b650/18215/enq: FU - contention>

*** 2011-06-03 10:45:01.281

==============
HANG ANALYSIS:
==============

Open chains found:
Chain 1 :  :
    <0/390/33355/0x59006f28/13839/No Wait>
-- <0/654/11173/0x9800b650/18215/enq: FU - contention>

[389]/0/390/33355/0x5913ac88/13839/LEAF_NW/127/128//653
[653]/0/654/11173/0x98171e30/18215/REMOTE_WT/219/220/[389]/none

从10:40到10:45的2次自动hanganalyze分析都能看到fu锁,说明在该5分钟内,
应该一直有等待事件enq: FU - contention 产生。

到这里为止,我们就可以定位alert log 中锁报的ges fu 锁,实际就是指的enq: FU - contention(rac环境而已)
另外在部分trace中,没有发现FU锁,如下:

Chain 9 :  :
    <0/390/33355/0x59006f28/13839/No Wait>
[389]/0/390/33355/0x5913ac88/13839/SINGLE_NODE_NW/127/128//none

最近一段时间内,故障节点的信息如下:

*** 2011-06-13 07:45:01.294

==============
HANG ANALYSIS:
==============

Open chains found:

Chain 1 :  :
    <0/390/33355/0x59006f28/13839/No Wait>
-- <0/658/65185/0x98002018/3250/enq: FU - contention>

[389]/0/390/33355/0x5913ac88/13839/LEAF_NW/173/174//657
[657]/0/658/65185/0x981770f0/3250/REMOTE_WT/355/356/[389]/none

*** 2011-06-13 07:50:00.876

==============
HANG ANALYSIS:
==============

Open chains found:

Chain 1 :  :

    <0/390/33355/0x59006f28/13839/No Wait>
-- <0/658/65185/0x98002018/3250/enq: FU - contention>
*** 2011-06-13 07:55:01.450

==============
HANG ANALYSIS:
==============

Open chains found:

Chain 1 :  :
    <0/173/39622/0x18016c28/16272/No Wait>
-- <0/876/38244/0xd7001048/28651/enq: WF - contention>
Chain 2 :  :
    <0/390/33355/0x59006f28/13839/No Wait>
-- <0/658/65185/0x98002018/3250/enq: FU - contention>

从上看,FU锁,持续了10分钟左右。
从上面我们可以看出问题的根源就是13839 进程,该进程阻塞别的进程如3250,导致eventenq: FU - contention产生。

到这里我们清楚了故障的原因了,是13839进程故障异常所导致,但是这个进程是什么进程呢?做什么的?为什么出现异常?
要回答这个问题,我们还得继续往下分析。

另外,从lmd0 trace,我们也可以看出,13839 是问题的根源。

*** 2011-06-13 01:55:04.120
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x0][0x0],[FU]

----------resource 0x3e31ce9d0----------------------
resname       : [0x0][0x0],[FU]
Local node    : 0
dir_node      : 0
master_node   : 0
hv idx        : 95
hv last r.inc : 0
current inc   : 18
hv status     : 0
hv master     : 0
open options  : dd
grant_bits    : KJUSERNL KJUSEREX
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW  KJUSEREX
count         : 1         0         0         0         0         1
val_state     : KJUSERVS_NOVALUE
valblk        : 0x00000000000000000000000000000000 .
access_node   : 0
vbreq_state   : 0
state         : x0
resp          : 3e31ce9d0
On Scan_q     : N
Total accesses: 5032835
Imm.  accesses: 7235
Granted_locks : 1
Cvting_locks  : 1
value_block:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
GRANTED_Q :
lp 6592b8d10 gl KJUSEREX rp 3e31ce9d0 [0x0][0x0],[FU]
  master 0 gl owner 65913ac88 possible pid 13839 xid 1003-003B-0001AAA7 bast 0 rseq 200 mseq 0 history 0x95495495
  open opt KJUSERDEADLOCK  KJUSERIDLK
CONVERT_Q:
lp 5983261d8 gl KJUSERNL rl KJUSEREX rp 3e31ce9d0 [0x0][0x0],[FU]
  master 0 gl owner 5981732e0 possible pid 9121 xid 1001-0014-00038B7C bast 0 rseq 200 mseq 0 history 0x1495149a
  convert opt KJUSERGETVALUE KJUSERNODEADLOCKWAIT KJUSERTRCCANCEL
----------enqueue 0x6592b8d10------------------------
lock version     : 15017751
Owner node       : 0
grant_level      : KJUSEREX
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : 0
resp             : 3e31ce9d0
procp            : 4174ae968
pid              : 13839
proc version     : 9797
oprocp           : 0
opid             : 0
group lock owner : 65913ac88
possible pid     : 13839
xid              : 1003-003B-0001AAA7
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q       : N
On_dd_q          : N
lock_state       : GRANTED
Open Options     : KJUSERDEADLOCK  KJUSERIDLK
Convert options  : KJUSERGETVALUE KJUSERNODEADLOCKWAIT KJUSERTRCCANCEL
History          : 0x95495495
Msg_Seq          : 0x0
res_seq          : 200
valblk           : 0x00000000000000000000000000000000 .
Potential blocker (pid=13839) on resource FU-00000000-00000000
DUMP LOCAL BLOCKER: initiate state dump for TIMEOUT
  possible owner[59.13839]
Submitting asynchronized dump request [28]

在开始,曾提到awr快照为何也没有产生,原因是mmon进程也出现异常了,mmon trace如下:

*** 2011-04-26 17:00:58.830
*** SERVICE NAME:(SYS$BACKGROUND) 2011-04-26 17:00:58.830
*** SESSION ID:(210.1) 2011-04-26 17:00:58.830
*** KEWRAFC: Flush slave failed, AWR Enqueue Timeout
*** 2011-04-26 17:01:59.160
*** KEWRAFC: Flush slave failed, AWR Enqueue Timeout
###### 省略部分内容 ######
*** 2011-06-13 09:55:57.050
***KEWFDBFUS: Auto-DBFUS slave failed, return Code: 2*** KEWRAFC: Flush slave failed, AWR Enqueue Timeout
*** 2011-06-13 09:56:58.670
*** KEWRAFC: Flush slave failed, AWR Enqueue Timeout
*** 2011-06-13 09:57:59.730
###### 省略部分内容 ######
*** 2011-06-13 10:28:12.600
*** KEWRAFC: Flush slave failed, AWR Enqueue Timeout
*** 2011-06-13 10:31:10.240
*** KEWA: Error in scheduling slave for emergency-flush ****
*** KEWRAFC: Flush slave failed, AWR Enqueue Timeout
Unable to schedule a MMON slave at: Auto Flush Main 1
  Attempt to create slave process failed.
  Can happen for several reasons:
    - No process state objects
    - Reached OS set limits
    - A shutdown was going on
  Check alert log for more details.

从上看,mmon进程从2011年4月26号就出现异常了,以至于awr快照一直未产生(以致于我无法通过查询hist视图来定位问题)。
我们还是回到正题上来,那就是13839进程,为啥出现异常? 这个问题就需要用diag trace分析了。
从diag trace中,我们发现了如下关键性内容:

*** 2011-05-31 01:45:29.910
Dumping diagnostic information for ospid 13839:
OS pid = 13839
loadavg : 0.31 0.33 0.34
swap info: free_mem = 14418.00M rsv = 14742.30M
           alloc = 14374.51M avail = 43892.05 swap_free = 44259.84M
F S      UID   PID  PPID   C PRI NI     ADDR     SZ    WCHAN    STIME TTY         TIME CMD
0 S   oracle 13839     1   0  40 20          1598655          23:32:42             0:02 ora_m001_sqyydb1

从上面的信息可以知道13839这个进程是什么东西,原来是mmon的附属进程。
下面是进程13839 的进程信息:

*** 2011-06-13 07:55:16.170
  ----------------------------------------
  SO: 659006f28, type: 2, owner: 0, flag: INIT/-/-/0x00
  (process) Oracle pid=59, calls cur/top: 65add3cb0/65add3198, flag: (2) SYSTEM
            int error: 0, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 0 0 55
              last post received-location: kjata: wake up enqueue owner
              last process to post me: 718002018 1 6
              last post sent: 0 0 24
              last post sent-location: ksasnd
              last process posted by me: 718007ef8 1 6
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: 4153719b0
    O/S info: user: oracle, term: UNKNOWN, ospid: 13839
    OSD pid info: Unix process pid: 13839, image: oracle@xhsqsqyydb1 (m001)

*** 2011-06-13 08:55:15.168
  ----------------------------------------
  SO: 659006f28, type: 2, owner: 0, flag: INIT/-/-/0x00
  (process) Oracle pid=59, calls cur/top: 65add3cb0/65add3198, flag: (2) SYSTEM
            int error: 0, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 0 0 55
              last post received-location: kjata: wake up enqueue owner
              last process to post me: 718002018 1 6
              last post sent: 0 0 24
              last post sent-location: ksasnd
              last process posted by me: 718007ef8 1 6
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: 4153719b0
    O/S info: user: oracle, term: UNKNOWN, ospid: 13839
    OSD pid info: Unix process pid: 13839, image: oracle@xhsqsqyydb1 (m001)

*** 2011-06-13 09:55:17.061
  ----------------------------------------
  SO: 659006f28, type: 2, owner: 0, flag: INIT/-/-/0x00
  (process) Oracle pid=59, calls cur/top: 65add3cb0/65add3198, flag: (2) SYSTEM
            int error: 0, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 0 0 55
              last post received-location: kjata: wake up enqueue owner
              last process to post me: 718002018 1 6
              last post sent: 0 0 24
              last post sent-location: ksasnd
              last process posted by me: 718007ef8 1 6
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: 4153719b0
    O/S info: user: oracle, term: UNKNOWN, ospid: 13839
    OSD pid info: Unix process pid: 13839, image: oracle@xhsqsqyydb1 (m001)

单从上面6月13号的信息,可以看出在进程是在间隔一个小时工作一次,那么这个进程具体在干什么呢?

由于该节点在6月13号上午10点半左右重启主机了,所以9点55分的diag进程dump信息,就是故障最近的信息.

*** 2011-06-13 09:55:15.210
Dumping diagnostic information for ospid 13839:
OS pid = 13839
loadavg : 1.32 1.39 1.34
swap info: free_mem = 12945.98M rsv = 15510.91M
           alloc = 14991.01M avail = 43040.05 swap_free = 43559.95M
F S      UID   PID  PPID   C PRI NI     ADDR     SZ    WCHAN    STIME TTY         TIME CMD
0 S   oracle 13839     1   0  40 20          1598655            May 30             0:26 ora_m001_sqyydb1
13839:  ora_m001_sqyydb1
    SO: 65913ac88, type: 4, owner: 659006f28, flag: INIT/-/-/0x00
    (session) sid: 390 trans: 65ad33310, creator: 659006f28, flag: (8000051) USR/- BSY/-/-/-/-/-
              DID: 0001-003B-0001AAA7, short-term DID: 0001-003B-0001AAA8
              txn branch: 0
              oct: 0, prv: 0, sql: 0, psql: 0, user: 0/SYS
    last wait for 'db file sequential read' blocking sess=0x0 seq=1165 wait_time=329 seconds since wait started=1160552
                file#=3, block#=9ba, blocks=1
    Dumping Session Wait History
     for 'db file sequential read' count=1 wait_time=329
                file#=3, block#=9ba, blocks=1
     for 'db file sequential read' count=1 wait_time=1032
                file#=3, block#=9bf, blocks=1
     for 'db file sequential read' count=1 wait_time=1261
                file#=3, block#=9c4, blocks=1
     for 'db file sequential read' count=1 wait_time=1441
                file#=3, block#=9b9, blocks=1
     for 'db file sequential read' count=1 wait_time=1207
                file#=3, block#=9bb, blocks=1
     for 'gc current block 2-way' count=1 wait_time=655
                =1, =f248, =1
     for 'gc current block 2-way' count=1 wait_time=584
                =1, =f242, =1
     for 'gc current block 2-way' count=1 wait_time=577
                =1, =f237, =1
     for 'gc current block 2-way' count=1 wait_time=511
               =1, =f22a, =1
     for 'gc current block 2-way' count=1 wait_time=629
                =1, =f228, =1

    temporary object counter: 1
      SO: 3f2c90d28, type: 53, owner: 65913ac88, flag: INIT/-/-/0x00
      LIBRARY OBJECT LOCK: lock=3f2c90d28 handle=41470bac0 mode=N
      call pin=0 session pin=0 hpc=0000 hlc=0000
      htl=3f2c90da8[3f28e5600,3f2d871a8] htb=3f28e5600 ssga=3f28e4d08
      user=65913ac88 session=65913d5e8 count=1 flags=[0000] savepoint=0x4de3b89d
      LIBRARY OBJECT HANDLE: handle=41470bac0 mtx=41470bbf0(1) cdp=1
      name=

UPDATE WRI$_DBU_FEATURE_USAGE SET LAST_USAGE_DATE = :B4 , DETECTED_USAGES = DETECTED_USAGES + 1, AUX_COUNT = :B6 , FEATURE_INFO = :B5 , FIRST_USAGE_DATE = DECODE(FIRST_USAGE_DATE, NULL, :B4 , FIRST_USAGE_DATE) WHERE NAME = :B3 AND DBID = :B2 AND VERSION = :B1
      hash=8e08450cee5e6e6a1949b4138f06f27e timestamp=05-30-2011 23:32:41
SO: 3dedfe1d0, type: 53, owner: 65913ac88, flag: INIT/-/-/0x00
      LIBRARY OBJECT LOCK: lock=3dedfe1d0 handle=3a4deef28 mode=N
      call pin=0 session pin=0 hpc=0000 hlc=0000
      htl=3dedfe250[3f291a5f8,3ded23da0] htb=3f28e5160 ssga=3f28e4d08
      user=65913ac88 session=65913d5e8 count=1 flags=[0000] savepoint=0x4de3b899
      LIBRARY OBJECT HANDLE: handle=3a4deef28 mtx=3a4def058(1) cdp=1
      name=SELECT MT.NAME, INST_CHK_METHOD, INST_CHK_LOGIC, USG_DET_METHOD, USG_DET_LOGIC FROM WRI$_DBU_FEATURE_METADATA MT, WRI$_DBU_FEATURE_USAGE FU WHERE MT.NAME = FU.NAME AND FU.DBID = :B2 AND FU.VERSION = :B1
      hash=29b4705a89e57f52f7c5420931506834 timestamp=05-30-2011 23:32:41
   SELECT TYPE, MESSAGE, IMPACT, CNT, TOT_CNT FROM (SELECT DECODE (F.TYPE, 1, 'PROBLEM', 2, 'SYMPTOM', 3, 'ERROR', 4, 'INFORMATION', 5, 'WARNING', 'UNKNOWN') AS TYPE, DBMS_ADVISOR.FORMAT_MESSAGE_GROUP(F.MSG_ID) AS MESSAGE, COUNT(*) AS CNT, SUM(F.IMPACT_VAL) AS IMPACT, RANK() OVER (ORDER BY COUNT(*) DESC) AS RNK, SUM(COUNT(*)) OVER () AS TOT_CNT FROM SYS.WRI$_ADV_TASKS T, SYS.WRI$_ADV_FINDINGS F WHERE T.ADVISOR_ID = 1 AND T.EXEC_START > SYSDATE - 7 AND T.ID = F.TASK_ID AND F.TYPE = 1 GROUP BY F.TYPE, DBMS_ADVIS
      hash=8b9721e0166252ee02eb2a20bf246f98 timestamp=05-30-2011 23:32:43
SO: 65ad33bb0, type: 40, owner: 65ad7ba40, flag: INIT/-/-/0x00
        (trans) flg = 0x04, flg2 = 0x00, prx = 0x65ad33310, ros = 2147483647 bsn = 0x2b9fd04 bndsn = 0x2b9fd06 spn = 0x2b9fd08
        efd = 26
        parent xid: 0x0000.000.00000000
        env: (scn: 0x0002.653cd124  xid: 0x000a.019.0160eab2  uba: 0x0080026d.abdf.0e  statement num=0  parent xid: xid: 0x0000.000.00000000  scn: 0x0002.653cd123 0sch: scn: 0x0000.00000000)
        cev: (spc = 5382  arsp = 0  ubk tsn: 1 rdba: 0x0080026d  useg tsn: 2147483647 rdba: 0x00000000
              hwm uba: 0x0080026d.abdf.0e  col uba: 0x00000000.0000.00
              num bl: 0 bk list: 0x0)
              cr opc: 0x0 spc: 5382 uba: 0x0080026d.abdf.0e
        (enqueue) TX-00010010-00223889  DID: 0001-003B-0001AAA7
        lv: 68 c8 d5 f1 00 02 00 00 00 00 00 00 00 00 00 00  res_flag: 0x45
        res: 0x41542e390, lock_flag: 0x0
        own: 0x65913ac88, sess: 0x65913ac88, prv: 0x65ad33bf8
        slk: 0x0
         xga: 0x0, heap: UGA
        Trans IMU st: 2 Pool index 65535, Redo pool 0x65ad34288, Undo pool 0x65ad34370
        Redo pool range [0x0 0xffffffff7949d800 0xffffffff794a0000]
        Undo pool range [0x0 0xffffffff7949b000 0xffffffff7949d800]
     ------------process 0x4174ae968--------------------
    proc version      : 9797
    Local node        : 0
    pid               : 13839
    lkp_node          : 0
    svr_mode          : 0
    proc state        : KJP_UNFREEZE
    Last drm hb acked : 0
    Total accesses    : 1789545
    Imm.  accesses    : 1788901
    Locks on ASTQ     : 0
    Locks Pending AST : 0
    Granted locks     : 0
    AST_Q:
    PENDING_Q:
    GRANTED_Q:
    SO: 65add3198, type: 3, owner: 659006f28, flag: INIT/-/-/0x00
    (call) sess: cur 65913ac88, rec 65913d5e8, usr 65913ac88; depth: 0
      ----------------------------------------
      SO: 65adc9f60, type: 3, owner: 65add3198, flag: INIT/-/-/0x00
      (call) sess: cur 65913d5e8, rec 65913d5e8, usr 65913ac88; depth: 1
        ----------------------------------------
        SO: 65adcf3e0, type: 3, owner: 65adc9f60, flag: INIT/-/-/0x00
        (call) sess: cur 65913d5e8, rec 65913d5e8, usr 65913ac88; depth: 2
          ----------------------------------------
          SO: 65add3cb0, type: 3, owner: 65adcf3e0, flag: INIT/-/-/0x00
          (call) sess: cur 65913d5e8, rec 65913d5e8, usr 65913ac88; depth: 3
          ----------------------------------------
          SO: 3a6557890, type: 54, owner: 65adcf3e0, flag: INIT/-/-/0x00
         LIBRARY OBJECT PIN: pin=3a6557890 handle=414d18f40 mode=S lock=3dedd0f58
          user=65913ac88 session=65913d5e8 count=565 mask=0011 savepoint=0x2b9a99a flags=[00]
          ----------------------------------------
          SO: 3a6556a38, type: 54, owner: 65adcf3e0, flag: INIT/-/-/0x00
          LIBRARY OBJECT PIN: pin=3a6556a38 handle=411a41fe0 mode=S lock=3f2bfb990
          user=65913ac88 session=65913d5e8 count=2 mask=0011 savepoint=0x2b9a965 flags=[00]
          ----------------------------------------
          SO: 3e3436338, type: 54, owner: 65adcf3e0, flag: INIT/-/-/0x00
          LIBRARY OBJECT PIN: pin=3e3436338 handle=4127ac9d0 mode=S lock=3dec90760
          user=65913ac88 session=65913d5e8 count=565 mask=0011 savepoint=0x2b9a930 flags=[00]
          ----------------------------------------
          SO: 3a6531520, type: 54, owner: 65adcf3e0, flag: INIT/-/-/0x00
          LIBRARY OBJECT PIN: pin=3a6531520 handle=412c736b0 mode=S lock=3f2baeb08
          user=65913ac88 session=65913d5e8 count=2 mask=0011 savepoint=0x2b9a8fb flags=[00]
          ----------------------------------------
          SO: 3a6531da0, type: 54, owner: 65adcf3e0, flag: INIT/-/-/0x00
          LIBRARY OBJECT PIN: pin=3a6531da0 handle=3e27e3f70 mode=S lock=3f2df8618
          user=65913ac88 session=65913d5e8 count=2 mask=0011 savepoint=0x2b9a548 flags=[00]
          ----------------------------------------
          SO: 3a6669780, type: 54, owner: 65adcf3e0, flag: INIT/-/-/0x00
          LIBRARY OBJECT PIN: pin=3a6669780 handle=411a5d810 mode=S lock=3dedeb9b0
          user=65913ac88 session=65913d5e8 count=5 mask=001d savepoint=0x2b9a548 flags=[00]
          ----------------------------------------
          SO: 3dedeb9b0, type: 53, owner: 65adcf3e0, flag: INIT/-/-/0x00
          LIBRARY OBJECT LOCK: lock=3dedeb9b0 handle=411a5d810 mode=S
          call pin=3a6669780 session pin=0 hpc=0000 hlc=0000
          htl=3dedeba30[3f28e4fb0,3dee04b40] htb=3f28e4fb0 ssga=3f28e4d08
          user=65913ac88 session=65913d5e8 count=5 flags=PNC/[0400] savepoint=0x2b9a548
          LIBRARY OBJECT HANDLE: handle=411a5d810 mtx=411a5d940(0) cdp=0
          name=SYS.STANDARD
          hash=efbc27b094c2fbc2eb1b8be457d32119 timestamp=04-18-2003 00:00:00
          namespace=TABL flags=KGHP/TIM/XLR/[00000020]
          kkkk-dddd-llll=0000-001d-001d lock=S pin=S latch#=9 hpc=5ed4 hlc=5ed4
          lwt=411a5d8b8[411a5d8b8,411a5d8b8] ltm=411a5d8c8[411a5d8c8,411a5d8c8]
          pwt=411a5d880[411a5d880,411a5d880] ptm=411a5d890[411a5d890,411a5d890]
          ref=411a5d8e8[411a5d8e8,411a5d8e8] lnd=411a5d900[411a539e8,411b43828]
            LOCK INSTANCE LOCK: id=LBefbc27b094c2fbc2
            PIN INSTANCE LOCK: id=NBefbc27b094c2fbc2 mode=S release=F flags=[00]
            INVALIDATION INSTANCE LOCK: id=IV0000037712010101 mode=S
            LIBRARY OBJECT: object=3f0fb34b8
            type=PCKG flags=EXS/LOC[0005] pflags=NST/IVR[0201] status=VALD load=0

SO: 65ad33310, type: 40, owner: 65adc9f60, flag: INIT/-/-/0x00
        (trans) flg = 0x4001e23, flg2 = 0x80000, prx = 0x0, ros = 2147483647 bsn = 0x2b9a4a0 bndsn = 0x2b9aa9e spn = 0x2b9fd0a
        efd = 4
        parent xid: 0x0000.000.00000000
        env: (scn: 0x0002.653cd067  xid: 0x000a.00c.0160eb63  uba: 0x0080026c.abdf.38  statement num=0  parent xid: xid: 0x0000.000.00000000  scn: 0x0002.653cd124 0sch: scn: 0x0000.00000000)
        cev: (spc = 2474  arsp = 416d0c960  ubk tsn: 1 rdba: 0x0080026c  useg tsn: 1 rdba: 0x00800099
              hwm uba: 0x0080026c.abdf.38  col uba: 0x00000000.0000.00
              num bl: 1 bk list: 0x406d3a8d0)
              cr opc: 0x0 spc: 2474 uba: 0x0080026c.abdf.38
        (enqueue) TX-000A000C-0160EB63  DID: 0001-003B-0001AAA7
        lv: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  res_flag: 0x47
        res: 0x4153eb0d0, mode: X, lock_flag: 0x0
        own: 0x65913ac88, sess: 0x65913ac88, proc: 0x659006f28, prv: 0x4153eb0e0
        slk: 0x0
         xga: 0x0, heap: UGA
        Trans IMU st: 2 Pool index 65535, Redo pool 0x65ad339e8, Undo pool 0x65ad33ad0
        Redo pool range [0xffffffff7981d3a8 0xffffffff7981d3a8 0xffffffff7981fba8]
        Undo pool range [0xffffffff7981aba8 0xffffffff7981aba8 0xffffffff7981d3a8]
         ----------------------------------------
          SO: 406d3a8d0, type: 39, owner: 65ad33310, flag: -/-/-/0x00
          (List of Blocks) next index = 1
          index   itli   buffer hint   rdba       savepoint
          -----------------------------------------------------------
              0      1   0x3c7094708    0xc009b7     0x2b9aa9e
          ----------------------------------------
          SO: 65acb28c0, type: 36, owner: 65ad33310, flag: INIT/-/-/0x00
          DML LOCK: tab=8785 flg=11 chi=0
                    his[0]: mod=3 spn=45722267
          (enqueue) TM-00002251-00000000    DID: 0001-003B-0001AAA7
          lv: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  res_flag: 0x7
          res: 0x4153e9be8, mode: SX, lock_flag: 0x0
          own: 0x65913ac88, sess: 0x65913ac88, proc: 0x659006f28, prv: 0x4153e9bf8
          slk: 0x6592b8fb0
        ----------------------------------------
        SO: 659183f40, type: 5, owner: 65adc9f60, flag: INIT/-/-/0x00
        (enqueue) FU-00000000-00000000  DID: 0001-003B-0001AAA7
        lv: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  res_flag: 0x3
        res: 0x4153e2420, mode: X, lock_flag: 0x0
        own: 0x65913ac88, sess: 0x65913ac88, proc: 0x659006f28, prv: 0x4153e2430
        slk: 0x6592b8d10
      ----------------------------------------
      SO: 65913d5e8, type: 4, owner: 65add3198, flag: INIT/-/-/0x00
      (session) sid: 392 trans: 65ad33310, creator: 0, flag: (2) -/REC -/-/-/-/-/-
                DID: 0000-0000-00000000, short-term DID: 0000-0000-00000000
                txn branch: 0
                oct: 0, prv: 0, sql: 0, psql: 0, user: 0/SYS
      temporary object counter: 0

从上面的几个sql语句,我们可以看出,m001 进程是在处理awr快照,换句话说,也就是m001 在清理快照信息的时候出现异常了。
要验证这个观点,我们可以来看看m001 m002 进程的trace,随便选取故障期间的几个的trace,如下:

*** ACTION NAME:(Auto-Purge Slave Action) 2011-05-30 23:14:46.720
*** MODULE NAME:(MMON_SLAVE) 2011-05-30 23:14:46.720
*** SERVICE NAME:(SYS$BACKGROUND) 2011-05-30 23:14:46.720
*** SESSION ID:(392.14821) 2011-05-30 23:14:46.720
*** KEWROCISTMTEXEC - encountered error: (ORA-00054: resource busy and acquire with NOWAIT specified

)
  *** SQLSTR: total-len=82, dump-len=82,
      STR={alter table WRH$_ACTIVE_SESSION_HISTORY drop partition WRH$_ACTIVE_436568992_30483}
KEWRAPAD: Error=13509 in auto_purge_all_dbids for dbid=436568992.
*** ACTION NAME:(Auto-Purge Slave Action) 2011-06-10 23:27:33.774
*** MODULE NAME:(MMON_SLAVE) 2011-06-10 23:27:33.774
*** SERVICE NAME:(SYS$BACKGROUND) 2011-06-10 23:27:33.774
*** SESSION ID:(876.26029) 2011-06-10 23:27:33.774
*** KEWROCISTMTEXEC - encountered error: (ORA-00054: resource busy and acquire with NOWAIT specified

)
  *** SQLSTR: total-len=82, dump-len=82,
      STR={alter table WRH$_ACTIVE_SESSION_HISTORY drop partition WRH$_ACTIVE_436568992_30483}
KEWRAPAD: Error=13509 in auto_purge_all_dbids for dbid=436568992.

从上面我们可以看出,m00x进程均为mmon的附属进程,用来清除awr快照用的,
很明显,从上面的信息,我们可以发现m001和m002在清理快照时出现异常

由于mmon的附属进程m001,间隔性的启动,其作用是产生awr快照以及清除过期快照信息等,
m001进程出现异常,进程到时相关session,产生FU等待。
m001进程为轻量级进程,其实在故障期间,我完全可以将这个进程kill,进而解决问题。

--------------------------------------ORACLE-DBA----------------------------------------

最权威、专业的Oracle案例资源汇总之【案例】Oracle GES:Potential blocker on resource FU解决思路

本文由大师惜分飞原创分享,网址:http://www.oracleplus.net/arch/1267.html

Oracle研究中心

关键词:

GES:Potential blocker

Oracle AWR报告生成失败的原因