sitemap

RSS地图

收藏本站

设为首页

Oracle研究中心

当前位置:Oracle研究中心 > 运维DBA > Oracle AWR >

学习笔记:Oracle awr报告案例 oracle进程占系统资源大 通过awr一

时间:2016-05-15 23:27   来源:Oracle研究中心   作者:惜分飞   点击:

天萃荷净

深入学习Oracle awr报告,一步步分析导致系统负载较高的进程

一个客户反馈说有一系统经常性负载比较高,让我帮忙分析原因
系统负载情况

[oracle@zwq-kfdialdb ~]$ top -c
top - 17:11:06 up 78 days,  1:12,  5 users,  load average: 124.83, 125.90, 112.13
Tasks: 836 total, 152 running, 684 sleeping,   0 stopped,   0 zombie
Cpu(s): 98.1%us,  0.1%sy,  0.0%ni,  1.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  132070908k total, 90494280k used, 41576628k free,  1147384k buffers
Swap: 67108856k total,        0k used, 67108856k free, 79109904k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                
12179 oracle    15   0 32.5g 142m 139m R 39.0  0.1   6:09.20 oracleahdial (LOCAL=NO)
11530 oracle    16   0 32.5g 469m 466m R 36.1  0.4  11:14.23 oracleahdial (LOCAL=NO)
11816 oracle    15   0 32.5g 467m 463m R 36.1  0.4   6:33.86 oracleahdial (LOCAL=NO)
11577 oracle    15   0 32.5g 480m 477m R 34.7  0.4   7:15.98 oracleahdial (LOCAL=NO)
12136 oracle    16   0 32.5g 455m 452m R 31.9  0.4   9:07.88 oracleahdial (LOCAL=NO)
11237 oracle    16   0 32.5g 997m 992m R 31.2  0.8  20:53.50 oracleahdial (LOCAL=NO)
11427 oracle    16   0 32.5g 137m 135m R 31.2  0.1  11:50.16 oracleahdial (LOCAL=NO)
12051 oracle    16   0 32.5g 459m 456m R 31.2  0.4   6:12.67 oracleahdial (LOCAL=NO)

[oracle@zwq-kfdialdb ~]$ vmstat 3 10
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
137  0      0 41566600 1147384 79109888    0    0     0     3    0    0  0  0 99  0  0
146  0      0 41567548 1147384 79109888    0    0     0    47 1058 32190 98  0  2  0  0
150  0      0 41568056 1147384 79109888    0    0     0    39 1081 31543 98  0  2  0  0
12  0      0 41568464 1147384 79109888    0    0     0    23 1056 32111 98  0  2  0  0
62  0      0 41568776 1147384 79109888    0    0     0    11 1067 31474 98  0  2  0  0
108  0      0 41568304 1147384 79109888    0    0     0    39 1059 31193 98  0  2  0  0
140  0      0 41569280 1147384 79109888    0    0     0    48 1063 31171 98  0  2  0  0
140  0      0 41569444 1147384 79109888    0    0     0    40 1075 30508 98  0  2  0  0

通过top和vmstat看出系统现在负载很高,主要都是用户进程导致.

查询等待事件

SQL> select event from v$session where wait_class#<>6;

EVENT
----------------------------------------------------------------
cursor: mutex S

SQL> /

EVENT
----------------------------------------------------------------
cursor: mutex S

SQL> /

EVENT
----------------------------------------------------------------
cursor: mutex S

SQL> /

EVENT
----------------------------------------------------------------
cursor: mutex S

SQL> SELECT a.*, s.sql_text
  2    FROM v$sql s,
  3         (SELECT sid,
  4                 event,
  5                 wait_class,
  6                 p1 cursor_hash_value,
  7                 p2raw Mutex_value,
  8                 TO_NUMBER (SUBSTR (p2raw, 1, 8), 'xxxxxxxx') hold_mutex_x_sid
  9            FROM v$session_wait
 10           WHERE event LIKE 'cursor%') a
 11   WHERE s.HASH_VALUE = a.cursor_hash_value
 12  /

no rows selected

SQL>  select event from v$session where wait_class#<>6;

EVENT
----------------------------------------------------------------
SQL*Net message to client

数据库开始的等待事件只有cursor: mutex S,等该等待事件消失后系统负载也恢复正常

再次查看系统负载

[oracle@zwq-kfdialdb ~]$ top -c -i10
top - 17:13:51 up 78 days,  1:15,  6 users,  load average: 12.57, 78.21, 96.45
Tasks: 702 total,   2 running, 700 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.5%us,  0.2%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  132070908k total, 86477808k used, 45593100k free,  1147500k buffers
Swap: 67108856k total,        0k used, 67108856k free, 79116036k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND        
15045 oracle    16   0 13136 1476  724 R 10.6  0.0   0:00.12 top -c -i10    
10010 root      10  -5     0    0    0 D  0.0  0.0   0:00.68 [kondemand/4]  
10019 root      10  -5     0    0    0 D  0.0  0.0   1:41.58 [kondemand/13] 
10020 root      10  -5     0    0    0 D  0.0  0.0   1:52.28 [kondemand/14] 
10021 root      10  -5     0    0    0 R  0.0  0.0   2:01.54 [kondemand/15] 
12166 root      24   0 10084  300  216 D  0.0  0.0   0:00.00 /opt/VRTSgab/gablogd

[oracle@zwq-kfdialdb ~]$  vmstat 3 10
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 45484264 1147528 79117360    0    0     0     3    0    0  0  0 99  0  0
 2  0      0 45468452 1147528 79117456    0    0     4   161 1372 5369  2  2 97  0  0
 3  0      0 45463712 1147528 79117584    0    0     4   187 1602 7253  6  0 93  0  0
 1  0      0 45458220 1147528 79117648    0    0     1    99 1358 5821  2  0 98  0  0
 0  0      0 45475168 1147528 79117712    0    0     0    41 1321 5321  2  0 98  0  0
 0  0      0 45473624 1147528 79117744    0    0     3   104 1378 5455  2  0 98  0  0
 2  0      0 45474656 1147528 79117776    0    0     0    55 1196 4872  1  0 99  0  0
 0  0      0 45474376 1147532 79117824    0    0     8   113 1170 4990  2  0 98  0  0
 1  0      0 45475440 1147532 79117872    0    0     1    56 1187 5301  3  0 97  0  0
 1  0      0 45475824 1147532 79117888    0    0     3    99 1083 4643  3  0 97  0  0

结合上面的等待事件查询,我们可以大概评估出来,当cursor: mutex S等待消失后,系统负载也恢复正常,现在已经不存在环境,如果要找出问题只能够是借助AWR和ASH

分析ASH
Top User Events

Top SQL with Top Events

Activity Over Time

通过对ASH分析,可以大概确定,在这段时间内,引起系统负载高主要是cursor: mutex S导致

分析AWR
awr整体信息(从这里看数据库是相当的繁忙)

Load Profile(从这里看数据库业务比较小)

Top 5 Timed Foreground Events(主要等待事件cursor: mutex S,和前面分析相符)

OS LOAD(虽然和系统看到有一定出入,但是整体还是展示系统负载较高)

SQL ordered by Version Count(出现cursor: mutex S,因为load profile中解析不多,所以想到高版本问题,这里确实非常高)

通过这里的一些列分析,我们已经基本上可以确定,该数据库因为高版本问题导致cursor: mutex S以及library cache 相关等待严重,从而出现系统负载过高.

找出高版本原因
高版本相关信息和查询请见:关于High Versions Count总结

SQL> select * from table(version_rpt('f8b9tba7sfsb5')); 

COLUMN_VALUE
--------------------------------------------------------------------------------
Version Count Report Version 3.2.1 -- Today's Date 19-jul-12 18:13
RDBMS Version :11.2.0.1.0 Host: zwq-kfdialdb Instance 1 : ahdial
==================================================================
Addr: 000000080FA4CEA0  Hash_Value: 2408014181  SQL_ID f8b9tba7sfsb5
Sharable_Mem: 206315729 bytes   Parses: 48689
Stmt:
0 select count(*) as col_0_0_ from TBL_SP_SALES_RECORDS tblspsales
1 0_ where tblspsales0_.SALES_RECORDS_STATUS=:"SYS_B_0" and tblsps
2 ales0_.MOBILE_TELE_NO=:"SYS_B_1"
3


COLUMN_VALUE
--------------------------------------------------------------------------------
Versions Summary
----------------
AUTH_CHECK_MISMATCH :4
TRANSLATION_MISMATCH :4
ROLL_INVALID_MISMATCH :10219
PURGED_CURSOR :9

Total Versions:10219

Plan Hash Value Summary
-----------------------

COLUMN_VALUE
--------------------------------------------------------------------------------
Plan Hash Value Count
=============== =====
      791727930 920
     2820478500 9300

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Details for AUTH_CHECK_MISMATCH :

  # of Ver PARSING_USER_ID PARSING_SCHEMA_ID PARSING_SCHEMA_NAME
========== =============== ================= ===================
     10218               75                75 HOLLYSP

COLUMN_VALUE
--------------------------------------------------------------------------------
         2              107               107 HOLLYSP_TEST
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Details for TRANSLATION_MISMATCH :

Summary of objects probably causing TRANSLATION_MISMATCH

  Object# Owner.Object_Name
========= =================
    76737 HOLLYSP.TBL_SP_SALES_RECORDS
 HOLLYSP.TBL_SP_SALES_RECORDS
 HOLLYSP_TEST.TBL_SP_SALES_RECORDS

COLUMN_VALUE
--------------------------------------------------------------------------------
   107043 HOLLYSP_TEST.TBL_SP_SALES_RECORDS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Details for ROLL_INVALID_MISMATCH :

No details available
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Details for PURGED_CURSOR :

No details available
####
To further debug Ask Oracle Support for the appropiate level LLL.

COLUMN_VALUE
--------------------------------------------------------------------------------
alter session set events
 'immediate trace name cursortrace address 2408014181, level LLL';
To turn it off do use address 1, level 2147483648
================================================================

59 rows selected.

SQL> select * from table(version_rpt('6zhjf3qh8gyp9'));   

COLUMN_VALUE
--------------------------------------------------------------------------------
Version Count Report Version 3.2.1 -- Today's Date 19-jul-12 18:15
RDBMS Version :11.2.0.1.0 Host: zwq-kfdialdb Instance 1 : ahdial
==================================================================
Addr: 00000007FD46D2A8  Hash_Value: 2693266089  SQL_ID 6zhjf3qh8gyp9
Sharable_Mem: 111904227 bytes   Parses: 4880
Stmt:
0 select count(*) as col_0_0_ from TBL_SP_SALES_RECORDS tblspsales
1 0_ where tblspsales0_.SALES_RECORDS_STATUS=:"SYS_B_0" and tblsps
2 ales0_.CAMPAIGN_ID=:"SYS_B_1" and tblspsales0_.IS_CONN=:"SYS_B_2
3 " and tblspsales0_.SALES_TIME>=:"SYS_B_3" and tblspsales0_.SALES
4 _PERSON=:"SYS_B_4"

COLUMN_VALUE
--------------------------------------------------------------------------------
5

Versions Summary
----------------
BIND_MISMATCH :4804
INCOMP_LTRL_MISMATCH :372
HASH_MATCH_FAILED :4936

Total Versions:4935

Plan Hash Value Summary

COLUMN_VALUE
--------------------------------------------------------------------------------
-----------------------
Plan Hash Value Count
=============== =====
     1645985080 3
     2040125427 4933

~~~~~~~~~~~~~~~~~~~~~~~~~~~
Details for BIND_MISMATCH :

Consolidated details for :
BIND_MISMATCH,USER_BIND_PEEK_MISMATCH,BIND_UACS_DIFF and

COLUMN_VALUE
--------------------------------------------------------------------------------
BIND_EQUIV_FAILURE (Mislabled as ROW_LEVEL_SEC_MISMATCH BY bug 6964441 in 11gR1)

from v$sql_bind_capture
COUNT(*) POSITION MIN(MAX_LENGTH) MAX(MAX_LENGTH) DATATYPE (PRECISION,SCALE)
======== ======== =============== =============== ======== ================
    4936        1              32              32        1 (,)
    4936        2              32              32        1 (,)
    4936        3              32              32        1 (,)
    4936        4              32              32        1 (,)
    4936        5              32              32        1 (,)


COLUMN_VALUE
--------------------------------------------------------------------------------
SUM(DECODE(column,Y, 1, 0) FROM V$SQL
IS_OBSOLETE IS_BIND_SENSITIVE IS_BIND_AWARE IS_SHAREABLE
=========== ================= ============= ============
          0              4309             0         4309
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Details for INCOMP_LTRL_MISMATCH :

No details available
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Details for HASH_MATCH_FAILED :


COLUMN_VALUE
--------------------------------------------------------------------------------
No details available
####
To further debug Ask Oracle Support for the appropiate level LLL.
alter session set events
 'immediate trace name cursortrace address 2693266089, level LLL';
To turn it off do use address 1, level 2147483648
================================================================

62 rows selected.

根据经验,出现这么多的高版本情况,很可能是cursor_sharing参数设置问题

--对应sql语句
select count(*) as col_0_0_ from TBL_SP_SALES_RECORDS tblspsales0_ 
where tblspsales0_.SALES_RECORDS_STATUS=:"SYS_B_0" and 
tblspsales0_.CAMPAIGN_ID=:"SYS_B_1" and tblspsales0_.IS_CONN=:"SYS_B_2" 
and tblspsales0_.SALES_TIME>=:"SYS_B_3" and tblspsales0_.SALES_PERSON=:"SYS_B_4"

--cursor_sharing参数
SQL> show parameter cursor_sharing;

NAME                                 TYPE                             VALUE
------------------------------------ -------------------------------- ----------------
cursor_sharing                       string                           SIMILAR

根据oracle官方建议在11g中不推荐使用cursor_sharing=SIMILAR,其实在所有版本中都不推荐,设置为该值很容易导致高版本问题.而且该值会出现莫名其妙的,无法解释的高版本问题.而且根据oracle相关文档,在即将发布的12c版本中,将除掉SIMILAR值.对于客户库的该问题,因为很多sql未绑定参数,为了减少硬解析,建议在业务低谷时设置cursor_sharing=FORCE,并刷新shared pool.

 

-----------------温馨提示--------------------
操作有风险,动手需谨慎
Oracle研究中心
http://www.oracleplus.net
本文由大师惜分飞原创分享,转载请尽量保留本站网址

--------------------------------------ORACLE-DBA----------------------------------------

最权威、专业的Oracle案例资源汇总之学习笔记:Oracle awr报告案例 oracle进程占系统资源大 通过awr一

本文由大师惜分飞原创分享,网址:http://www.oracleplus.net/arch/186.html

Oracle研究中心

关键词:

Oracle awr报告案例

oracle进程占操作系统资源非常大

通过awr报告一步步分析数据库等待事件