sitemap

RSS地图

收藏本站

设为首页

Oracle研究中心

当前位置:Oracle研究中心 > 故障案例 >

【案例】分析ORACLE数据库多块读很慢 单块读很快的案例分析

时间:2016-11-06 10:37   来源:Oracle研究中心   作者:HTZ   点击:

天萃荷净 Oracle研究中心案例分析:分析ORACLE数据库多块读很慢、单块读很快的案例分析

下面是整个案例的分析过程,分析了很久了,今天终于有时间整理一下,记录在ONENOTE里面。

1 环境介绍与现象描述


数据库多块读很慢,单块读、写都很正常,多块读的平均响应时间是单块都的10倍。

本系统数据文件存放在VXFS上面,数据库是从9i通过DG方式升级到11G,9i使用的是raw.

2 分析与测试过程


2.1 AWR中收集单块读与多块读的平均响应时间

下面是从AWR中统计每个快照的单块读与多块读的响应时间

SQL> set ver off pages 50000 lines 200 tab off
SQL> col EVENT_NAME for a30
SQL> col total_waits for 99999999999999
SQL> col total_time_s for 999999999.999
SQL> col avg_time_ms for 999999999.999
SQL> BREAK ON inst SKIP 1
SQL> select instance_number inst,
  2         to_char(time, 'YYYY-MM-DD HH24:MI:SS') time,
  3         event_name,
  4         sum(delta_total_waits) total_waits,
  5         round(sum(delta_time_waited / 1000000), 3) total_time_s,
  6         round(sum(delta_time_waited) /
  7               decode(sum(delta_total_waits), 0, null, sum(delta_total_waits)) / 1000,
  8               3) avg_time_ms
  9    from (select hse.instance_number,
10                 hse.snap_id,
11                 trunc(sysdate - 30 + 1) +
12                 trunc((cast(hs.begin_interval_time as date) -
13                       (trunc(sysdate - 10 + 1))) * 24 /
14                       (1)) * (1) / 24 time,
15                 EVENT_NAME,
16                 (lead(TOTAL_WAITS, 1)
17                  over(partition by hse.instance_number,
18                       hs.STARTUP_TIME,
19                       EVENT_NAME order by hse.snap_id)) - TOTAL_WAITS delta_total_waits,
20                 (lead(TIME_WAITED_MICRO, 1)
21                  over(partition by hse.instance_number,
22                       hs.STARTUP_TIME,
23                       EVENT_NAME order by hse.snap_id)) - TIME_WAITED_MICRO delta_time_waited
24            from DBA_HIST_SYSTEM_EVENT hse, DBA_HIST_SNAPSHOT hs
25           where hse.snap_id = hs.snap_id
26             and hs.begin_interval_time >= trunc(sysdate) - 30 + 1
27             and hse.EVENT_NAME in ('db file sequential read','db file scattered read')) a
28   group by instance_number, time, event_name
29   order by 1, 2, 3;

      INST TIME                EVENT_NAME                         TOTAL_WAITS   TOTAL_TIME_S    AVG_TIME_MS
---------- ------------------- ------------------------------ --------------- -------------- --------------
         1 2016-01-21 22:00:00 db file scattered read                  116532       8003.375         68.680
           2016-01-21 22:00:00 db file sequential read                 451345       2617.070          5.798
           2016-01-21 23:00:00 db file scattered read                  509375      60848.652        119.457
           2016-01-21 23:00:00 db file sequential read                1032694       9415.338          9.117
           2016-01-22 00:00:00 db file scattered read                  511267      55300.596        108.164
           2016-01-22 00:00:00 db file sequential read                 665804       5953.620          8.942
           2016-01-22 01:00:00 db file scattered read                  212010      13539.024         63.860
           2016-01-22 01:00:00 db file sequential read                 368793       3063.742          8.307
           2016-01-22 02:00:00 db file scattered read                  161036      19300.935        119.855
           2016-01-22 02:00:00 db file sequential read                 144915       1313.259          9.062
           2016-01-22 03:00:00 db file scattered read                  646679      84395.856        130.507
           2016-01-22 03:00:00 db file sequential read                 335820       3152.971          9.389
           2016-01-22 04:00:00 db file scattered read                  617742      67478.919        109.235
           2016-01-22 04:00:00 db file sequential read                 288044       2563.102          8.898
           2016-01-22 05:00:00 db file scattered read                  493513      59254.417        120.067
           2016-01-22 05:00:00 db file sequential read                 991351       9743.101          9.828
           2016-01-22 06:00:00 db file scattered read                  467135      47366.993        101.399
           2016-01-22 06:00:00 db file sequential read                1511804      13600.312          8.996
           2016-01-22 07:00:00 db file scattered read                  305532      32673.273        106.939
           2016-01-22 07:00:00 db file sequential read                1451034      13513.703          9.313
           2016-01-22 08:00:00 db file scattered read                  469331      33309.734         70.973
           2016-01-22 08:00:00 db file sequential read                1586451      12038.511          7.588
           2016-01-22 09:00:00 db file scattered read                  409206      26532.131         64.838
           2016-01-22 09:00:00 db file sequential read                1876402      11685.365          6.228
           2016-01-22 10:00:00 db file scattered read                  218379      15000.749         68.691
           2016-01-22 10:00:00 db file sequential read                1012738       7195.984          7.105
           2016-01-22 11:00:00 db file scattered read                  280045      27214.605         97.179
           2016-01-22 11:00:00 db file sequential read                 775662       5751.906          7.415
           2016-01-22 12:00:00 db file scattered read                  271739      21270.681         78.276
           2016-01-22 12:00:00 db file sequential read                1013600       6862.144          6.770
           2016-01-22 13:00:00 db file scattered read                   91427       8502.492         92.998
           2016-01-22 13:00:00 db file sequential read                 994440       6992.188          7.031
           2016-01-22 14:00:00 db file scattered read                  143148       6822.649         47.662
           2016-01-22 14:00:00 db file sequential read                 897827       4978.166          5.545
           2016-01-22 15:00:00 db file scattered read                   80698       4790.735         59.366
           2016-01-22 15:00:00 db file sequential read                 507671       2940.978          5.793
          
.................
2.2 手动采集多块读与物理读

以为是AWR出来的结果不正确,于是自己写个脚本来收集数据

      --1,创建表
drop table system.htz_system_event ;
drop sequence system.htz_system_event_seq;
create sequence system.htz_system_event_seq;
create table system.htz_system_event(start_time date,id number,name varchar2(64),TOTAL_WAITS number,TOTAL_TIMEOUTS number,TIME_WAITED number,AVERAGE_WAIT number,TIME_WAITED_MICRO number,TOTAL_WAITS_FG number,TOTAL_TIMEOUTS_FG number,TIME_WAITED_FG number,AVERAGE_WAIT_FG number,TIME_WAITED_MICRO_FG number) tablespace ODS_MODEL_SETT;
--2,自动作业插入数据,每10分钟插入一次
VARIABLE jobno NUMBER;
BEGIN
   DBMS_JOB.SUBMIT (
      :jobno,
      'DECLARE
      v_sequence   NUMBER;
      BEGIN
          SELECT system.htz_system_event_seq.nextval INTO v_sequence FROM DUAL;
          insert into system.htz_system_event select sysdate,v_sequence,event,TOTAL_WAITS,TOTAL_TIMEOUTS,TIME_WAITED,AVERAGE_WAIT,TIME_WAITED_MICRO,TOTAL_WAITS_FG,TOTAL_TIMEOUTS_FG,TIME_WAITED_FG,AVERAGE_WAIT_FG,TIME_WAITED_MICRO_FG from v$system_event where wait_class in (''User I/O'',''System I/O'');
      END;',
      SYSDATE,
      'SYSDATE + 10/1440');
   COMMIT;
END;
/
PL/SQL procedure successfully completed.

PRINT jobno

SQL> PRINT jobno

               JOBNO
--------------------
                 703
SQL> set echo off
SQL> set lines 200 pages 3000 heading on
SQL> col name for a30
SQL> col start_time for a11
SQL> col TOTAL_WAITS for 9999999999
SQL> col lag_total_waits for 9999999999
SQL> col TIME_WAITED_MICRO for 9999999999
SQL> col lag_TIME_WAITED_MICRO for 9999999999
SQL> col id for 999999
SQL> col AVERAGE_WAIT for 9999.99
SQL> col AVG_WAIT for 9999.99
SQL>   SELECT TO_CHAR (start_time, 'dd hh24:mi:ss') start_time,
  2           id,
  3           NAME,
  4           TRUNC (a.TOTAL_WAITS / 10000) TOTAL_WAITS,
  5           TRUNC (LAG (total_waits) OVER (PARTITION BY NAME ORDER BY id) / 1000)
  6              lag_total_waits,
  7           TRUNC (a.TIME_WAITED_MICRO / 1000000) TIME_WAITED_MICRO,
  8           TRUNC (
  9                LAG (TIME_WAITED_MICRO) OVER (PARTITION BY name ORDER BY id)
10              / 1000000)
11              lag_TIME_WAITED_MICRO,
12           a.AVERAGE_WAIT,
13           ROUND (
14              DECODE (
15                   total_waits
16                 - LAG (total_waits) OVER (PARTITION BY NAME ORDER BY id),
17                 0, NULL,
18                 (  (  TIME_WAITED_MICRO
19                     - LAG (TIME_WAITED_MICRO)
20                          OVER (PARTITION BY name ORDER BY id))
21                  / (  total_waits
22                     - LAG (total_waits) OVER (PARTITION BY NAME ORDER BY id))
23                  / 1000)),
24              2)
25              avg_wait
26      FROM SYSTEM.htz_system_event a
27     WHERE name IN ('db file scattered read')
28  ORDER BY name, start_time
29  /

START_TIME       ID NAME                           TOTAL_WAITS LAG_TOTAL_WAITS TIME_WAITED_MICRO LAG_TIME_WAITED_MICRO AVERAGE_WAIT AVG_WAIT
----------- ------- ------------------------------ ----------- --------------- ----------------- --------------------- ------------ --------
19 11:18:55       1 db file scattered read              160921                         124946964                               7.76
19 11:28:56       2 db file scattered read              160925         1609219         124950029             124946964         7.76    93.00
19 11:38:57       3 db file scattered read              160927         1609252         124952805             124950029         7.76   107.95
19 11:48:59       4 db file scattered read              160932         1609278         124955488             124952805         7.76    53.90
19 11:59:00       5 db file scattered read              160935         1609328         124957105             124955488         7.76    60.11
19 12:09:02       6 db file scattered read              160939         1609355         124961852             124957105         7.76   108.29
19 12:19:04       7 db file scattered read              160945         1609398         124968591             124961852         7.76   127.37
19 12:29:05       8 db file scattered read              160949         1609451         124974549             124968591         7.76   124.38
19 12:39:07       9 db file scattered read              160953         1609499         124979139             124974549         7.76   131.63
19 12:49:08      10 db file scattered read              160957         1609534         124983884             124979139         7.77   125.27
19 12:59:10      11 db file scattered read              160959         1609572         124986769             124983884         7.77   116.43
19 13:09:12      12 db file scattered read              160969         1609597         124995455             124986769         7.77    90.32
19 13:19:13      13 db file scattered read              160974         1609693         125001595             124995455         7.77   115.52
19 13:29:14      14 db file scattered read              160980         1609746         125008342             125001595         7.77   115.31
19 13:39:16      15 db file scattered read              160984         1609805         125012003             125008342         7.77    92.16
19 13:49:17      16 db file scattered read              160988         1609844         125016167             125012003         7.77   105.47
19 13:59:18      17 db file scattered read              160990         1609884         125018127             125016167         7.77    96.30
19 14:09:20      18 db file scattered read              160994         1609904         125023065             125018127         7.77   109.13
19 14:19:21      19 db file scattered read              160997         1609949         125025060             125023065         7.77    80.29
19 14:29:23      20 db file scattered read              161003         1609974         125028340             125025060         7.77    58.06
19 14:39:24      21 db file scattered read              161008         1610031         125030265             125028340         7.77    37.58
19 14:49:25      22 db file scattered read              161008         1610082         125030809             125030265         7.77    89.22
19 14:59:27      23 db file scattered read              161009         1610088         125031182             125030809         7.77    86.79
这里手动收集出来跟从AWR里面查询出来的数据差不多.



2.3 查询文件的碎片

环境中使用的VXFS,所以我们查询一下文件的碎片

    [htz.pw:oracle]/opt/VRTS/bin>./fsmap -aH /data12/oradata/dblv32g_0225.dbf|more
                    Volume  Extent Type     File Offset     Extent Size     File
                vol_data12         Data         0 Bytes        16.00 KB     /data12/oradata/dblv32g_0225.dbf
                vol_data12         Data        16.00 KB       128.00 KB     /data12/oradata/dblv32g_0225.dbf
                vol_data12         Data       144.00 KB        32.00 KB     /data12/oradata/dblv32g_0225.dbf
                vol_data12         Data       176.00 KB        32.00 KB     /data12/oradata/dblv32g_0225.dbf
                vol_data12         Data       208.00 KB        32.00 KB     /data12/oradata/dblv32g_0225.dbf
                vol_data12         Data       240.00 KB        32.00 KB     /data12/oradata/dblv32g_0225.dbf
                vol_data12         Data       272.00 KB        32.00 KB     /data12/oradata/dblv32g_0225.dbf
                vol_data12         Data       304.00 KB        32.00 KB     /data12/oradata/dblv32g_0225.dbf
                vol_data12         Data       336.00 KB        32.00 KB     /data12/oradata/dblv32g_0225.dbf
                vol_data12         Data       368.00 KB        24.00 KB     /data12/oradata/dblv32g_0225.dbf
                vol_data12         Data       392.00 KB        32.00 KB     /data12/oradata/dblv32g_0225.dbf
                vol_data12         Data       424.00 KB        32.00 KB     /data12/oradata/dblv32g_0225.dbf
                vol_data12         Data       456.00 KB        32.00 KB     /data12/oradata/dblv32g_0225.dbf
                vol_data12         Data       488.00 KB        32.00 KB     /data12/oradata/dblv32g_0225.dbf
                vol_data12         Data       520.00 KB       128.00 KB     /data12/oradata/dblv32g_0225.dbf
                vol_data12         Data       648.00 KB       128.00 KB     /data12/oradata/dblv32g_0225.dbf
                ............
这里看到文件系统上面每个区的大小都是32KB,文件存在碎片,并且很严重。在VXFS里面文件的空间分配其实跟ORACLE一样的,连续的块构成区,但是区于区之间有可能不连续的。ORACLE一次IO最大是1M,如果一次多块读1M,需要读32个VXFS里面的区,那么一次ORACLE的IO在VXFS里面间接的变成了32次IO。



2.4 手动创建表空间,查看数据文件的碎片

下面手动创建一个10G的数据文件,看看文件的碎片如何。

SQL> create tablespace htz_test datafile '/data14/oradata/htz_test01.dbf' size 10G autoextend off;

Tablespace created.

[htz.pw:oracle]/oracle/app/oracle/admin/htz/htz>/opt/VRTS/bin/fsmap -aH /data14/oradata/htz_test01.dbf
                    Volume  Extent Type     File Offset     Extent Size     File
                vol_data14         Data         0 Bytes        10.00 GB     /data14/oradata/htz_test01.dbf
[htz.pw:oracle]/oracle/app/oracle/admin/htz/htz>ls -l  /data14/oradata/htz_test01.dbf
-rw-r-----    1 oracle   dba      10737426432 Feb 19 14:36 /data14/oradata/htz_test01.dbf
可以发现10G的数据文件,只生成了一个区,跟我们之前的数据文件相差很大。



2.5 创建表用于测试IO

这里创建两张表,htz_test_table存放在之前我们创建的表空间中(新表空间),htz_test_table1存放在9i环境中就有的表空间(旧表空间)



  SQL> create table system.htz_test_table nologging tablespace htz_test  parallel 5 as select /*+ parallel(a 5)*/* from PU_DM.ACCT_CREDIT_ALL_D a where rownum <1000000;
  SQL> create table system.htz_test_table1 tablespace ODS_MODEL as select * from system.htz_test_table;
2.5.1 通过TRUSS,10046来对上面2张表进行分析



旧表空间的表

SQL> alter session set "_serial_direct_read"='NEVER';

Session altered.

Elapsed: 00:00:00.00
SQL> @trace_10046_my.sql
Statement processed.
Statement processed.
/oracle/app/oracle/diag/rdbms/puods/puhtz.pw/trace/puhtz.pw_ora_63111800.trc
'oradebug event 10046 trace name context off'


select count(*) from system.htz_test_table1;


truss -Ddfo /expdata/63111800.log -p 63111800


SQL> select count(*) from system.htz_test_table1;

  COUNT(*)
----------
  63999936

Elapsed: 00:11:18.68

这里看到SQL执行SQL是11分钟
SQL> select * from v$session_event where sid in (select sid from v$mystat) ;

       SID EVENT                                    TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT   MAX_WAIT TIME_WAITED_MICRO   EVENT_ID WAIT_CLASS_ID WAIT_CLASS# WAIT_CLASS
---------- ---------------------------------------- ----------- -------------- ----------- ------------ ---------- ----------------- ---------- ------------- ----------- --------------------
      1659 Disk file operations I/O                         528              0           1            0          0              7621  166678035    1740759767           8 User I/O
      1659 log file sync                                      1              0           0          .06          0               596 1328744198    3386400367           5 Commit
      1659 db file sequential read                            4              0           3          .66          1             26371 2652584166    1740759767           8 User I/O
      1659 db file scattered read                          4543              0       65910        14.51         51         659102331  506183215    1740759767           8 User I/O
      1659 SQL*Net message to client                         18              0           0            0          0                98 2067390145    2000153315           7 Network
      1659 SQL*Net message from client                       17              0       24352       1432.5       7812         243524787 1421975091    2723168908           6 Idle
      1659 events in waitclass Other                          7              0           0   Oracle о        .05          0              3262 1736664284    1893977003           0 Other

7 rows selected.
     平均的多块读的时间是14.51   ,单块读的时间是0.66
新表空间的表



truss -Ddfo /expdata/45089246.log -p 45089246


SQL> @myspid

SPID
------------------------
45089246

SQL> @mysid.sql

USERENV('SID')
--------------
          1659

SQL> alter session set "_serial_direct_read"='NEVER';

Session altered.

SQL> set lines 200                                                            
SQL> col wait_class for a20                                                   
SQL> col event for a40                                                        
SQL> select * from v$session_event where sid in (select sid from v$mystat) ;  
                                                                         

       SID EVENT                                    TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT   MAX_WAIT TIME_WAITED_MICRO   EVENT_ID WAIT_CLASS_ID WAIT_CLASS# WAIT_CLASS
---------- ---------------------------------------- ----------- -------------- ----------- ------------ ---------- ----------------- ---------- ------------- ----------- --------------------
      1659 Disk file operations I/O                           1              0           0            0          0                12  166678035    1740759767           8 User I/O
      1659 log file sync                                      1              0           0           .1          0              1031 1328744198    3386400367           5 Commit
      1659 SQL*Net message to client                         11              0           0            0          0                24 2067390145    2000153315           7 Network
      1659 SQL*Net message from client                       10              0        2594       259.36        935          25935800 1421975091    2723168908           6 Idle

SQL> SQL> set timing on
SQL> @trace_10046_my.sql
Statement processed.
Statement processed.
/oracle/app/oracle/diag/rdbms/puods/puhtz.pw/trace/puhtz.pw_ora_45089246.trc
'oradebug event 10046 trace name context off'
SQL> select count(*) from system.htz_test_table;

  COUNT(*)
----------
  63999936

Elapsed: 00:00:37.47

这里看到时间是0.37S
SQL> select * from v$session_event where sid in (select sid from v$mystat) ;

       SID EVENT                                    TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT   MAX_WAIT TIME_WAITED_MICRO   EVENT_ID WAIT_CLASS_ID WAIT_CLASS# WAIT_CLASS
---------- ---------------------------------------- ----------- -------------- ----------- ------------ ---------- ----------------- ---------- ------------- ----------- --------------------
      1659 Disk file operations I/O                           4              0           0            0          0                69  166678035    1740759767           8 User I/O
      1659 log file sync                                      1              0           0           .1          0              1031 1328744198    3386400367           5 Commit
      1659 db file sequential read                            9              0           5          .57          1             51512 2652584166    1740759767           8 User I/O
      1659 db file scattered read                          4540              0        1696          .37         34          16957162  506183215    1740759767           8 User I/O
      1659 db file parallel read                              5              0          19         3.78          5            188920  834992820    1740759767           8 User I/O
      1659 SQL*Net message to client                         18              0           0            0          0                69 2067390145    2000153315           7 Network
      1659 SQL*Net message from client                       17              0       13426       789.77       5328         134261271 1421975091    2723168908           6 Idle

7 rows selected.

Elapsed: 00:00:03.51
    多块读的平均时间是0.37,单块读是0.57
2.5.2 TRUSS文件分析

D:\temp>cat 45089246.log|grep lseek|awk  "{print $3}"|awk -F : "{sum+=$1} END {print sum}"
13.9234
D:\temp>cat 63111800.log|grep lseek|awk  "{print $3}"|awk -F : "{sum+=$1} END {print sum}"
531.661

D:\temp>cat 63111800.log|grep lseek|wc -l
5044

D:\temp>cat 45089246.log|grep lseek|wc -l
3332
这里没有贴出详细的内容,只给了有问题的函数

通过TRUSS文件,可以看到整个函数中,主要是由于LSEEK函数消耗的时间差异很大,才导致整个SQL运行很慢。

2.5.3 通过10046时间分析

新表空间的10046统计信息
1
旧表空间的10046统计信息
2

3 分析总结

由于9i使用DG方式升级到11G,在9I中还没有odm,所有在还原数据文件到vxfs时,vxfs会尽量将文件分散,这就导致后来的区不连续,多块读很慢。

4,解决方案

解决方案很简单,其实可以通过sf自己的工具在线做,但是sf工程师说目前没有成功的案例,不做,所以就只能通过ORACLE数据的方法了,使用rman backup as copy将数据文件从一个目录copy到另外一个目录,通过调用odm功能向vxfs里面写数据就可以了。


本文固定链接: http://www.htz.pw/2016/03/13/%e5%88%86%e6%9e%90oracle%e6%95%b0%e6%8d%ae%e5%ba%93%e5%a4%9a%e5%9d%97%e8%af%bb%e5%be%88%e6%85%a2%e3%80%81%e5%8d%95%e5%9d%97%e8%af%bb%e5%be%88%e5%bf%ab%e7%9a%84%e6%a1%88%e4%be%8b%e5%88%86%e6%9e%90.html | 认真就输

--------------------------------------ORACLE-DBA----------------------------------------

最权威、专业的Oracle案例资源汇总之【案例】分析ORACLE数据库多块读很慢 单块读很快的案例分析

本文由大师惜分飞原创分享,网址:http://www.oracleplus.net/arch/1185.html

Oracle研究中心

关键词:

Oracle优化笔记

Oracle多块读很慢解决办法

Oracle 多块读与单块读优化方法