sitemap

RSS地图

收藏本站

设为首页

Oracle研究中心

当前位置:Oracle研究中心 > 运维DBA > Oracle性能优化 >

【学习笔记】Oracle优化 记录一次数据库优化的完整过程及思路

时间:2016-07-31 18:07   来源:Oracle研究中心   作者:惜分飞   点击:

天萃荷净 分享一篇关于Oracle数据库优化的完整过程,配置Hugepage,减小SGA、优化因为主键表频繁插入引起的user$,con$,cdef$递归查询sql。

最近对客户的一个数据库进行了优化,在本次优化过程中,主要涉及以下方面:
1. 确保系统有足够的内存,处理方法配置Hugepage,减小SGA
2. 优化因为主键表频繁插入引起的user$,con$,cdef$递归查询sql

SQL> select c.name, u.name from con$ c, cdef$ cd, user$ u where 
   2 c.con# = cd.con# and cd.enabled = :1 and c.owner# = u.user#;


Execution Plan
----------------------------------------------------------
Plan hash value: 2409458995

-----------------------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |  3820 |   164K|    38   (6)| 00:00:01 |
|*  1 |  HASH JOIN          |       |  3820 |   164K|    38   (6)| 00:00:01 |
|   2 |   TABLE ACCESS FULL | USER$ |    64 |   896 |     3   (0)| 00:00:01 |
|*  3 |   HASH JOIN         |       |  3820 |   111K|    34   (3)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| CDEF$ |  3820 | 34380 |    25   (0)| 00:00:01 |
|   5 |    TABLE ACCESS FULL| CON$  |  6368 |   130K|     8   (0)| 00:00:01 |
-----------------------------------------------------------------------------

====================================================================================================================
一次数据库优化全过程分析 硬件配置
 
Name Value Name Value
Hostname localhost OS Type Linux
Memory Size 32949016kB Swap Size 18481144kB
CPU Count 8 CPU MHz 2128.052
注:该系统除了跑oracle外,还跑了 10个java程序,每个程序大概占用800M内存
 
数据库信息
 
DB Name DB Id Instance Inst num Release |rac| Host
ORCL 1346827858 orcl 1 10.2.0.1.0 NO localhost.localdomain
pga_aggregate_target 5368709120
   
processes 6000
sessions 6605
sga—max—size 21474836480
   
sga_target 21474836480
优化前awr报告信息
 
  Snap Id Snap Time Sessions Cursors/Session
Begin Snap: 3863 06-Dec-13 08:00:57 1896 2.1
End Snap: 3864 06-Dec-13 09:01:00 1855 2.3
Elapsed:   60.06 (mins)    
DB Time:   108,269.34 (mins)    
这里可以发现1) dbtime>>Elapsed,2) session数较多
  Per Second Per Transaction
Redo size: 211,446.31 5,999.57
Logical reads: 11,590.99 328.88
Block changes: 1,345.06 38.16
Physical reads: 5.82 0.17
Physical writes: 113.75 3.23
User calls: 737.58 20.93
Parses: 388.42 11.02
Hard parses: 0.04 0.00
Sorts: 15.15 0.43
Logons: 0.20 0.01
Executes: 392.82 11.15
Transactions: 35.24  
这里显示系统各个指标都不是非常大,证明系统不是真正忙 Top 5等待事件
 
Event |waits| Time(s) Avg Wait(ms) % Total Call Time Wait Class
row cache lock 254,389 493,942 1,942 75.9 Concurrency
enq: TX - row lock contention 5,091 11,582 2,275 1.8 Application
enq: SQ - contention 11,497 11,110 966 1.7 Configuration
log file sync 426,449 9,757 23 1.5 Commit
CPU time   3,419   .5  
这里大量 row cache lock,TX,enq: SQ,log file sync 等待
 
分析 enq: SQ - contention
 
Statistic Name Time (s)  
sql execute elapsed time 638,308.40 98.04
sequence load elapsed time 617,895.85 94.91
Enqueue Type (Request Reason) Requests Succ Gets Failed Gets Waits Wt Time (s) Av Wt Time(ms)
TX-Transaction (row lock contention) 1,683 1,646 0 1,646 11,721 7,120.83
SQ-Sequence Cache 17,474 17,474 0 9,987 11,378 1,139.2
通过分析,证明seq调用确实非常频繁,需要检查cache,并考虑增加cache Top sql执行情况
 
Buffer Gets Executions Gets per Exec %Total CPU Time (s) Elapsed Time (s) SQL Id SQL Module SQL Text
65,613,997 463,555 141.55 76.49 1985.48 2388.82 7gtztzv329wg0   select c.name, u.name from co...
25,403,304 159,619 159.15 29.61 872.32 242310.28 07kx6r40n3fmu JDBC Thin Client insert into zyzhfee.np一mobile...
23,697,081 145,389 162.99 27.63 721.50 212920.96 dg54j6dm04vp0 JDBC Thin Client insert into zyzhfee.np一client...
23,551,863 144,006 163.55 27.46 762.06 212638.86 chpbbd3k1gdm1 JDBC Thin Client insert into zyzhfee.np一client...
2,399,608 15,101 158.90 2.80 74.49 25416.02 8wq8m7jgk49yy JDBC Thin Client insert into zyzhfee.np一client...
检查系统状态
 
top - 11:22:08 up 15:26^     4 users, load average: 9.04j 7.29, 6.67 Tasks: 2113 total, 10 ruimingj 2103 sleepingj        0 stoppedj 0 zombie Cpu(s): 14. IJSus, 66. TSKsy, 0. OMni, ll.BJSid, 7.4Kwa,  0. OShi, 0. 3%si,    0. OSKst Mem: 32949016k -total, 32863840k used, 85176k free, 1376k buffers Svap: 18481144k total, 6716484k used, 11764660k free, 3114184k cached
 
PID USER PR NI VIRT RES SHR S «CPU MEM TIME+ COMAHD
571 root 20 -5 0 0 □ E 99.9 0. □ 51:21.78 [kswapdD]
572 root 20 -5 0 0 OR 86.0 0. 0 49:58. 28 [kswapdl]
16589 oracle 18 0 76144 1124 920 E 43. 9 0. 0 0:03.90 tar czvf dataCollector_loca!
16500 oracle 18 0 4044 624 252 D 34.7 0. □ □:11.80 gzip
4994 oracle 17 0 20. lg 97m 91m E 34.0 0. 3 0:02.06 oracleorcl (L0CAL=N0)
16597 oracle 19 0 98. 9m 148 108 D 23.2 0. 0 0:00.75 ora_m000_orcl
3532 oracle 15 0 20. lg 99m 95m S 22.3 0. 3 0:02.69 oracleorcl (L0CAL=H0)
14866 luogb 25 0 954m 3D2m 6180 S 21.0 0. 9 1: 22. 28 /usr/java/jdkl. 6. 0_32/biri/j:
5438 oracle 15 0 20. 2g 56m 56m R 19.2 0. 2 11:42.84 ora_lgwr_orcl
5184 oracle 16 0 20. lg 98m 93m S 18.9 0. 3 0:04.21 oracleorcl (L0CAL=N0)
l -Djava.util.logging, config. file=/home/lu
 
[oracle@localhost ■total 〜]$ free -m used free
Mem: 32176 32082 93
-/+ buffers/cache: 29035 3141
Swap: 18047 7554 10493
shared   buffers 0 1
 
cached 3046
 
[oracle@localhost ^]$ more /proc/meminf〇
 
MemT ot al: MeniFree: Buffers: Cached: SwapCached: Act ive: Inact ive: HighTotal: HighFree: LowTotal: LowFree: Swap Total: SwapFree: Dirty: Writeback: AnonPages: Mapped: Slab: PageTables:
 
32949016 92444 2592 2926344 590172 17450272 127408 32949016 92444 18481144 10757972 20016 14577344 2818160 221504 14740844
 
kB kB kB kB kB kB kB kB kB kB kB kB kB kB kB kB kB kB kB
 
WFS_Unstable: Bounce: Commit Limit: Committed一AS: VmallocTotal: VmallocUsed: VmallocChunk: Hug eP ag e s_ T ot al: HugePages_Free:
 
0 kB 0 kB 34955652 kB 60634280 kB 34359738367 kB 272032 kB 34359466091 kB 0 0
 
Hug eP ag e s_Rsvd:  0 Hugepagesize:     2048 kB 通过检查系统发现系统有大量的swap使用,内存明显不足
 
第一次优化分析 系统未配置大页,pagetables占用了 14G以上内存,内存明显不足使用了大量的swap,而 评估系统的物理读非常小,决定配置HugePage,减小sga,增加seq cache到200
 
第一次优化后结果
 
  Snap Id Snap Time Sessions Cursors/Session
Begin Snap: 3890 07-Dec-13 11:00:31 386 2.3
End Snap: 3891 07-Dec-13 12:00:33 387 2.3
Elapsed:   60.03 (mins)    
DB Time:   62.34 (mins)    
整体db time己经下降了很多,session数也己经下降的很厉害,初步证明第一次优化比较成 功
 
Per Second                                 Per Transaction
 
Redo size: 390,991.20 3,325.74
Logical reads: 29,522.02 251.11
Block changes: 2,539.28 21.60
Physical reads: 0.09 0.00
Physical writes: 274.86 2.34
User calls: 2,774.90 23.60
Parses: 957.68 8.15
Hard parses: 0.04 0.00
Sorts: 50.30 0.43
Logons: 0.03 0.00
Executes: 977.81 8.32
Transactions: 117.57  
Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
CPU time   3,312   88.5  
log file sync 451,977 540 1 14.4 Commit
log file parallel write 789,519 188 0 5.0 System I/O
SQL*Net break/reset to client 1,152,660 186 0 5.0 Application
log file switch completion 265 28 106 .8 Configuration
等待事件主要变成了和log file相关
 
Buffer Gets Executions Gets per Exec %Total CPU Time (s) Elapsed Time (s) SQL Id SQL Module SQL Text
81,547,858 576,202 141.53 76.69 2174.15 2174.83 7gtztzv329wg0   select c.name, u.name from co...
30,682,401 198,544 154.54 28.85 837.10 842.80 07kx6r40n3fmu JDBC Thin Client insert into zyzhfee.np一mobile...
28,685,157 181,020 158.46 26.98 743.53 744.41 dg54j6dm04vp0 JDBC Thin Client insert into zyzhfee.np一client...
28,424,227 178,618 159.13 26.73 741.08 743.07 chpbbd3k1gdm1 JDBC Thin Client insert into zyzhfee.np一client...
3,087,083 20,021 154.19 2.90 79.60 79.70 8wa8m7lak49vv JDBC insert into
              Thin zyzhfee.np一client...
              Client  
Sql里面,系统负载下降,sql执行时间变短,因为seq cache増加,所以每次的逻辑读稍微 减小,但是依然比较大, 而且select c.name, u.name from con$ c, cdef$ cd, user$ u where c.con# = cd.con# and cd.enabled = :1 and c.owner# = u.user# S〇L语句占据太大资源, 对于该语句进行第二次优化
 
第二次优化分析 Top sql中几条插入频繁的语句都有主键,而且是通过主键来控制数据一致性,而在验证数据一致性的时候, 就会执行如下sql语句: select c.name, u.name from con$ c, cdef$ cd, user$ u where c.con# = cd.con# and cd.enabled = :1 and c.owner# = u.user# 要想降低该语句和insert语句的逻辑读,就需要降低该sql的逻辑读,分析该sql执行计划 SQL> select NAME,POSITION, DATATYPE,DATATYPE_STRING,VALUE_STRING, LAST_CAPTURED 2   from V$SQL_BIND_CAPTURE where SQL_ID=,&sql_id, 3   / Enter value for sql_id: 7gtztzv329wg0 old 2: from V$SQL—BIND—CAPTURE where SQL_ID=’&sql—id’ new 2: from V$SQL—BIND—CAPTURE where SQL_ID=’7gtztzv329wg0’
 
NAME
 
DATATYPE DATATYPE_STRING
 
VALUE_STRING
 
LAST_CAPT
 
2 NUMBER 53681 10-DEC-13
 
POSITION
 
1
 
SQL> set autot trace exp stat SQL> set lines 120 SQL> select c.name, u. name from con$ c, cdef$ cd, user$ u where c. con# = cd. con# and cd. enabled = 53681 and c.owner# = u.user#;
 
Execution Plan
 
Plan hash value: 2409458995
 
| Id Operation | Name | Rows | Bytes Cost (%CPU) Time |
| 0 SELECT STATEMENT   3820 | 164K| 38 00:00:01 |
|* 1 HASH JOIN   3820 | 164K| 38 00:00:01 |
| 2 TABLE ACCESS FULL | USER$ 64 | 896 | 3 (0) 00:00:01 |
|* 3 HASH JOIN   3820 | 111K| 34 00:00:01 |
|* 4 TABLE ACCESS FULL | CDEF$ 3820 | 34380 | 25 (0) 00:00:01 |
| 5 TABLE ACCESS FULL | CON$ | 6368 | 130K| 8 (0) 00:00:01 |
Predicate Information (identified by operation id):
 
1 - access(〃C〃.〃OWNER#〃=〃U〃.〃USER#〃) 3 - access("C"."CON#"="CD"."CON#") 4 - filter(〃CD〃.〃ENABLED〃=53681) Statistics
  1 recursive calls
  0 db block gets
  150 consistent gets
  0 physical reads
  0 redo size
  598 bytes sent via SQL*Net to client
  469 bytes received via SQL*Net from client
  2 SQL*Net roundtrips to/from client
  0 sorts (memory)
  0 sorts (disk)
  1 rows processed
该sql走全表扫描,而在CDEF$* C〇N$表中的相关index未被正常使用
 
Table Tablespace                  Status TyPe
 
Degree Cache
 
Len
 
Analyzed
 
SYS USER$ SYSTEM Cluster VALID 64 4 83 0 1 N 2013-10-07
      Table               22:00:50
SYS CDEF$ SYSTEM Cluster VALID 6367 108 41 0 1 N 2013-12-04
      Table               22:04:50
SYS CON$ SYSTEM Normal VALID 6368 28 21 0 1 N 2013-12-04
      Table               22:04:46
因为这个查询本身是insert有主键的表的一个递归sql语句,出现如此效率低的情况,应该 属于异常情况,查询mos发现 Bug 5103126 - Insert of duplicate rows with unique constraint slow (Doc ID 5103126.8) ug 9290526 Poor plan for recursive SQL used for DML involving a UNIQUE constraint Bug 5103126解决方法是升级到10.2.0.2并打上patch Bug 9290526描述版本不符合,但是也是同样类型,提供workaround使用rbo 因为不想给数据库升级,而且基表不能加hint,以前测试sql profile控制基表也可能不行, 因此我无法使用rbo
 
第二次优化后结果
 
尝试使用analyze重新收集信息 analyze table cdef$ compute statistics; analyze table con$ compute statistics;
 
执行计划变为 SQL> select c. name, u. name from con$ c, cdef$ cd, user$ u where c. con# = cd. con# and cd. enabled = 53681 and c.owner# = u.user#;
 
Execution Plan
 
Plan hash value: 4014731293
 
 
1 Id Operation | Name | Rows Bytes | Cost (%CPU) Time |
| 0 SELECT STATEMENT | | 5 210 11 (10) 00:00:01 |
|* 1 HASH JOIN | | 5 210 11 (10) 00:00:01 |
| 2 NESTED LOOPS | | 5 140 | 7 (0) 00:00:01 |
| 3 TABLE ACCESS BY INDEX ROWID| CDEF$ | 5 35 | 2 (0) 00:00:01 |
|* 4 INDEX RANGE SCAN | I_CDEF4 | 5   1 (0) 00:00:01 |
I 5 | TABLE ACCESS BY INDEX ROWID| CON$ | 1 | 21 | 1 (0)| 00:00:01
I* 6 | INDEX UNIQUE SCAN | I_CON2 | 1 | | 0 (0)| 00:00:01
| 7 | TABLE ACCESS FULL | USER$ | 64 | 896 | 3 (0)| 00:00:01
Predicate Information (identified by operation id):
 
1 - access(〃C〃.〃OWNER#〃=〃 U〃. 〃USER#〃)
4 - access("CD"."ENABLED" =53681)
6 - access(〃C〃.〃CON#〃=〃CD 〃.〃CON#〃)
Statistics    
0 recursive calls  
0 db block gets  
13 consistent gets
0 physical reads  
0 redo size  
598 bytes sent via SQL*Net to client
469 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)  
0 sorts (disk)  
rows processed  
该sql的逻辑读从150下降到了 13,优化效果明显 再次收集awr信息
 
  Snap Id Snap Time Sessions Cursors/Session
Begin Snap: 3970 10-Dec-13 18:10:19 423 2.0
End Snap: 3971 10-Dec-13 18:18:49 446 2.0
Elapsed:   8.50 (mins)    
DB Time:   3.58 (mins)    
Db time下降到Elapsed —半不到
 
  Per Second Per Transaction
Redo size: 343,234.65 3,069.30
Logical reads: 7,447.03 66.59
Block changes: 2,009.48 17.97
Physical reads: 0.03 0.00
Physical writes: 288.62 2.58
User calls: 2,907.96 26.00
     
Parses: 1,026.52 9.18
Hard parses: 0.04 0.00
Sorts: 46.61 0.42
Logons: 0.08 0.00
Executes: 1,033.69 9.24
Transactions: 111.83  
逻辑读从以前的每秒3w左右下降到7千多
 
Buffer Gets Executions Gets per Exec %Total CPU Time (s) Elapsed Time (s) SQL Id SQL Module SQL Text
605,127 77,057 7.85 15.94 13.54 13.54 7gtztzv329wg0   select c.name, u.name from co...
569,914 23,390 24.37 15.01 11.80 11.80 chpbbd3k1gdm1 JDBC Thin Client insert into zyzhfee.np一client...
549,291 23,776 23.10 14.47 10.59 10.59 dg54j6dm04vp0 JDBC Thin Client insert into zyzhfee.np一client...
548,928 26,770 20.51 14.46 15.61 15.75 07kx6r40n3fmu JDBC Thin Client insert into zyzhfee.np一mobile...
第一条sql逻辑读从整体的70%以上下降到16%左右,相应的insert语句的逻辑读从原先的 150以上下降到25以下,优化效果非常明显
 
至此该系统绝大部分算优化完工
 
 
 

--------------------------------------ORACLE-DBA----------------------------------------

最权威、专业的Oracle案例资源汇总之【学习笔记】Oracle优化 记录一次数据库优化的完整过程及思路

本文由大师惜分飞原创分享,网址:http://www.oracleplus.net/arch/922.html

Oracle研究中心

关键词:

Oracle优化

主键表频繁插入引起的user$

con$

cdef$递归查询sql