缓慢的update语句性能分析(2)

日期：2020-06-04 栏目：程序人生浏览：次

但是查看sql语句的执行统计信息，就有些奇怪了。

Stat NameStatement TotalPer Execution% Snap Total
Elapsed Time (ms) 8,659,180 13,921.51 54.34
CPU Time (ms) 69,346 111.49 11.63
Executions 622
Buffer Gets 3,146,068 5,057.99 35.91
Disk Reads 645,229 1,037.35 70.31
Parse Calls 622 1.00 0.04
Rows 621,827 999.72
User I/O Wait Time (ms) 8,608,075

sql语句的执行总共持续8659s左右，然后8608s的时间在user I/O的等待上，这样下来，622次的执行其实花费的时间并不多。
对于这个问题，自己也比较疑惑，开始怀疑是否是磁盘的IO上出现了问题。
但是使用MegaCli查看的时候，发现不存在任何的坏块。
# MegaCli -CfgDsply -a0|grep Error
Media Error Count: 0
Other Error Count: 0
Media Error Count: 0
Other Error Count: 0
Media Error Count: 0
Other Error Count: 0
Media Error Count: 0
Other Error Count: 0
Media Error Count: 0
Other Error Count: 0
Media Error Count: 0
Other Error Count: 0
Media Error Count: 0
这个时候的一个猜测就是可能由绑定变量的数据类型不同导致的sql性能问题。但是排查一番，发现还是没有得到自己期望的结果。
查看输入的参数类型，都是期望中的varchar2，所以sql语句的过程中还是不会出现自己猜想的全表扫描的可能性。
select name,datatype_string,value_string,datatype from DBA_HIST_SQLBIND where sql_id='94p345yuqh3zd' and snap_id between 58711 and 58712
NAME DATATYPE_STRING VALUE_STRING DATATYPE
------------------------------ --------------- ------------------------------ ----------
:1 VARCHAR2(128) xxxxxx9@test.com 1
:1 VARCHAR2(128) 23234324324234 1
对于IO的瓶颈问题，自己还是从addm中得到了自己需要的东西。
对于磁盘吞吐量的说法，addm的报告中是这么描述的。

FINDING 6: 39% impact (6136 seconds)
------------------------------------
The throughput of the I/O subsystem was significantly lower than expected.

RECOMMENDATION 1: Host Configuration, 39% benefit (6136 seconds)
ACTION: Consider increasing the throughput of the I/O subsystem.
Oracle's recommended solution is to stripe all data file using the
SAME methodology. You might also need to increase the number of disks
for better performance. Alternatively, consider using Oracle's
Automatic Storage Management solution.
RATIONALE: During the analysis period, the average data files' I/O
throughput was 3.9 M per second for reads and 2.7 M per second for
writes. The average response time for single block reads was 16
milliseconds.
这个部分还是能够说明问题的，在IO上还是遇到了较大的瓶颈。这些延迟等待是造成DB time急剧升高的主因。
当然了我们也不能按照addm的说法，直接替换成asm，这个不是马上能够实现的方法。
但是在awr报告中还是发现了一丝蛛丝马迹，有一些辅助的调优方法。
第一个就是shared pool的大小，这个库大概有1000个session,但是因为使用了sga的自动管理，结果shared pool被30G的空间中只剩下了1.4G左右的缓存，很明显对于支持1000多个session的库来说，shard pool被压榨的太多了，可以指定一个稍大一些的值，保证shared pool不被全部榨干。
另外一个问题就是update执行如此缓慢，出了user I/O的原因之外，可以一个执行极为频繁的sql语句扫描的是同一张表，会造成一些热块的争用。同时会为了支持一致性读，势必在undo上会有较大的消耗，查看了这个库的undo还是一个相对较小的值，可以调大一些。

转载注明出处：https://www.heiqu.com/f49f704cc65117d7bbf6bfb8b1b1b607.html

缓慢的update语句性能分析(2)

相关推荐