在DB2应用使用过程中,我们经常会碰到应用响应时间很慢,甚至没有响应,但是应用服务器可能并不是很繁忙,cpu利用率也非常低,引起这种状况的原因有很多种,比如环境问题,应用资源泄漏,数据库原因等等,本文主要是从一次应用性能诊断过程来谈谈如何通过数据库诊断应用性能问题。
问题:
测试过程中发现应用中某个跳转页面执行时间比较长,系统压力不大,cpu利用很低,该页面需要从cache中取数据,第一次的时候加载cache(从数据库中查询回数据并cache)。
诊断:
页面逻辑比较简单,我们先用loadrunner模拟并发测试一下这个页面,然后再数据库端捕获sql执行情况。
1、打开db2监控开关
#db2 connect to eos
#db2 update monitor switches using statement on
#db2 reset monitor all
2、几分钟之后,我们收集sql统计快照
#db2 get snapshot for dynamic sql on eos > dysqlstatus.out
现在统计信息已经存放在dysqlstatus.out中,你可以使用任意方便的文本处理工具查看,我一般用windows上的gvim来处理,打开dysqlstatus.out
Number of executions = 1
Number of compilations = 1
Worst preparation time (ms) = 2
Best preparation time (ms) = 2
Internal rows deleted = 0
Internal rows inserted = 0
Rows read = 2
Internal rows updated = 0
Rows written = 0
Statement sorts = 0
Statement sort overflows = 0
Total sort time = 0
Buffer pool data logical reads = Not Collected
Buffer pool data physical reads = Not Collected
Buffer pool temporary data logical reads = Not Collected
Buffer pool temporary data physical reads = Not Collected
Buffer pool index logical reads = Not Collected
Buffer pool index physical reads = Not Collected
Buffer pool temporary index logical reads = Not Collected
Buffer pool temporary index physical reads = Not Collected
Total execution time (sec.ms) = 0.000377
Total user cpu time (sec.ms) = 0.010000
Total system cpu time (sec.ms) = 0.000000
Statement text = select ACTIVITYDEFID,ACTIVITYINSTID from wfworkitem where PROCESSINSTID=104199 and CURRENTSTATE = 4
......
简单说一下vi中的处理
:g!/Total execution time/d
只保留文本中的sql执行时间,我们要按照执行时间来排序
通过vim的visual功能选择执行时间块(等号后面的数字),然后排序
Total execution time (sec.ms) = 0.050590
Total execution time (sec.ms) = 0.000170
Total execution time (sec.ms) = 0.000247
Total execution time (sec.ms) = 0.000292
Total execution time (sec.ms) = 0.000474
Total execution time (sec.ms) = 0.000330
Total execution time (sec.ms) = 0.000348
Total execution time (sec.ms) = 0.000279
Total execution time (sec.ms) = 0.000385
Total execution time (sec.ms) = 0.000296
Total execution time (sec.ms) = 0.000261
Total execution time (sec.ms) = 0.000195
Total execution time (sec.ms) = 0.000226
Total execution time (sec.ms) = 0.000227
Total execution time (sec.ms) = 0.000193
......
:'<,'>!sort
排序后的结果(部分)
Total execution time (sec.ms) = 2.027776
Total execution time (sec.ms) = 2.203624
Total execution time (sec.ms) = 2.504677
Total execution time (sec.ms) = 2.951256
Total execution time (sec.ms) = 3.119875
Total execution time (sec.ms) = 3.303277
Total execution time (sec.ms) = 3.303517
Total execution time (sec.ms) = 4.017133
Total execution time (sec.ms) = 4.043329
Total execution time (sec.ms) = 4.252125
Total execution time (sec.ms) = 4.400952
Total execution time (sec.ms) = 4.606765
Total execution time (sec.ms) = 5.208087
Total execution time (sec.ms) = 5.778598
Total execution time (sec.ms) = 8.117470
Total execution time (sec.ms) = 9797.905136
可以看到最长时间的sql total执行时间耗费了3797.905123s.
现在我们到dysqlstatus.out中去找这条语句
Number of executions = 4602
Number of compilations = 4294967295
Worst preparation time (ms) = 2
Best preparation time (ms) = 2
Internal rows deleted = 0
Internal rows inserted = 0
Rows read = 2963688
Internal rows updated = 0
Rows written = 0
Statement sorts = 0
Statement sort overflows = 0
Total sort time = 0
Buffer pool data logical reads = Not Collected
Buffer pool data physical reads = Not Collected
Buffer pool temporary data logical reads = Not Collected
Buffer pool temporary data physical reads = Not Collected
Buffer pool index logical reads = Not Collected
Buffer pool index physical reads = Not Collected
Buffer pool temporary index logical reads = Not Collected
Buffer pool temporary index physical reads = Not Collected
Total execution time (sec.ms) = 9797.905136
Total user cpu time (sec.ms) = 9.290000
Total system cpu time (sec.ms) = 1.230000
Statement text = select * from XXXX_T_CNFACTIVITYDEF
这条语句总共执行了4602次,平均每次的执行时间2S,而且这些数据应该是被cache起来的 ;)
总结:
上面的方法简单总结了从数据库层面对应用的性能问题诊断,希望对大家有所帮助,对于数据库快照诊断问题的思路对于任意数据库通用