resmgr:cpu quantum导致的性能问题

本站文章除注明转载外，均为本站原创： 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客

昨天给某客户升级的系统，经过TDE加密以后，今天上午出现严重的性能问题，表现的现象如下：
从top可以看到，cpu消耗非常之高，基本上在90~95%左右，奇怪的是user消耗只有30~40%左右，
大部分是sys消耗，断定db出问题了，通过QQ远程客户，进行了如下处理：
通过
SQL> select event,count(*) from v$session group by event;
之后, 发现有70个左右的 resmgr:cpu quantum 等待，另外还有2~5个 asynch descriptor resize。

昨天给某客户升级的系统，经过TDE加密以后，今天上午出现严重的性能问题，表现的现象如下：

从top可以看到，cpu消耗非常之高，基本上在90~95%左右，奇怪的是user消耗只有30~40%左右，

大部分是sys消耗，断定db出问题了，通过QQ远程客户，进行了如下处理：

通过

SQL> select event,count(*) from v$session group by event;

之后, 发现有70个左右的 resmgr:cpu quantum 等待，另外还有2~5个 asynch descriptor resize。

但从等待事件来看，都没遇到过，查询mos发现了相关的资料：
High "Resmgr:Cpu Quantum" Wait Events In 11g Even When Resource Manager Is Disabled [ID 949033.1]
11g: Scheduler Maintenance Tasks or Autotasks [ID 756734.1]
Large Waits With The Wait Event "Resmgr:Cpu Quantum" [ID 806893.1]
于是做出了如下调整：

但从等待事件来看，都没遇到过，查询mos发现了相关的资料：

High "Resmgr:Cpu Quantum" Wait Events In 11g Even When Resource Manager Is Disabled [ID 949033.1]

11g: Scheduler Maintenance Tasks or Autotasks [ID 756734.1]

Large Waits With The Wait Event "Resmgr:Cpu Quantum" [ID 806893.1]

于是做出了如下调整：

alter system set resource_manager_plan='';
execute dbms_scheduler.set_attribute('WEEKNIGHT_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('WEEKEND_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('MONDAY_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('TUESDAY_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('WEDNESDAY_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('THURSDAY_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('FRIDAY_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('SATURDAY_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('SUNDAY_WINDOW','RESOURCE_PLAN','');
BEGIN
DBMS_AUTO_TASK_ADMIN.DISABLE(client_name => 'sql tuning advisor',
operation => NULL,
window_name => NULL);
END;
/

alter system set resource_manager_plan='';

execute dbms_scheduler.set_attribute('WEEKNIGHT_WINDOW','RESOURCE_PLAN','');

execute dbms_scheduler.set_attribute('WEEKEND_WINDOW','RESOURCE_PLAN','');

execute dbms_scheduler.set_attribute('MONDAY_WINDOW','RESOURCE_PLAN','');

execute dbms_scheduler.set_attribute('TUESDAY_WINDOW','RESOURCE_PLAN','');

execute dbms_scheduler.set_attribute('WEDNESDAY_WINDOW','RESOURCE_PLAN','');

execute dbms_scheduler.set_attribute('THURSDAY_WINDOW','RESOURCE_PLAN','');

execute dbms_scheduler.set_attribute('FRIDAY_WINDOW','RESOURCE_PLAN','');

execute dbms_scheduler.set_attribute('SATURDAY_WINDOW','RESOURCE_PLAN','');

execute dbms_scheduler.set_attribute('SUNDAY_WINDOW','RESOURCE_PLAN','');

BEGIN

DBMS_AUTO_TASK_ADMIN.DISABLE(client_name => 'sql tuning advisor',

operation => NULL,

window_name => NULL);

END;

另外关于event asynch descriptor resize，从字面上理解就知道跟异步IO有关，当时通过top中cpu消耗较高的
几个进程，我关联v$process, v$session查询发现有几个进程所持有的event竟然是asynch descriptor resize，
于是做了如下调整：

另外关于event asynch descriptor resize，从字面上理解就知道跟异步IO有关，当时通过top中cpu消耗较高的

几个进程，我关联v$process, v$session查询发现有几个进程所持有的event竟然是asynch descriptor resize，

于是做了如下调整：

alter system set disk_async_io=false scope=spfile;
alter system filesystemio_options=none scope=spfile;

1 2	alter system set disk_async_io=false scope=spfile; alter system filesystemio_options=none scope=spfile;

然后跟开发商沟通了一下，重启了一下db，重启后观察主机资源情况，发现基本上正常了。
为了安抚客户，还是打车去了趟现场，到现场以后，还做了如下调整：
1. OPTIMIZER_INDEX_COST_ADJ 调整为默认值100。
2. 隐含参数_ASH_SIZE调整为16m(默认是8m), 当时查看alert日志发现有如下告警：

然后跟开发商沟通了一下，重启了一下db，重启后观察主机资源情况，发现基本上正常了。

为了安抚客户，还是打车去了趟现场，到现场以后，还做了如下调整：

1. OPTIMIZER_INDEX_COST_ADJ 调整为默认值100。

2. 隐含参数_ASH_SIZE调整为16m(默认是8m), 当时查看alert日志发现有如下告警：

Mon Nov 21 14:42:24 2011
Archived Log entry 38536 added for thread 1 sequence 38650 ID 0x18c941ba dest 1:
Mon Nov 21 14:46:51 2011
Active Session History (ASH) performed an emergency flush. This may mean that ASH is undersized. 
If emergency flushes are a recurring issue, you may consider increasing ASH size by setting the 
value of _ASH_SIZE to a sufficiently large value. Currently, ASH size is 8388608 bytes. Both ASH
size and the total number of emergency flushes since instance startup can be monitored by running
the following query:
select total_size,awr_flush_emergency_count from v$ash_info;

Mon Nov 21 14:42:24 2011

Archived Log entry 38536 added for thread 1 sequence 38650 ID 0x18c941ba dest 1:

Mon Nov 21 14:46:51 2011

Active Session History (ASH) performed an emergency flush. This may mean that ASH is undersized.

If emergency flushes are a recurring issue, you may consider increasing ASH size by setting the

value of _ASH_SIZE to a sufficiently large value. Currently, ASH size is 8388608 bytes. Both ASH

size and the total number of emergency flushes since instance startup can be monitored by running

the following query:

select total_size,awr_flush_emergency_count from v$ash_info;

3. 对于消耗逻辑度最为严重的3个sql语句，通过查询v$sql发现其有多个child number，看其执行计划也存在
多个执行计划，其中几个执行计划明显有问题，一会儿是index full scan一会儿是index skip scan。
于是为该几个sql创建了outline，如下：

3. 对于消耗逻辑度最为严重的3个sql语句，通过查询v$sql发现其有多个child number，看其执行计划也存在

多个执行计划，其中几个执行计划明显有问题，一会儿是index full scan一会儿是index skip scan。

于是为该几个sql创建了outline，如下：

SQL> alter session set create_stored_outlines=true;
Session altered.
SQL> exec dbms_outln.create_outline(3881163839,0);
PL/SQL procedure successfully completed.
SQL>  -- 其他两个省略。

SQL> alter session set create_stored_outlines=true;

Session altered.

SQL> exec dbms_outln.create_outline(3881163839,0);

PL/SQL procedure successfully completed.

SQL> -- 其他两个省略。

晚上回家，查询mos发现，关于 asynch descriptor resize 果然存在bug，而且处理方式就是调整disk_type_io参数，如下：

1	晚上回家，查询mos发现，关于 asynch descriptor resize 果然存在bug，而且处理方式就是调整disk_type_io参数，如下：

Bug 9829397 – Excessive CPU and many “asynch descriptor resize” waits for SQL using Async IO [ID 9829397.8]

		修改时间 07-SEP-2011 类型 PATCH 状态 PUBLISHED

Bug 9829397 Excessive CPU and many “asynch descriptor resize” waits for SQL using Async IO

This note gives a brief overview of bug 9829397.

The content was last updated on: 07-SEP-2011
Click here for
details of each of the sections below.

Affects:

Product (Component) Oracle Server (Rdbms)

Range of versions believed to be affected Versions >= 11.2 but BELOW 12.1

Versions confirmed as being affected

11.2.0.2

11.2.0.1

Platforms affected Generic (all / most platforms affected)

It is
believed to be a regression
in default behaviour thus:
Regression introduced in
11.2.0.2

Fixed:

This issue is fixed in

12.1
(Future Release)

11.2.0.3

Symptoms:	Related To:
Excessive CPU Usage Performance Of Query/ies Affected Waits for “asynch descriptor resize”	DISK_ASYNCH_IO

Description

Some queries in 11.2 may exhibit higher CPU usage than earlier releases with many "asynch descriptor resize" waits occurring compared to the same SQL in earlier releases. Rediscovery Notes: Async IO is in use. The total time waiting for "asynch descriptor resize" is typically very small but with very high counts. The high wait count indicates many resizes of the number of AIO descriptors unnecessarily wasting CPU. Workaround Disable async IO. eg: Set DISK_ASYNCH_IO = false

1
2
3
4
5
6
7
8
9
10
11
12
13
14

Some queries in 11.2 may exhibit higher CPU usage than earlier
releases with many "asynch descriptor resize" waits occurring
compared to the same SQL in earlier releases.

Rediscovery Notes:
Async IO is in use.
The total time waiting for "asynch descriptor resize" is
typically very small but with very high counts. The high
wait count indicates many resizes of the number of AIO
descriptors unnecessarily wasting CPU.

Workaround
Disable async IO.
eg: Set DISK_ASYNCH_IO = false

最后调整以后系统基本正常了，单纯就系统资源来说，cpu idle维持在50~80%。

1	最后调整以后系统基本正常了，单纯就系统资源来说，cpu idle维持在50~80%。

Tags: resmgr:cpu quantum
Posted in troubleshooting on 11月 21, 2011

One Response to “resmgr:cpu quantum导致的性能问题”

roger Says:

1月 30, 2012 at 上午 2:28

今天看mos文档，偶然发现居然还有这个bug •Patch:12596494 GENERALLY HIGHER CPU USAGE IN 11.2.0.2 THAN 10.2.0.4

You must be logged in to post a comment.

love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客

Categories

Archives

最新评论

国内圈子

oracle security

resmgr:cpu quantum导致的性能问题

Bug 9829397 Excessive CPU and many “asynch descriptor resize” waits for SQL using Async IO

Affects:

Fixed:

Symptoms:

Related To:

Description

One Response to “resmgr:cpu quantum导致的性能问题”

Leave a Reply

*Product (Component)*	Oracle Server (Rdbms)
*Range of versions believed* to be affected**	Versions >= 11.2 but BELOW 12.1
*Versions confirmed* as being affected**	11.2.0.2 11.2.0.1
Platforms affected	Generic (all / most platforms affected)