One case:Latch free of oracle 9208 ?
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
最近某客户9208环境,遭遇latch free问题,而在oracle 9i版本中,latch free 只是一个汇总类型的event。
换句话讲,有多种event的表现其实都是通过latch free来呈现。 客户这里的实际情况查询是主要体现在:
row cache objects和cache buffer chains. 对于cache buffer chains,相对比较简单,通常也就是热块或
SQL效率问题。而 row cache objects相对麻烦一些,客户的环境经过分析发现的row cache中的dc_rollback_segs
等待严重导致,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
LATCH# CACHE# NAME TYPE SUBORDINATE# GETS ------- ------ --------------------------------- ----------- ------------ ---------- 57 3 dc_rollback_segments PARENT -1.591E+09 ....... 57 3 dc_rollback_segments PARENT -71015419 58 3 dc_rollback_segments PARENT -1.591E+09 ....... 58 3 dc_rollback_segments PARENT -71015419 59 3 dc_rollback_segments PARENT -1.591E+09 ....... 59 3 dc_rollback_segments PARENT -71015419 60 3 dc_rollback_segments PARENT -1.591E+09 ....... 60 3 dc_rollback_segments PARENT -71015419 61 3 dc_rollback_segments PARENT -1.591E+09 ....... 61 3 dc_rollback_segments PARENT -71015419 62 3 dc_rollback_segments PARENT -1.591E+09 ....... 62 3 dc_rollback_segments PARENT -71015419 63 3 dc_rollback_segments PARENT -1.591E+09 ....... 63 3 dc_rollback_segments PARENT -71015419 64 3 dc_rollback_segments PARENT -1.591E+09 ....... 64 3 dc_rollback_segments PARENT -71015419 |
针对这个问题,似乎不太好处理,mos提供了一个通过调整_rollback_segment_count参数来避免该问题,而且还不一定有效。
我们先不管是否有效,那么这个参数有何作用 ? 通过10gR2 环境来研究一下。
++++++Session 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
SQL> select * from v$version where rownum < 3; BANNER ---------------------------------------------------------------- Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - Prod PL/SQL Release 10.2.0.5.0 - Production SQL> select count(1) from v$latch_children where name='row cache objects'; COUNT(1) ---------- 41 SQL> select count(1) from dba_rollback_segs; COUNT(1) ---------- 11 SQL> select count(1) from dba_rollback_segs where status='ONLINE'; COUNT(1) ---------- 11 SQL> set pages 1000 SQL> column cache# format 99999 SQL> column name format a33 SQL> column latch# format 999999 SQL> select distinct s.kqrstcln latch#, 2 r.cache#, 3 r.parameter name, 4 r.type, 5 r.subordinate#, 6 r.gets 7 from v$rowcache r, x$kqrst s 8 where r.cache# = s.kqrstcid 9 order by 1,4,5; LATCH# CACHE# NAME TYPE SUBORDINATE# GETS ------- ------ --------------------------------- ----------- ------------ ---------- 1 1 dc_free_extents PARENT 0 2 4 dc_used_extents PARENT 0 3 2 dc_segments PARENT 49990 4 0 dc_tablespaces PARENT 203697 5 5 dc_tablespace_quotas PARENT 176 6 6 dc_files PARENT 3 7 7 dc_users PARENT 173742 7 7 dc_users SUBORDINATE 0 254 7 7 dc_users SUBORDINATE 1 2031 7 7 dc_users SUBORDINATE 2 0 8 3 dc_rollback_segments PARENT 466657 9 8 dc_objects PARENT 46315 9 8 dc_object_grants SUBORDINATE 0 3947 10 17 dc_global_oids PARENT 104704 11 12 dc_constraints PARENT 88 12 11 dc_object_ids PARENT 179941 13 13 dc_sequences PARENT 10722 14 10 dc_usernames PARENT 9010 15 15 dc_database_links PARENT 245 16 16 dc_histogram_defs PARENT 45492 16 16 dc_histogram_data SUBORDINATE 0 14794 16 16 dc_histogram_data SUBORDINATE 1 2401 17 33 kqlsubheap_object PARENT 0 18 19 dc_table_scns PARENT 103 18 19 dc_partition_scns SUBORDINATE 0 0 19 18 dc_outlines PARENT 0 20 14 dc_profiles PARENT 197 21 34 realm cache PARENT 0 21 34 realm auth SUBORDINATE 0 0 22 35 Command rule cache PARENT 0 23 36 Realm Object cache PARENT 0 23 36 Realm Subordinate Cache SUBORDINATE 0 0 24 40 Rule Set Cache PARENT 0 25 37 event map PARENT 0 26 38 format PARENT 0 27 39 audit collector PARENT 0 28 26 global database name PARENT 36 29 20 rule_info PARENT 0 30 21 rule_or_piece PARENT 0 30 21 rule_fast_operators SUBORDINATE 0 0 31 9 dc_qmc_cache_entries PARENT 0 32 23 dc_qmc_ldap_cache_entries PARENT 0 33 27 qmtmrcin_cache_entries PARENT 0 34 28 qmtmrctn_cache_entries PARENT 0 35 29 qmtmrcip_cache_entries PARENT 0 36 30 qmtmrctp_cache_entries PARENT 0 37 31 qmtmrciq_cache_entries PARENT 0 38 32 qmtmrctq_cache_entries PARENT 0 39 24 outstanding_alerts PARENT 676 40 22 dc_awr_control PARENT 1065 41 25 dc_hintsets PARENT 0 51 rows selected. SQL> oradebug setmypid Statement processed. SQL> ALTER SESSION SET EVENTS 'immediate trace name row_cache level 2'; Session altered. SQL> oradebug tracefile_name /home/ora10g/admin/roger/udump/roger_ora_18777.trc SQL> oradebug close_trace Statement processed. |
我们先来看下level 2的dump,看下row cache objects的dc是不是41个,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
[root@killdb ~]# cat cat /home/ora10g/admin/roger/udump/roger_ora_18777.trc |grep cid cat: cat: No such file or directory ROW CACHE HASH TABLE: cid=0 ht=0x28b9e660 size=32 ROW CACHE HASH TABLE: cid=1 ht=0x28b1d534 size=256 ROW CACHE HASH TABLE: cid=2 ht=0x28b1e5fc size=65536 ROW CACHE HASH TABLE: cid=3 ht=0x284b4074 size=128 ROW CACHE HASH TABLE: cid=4 ht=0x28b1dd98 size=256 ROW CACHE HASH TABLE: cid=5 ht=0x28b9e7c4 size=256 ROW CACHE HASH TABLE: cid=6 ht=0x28b9f028 size=64 ROW CACHE HASH TABLE: cid=7 ht=0x28434010 size=65536 ROW CACHE HASH TABLE: cid=8 ht=0x284b44d8 size=65536 ROW CACHE HASH TABLE: cid=9 ht=0x2873cbf8 size=512 ROW CACHE HASH TABLE: cid=10 ht=0x285b62cc size=256 ROW CACHE HASH TABLE: cid=11 ht=0x28535e04 size=65536 ROW CACHE HASH TABLE: cid=12 ht=0x285355a0 size=256 ROW CACHE HASH TABLE: cid=13 ht=0x285b5e68 size=128 ROW CACHE HASH TABLE: cid=14 ht=0x28638f24 size=16 ROW CACHE HASH TABLE: cid=15 ht=0x285b6b30 size=32 ROW CACHE HASH TABLE: cid=16 ht=0x285b6c94 size=65536 ROW CACHE HASH TABLE: cid=17 ht=0x2853453c size=512 ROW CACHE HASH TABLE: cid=18 ht=0x28637ec0 size=512 ROW CACHE HASH TABLE: cid=19 ht=0x28636e5c size=512 ROW CACHE HASH TABLE: cid=20 ht=0x2863cb30 size=65536 ROW CACHE HASH TABLE: cid=21 ht=0x286bcb94 size=65536 ROW CACHE HASH TABLE: cid=22 ht=0x2874577c size=1 ROW CACHE HASH TABLE: cid=23 ht=0x2873dc5c size=512 ROW CACHE HASH TABLE: cid=24 ht=0x28744f18 size=256 ROW CACHE HASH TABLE: cid=25 ht=0x287457e8 size=512 ROW CACHE HASH TABLE: cid=26 ht=0x2863cac4 size=1 ROW CACHE HASH TABLE: cid=27 ht=0x2873ecc0 size=512 ROW CACHE HASH TABLE: cid=28 ht=0x2873fd24 size=512 ROW CACHE HASH TABLE: cid=29 ht=0x28740d88 size=512 ROW CACHE HASH TABLE: cid=30 ht=0x28741dec size=512 ROW CACHE HASH TABLE: cid=31 ht=0x28742e50 size=512 ROW CACHE HASH TABLE: cid=32 ht=0x28743eb4 size=512 ROW CACHE HASH TABLE: cid=33 ht=0x28636cf8 size=32 ROW CACHE HASH TABLE: cid=34 ht=0x28639008 size=256 ROW CACHE HASH TABLE: cid=35 ht=0x2863986c size=256 ROW CACHE HASH TABLE: cid=36 ht=0x2863a0d0 size=256 ROW CACHE HASH TABLE: cid=37 ht=0x2863b198 size=256 ROW CACHE HASH TABLE: cid=38 ht=0x2863b9fc size=256 ROW CACHE HASH TABLE: cid=39 ht=0x2863c260 size=256 ROW CACHE HASH TABLE: cid=40 ht=0x2863a934 size=256 |
我们可以看到,确认是41个,跟我们查询v$latch_children是符合的.
+++++Session 2
1 2 3 4 5 6 7 8 9 10 11 12 |
SQL> alter session set tracefile_identifier='001'; Session altered. SQL> oradebug setmypid Statement processed. SQL> oradebug dump row_cache 8; Statement processed. SQL> oradebug close_trace Statement processed. SQL> oradebug tracefile_name /home/ora10g/admin/roger/udump/roger_ora_18777_001.trc |
从上面的trace我们可以发现,保护dc_rollback_segments row cache的bucket有30个,如下:
1 2 3 |
[root@killdb ~]# cat /home/ora10g/admin/roger/udump/roger_ora_18777_001.trc |grep dc_rollback_segments|wc -l 30 [root@killdb ~]# |
从该trace我们可以看到buckets的个数以及hash chain的情况,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
ROW CACHE HASH TABLE: cid=8 ht=0x284b44d8 size=65536 Buckets with more than 20 objects: NONE Hash Chain Size Number of Buckets --------------- ----------------- 0 64900 1 636 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 10 0 11 0 12 0 13 0 14 0 15 0 16 0 17 0 18 0 19 0 20 0 >20 0 |
我们看到,此时有2种hash chain size,分别是0和1. 其中1 表示当前存放内容的hash chain,保护了636个buckets。 另外
hash chain size 为0的,表示这部分的64900 个bucket是空闲的,也就是说没有其中没有任何内容.
下面我们将参数_rollback_segment_count调整为6,然后再来观察下保护dc_rollback_segments的bucket有多少个。
注意:_rollback_segment_count参数是指数据库中保持online回滚段的个数(非SYSTEM回滚段,sysem回滚段总是online的)
+++++Session 3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
SQL> alter system set "_rollback_segment_count"=6; System altered. SQL> select count(1) from dba_rollback_segs where status='ONLINE'; COUNT(1) ---------- 11 SQL> shutdown abort; ORACLE instance shut down. SQL> SQL> startup ORACLE instance started. Total System Global Area 167772160 bytes Fixed Size 1272600 bytes Variable Size 146801896 bytes Database Buffers 16777216 bytes Redo Buffers 2920448 bytes Database mounted. Database opened. SQL> SQL> select count(1) from dba_rollback_segs where status='ONLINE'; COUNT(1) ---------- 7 SQL> oradebug setmypid Statement processed. SQL> oradebug dump row_cache 2 Statement processed. SQL> oradebug tracefile_name /home/ora10g/admin/roger/udump/roger_ora_20510.trc SQL> SQL> alter session set tracefile_identifier='001'; Session altered. SQL> oradebug setmypid Statement processed. SQL> oradebug dump row_cache 8; Statement processed. SQL> oradebug close_trace Statement processed. SQL> oradebug tracefile_name /home/ora10g/admin/roger/udump/roger_ora_20510_001.trc |
我们来看下调整参数之后保护dc_rollback_segments 的bucket有没有变化:
[root@killdb ~]# cat /home/ora10g/admin/roger/udump/roger_ora_20510_001.trc|grep dc_rollback_segments|wc -l
30
我们这里来看似乎仍然是30,真的吗 ?我们先来看下此时一共有多个bucket:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
ROW CACHE HASH TABLE: cid=8 ht=0x284b44d8 size=65536 Buckets with more than 20 objects: NONE Hash Chain Size Number of Buckets --------------- ----------------- 0 65316 1 220 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 10 0 11 0 12 0 13 0 14 0 15 0 16 0 17 0 18 0 19 0 20 0 >20 0 |
当前库中此时使用的bucket有220个,因为我将db重启了,所以这个比之前要少,是正常的。 我们重点是来看
保护dc_rollback_segments的bucket是不是有所变化 ?
打开trace文件,搜索发现确认有所降低,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
BUCKET 48: row cache parent object: address=0x29ae4dc0 cid=3(dc_rollback_segments) hash=35e74caf typ=5 transaction=(nil) flags=00000002 own=0x29ae4e2c[0x29ae4e2c,0x29ae4e2c] wat=0x29ae4e34[0x29ae4e34,0x29ae4e34] mode=N status=VALID/-/-/-/-/-/-/-/- data= 00000001 00000001 00000002 00000009 535f0009 4d535359 00243155 00000000 00000000 00000000 00000000 00000000 00000001 00000001 0000041d 000003f2 005594c4 00000000 00000000 00000001 BUCKET 48 total object count=3 BUCKET 86: row cache parent object: address=0x29ae4224 cid=3(dc_rollback_segments) hash=75647f55 typ=5 transaction=(nil) flags=00000002 own=0x29ae4290[0x29ae4290,0x29ae4290] wat=0x29ae4298[0x29ae4298,0x29ae4298] mode=N status=VALID/-/-/-/-/-/-/-/- data= 00000008 00000001 00000002 00000079 535f0009 4d535359 00243855 00000000 00000000 00000000 00000000 00000000 00000001 00000001 00000473 00000447 005595da 00000000 00000000 00000001 BUCKET 86 total object count=2 BUCKET 91: row cache parent object: address=0x29ae3ed0 cid=3(dc_rollback_segments) hash=3664ff5a typ=5 transaction=(nil) flags=00000002 own=0x29ae3f3c[0x29ae3f3c,0x29ae3f3c] wat=0x29ae3f44[0x29ae3f44,0x29ae3f44] mode=N status=VALID/-/-/-/-/-/-/-/- data= 0000000b 00000001 00000002 00000529 42520007 30305f53 00000031 00000000 00000000 00000000 00000000 00000000 00000001 00000001 00000002 00000004 002e6f68 00000000 00000000 00000000 BUCKET 91 total object count=2 |
我们可以看到,此时虽然count显示是有30个bucket在保护dc_rollback_segments,然而实际上
bucket 48/86/91分别保护了3、2、2个object. 换句话讲,实际上用到的bucket就应该是30-(3+2+2)+3=26.
从上面的实验来看,_rollback_segment_count参数确认影响row cache中保护dc_*对象的bucket数量,
总体原则就是参数越大,bucket就越多,反之亦然.
简单的总结一下:
1. _rollback_segment_count参数是控制非SYSTEM回滚段处于online状态的个数;
2. _rollback_segment_count参数越大,保持online状态的回滚段就越多。9i默认情况下oracle会自己去进行判断,
有自己的算法,随着不停的inactive和active操作,smon去offline和online回滚段,会加剧系统资源的消耗;
3. hash bucket数量越多,那么也就越不容易导致竞争。
Leave a Reply
You must be logged in to post a comment.