ORA-00600: internal error code, arguments: [4187]
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
前不久某客户的一套Oracle RAC,其中一个节点由于主机宕机重启后,数据库频繁crash,信息如下:
1 2 3 4 5 |
Block recovery completed at rba 149666.7.16, scn 3562.2643076291 Non-fatal internal error happenned while SMON was doing flushing of monitored table stats. SMON exceeded the maximum limit of 100 internal error(s). Errors in file /oracle/app/oracle/diag/rdbms/abm/abm2/trace/abm2_smon_10879294.trc: ORA-00600: internal error code, arguments: [4187], [], |
从上述错误来看,很明显是SMON进程在进行事务恢复时出现了异常,当报错此时达到100次时,实例会被强制crash重启。
首先我们这里来看下上述trace文件的内容:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
ORA-00600: internal error code, arguments: [4187], [], [ Error 600 in redo application callback Dump of change vector: TYP:0 CLS:55 AFN:4 DBA:0x01000110 OBJ:4294967295 SCN:0x0dea.957d672f SEQ:1 OP:5.2 ENC:0 RBL:0 ktudh redo: slt: 0x0021 sqn: 0x00000001 flg: 0x0411 siz: 80 fbi: 0 uba: 0x0107691f.9399.17 pxid: 0x0000.000.00000000 Block after image is corrupt: buffer rdba: 0x01000110 scn: 0x0dea.957d672f seq: 0x01 flg: 0x04 tail: 0x672f2601 frmt: 0x02 chkval: 0xb7ac type: 0x26=KTU SMU HEADER BLOCK Hex dump of corrupt header 3 = CHKVAL Dump of memory from 0x07000102A717A000 to 0x07000102A717A014 。。。。。。 7000102A717BFF0 00000000 00000000 00000000 [............] kcra_dump_redo_internal: skipped for critical process Doing block recovery for file 4 block 272 Block header before block recovery: buffer tsn: 4 rdba: 0x01000110 (4/272) scn: 0x0dea.957d672f seq: 0x01 flg: 0x04 tail: 0x672f2601 frmt: 0x02 chkval: 0xb7ac type: 0x26=KTU SMU HEADER BLOCK Resuming block recovery (PMON) for file 4 block 272 Block recovery from logseq 149687, block 51 to scn 15301400723442 |
从上述信息来看,Oracle 提示undo block可能有损,因为这里提示为block after image. 很明显这个buffer block地址是file 4 block 272.所以这里我先尝试dbv 检测一下该文件(本质上是undo datafile)是否存在异常,如下是dbv的结果。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
ABM-DB2:oracle:/oracle/app/oracle/diag/rdbms/abm/abm2/trace$dbv userid=system/oracle file='+DG_DATA/abm/datafile/undotbs2.263.819160701' blocksize=8192 DBVERIFY: Release 11.2.0.4.0 - Production on Fri Mar 17 21:26:44 2017 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. DBVERIFY - Verification starting : FILE = +DG_DATA/abm/datafile/undotbs2.263.819160701 DBVERIFY - Verification complete Total Pages Examined : 1310720 Total Pages Processed (Data) : 0 Total Pages Failing (Data) : 0 Total Pages Processed (Index): 0 Total Pages Failing (Index): 0 Total Pages Processed (Other): 1310719 Total Pages Processed (Seg) : 17 Total Pages Failing (Seg) : 0 Total Pages Empty : 1 Total Pages Marked Corrupt : 0 Total Pages Influx : 0 Total Pages Encrypted : 0 Highest block SCN : 0 (0.0) |
我检测发现undo 文件居然是ok。那么问题出在什么地方呢? 既然这是undo segment header block,那么就dump一下看看。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
Extent Control Header ----------------------------------------------------------------- Extent Header:: spare1: 0 spare2: 0 #extents: 88 #blocks: 569743 last map 0x00000000 #maps: 0 offset: 4080 Highwater:: 0x01076920 ext#: 52 blk#: 2208 ext size: 8192 #blocks in seg. hdr's freelists: 0 #blocks below: 0 mapblk 0x00000000 offset: 52 Unlocked Map Header:: next 0x00000000 #extents: 88 obj#: 0 flag: 0x40000000 Extent Map ----------------------------------------------------------------- 0x01000111 length: 7 0x1844b980 length: 8 0x010ced80 length: 8192 ...... TRN CTL:: seq: 0x9399 chd: 0x0021 ctl: 0x0010 inc: 0x00000000 nfb: 0x0003 mgc: 0xb000 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe) uba: 0x0107691f.9399.15 scn: 0x0dea.957d64f7 Version: 0x01 FREE BLOCK POOL:: uba: 0x0107691f.9399.16 ext: 0x34 spc: 0x1486 uba: 0x01076920.9399.03 ext: 0x34 spc: 0x11a4 uba: 0x0107691c.9399.14 ext: 0x34 spc: 0x1656 uba: 0x00000000.9344.19 ext: 0x37 spc: 0x1688 uba: 0x00000000.bd47.04 ext: 0x5 spc: 0x1d3a TRN TBL:: index state cflags wrap# uel scn dba parent-xid nub stmt_num cmt ------------------------------------------------------------------------------------------------ 0x00 9 0x00 0xffffdd12 0x0003 0x0dea.957d661f 0x0107691c 0x0000.000.00000000 0x00000001 0x00000000 1489714986 0x01 9 0x00 0xffffea71 0x001e 0x0dea.957d65ac 0x0107691b 0x0000.000.00000000 0x00000001 0x00000000 1489714986 0x02 9 0x00 0xffffecf0 0x0010 0x0dea.957d671d 0x0107691f 0x0000.000.00000000 0x00000001 0x00000000 1489714986 ...... 0x14 9 0x00 0xffffd6de 0x0015 0x0dea.957d65c4 0x0107691c 0x0000.000.00000000 0x00000001 0x00000000 1489714986 0x15 9 0x00 0xfffff0dd 0x0013 0x0dea.957d65d8 0x0107691c 0x0000.000.00000000 0x00000001 0x00000000 1489714986 0x16 9 0x00 0xffffd57c 0x0014 0x0dea.957d65c2 0x0107691c 0x0000.000.00000000 0x00000001 0x00000000 1489714986 0x17 9 0x00 0xffffe5db 0x0002 0x0dea.957d6716 0x0107691f 0x0000.000.00000000 0x00000001 0x00000000 1489714986 0x18 9 0x00 0xffffee0a 0x0017 0x0dea.957d6710 0x0107691f 0x0000.000.00000000 0x00000001 0x00000000 1489714986 0x19 9 0x00 0xffffe239 0x0009 0x0dea.957d658d 0x0107691b 0x0000.000.00000000 0x00000001 0x00000000 1489714986 0x1a 9 0x00 0xffffe4d8 0x000a 0x0dea.957d6709 0x0107691f 0x0000.000.00000000 0x00000001 0x00000000 1489714986 0x1b 9 0x00 0xffff5827 0x001a 0x0dea.957d6705 0x0107691f 0x0000.000.00000000 0x00000001 0x00000000 1489714986 0x1c 9 0x00 0xfffff836 0x000e 0x0dea.957d66fb 0x0107691f 0x0000.000.00000000 0x00000001 0x00000000 1489714986 0x1d 9 0x00 0xffffc955 0x000b 0x0dea.957d65b7 0x0107691b 0x0000.000.00000000 0x00000001 0x00000000 1489714986 0x1e 9 0x00 0xffffec64 0x000d 0x0dea.957d65b0 0x0107691b 0x0000.000.00000000 0x00000001 0x00000000 1489714986 0x1f 9 0x00 0xffffdcd3 0x000f 0x0dea.957d66dc 0x0107691f 0x0000.000.00000000 0x00000001 0x00000000 1489714986 0x20 9 0x00 0xffffe4e2 0x001c 0x0dea.957d66e8 0x0107691c 0x0000.000.00000000 0x00000001 0x00000000 1489714986 0x21 9 0x00 0xfffffff1 0x000c 0x0dea.957d6513 0x0107691b 0x0000.000.00000000 0x00000001 0x00000000 1489714986 EXT TRN CTL:: usn: 20 sp1:0x00000000 sp2:0x00000000 sp3:0x00000000 sp4:0x00000000 sp5:0x00000000 sp6:0x00000000 sp7:0x00000000 sp8:0x00000000 EXT TRN TBL:: index extflag extHash extSpare1 extSpare2 --------------------------------------------------- 0x00 0x00000000 0x00000000 0x00000000 0x00000000 0x01 0x00000000 0x00000000 0x00000000 0x00000000 ...... 0x1f 0x00000000 0x00000000 0x00000000 0x00000000 0x20 0x00000000 0x00000000 0x00000000 0x00000000 0x21 0x00000000 0x00000000 0x00000000 0x00000000 GLOBAL CACHE ELEMENT DUMP (address: 0x700010001ed9618): id1: 0x110 id2: 0x4 pkey: INVALID block: (4/272) lock: X rls: 0x0 acq: 0x0 latch: 16 flags: 0x20 fair: 0 recovery: 0 fpin: 'ktuwh72: ktugus:ktuswr1' bscn: 0xdea.957d672f bctx: 0x0 write: 0 scan: 0x0 lcp: 0x0 lnk: [NULL] lch: [0x700010267e80150,0x700010267e80150] seq: 686 hist: 145:0 28 340 225 212 72 257 59 334 43 158:0 38 LIST OF BUFFERS LINKED TO THIS GLOBAL CACHE ELEMENT: flg: 0x08200001 state: XCURRENT tsn: 4 tsh: 3 addr: 0x700010267e80018 obj: INVALID cls: UNDO HEAD bscn: 0xdea.957d672f |
从上面整个回滚段头的dump来看,信息确实与redo的内容不匹配,难怪最后会报INVALID block。
对于这个问题,同事说可能是Bug 19700135 : ORA-600 [4187] WHEN WRAP# IS CLOSE TO KSQNMAXVAL。
然而我分析了一下,从现象上来看,并不完全复合。
不管怎么说,从dump 来看这个回滚段并没有任何活动的事务,因此可以通过重建undo或者drop 回滚段的方式来处理这个问题。
最后我通过重建undo表空间之后,观察了10分钟,alert log不再报任何ORA-00600错误。这个小问题在此告一段落。
Leave a Reply
You must be logged in to post a comment.