某金融客户Oracle 19c RAC logical corruption
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
这几天实在太忙,刚弄完文档。业务线的同事就找到我,说一个银行客户的核心系统昨晚出了故障,还没找到原因,希望能帮忙分析下。
从提供的信息来看是业务跑任务报错,遇到了Oracle-00600和ora-07445 错误。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
Doing block recovery <span class="hljs-keyword">for</span> file 408 block 1405699 Resuming block recovery (PMON) <span class="hljs-keyword">for</span> file 408 block 1405699 Block recovery from logseq 127486, block 19033 to scn 0x00000040ee407ee9 Tue Jun 25 02:32:39 2024 Recovery of Online Redo Log: Thread 1 Group 46 Seq 127486 Reading mem 0 Mem<span class="hljs-comment"># 0: +DATA1/xxxxx/ONLINELOG/group_46.01.log</span> Block recovery completed at rba 127486.19061.16, scn 0x00000040ee407ef6 Exception [<span class="hljs-built_in">type</span>: SIGSEGV, Address not mapped to object] [ADDR:0x30044C4E535002C9] [PC:0x100222454, kghrst()+500] [flags: 0x0, count: 1] Errors <span class="hljs-keyword">in</span> file /u01/oracle/app/diag/rdbms/xxxxx/xxxxx1/trace/xxxxx1_ora_16384202.trc (incident=4925384): ORA-07445: exception encountered: core dump [kghrst()+500] [SIGSEGV] [ADDR:0x30044C4E535002C9] [PC:0x100222454] [Address not mapped to object] [] ORA-00607: Internal error occurred <span class="hljs-keyword">while</span> making a change to a data block ORA-00600: internal error code, arguments: [17182], [0x121242CE0], [], [], [], [], [], [], [], [], [], [] Incident details <span class="hljs-keyword">in</span>: /u01/oracle/app/diag/rdbms/xxxxx/xxxxx1/incident/incdir_4925384/xxxxx1_ora_16384202_i4925384.trc |
这里完全不用去看这个600错误和7445错误是什么含义,从前文的日志来看,似乎在做一个block recovery出了异常。通过提供的alert 继续往前翻发现,在当天晚上22点过就开始报错了。
1 2 3 4 5 6 7 8 9 |
Mon Jun 24 22:29:29 2024 Errors <span class="hljs-keyword">in</span> file /u01/oracle/app/diag/rdbms/xxxxx/xxxxx1/trace/xxxxx1_ora_21038638.trc (incident=4926767): ORA-00600: internal error code, arguments: [kdsgrp1-qetlbrcb], [7], [1712681731], [95], [], [], [], [], [], [], [], [] Mon Jun 24 22:29:29 2024 Errors <span class="hljs-keyword">in</span> file /u01/oracle/app/diag/rdbms/xxxxx/xxxxx1/trace/xxxxx1_ora_35193336.trc (incident=4924391): ORA-00600: internal error code, arguments: [4512], [128], [23], [0.0.0], [85557], [527736], [7], [1712681731], [], [], [], [] Incident details <span class="hljs-keyword">in</span>: /u01/oracle/app/diag/rdbms/xxxxx/xxxxx1/incident/incdir_4924391/xxxxx1_ora_35193336_i4924391.trc Incident details <span class="hljs-keyword">in</span>: /u01/oracle/app/diag/rdbms/xxxxx/xxxxx1/incident/incdir_4926767/xxxxx1_ora_21038638_i4926767.trc |
老实说这个报错其实很眼熟,但是用户google都搜不到这个错误。其实也搜了下mos发现根本搜不到。
但是从字面意思来看,kdsgrp1-qetlbrcb 和kdsgrp1 应该是差不多的含义和处理思路。
这里简单解释一下这个关键错误: ORA-00600: internal error code, arguments: [kdsgrp1-qetlbrcb], [7], [1712681731], [95], [], [], [], [], [], [], [], []
这表示读取数据时发现第7号表空间的第1712681731号block(这是10进制,需要转换成file/block)的第95行数据 读取报错。
既然如此我们就干脆来看看trace文件得了,更加直观一些。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
kdsDumpState: cdb: 0 dspdb: 0 <span class="hljs-built_in">type</span>: 1 info: 0x5 flag: 0x0 kdsDumpState: sample <span class="hljs-built_in">type</span>: 0 layer: 0 * kdsgrp1-1: <span class="hljs-built_in">where</span> 4 ************************************* row 0x66157303.5f no continuation RID row likely from index rdba: 0x602ba953, tsn: 8 rfile<span class="hljs-comment"># 408 block# 1405699 slot 95 not found (dscnt: 0)</span> Objd 527736 tsn 7 Bigfile TS : NO Next off 0 last piece 0 curlvl 0 flag 0x0 slotn 95 Head RID 0x00000000 KDSTABN_GET: 0 ..... ntab: 1 curSlot: 95 ..... nrows: 113 Dumping kcb descriptor: kcbds 0x12129caa8: pdb 0, tsn 7, rdba 0x66157303, afn 408, objd 527736, cls 1, tidflg 0x8 0x80 0x0 |
实际上这里就描述的相对清晰了。 看到这里,我是多么希望国产数据库在这方面也能加强加强。
上面的信息我简单翻译一下就是,该业务SQL通过索引回表发现该表所在的408号文件的1405699号block的95行数据找不到。 那么这里可以根据trace来看看上面提到的索引block的dump,看看索引中的数据是否存在:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
Block header dump: 0x602ba953 Object id on Block? Y seg/obj: 0x80ccd csc: 0x00000040eaf1e443 itc: 29 flg: E typ: 2 - INDEX brn: 0 bdba: 0x602ba900 ver: 0x01 opc: 0 inc: 0 exflg: 0 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x0687.01a.005827a7 0x69441431.81ca.01 CB-- 0 scn 0x00000040ea5d1092 0x02 0x0578.017.0058304a 0x13854664.81bd.05 C--- 0 scn 0x00000040eaf091ca 0x03 0x0691.000.00527481 0x13481b1e.7a65.02 --U- 1 fsc 0x0000.eaf1e447 0x04 0x0ac7.01c.0057faf6 0x014694b8.8239.34 --U- 1 fsc 0x0000.eaf452de 0x05 0x0893.01a.005513e8 0x13c63a6d.801b.1f --U- 1 fsc 0x0000.eaf50ce1 0x06 0x0e66.00e.0056e394 0x0147e001.7ebf.18 --U- 1 fsc 0x0000.eaf759d4 0x07 0x09de.01a.00550cd8 0x0146f9b1.7d9d.0a --U- 1 fsc 0x0000.eafc97cf 0x08 0x0e71.014.0057f64e 0x0145946a.803d.2b C--- 0 scn 0x00000040eaa00c93 0x09 0x0634.01b.00554046 0x1348c984.7f41.10 --U- 1 fsc 0x0000.eb00fbf7 0x0a 0x0c22.011.00524991 0x14035b22.7b9a.37 C--- 0 scn 0x00000040ea7dc8fb 。。。。。。 0x1d 0x0397.00b.0058ff56 0x1385a287.8186.10 C--- 0 scn 0x00000040ead9fe73 Leaf block dump =============== header address 504404435098002156=0x70001294925e2ec kdxcolev 0 KDXCOLEV Flags = - - - kdxcolok 0 kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y kdxconco 2 kdxcosdc 1 kdxconro 112 kdxcofbo 260=0x104 kdxcofeo 2456=0x998 kdxcoavs 2196 kdxlespl 0 kdxlende 0 kdxlenxt 1879049743=0x7000060f kdxleprv 1692400010=0x64dff98a kdxledsz 6 kdxlebksz 7384 row<span class="hljs-comment">#0[3952] flag: -------, lock: 0, len=44, data:(6): 65 dc 06 09 00 5c</span> col 0; len 24; (24): ...... row<span class="hljs-comment">#95[3248] flag: -------, lock: 0, len=44, data:(6): 55 fb bf dd 00 55</span> col 0; len 24; (24): 30 39 36 35 34 32 34 31 32 32 30 36 33 30 32 47 39 32 30 36 35 35 32 36 col 1; len 10; (10): 32 30 32 34 2d 30 36 2d 32 34 ...... |
可见索引数据是存在的,且索引block的数据是完整的,但rows只有112条。
接下来我们继续看对应的数据块dump内容:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 |
kdsgrp - dump CR block dba=0x66157303 buffer tsn: 7 rdba: 0x66157303 (408/1405699) scn: 0x40eb045a4d seq: 0x01 flg: 0x02 tail: 0x5a4d0601 frmt: 0x02 chkval: 0x0000 <span class="hljs-built_in">type</span>: 0x06=trans data Block header dump: 0x66157303 Object id on Block? Y seg/obj: 0x80d78 csc: 0x00000040eb0456d2 itc: 23 flg: E typ: 1 - DATA brn: 0 bdba: 0x66157300 ver: 0x01 opc: 0 inc: 0 exflg: 0 ...... Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x0741.01e.00582a93 0x6947578f.8183.1f C--- 0 scn 0x00000040eb044b19 ...... 0x06 0x0be6.002.004fc3b0 0x13c6fde1.79e5.06 C--- 0 scn 0x00000040eb044f1c 0x07 0x022e.017.00553cca 0x1347f24f.7f8e.1e --U- 1 fsc 0x0000.eb045744 0x08 0x0966.002.0057fef0 0x6985968d.8291.04 --U- 1 fsc 0x0000.eb045a4d ...... 0x0d 0x0987.012.00524580 0x13c71e54.7b57.32 C--- 0 scn 0x00000040eb044e04 0x0e 0x01a7.00c.00559065 0x1385fe1e.80e6.11 --U- 1 fsc 0x0000.eb0456e8 ..... 0x14 0x036b.005.00527dfc 0x1385dc2e.7cfd.34 --U- 1 fsc 0x0000.eb0456d4 0x15 0x03ce.013.0055663e 0x0108a195.7ff5.03 C--- 0 scn 0x00000040eb0420d8 0x16 0x0728.017.0059a11a 0x134743fa.82a9.19 C--- 0 scn 0x00000040eb0424c6 0x17 0x02ff.01e.00553ed3 0x1385cbcc.8005.1f C--- 0 scn 0x00000040eb042df6 bdba: 0x66157303 data_block_dump,data header at 0x700012ec8f9825c =============== tsiz: 0x1da0 hsiz: 0xf4 pbl: 0x700012ec8f9825c 76543210 flag=-------- ntab=1 nrow=113 frre=-1 fsbo=0xf4 fseo=0xf9 avsp=0x5 tosp=0x5 0xe:pti[0] nrow=113 offs=0 0x12:pri[0] offs=0x1d50 0x14:pri[1] offs=0x1d06 。。。。。。 0xcc:pri[93] offs=0x5a2 0xce:pri[94] offs=0x559 0xd0:pri[95] offs=0x525 0xd2:pri[96] offs=0x4f0 。。。。。。 0xf0:pri[111] offs=0x18c 0xf2:pri[112] offs=0x142 block_row_dump: 。。。。。。。 tab 0, row 94, @0x559 tl: 2 fb: --HDF--- lb: 0x31 tab 0, row 95, @0x525 tl: 1716 fb: KC-----N lb: 0x56 cc: 43 cki: 2 nrid: 0xc10b0831.3133 col 0: [50] 30 36 30 31 01 80 0a 32 30 32 32 2d 31 31 2d 30 31 01 80 03 c1 56 2b 03 c1 56 2b 01 80 01 80 2c 00 10 18 30 38 35 35 38 31 38 31 31 39 30 36 32 35 32 col 1: [88] 30 30 30 31 38 39 36 31 03 49 30 31 06 35 35 38 31 38 31 03 43 4e 59 10 35 35 38 31 38 31 30 31 36 30 31 31 30 31 30 31 03 c1 56 2b 01 80 02 c1 2b 08 36 30 31 31 30 31 30 31 03 c1 56 2b 0a 32 30 32 32 2d 31 30 2d 33 31 01 80 01 80 03 c1 56 2b 01 80 01 80 2c 00 10 col 2: [24] 30 38 35 35 38 31 38 31 31 39 30 36 32 35 32 58 30 30 30 31 38 39 36 31 col 3: [ 3] 48 30 31 col 4: [ 6] 35 35 38 31 38 31 col 5: [ 3] 43 4e 59 col 6: [16] 35 35 38 31 38 31 30 31 31 33 30 31 30 31 30 33 col 7: [ 1] 80 col 8: [ 3] c1 56 2b col 9: [ 2] c1 32 col 10: [ 8] 31 33 30 31 30 31 30 33 col 11: [ 1] 80 col 12: [10] 32 30 32 32 2d 31 31 2d 30 31 col 13: [ 1] 80 col 14: [ 3] c1 56 2b col 15: [ 3] c1 56 2b col 16: [ 1] 80 col 17: [ 1] 80 col 18: [44] 00 10 18 30 38 35 35 37 31 34 31 32 31 31 31 31 36 32 43 35 37 34 38 31 35 35 33 03 41 30 31 06 35 35 37 31 34 31 03 43 4e 59 10 35 col 19: [53] 37 31 34 31 30 31 31 31 33 32 30 36 30 31 01 80 03 c1 05 49 02 c1 04 08 31 31 33 32 30 36 30 31 01 80 0a 32 30 32 32 2d 31 30 2d 30 31 01 80 03 c1 05 49 03 c1 col 20: [ 5] 49 01 80 01 80 col 21: [44] 00 10 18 30 39 36 35 35 32 38 31 32 32 30 37 30 35 32 43 35 39 37 31 38 32 34 37 03 41 30 31 06 36 35 35 32 38 31 03 43 4e 59 10 36 col 22: [53] 35 32 38 31 30 31 31 31 33 32 30 36 30 31 01 80 04 c2 03 08 33 02 c1 0e 08 31 31 33 32 30 36 30 31 01 80 0a 32 30 32 33 2d 30 31 2d 30 39 01 80 04 c2 03 08 33 col 23: [ 4] c2 03 08 33 col 24: [ 1] 80 col 25: [ 1] 80 col 26: [44] 00 10 18 30 37 36 30 32 32 35 31 32 32 30 34 31 31 32 47 37 32 30 33 36 37 35 37 03 41 30 31 06 36 30 32 32 35 31 03 43 4e 59 10 36 col 27: [48] 32 32 35 31 30 31 31 31 33 32 30 36 30 31 01 80 03 c1 1e 1b 02 c1 10 08 31 31 33 32 30 36 30 31 01 80 0a 32 30 32 33 2d 30 32 2d 30 31 01 80 03 col 28: [193] 1e 1b 03 c1 1e 1b 01 80 01 80 2c 00 10 18 30 37 36 30 32 32 35 31 32 32 30 34 31 31 32 47 37 32 30 33 36 37 35 37 03 49 30 31 06 36 30 32 32 35 31 03 43 4e 59 10 36 30 32 32 35 31 30 31 36 30 31 31 30 31 30 31 03 c1 1e 1b 01 80 03 c2 02 26 08 36 30 31 31 30 31 30 31 03 c1 1e 1b 0a 32 30 32 33 2d 30 31 2d 33 31 01 80 01 80 03 c1 1e 1b 01 80 01 80 2c 00 10 18 30 37 36 30 32 32 35 31 32 32 30 34 31 31 32 47 37 32 30 33 36 37 35 37 03 48 30 31 06 36 30 32 32 35 31 03 43 4e 59 10 36 30 32 32 35 31 30 31 31 33 30 31 30 31 30 33 01 80 03 c1 1e 1b 03 c2 02 2c 08 31 33 30 31 41 30 col 29: [49] 31 35 35 35 30 31 0a 32 30 32 34 2d 30 36 2d 32 34 02 32 30 04 4c 4e 53 50 01 80 01 53 01 80 08 b4 a6 c0 ed b3 c9 b9 a6 0a 32 30 32 34 2d 30 36 2d col 30: [50] 35 2c 00 09 18 30 32 30 32 30 30 38 30 31 30 30 30 36 32 34 41 30 33 35 38 32 37 30 31 0a 32 30 32 34 2d 30 36 2d 32 34 02 32 30 04 4c 4e 53 50 02 c1 col 31: [19] 01 53 01 80 08 b4 a6 c0 ed b3 c9 b9 a6 0a 32 30 32 34 2d col 32: [48] 36 2d 32 35 0c 00 09 68 15 98 53 00 3a 18 30 34 33 36 30 31 33 33 32 33 30 32 32 37 32 43 33 36 30 39 37 30 39 32 0a 32 30 32 34 2d 30 36 2d 32 col 33: [52] 02 32 30 04 4c 4e 53 50 02 c1 14 01 53 01 80 08 b4 a6 c0 ed b3 c9 b9 a6 0a 32 30 32 34 2d 30 36 2d 32 35 2c 00 07 18 31 30 31 31 30 30 36 30 31 38 39 30 33 col 34: [49] 36 41 30 33 35 37 35 37 30 31 0a 32 30 32 34 2d 30 36 2d 32 34 01 30 04 4c 4f 41 4e 02 c1 1a 01 53 01 80 2c 00 07 18 30 34 33 35 38 31 34 31 32 32 col 35: [48] 31 31 39 32 47 33 34 37 33 37 35 30 33 0a 32 30 32 34 2d 30 36 2d 32 34 01 30 04 4c 4f 41 4e 02 c1 1a 01 53 01 80 2c 00 07 18 30 37 36 30 38 31 col 36: [48] 33 32 34 30 31 31 31 32 47 38 38 31 38 39 33 33 34 0a 32 30 32 34 2d 30 36 2d 32 34 01 30 04 4c 4f 41 4e 02 c1 1a 01 53 01 80 2c 00 07 18 30 38 col 37: [53] 35 37 33 39 31 32 31 31 31 31 31 32 43 37 39 37 30 36 32 36 33 0a 32 30 32 34 2d 30 36 2d 32 34 01 30 04 4c 4f 41 4e 02 c1 1c 01 53 01 80 2c 00 07 18 30 39 36 col 38: [53] 37 31 31 31 32 32 31 32 31 35 32 43 30 31 39 33 30 33 36 35 0a 32 30 32 34 2d 30 36 2d 32 34 01 30 04 4c 4f 41 4e 02 c1 18 01 53 01 80 2c 00 07 18 30 35 34 37 col 39: [57] 34 34 31 32 33 30 38 31 37 32 43 30 39 33 38 37 32 31 37 0a 32 30 32 34 2d 30 36 2d 32 34 01 30 04 4c 4f 41 4e 02 c1 18 01 53 01 80 0c 00 09 67 15 90 81 00 3b 18 30 38 35 col 40: [53] 37 32 38 31 32 32 31 31 31 34 32 43 30 30 32 39 39 37 38 35 0a 32 30 32 34 2d 30 36 2d 32 34 02 32 30 04 4c 4e 53 50 02 c1 1a 01 53 01 80 08 b4 a6 c0 ed b3 c9 col 41: [185] a6 0a 32 30 32 34 2d 30 36 2d 32 35 2c 00 07 18 30 32 31 35 36 30 33 31 32 33 30 37 32 35 32 43 37 37 33 34 33 32 31 31 0a 32 30 32 34 2d 30 36 2d 32 34 01 30 04 4c 4f 41 4e 02 c1 16 01 53 01 80 2c 00 07 18 30 34 33 35 38 32 37 31 32 31 30 39 31 30 32 47 35 33 34 36 36 35 30 33 0a 32 30 32 34 2d 30 36 2d 32 34 01 30 04 4c 4f 41 4e 02 c1 16 01 53 01 80 0c 00 09 68 15 98 d4 00 17 18 31 30 33 31 34 30 39 31 32 32 30 36 31 33 32 43 30 33 34 37 31 31 37 37 0a 32 30 32 34 2d 30 36 2d 32 34 02 32 30 04 4c 4e 53 50 02 c1 1a 01 53 01 80 08 b4 a6 c0 ed b3 c9 col 42: [185] a6 0a 32 30 32 34 2d 30 36 2d 32 35 20 00 00 64 1d 81 02 00 55 2c 00 09 18 30 39 36 36 32 32 34 31 32 32 30 39 30 35 32 43 36 34 37 31 38 35 34 35 0a 32 30 32 34 2d 30 36 2d 32 34 02 32 30 04 4c 4e 53 50 01 80 01 53 01 80 08 b4 a6 c0 ed b3 c9 b9 a6 0a 32 30 32 34 2d 30 36 2d 32 35 2c 00 09 18 30 32 30 32 30 30 39 30 31 39 33 30 31 30 31 41 30 30 37 36 36 36 30 31 0a 32 30 32 34 2d 30 36 2d 32 34 02 32 30 04 4c 4e 53 50 01 80 01 53 01 80 08 b4 a6 c0 ed b3 c9 b9 a6 0a 32 30 32 34 2d 30 36 2d 32 35 2c 00 09 09 31 32 37 36 30 31 31 37 31 0a 32 30 32 34 tab 0, row 96, @0x4f0 tl: 2 fb: --HDF--- lb: 0x31 |
Oh my God!这个数据看起来是有问题的。首先看dump 该block显示有113行数据,似乎跟index的数据是不匹配的(索引只有112条);其次该block对应的第95行数据。。。这看起来完全是乱码呀。
先不说是不是真的乱码,首先这行数据的行指针指向了另外一个block 0xc10b0831的3133行数据。而且看上去这行数据显示该block时一个cluster table。
很明显,我们这个数据块所在的表是一个普通表,不是cluster table。最起码的,列都对不上呀。
实际上看该trace的上下文,发现这个数据块有3行数据都有问题。实际上这个数据块的结构是完整的,只是block的数据异常(包括row directory),因此这是一个逻辑坏块。
比如看该block的itl中比较新的一个事务对应的redo record记录,发现是有行迁移的(实际上看block dump也能看到)。
1 2 3 4 5 6 7 8 9 10 11 12 13 |
CHANGE <span class="hljs-comment">#2 CON_ID:0 TYP:0 CLS:1 AFN:408 DBA:0x66157303 OBJ:527736 SCN:0x00000040eb042db8 SEQ:1 OP:11.6 ENC:0 RBL:0 FLG:0x0000</span> KTB Redo op: 0x02 ver: 0x01 compat bit: 4 (post-11) padding: 1 op: C uba: 0x1385cbcc.8005.1f KDO Op code: ORP row dependencies Disabled xtype: XA flags: 0x00000000 bdba: 0x66157303 hdba: 0x04402305 itli: 23 ispac: 0 maxfr: 4858 tabn: 0 slot: 68(0x44) size/delt: 9 fb: --H----- lb: 0x17 cc: 0 nrid: 0x641d8102.55 null: |
通过分析trace发现该数据块之前的redo dump记录都是正常的,而且只有update和insert操作,并无delete记录(看redo opcode即可)。
最终客户通过重建表暂时解决了问题。
那么问题来了,这个问题是怎么产生的呢?
老实说我比较怀疑写丢失导致的bug,毕竟redo都是正常的。
而看前面的dump我们可以看到:kdsgrp – dump CR block dba
这表示oracle报错的时候dump 内存块时,发现这个block是一个cr block。我们知道Oracle中的cr是利用 落盘后的数据块+undo 进行重构的。
难道说之前写数据落盘的时候,这个block就有问题了,所以后面构造的cr block也是有问题的。
这里我在测试环境验证一下这个问题。
破坏一下行头 是否可以恢复
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 |
+++Session 1 SQL> show user USER is <span class="hljs-string">"ROGER"</span> SQL> create table test0626(a number,b varchar2(100)); Table created. SQL> insert into test0626 values(10,<span class="hljs-string">'enmotech.com'</span>); 1 row created. SQL> insert into test0626 values(20,<span class="hljs-string">'mogdb.io'</span>); 1 row created. SQL> insert into test0626 values(30,<span class="hljs-string">'killdb.com'</span>); 1 row created. SQL> insert into test0626 values(100,<span class="hljs-string">'china chengdu'</span>); 1 row created. SQL> commit; Commit complete. SQL> alter system checkpoint; System altered. SQL> SQL> select a.*, 2 dbms_rowid.rowid_object(rowid) object_id, dbms_rowid.rowid_relative_fno(rowid) file_id, 3 4 dbms_rowid.rowid_block_number(rowid) block_id, 5 dbms_rowid.rowid_row_number(rowid) num, 6 rowidtochar(rowid) 7 from test0626 a; A B OBJECT_ID FILE_ID BLOCK_ID NUM ROWIDTOCHAR(ROWID) ---------- ------------------------------ ---------- ---------- ---------- ---------- ------------------------------------ 10 enmotech.com 113932 4 374414 0 AAAb0MAAEAABbaOAAA 20 mogdb.io 113932 4 374414 1 AAAb0MAAEAABbaOAAB 30 killdb.com 113932 4 374414 2 AAAb0MAAEAABbaOAAC 100 china chengdu 113932 4 374414 3 AAAb0MAAEAABbaOAAD SQL> alter system flush buffer_cache; System altered. ++++Session 2 BBED> <span class="hljs-built_in">set</span> file 4 block 374414 FILE<span class="hljs-comment"># 4</span> BLOCK<span class="hljs-comment"># 374414</span> BBED> p kdbr sb2 kdbr[0] @118 8069 sb2 kdbr[1] @120 8054 sb2 kdbr[2] @122 8037 sb2 kdbr[3] @124 8017 BBED> p *kdbr[2] rowdata[20] ----------- ub1 rowdata[20] @8137 0x2c BBED> x /rnnnnnnnnccccccc rowdata[20] @8137 ----------- flag@8137: 0x2c (KDRHFL, KDRHFF, KDRHFH) lock@8138: 0x01 cols@8139: 2 col 0[2] @8140: 30 col 1[10] @8143: <span class="hljs-comment">######################################### </span> BBED> x /rnnnnnnnnccccccc rowdata[20] @8137 ----------- flag@8137: 0x2c (KDRHFL, KDRHFF, KDRHFH) lock@8138: 0x01 cols@8139: 2 col 0[2] @8140: 30 col 1[10] @8143: <span class="hljs-comment">######################################### </span> BBED> modify /x 6c0103 Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y File: /u01/app/oracle/oradata/data/OGGDB/datafile/users01.dbf (4) Block: 374414 Offsets: 8137 to 8140 Dba:0x0105b68e ------------------------------------------------------------------------ 6c010302 <32 bytes per line> BBED> sum apply Check value <span class="hljs-keyword">for</span> File 4, Block 374414: current = 0xd734, required = 0xd734 BBED> +++Session 2 SQL> update test0626 <span class="hljs-built_in">set</span> b=<span class="hljs-string">'www.baidu.com'</span> <span class="hljs-built_in">where</span> a=100; 1 row updated. SQL> 不提交 +++session 3 SQL> shutdown abort; ORACLE instance shut down. SQL> startup ORACLE instance started. Total System Global Area 3423965184 bytes Fixed Size 2258040 bytes Variable Size 855640968 bytes Database Buffers 2550136832 bytes Redo Buffers 15929344 bytes Database mounted. Database opened. SQL> <span class="hljs-built_in">set</span> lines 120 SQL> select * from roger.test0626; A ---------- B ------------------------------------------------------------------------------------------------------------------------ 10 enmotech.com 20 mogdb.io db.io, ? enmotech.comG?? E? T ? ? ? ? A ---------- B ------------------------------------------------------------------------------------------------------------------------ 100 china chengdu SQL> ++++session 4 oracle@Ora11g-ogg-Target:/home/oracle <span class="hljs-variable">$sqlplus</span> roger/roger SQL*Plus: Release 11.2.0.4.0 Production on Wed Jun 26 21:18:51 2024 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL> select * from test0626; A ---------- B -------------------------------------------------------------------------------- 10 enmotech.com 20 mogdb.io db.io, ? ??? ~ ?? @ ?C ? A ---------- B -------------------------------------------------------------------------------- 100 china chengdu SQL> ++++Session 5 oracle@Ora11g-ogg-Target:/home/oracle <span class="hljs-variable">$sqlplus</span> roger/roger SQL*Plus: Release 11.2.0.4.0 Production on Wed Jun 26 21:18:59 2024 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL> conn /as sysdba Connected. SQL> select dbms_utility.make_data_block_address(4,374414) from dual; DBMS_UTILITY.MAKE_DATA_BLOCK_ADDRESS(4,374414) ---------------------------------------------- 17151630 SQL> oradebug setmypid Statement processed. SQL> oradebug tracefile_name /u01/app/oracle/diag/rdbms/oggdb/oggdb/trace/oggdb_ora_5736.trc SQL> alter session <span class="hljs-built_in">set</span> events <span class="hljs-string">'immediate trace name set_tsn_p1 level 5'</span>; Session altered. SQL> alter session <span class="hljs-built_in">set</span> events <span class="hljs-string">'immediate trace name buffer level 17151630'</span>; Session altered. SQL> SELECT lru_flag, 2 tch, BA, decode(STATE,0,<span class="hljs-string">'free'</span>,1,<span class="hljs-string">'xcur'</span>,2,<span class="hljs-string">'scur'</span>,3,<span class="hljs-string">'cr'</span>, 4,<span class="hljs-string">'read'</span>,5,<span class="hljs-string">'mrec'</span>,6,<span class="hljs-string">'irec'</span>,7,<span class="hljs-string">'write'</span>,8,<span class="hljs-string">'pi'</span>, 9,<span class="hljs-string">'memory'</span>,10,<span class="hljs-string">'mwrite'</span>,11,<span class="hljs-string">'donated'</span>, 12,<span class="hljs-string">'protected'</span>, 13,<span class="hljs-string">'securefile'</span>, 14,<span class="hljs-string">'siop'</span>,15,<span class="hljs-string">'recckpt'</span>, 16, <span class="hljs-string">'flashfree'</span>, 17, <span class="hljs-string">'flashcur'</span>, 18, <span class="hljs-string">'flashna'</span>) status FROM x<span class="hljs-variable">$bh</span> 3 4 5 6 WHERE FILE<span class="hljs-comment">#=4</span> 7 AND DBABLK=374414; LRU_FLAG TCH BA STATUS ---------- ---------- ---------------- -------------------- 0 1 00000000B0A0C000 cr 0 1 000000009139C000 cr 0 1 0000000092EE2000 xcur 2 1 0000000092EE4000 cr 4 0 00000000994F0000 free SQL> |
可以看到,居然能恢复这个逻辑坏块。这个逻辑坏块可查,只不过有问题这行数据显示为乱码而已。
如果去dump数据块看,确实是乱码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
oracle@Ora11g-ogg-Target:/home/oracle <span class="hljs-variable">$cat</span> /u01/app/oracle/diag/rdbms/oggdb/oggdb/trace/oggdb_ora_5736.trc|grep <span class="hljs-string">"Block header dump: 0x0105b68e"</span> Block header dump: 0x0105b68e Block header dump: 0x0105b68e Block header dump: 0x0105b68e Block header dump: 0x0105b68e Block header dump: 0x0105b68e Object id on Block? Y seg/obj: 0x1bd0c csc: 0x00.2af3579 itc: 2 flg: E typ: 1 - DATA brn: 0 bdba: 0x105b688 ver: 0x01 opc: 0 inc: 0 exflg: 0 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x0008.00d.000067f5 0x00c0010b.1a54.0c C--- 0 scn 0x0000.02af32fd 0x02 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 bdba: 0x0105b68e data_block_dump,data header at 0xb0a0c064 =============== tsiz: 0x1f98 hsiz: 0x1a pbl: 0xb0a0c064 76543210 flag=-------- ntab=1 nrow=4 frre=-1 fsbo=0x1a fseo=0x1f51 avsp=0x1f37 tosp=0x1f37 0xe:pti[0] nrow=4 offs=0 0x12:pri[0] offs=0x1f85 0x14:pri[1] offs=0x1f76 0x16:pri[2] offs=0x1f65 0x18:pri[3] offs=0x1f51 block_row_dump: tab 0, row 0, @0x1f85 tl: 19 fb: --H-FL-- lb: 0x0 cc: 2 col 0: [ 2] c1 0b col 1: [12] 65 6e 6d 6f 74 65 63 68 2e 63 6f 6d tab 0, row 1, @0x1f76 tl: 15 fb: --H-FL-- lb: 0x0 cc: 2 col 0: [ 2] c1 15 col 1: [ 8] 6d 6f 67 64 62 2e 69 6f tab 0, row 2, @0x1f65 tl: 200 fb: -CH-FL-- lb: 0x0 cc: 3 cki: 2 col 0: [193] 1f 0a 6b 69 6c 6c 64 62 2e 63 6f 6d 2c 00 02 02 c1 15 08 6d 6f 67 64 62 2e 69 6f 2c 00 02 02 c1 0b 0c 65 6e 6d 6f 74 65 63 68 2e 63 6f 6d 00 dmp lmt after 47 col 1: [0] col 2: [0] tab 0, row 3, @0x1f51 tl: 20 fb: --H-FL-- lb: 0x0 cc: 2 col 0: [ 2] c2 02 col 1: [13] 63 68 69 6e 61 20 63 68 65 6e 67 64 75 end_of_block_dump |
因为这里模拟构造了cr block,因此可以看到实际上有多个cr block,因为dump 该dba buffer会发现trace中有多行记录。
不过这不重要,我们发现多个dump 记录中的第三行数据都是乱码。
那么这说明什么问题,说明实际上默认情况下对于逻辑坏块在读取时并没有进行校验。
那么如果预防这样的问题,实际上oracle提供了一些方法,比如打开block和索引的读写校验,实际上就可以避免上面测试遇到的问题。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
SQL> alter system <span class="hljs-built_in">set</span> db_ultra_safe=DATA_AND_INDEX scope=spfile; System altered. SQL> shutdown abort; ORACLE instance shut down. SQL> startup ORACLE instance started. Total System Global Area 3423965184 bytes Fixed Size 2258040 bytes Variable Size 855640968 bytes Database Buffers 2550136832 bytes Redo Buffers 15929344 bytes Database mounted. Database opened. SQL> conn roger/roger Connected. SQL> <span class="hljs-built_in">set</span> lines 120 SQL> col b <span class="hljs-keyword">for</span> a40 SQL> select * from test0626; A B ---------- ---------------------------------------- 10 enmotech.com 20 mogdb.io db.io, ? enmotech.com?? ) ? ?? china chengdu SQL> update test0626 <span class="hljs-built_in">set</span> b=<span class="hljs-string">'MogDB 5.2'</span> <span class="hljs-built_in">where</span> a=20; update test0626 <span class="hljs-built_in">set</span> b=<span class="hljs-string">'MogDB 5.2'</span> <span class="hljs-built_in">where</span> a=20 * ERROR at line 1: ORA-00607: Internal error occurred <span class="hljs-keyword">while</span> making a change to a data block ORA-00600: internal error code, arguments: [kdBlkCheckError], [4], [374414], [6103], [], [], [], [], [], [], [], [] SQL> SQL> show parameter db_block NAME TYPE VALUE ------------------------------------ ---------------------- ------------------------------ db_block_buffers <span class="hljs-built_in">integer</span> 0 db_block_checking string FULL db_block_checksum string FULL db_block_size <span class="hljs-built_in">integer</span> 8192 SQL> |
此时报错的信息,也会在alert log详细打印,明确告你是什么对象的什么block有问题。
写在最后。所以你认为B表的数据被写到了A表,你认为这是Oracle Bug吗。
想了解更多内容,可以参考这2个课程系列文章。
Leave a Reply
You must be logged in to post a comment.