未格式化的坏块导致数据库备份失败
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
本文链接地址: 未格式化的坏块导致数据库备份失败
今天一位同事咨询说某客户一套核心数据库出现坏块,导致NBU备份失败;他通过rman进行blockrecover 报错,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
---rman validate检查报错坏块 SQL> SELECT * FROM v$database_block_corruption; FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO ---------- ---------- ---------- ------------------ --------- 15 1185624 8 0 CORRUPT 15 1185652 4 0 CORRUPT 15 1187096 4 0 CORRUPT ---通过blockrecover恢复报错 RMAN> run{ 2> ALLOCATE CHANNEL ch00 TYPE 'SBT_TAPE'; 3> SEND 'NB_ORA_SERV=nbu_master, NB_ORA_CLIENT=ybbred12'; 4> blockrecover datafile 15 block 1187096; 5> RELEASE CHANNEL ch00; 6> } using target database control file instead of recovery catalog allocated channel: ch00 channel ch00: SID=1618 instance=billog2 device type=SBT_TAPE channel ch00: Veritas NetBackup for Oracle - Release 7.7.3 (2016051920) sent command to channel: ch00 Starting recover at 2020-03-19 11:07:12 channel ch00: restoring block(s) channel ch00: specifying block(s) to restore from backup set restoring blocks of datafile 00015 channel ch00: reading from backup piece bk_115329_1_1035085787 channel ch00: ORA-19870: error while restoring backup piece bk_115329_1_1035085787 ORA-19507: failed to retrieve sequential file, handle="bk_115329_1_1035085787", parms="" ORA-27029: skgfrtrv: sbtrestore returned error ORA-19511: Error received from media manager layer, error text: Backup file <bk_115329_1_1035085787> not found in NetBackup catalog ....... |
从上述报错来看,rman提示找不到备份集合。对NBU有一点点经验;通过bplist查看最近几天对备份确实没有。
1 2 3 4 |
ybbred12[/]#bplist -l -C byyred12 -t 4 -s 03/15/2020 00:00:00 -e 03/15/2020 23:00:00 -R / EXIT STATUS 227: no entity was found ybbred12[/]#bplist -l -C byyred12 -t 4 -s 03/15/2020 00:00:00 -e 03/18/2020 23:00:00 -R / EXIT STATUS 227: no entity was found |
然后我对数据块进行了dump发现内容完全不对。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
*** 2020-03-19 12:17:33.639 Start dump data block from file +DATA/billog/datafile/tbs_cen_ilog.279.907377523 minblk 1185625 maxblk 1185625 V10 STYLE FILE HEADER: Compatibility Vsn = 186646528=0xb200000 Db ID=3752648572=0xdfacdf7c, Db Name='BILLOG' Activation ID=0=0x0 Control Seq=24409607=0x1747607, File size=4188672=0x3fea00 File Number=15, Blksiz=8192, File Type=3 DATA Dump all the blocks in range: Encrypted block <11, 3135232634> content will not be dumped. Dumping header only. buffer tsn: 11 rdba: 0xbadfda7a (747/2087546) scn: 0xbadf.badfda7a seq: 0xda flg: 0x7a tail: 0xbadfda7a frmt: 0x07 chkval: 0xbadf type: 0xba=unknown Hex dump of corrupt header 4 = CORRUPT Dump of memory from 0x0000000110860E00 to 0x0000000110860E14 110860E00 BADFDA7A BADFDA7A BADFDA7A BADFDA7A [...z...z...z...z] 110860E10 BADFDA7A [...z] End dump data block from file +DATA/billog/datafile/tbs_cen_ilog.279.907377523 minblk 1185625 maxblk 1185625 |
因此我怀疑这几个坏块可能是未格式化的坏块。进一步通过查询dba_extents进行了确认:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
SQL> set lines 200 SQL> col segment_name for a30 SQL> col segment_type for a20 SQL> col owner for a20 SQL> select segment_name,segment_type,owner 2 from dba_Extents 3 where file_id=15 4 and 1185624 between block_id and block_id + blocks -1; no rows selected SQL> c/5624/5662 4* and 1185662 between block_id and block_id + blocks -1 SQL> / no rows selected SQL> c/5662/7100 4* and 118877100 between block_id and block_id + blocks -1 SQL> / no rows selected SQL> select * from (select owner,segment_name,file_id,extent_id,block_id,blocks from 2 dba_extents where file_id=15 and block_id > 1185000) where rownum < 20; no rows selected SQL> c/1185000/1170000 2* dba_extents where file_id=15 and block_id > 1170000) where rownum < 20 SQL> SQL> l 1 select * from (select owner,segment_name,file_id,extent_id,block_id,blocks from 2* dba_extents where file_id=15 and block_id > 1170000) where rownum < 20 SQL> SQL> c/20/100 2* dba_extents where file_id=15 and block_id > 1170000) where rownum < 100 SQL> l 1 select * from (select owner,segment_name,file_id,extent_id,block_id,blocks from 2* dba_extents where file_id=15 and block_id > 1170000) where rownum < 100 SQL> |
看来这个数据文件可能很大一部分根本没被使用;至少目前没有被使用。我们可以查看该数据文件的高水位信息来验证这一点:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
SQL> select * 2 from (select /*+ ordered use_hash(a,b,c) */ 3 a.file_id, 4 a.file_name, 5 a.filesize, 6 b.freesize, 7 (a.filesize - b.freesize) usedsize, 8 c.hwmsize, 9 c.hwmsize - (a.filesize - b.freesize) unsedsize_belowhwm, 10 a.filesize - c.hwmsize canshrinksize 11 from (select file_id, 12 file_name, 13 round(bytes / 1024 / 1024) filesize 14 from dba_data_files) a, 15 (select file_id, round(sum(dfs.bytes) / 1024 / 1024) freesize 16 from dba_free_space dfs 17 group by file_id) b, 18 (select file_id, round(max(block_id) * 8 / 1024) HWMsize 19 from dba_extents 20 group by file_id) c 21 where a.file_id = b.file_id 22 and a.file_id = c.file_id 23 order by unsedsize_belowhwm desc) 24 where file_id in (select file_id 25 from dba_data_files 26 where file_id=15) 27 order by file_id; FILE_ID FILE_NAME FILESIZE FREESIZE USEDSIZE HWMSIZE UNSEDSIZE_BELOWHWM CANSHRINKSIZE ---------- ------------------------------------------------------------ ---------- ---------- ---------- ---------- ------------------ ------------- 15 +DATA/billog/datafile/tbs_cen_ilog.279.907377523 32724 31512 1212 1470 258 31254 |
不难看出目前该数据文件的HWM大概在1400M左右;而前面报错的坏块编号1185624*8192/1024/1024 大概是9400M左右;显然这是有问题的。
当然对于这个问题,我们可以其实不做任何处理,等业务表的数据逐渐增加;会自动去格式化处理掉这部分坏块,不过这样会影响备份。也就是说在坏块处理好之前备份可能都无法正常进行,除非指定allow corrption参数。
那么针对这种简单问题其实有3种比较简单的处理方法:
1、通过resize文件来完成
2、通过创建测试表,allocate extent到指定文件,并往表中插入大量数据,格式化坏块后,再drop table
3、通过bbed copy 空块直接覆盖即可(copy 覆盖需要修改rdba地址)。
很显然,针对这次的情况,resize文件是最简单的了。我们来看看效果:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
SQL> alter database datafile 15 resize 9000m; Database altered. RMAN> validate datafile 15; Starting validate at 2020-03-19 14:36:28 using channel ORA_DISK_1 channel ORA_DISK_1: starting validation of datafile channel ORA_DISK_1: specifying datafile(s) for validation input datafile file number=00015 name=+DATA/billog/datafile/tbs_cen_ilog.279.907377523 channel ORA_DISK_1: validation complete, elapsed time: 00:00:45 List of Datafiles ================= File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 15 OK 0 0 1152000 14080921213487 File Name: +DATA/billog/datafile/tbs_cen_ilog.279.907377523 Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 0 Index 0 1149254 Other 0 2746 |
非常简单的小问题。跟大家分享一下,就当是流水账吧!
Leave a Reply
You must be logged in to post a comment.