磁盘被分区导致Diskgroup无法mount
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
本文链接地址: 磁盘被分区导致Diskgroup无法mount
近期一个客户的Crs无法启动,报磁盘头损坏,如下是asm日志:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
WARNING: cache read a corrupt block: group=4(JTGL1) dsk=0 blk=94 disk=0 (JTGL1_0000) incarn=3915949017 au=0 blk=94 count=1 Sun Dec 01 14:04:18 2019 Errors in file /u01/app/grid/diag/asm/+asm/+ASM3/trace/+ASM3_ora_19679.trc: ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [94] [0 != 1] NOTE: a corrupted block from group JTGL1 was dumped to /u01/app/grid/diag/asm/+asm/+ASM3/trace/+ASM3_ora_19679.trc WARNING: cache read (retry) a corrupt block: group=4(JTGL1) dsk=0 blk=94 disk=0 (JTGL1_0000) incarn=3915949017 au=0 blk=94 count=1 Sun Dec 01 14:04:18 2019 Errors in file /u01/app/grid/diag/asm/+asm/+ASM3/trace/+ASM3_ora_19679.trc: ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [94] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [94] [0 != 1] WARNING: Failed to verify disk 0 (JTGL1_0000) of group 4 (JTGL1) path /dev/mpathn reason: endian_kfbh 0 != 1 NOTE: corrupt disk header dumped to /u01/app/grid/diag/asm/+asm/+ASM3/trace/+ASM3_ora_19679.trc NOTE: cache repaired a corrupt block: group=4(JTGL1) dsk=0 blk=94 on disk 0 from disk=0 (JTGL1_0000) incarn=3915949017 au=11 blk=94 count=1 Sun Dec 01 14:07:16 2019 |
通过简单分析发现磁盘头存在异常;准确的讲是整个磁盘前面1M的数据都存在问题,但是并非全部损坏。
我们可以看到盘头信息确实被破坏了,重点是有EFI PART信息。由此可见极可能是被分区导致。
通过读取循环读取一号au会发现,其实并非全部损坏;
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
[oracle@enmodb3 tmp]$ let n=256 [oracle@enmodb3 tmp]$ for (( i=2; i<$n; i++ )); do kfed read /tmp/mpathaa.dd blkn=$i | grep kfbh.type; done kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.type: 0 ; 0x002: KFBTYP_INVALID |
当然;这个问题处理起来并不困难;Oracle 从12.1版本开始已经对第0号au进行了全部备份;备份位置在AU=11的位置上;这是一个新特性。通过dd即可完成恢复。
那么问题来了,这个问题产生的根本原因是什么呢 ?
毫无疑问应该是被分区导致;但如果是裸盘被分区,那么仅仅破坏的盘头4k的位置,不太可能导致前面这么多block都被破坏(已验证)。
简单记录一下吧。
Leave a Reply
You must be logged in to post a comment.