3TB 非归档Oracle数据库恢复小case(windows)
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
本文链接地址: 3TB 非归档Oracle数据库恢复小case(windows)
这是友情支持一个朋友的数据库恢复case,昨天圣诞节的时候来个求助,只能速度帮忙解决了好过节去了。首先我们来看下的alert log都有哪些信息:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
Wed Dec 23 15:19:21 2015 SMON: enabling tx recovery Wed Dec 23 15:19:21 2015 Database Characterset is ZHS16GBK Opening with internal Resource Manager plan where NUMA PG = 2, CPUs = 8 Wed Dec 23 15:19:22 2015 Errors in file d:\oracle\product\10.2.0\admin\ajzq\udump\ajzq_ora_1712.trc: ORA-00600: internal error code, arguments: [4194], [45], [45], [], [], [], [], [] Doing block recovery for file 2 block 1618943 Block recovery from logseq 4, block 63 to scn 7662329350 Wed Dec 23 15:19:23 2015 Recovery of Online Redo Log: Thread 1 Group 1 Seq 4 Reading mem 0 Mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\xxxx\REDO01.LOG Block recovery completed at rba 4.4749.16, scn 1.3367362055 Doing block recovery for file 2 block 121 Block recovery from logseq 4, block 63 to scn 7662329346 Wed Dec 23 15:19:23 2015 Recovery of Online Redo Log: Thread 1 Group 1 Seq 4 Reading mem 0 Mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\xxxx\REDO01.LOG Block recovery completed at rba 4.3977.16, scn 1.3367362051 Wed Dec 23 15:19:24 2015 SMON: Restarting fast_start parallel rollback Wed Dec 23 15:19:24 2015 Errors in file d:\oracle\product\10.2.0\admin\xxxx\udump\xxxx_ora_1712.trc: ORA-00600: internal error code, arguments: [4194], [45], [45], [], [], [], [], [] Wed Dec 23 15:19:24 2015 DEBUG: Replaying xcb 0xf0adfbe0, pmd 0x20588f48 for failed op 8 Doing block recovery for file 2 block 1618943 Block recovery from logseq 4, block 63 to scn 7662329350 Wed Dec 23 15:19:24 2015 Recovery of Online Redo Log: Thread 1 Group 1 Seq 4 Reading mem 0 Mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\xxxx\REDO01.LOG Block recovery completed at rba 4.4749.16, scn 1.3367362055 Wed Dec 23 15:21:28 2015 Errors in file d:\oracle\product\10.2.0\admin\xxxx\udump\xxxx_ora_1712.trc: ORA-00600: internal error code, arguments: [4194], [45], [45], [], [], [], [], [] ORA-00600: internal error code, arguments: [4194], [45], [45], [], ..... Wed Dec 23 16:47:26 2015 Errors in file d:\oracle\product\10.2.0\admin\xxxx\bdump\xxxx_p005_8980.trc: ORA-00600: internal error code, arguments: [2037], [141460651], [141460651], [162], [6], [1], [3367403517], [3141222167] Wed Dec 23 16:47:26 2015 Hex dump of (file 92, block 3926982) in trace file d:\oracle\product\10.2.0\admin\xxxx\bdump\xxxx_dbw0_8864.trc Corrupt block relative dba: 0x173bebc6 (file 92, block 3926982) Bad header found during preparing block for write Data in bad block: type: 23 format: 7 rdba: 0xbd3b3f17 last change scn: 0x0001.c8b6be51 seq: 0x5 flg: 0x00 spare1: 0x3b spare2: 0xbc spare3: 0xc consistency value in tail: 0xbe511705 check value in block header: 0x0 block checksum disabled Wed Dec 23 16:47:26 2015 Errors in file d:\oracle\product\10.2.0\admin\xxxx\bdump\xxxx_p005_8980.trc: ORA-00600: internal error code, arguments: [2037], [141460651], [141460651], [162], [6], [1], [3367403517], [3141222167] Wed Dec 23 16:47:27 2015 Errors in file d:\oracle\product\10.2.0\admin\xxxx\bdump\xxxx_dbw0_8864.trc: ORA-00600: internal error code, arguments: [kcbzpbuf_1], [4], [1], [], [], [], [], [] Wed Dec 23 16:47:27 2015 Errors in file d:\oracle\product\10.2.0\admin\xxxx\bdump\xxxx_dbw0_8864.trc: ORA-00600: internal error code, arguments: [kcbzpbuf_1], [4], [1], [], [], [], [], [] DBW0: terminating instance due to error 471 |
我们可以看到,这个数据库已经宕机好几天了,在23号的时候就已经无法正常open了。上述的结果错误也很常见很简单。朋友经过处理后顺利把数据库open了,可是打开之后创建新的undo 表空间之后尝试重启后就再也打开不开了。遇到了如下的错误:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
Fri Dec 25 11:44:17 2015 Recovery of Online Redo Log: Thread 1 Group 3 Seq 6 Reading mem 0 Mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\xxxx\REDO03.LOG Fri Dec 25 11:44:17 2015 Completed redo application Fri Dec 25 11:44:17 2015 Hex dump of (file 44, block 518322) in trace file d:\oracle\product\10.2.0\admin\xxxx\bdump\xxxx_dbw2_21232.trc Corrupt block relative dba: 0x0b07e8b2 (file 44, block 518322) Bad header found during preparing block for write Data in bad block: type: 82 format: 2 rdba: 0x1c4ee895 last change scn: 0x0001.c8bbad54 seq: 0x21 flg: 0xf0 spare1: 0x52 spare2: 0x4d spare3: 0x895b consistency value in tail: 0xad545221 check value in block header: 0x0 block checksum disabled Fri Dec 25 11:44:18 2015 Errors in file d:\oracle\product\10.2.0\admin\xxxx\bdump\xxxx_dbw2_21232.trc: ORA-00600: internal error code, arguments: [kcbzpbuf_1], [4], [1], [], [], [], [], [] Fri Dec 25 11:44:18 2015 Errors in file d:\oracle\product\10.2.0\admin\xxxx\bdump\xxxx_dbw2_21232.trc: ORA-00600: internal error code, arguments: [kcbzpbuf_1], [4], [1], [], [], [], [], [] DBW2: terminating instance due to error 471 Fri Dec 25 11:44:18 2015 Errors in file d:\oracle\product\10.2.0\admin\xxxx\bdump\xxxx_p014_20704.trc: ORA-00471: DBWR process terminated with error |
这个错误我也是第一次遇见。从kcbzpbuf函数来看,是dbwr进出写脏块时发现了坏块。很明显,从上面的坏块信息来看,dba地址0x0b07e8b2 和 rdba地址0x1c4ee895是不匹配,这肯定没法正常写入的。
针对这个错误我事后在metalink 搜索了一下,还是有不少相关的文档甚至是bug,这里不多说了。处理方式很简单,首先加入如下2个隐含参数尝试打开数据库:
*._allow_resetlogs_corruption=TRUE
*._allow_error_simulation=TRUE
做了不完全恢复后,尝试open resetlogs时发现遇到了如下错误:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
ri Dec 25 12:20:03 2015 SMON: enabling cache recovery Fri Dec 25 12:20:04 2015 Errors in file d:\oracle\product\10.2.0\admin\xxxx\udump\xxxx_ora_20176.trc: ORA-00600: internal error code, arguments: [2662], [1], [3367742663], [1], [3367742798], [4194313], [], [] Fri Dec 25 12:20:05 2015 Errors in file d:\oracle\product\10.2.0\admin\xxxx\udump\xxxx_ora_20176.trc: ORA-00600: internal error code, arguments: [2662], [1], [3367742663], [1], [3367742798], [4194313], [], [] Fri Dec 25 12:20:05 2015 Error 600 happened during db open, shutting down database USER: terminating instance due to error 600 Fri Dec 25 12:20:05 2015 Errors in file d:\oracle\product\10.2.0\admin\xxxx\bdump\xxxx_p015_21472.trc: ORA-00600: internal error code, arguments: [15784], [600], [], [], [], [], [], [] ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [] Fri Dec 25 12:20:05 2015 Errors in file d:\oracle\product\10.2.0\admin\xxxx\bdump\xxxx_p014_9072.trc: ORA-00600: internal error code, arguments: [15784], [600], [], [], [], [], [], [] ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [] |
看到这里2个错误,很多人可能会很迷惑,到底数据库无法open是因为哪个错误导致的呢? 这里稍微注意一下前后报错的顺序就知道了,很明显是前面的ora-00600 [2662]错误导致无法open resetlogs完成。
对于ora-00600 [2662]错误这里不再解释了,SCN的问题。这里直接推进scn即可。
将数据库启动到mount状态,执行如下命令:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
SQL> alter system set job_queue_processes=0; Session altered. SQL> startup mount ORACLE instance started. Total System Global Area 7516192768 bytes Fixed Size 2175728 bytes Variable Size 1232850192 bytes Database Buffers 6274678784 bytes Redo Buffers 6488064 bytes Database mounted. SQL> alter session set events 'IMMEDIATE trace name adjust_scn level 10'; Session altered. SQL> alter database open; Database altered. |
很顺利就打开了数据库。由于之前朋友已经创建了新的undo表空间,我只需要帮忙把undo重建就完事了,结束支持任务。尝试进行drop tablespace undotbs1 including contents and datafiles时,遇到ora-01548错误。
在alter session set “_smu_debug_mode” = 4; 后都无法进行drop,那只能继续使用下隐含参数了:
1 2 |
_offline_rollback_segments _corrupted_rollback_segments |
通过这2个隐含参数,可以顺利将旧的undo 表空间drop掉,然后修改pfile文件去掉不需要的一些隐含参数,顺利打开数据库。
到这里结束我的半小时支持友情支持任务!悲剧的下午又开始优化某客户的SQL。。。。。
备注:
1) 对于这种异常强制打开的数据库,建议进行全库检查,包括是否有坏块等,甚至检查数据字典是否一致等等(实际上检查之前的alert log发现, 之前数据库在恢复过程中就报了不少的坏块)
2) 在这个年代,还有3TB的数据库,而且是windows,更悲剧的还是非归档,没有备份,这是不应该的。
3) 我们应该多考虑下系统的容灾建设,哪怕是有备份。
Leave a Reply
You must be logged in to post a comment.