15 TB 3节点RAC 的恢复记录
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
本文链接地址: 15 TB 3节点RAC 的恢复记录
某客户的业务系统(15TB,3节点RAC) 由于abort关闭之后,导致数据库无法正常open. 在Open时报错如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
Tue Oct 15 14:06:36 2013 Trace dumping is performing id=[cdmp_20131015140636] Errors in file /oracle/app/product/diag/rdbms/xxxx/xxxx/trace/xxxx03_ora_49348828.trc: ORA-00308: cannot open archived log '/arch/xxxx/2_30992_766859529.dbf' ORA-17503: ksfdopn:4 Failed to open file /arch/xxxx/2_30992_766859529.dbf ORA-17500: ODM err:File does not exist ORA-00600: internal error code, arguments: [4097], [3909], [23], [121212], [], [], [], [], [], [], [], [] Errors in file /oracle/app/product/diag/rdbms/xxxx/xxxx03/trace/xxxx03_ora_49348828.trc: ORA-00600: internal error code, arguments: [4097], [3909], [23], [121212], [], [], [], [], [], [], [], [] Errors in file /oracle/app/product/diag/rdbms/xxxx/xxxx03/trace/xxxx03_ora_49348828.trc: ORA-00600: internal error code, arguments: [4097], [3909], [23], [121212], [], [], [], [], [], [], [], [] Error 600 happened during db open, shutting down database USER (ospid: 49348828): terminating the instance due to error 600 Tue Oct 15 14:06:37 2013 |
但看上面的错误,或许有人以为这是回档有问题,或者以为是AIX操作系统有问题。 其实不然,why ?
我想问一下,实例恢复需要archivelog吗?
因此我们可以确认,数据库无法正常open的原因是ora-00600 [4097]错误在作祟。 关于这个错误,不想描述的太多。
网上也有很多文章,教你通过屏蔽回滚段来强制拉库。
但是这里,我想知道是,为什么会抛出这个错误。
我们来看Metalink 文档关于该错误的一个标准解释:
1 2 3 |
We are accessing a rollback segment header to see if a transaction has been committed. However, the xid given is in the future of the transaction table. This could be due to a rollback segment corruption issue OR you might be hitting the following known problem |
根据文档解释,产生该错误的原因是oracle在open时会去读取回滚段头中的事务表信息,以此来判断是否已经提交.
当发现某个事务(XID)的warp#比当前数据库的最大值都还要大时,将出现该错误。 下面我们来看下trace文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
Dump continued from file: /oracle/app/product/diag/rdbms/xxxx/xxxx03/trace/xxxx03_ora_17629274.trc ORA-00600: internal error code, arguments: [4097], [3909], [23], [121212], [], [], [], [], [], [], [], [] ========= Dump for incident 526533 (ORA 600 [4097]) ======== ----- Beginning of Customized Incident Dump(s) ----- xid: 0x0f45.017.0001d97c GLOBAL CACHE ELEMENT DUMP (address: 0x7000000ddef7278): id1: 0x26d7d0 id2: 0x5 pkey: TS#5 block: (5/2545616) lock: S rls: 0x0 acq: 0x0 latch: 5 flags: 0x20 fair: 0 recovery: 0 fpin: 'ktuwh02: ktugus' bscn: 0xc54.80f5a25a bctx: 0x0 write: 0 scan: 0x0 lcp: 0x0 lnk: [NULL] lch: [0x700000223e151e8,0x700000223e151e8] seq: 7 hist: 239 143:0 16 143:5 208 352 32 LIST OF BUFFERS LINKED TO THIS GLOBAL CACHE ELEMENT: flg: 0x00000000 state: SCURRENT tsn: 5 tsh: 0 mode: SHR foq: 0 pin: 'ktuwh02: ktugus' addr: 0x700000223e150b8 obj: INVALID cls: UNDO HEAD bscn: 0xc54.80f5a25a |
oracle internal错误如下:
ORA-00600: internal error code, arguments: [4097], [3909], [23], [121212]
从trace 内容下面的信息,可以判断出oracle在open的时候,是在对事务XID 0x0f45.017.0001d97c
进行操作时无法正常进行,进而抛出ora-00600 内部错误。
关于XID的结构如下:
XID= 0x0f45.017.0001d97c
0f45: 表示回滚段编号,转换为10进制后为3909
017:表示slot编号,转换为10进制后为23
0001d97c:表示wrap#号,转换为10进制后为121212
而从trace下面的内容可以看出,该事务操作的数据块是(5/2545616),即datafile 5,block 254516.
同时,在对数据库open之前,进行10046 trace跟踪时,从跟踪内容也可以确认,数据库在open时在
访问数据块file 5 block 2545616时出现异常,进而导致数据库无法打开,如下:
1 2 3 4 5 6 7 8 |
EXEC #2:c=0,e=68,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=4258302260,tim=32037576906172 FETCH #2:c=0,e=11,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=4258302260,tim=32037576906202 CLOSE #2:c=0,e=4,dep=1,type=3,tim=32037576906229 WAIT #1: nam='db file sequential read' ela= 398 file#=4 block#=27168 blocks=1 obj#=0 tim=32037576906669 WAIT #1: nam='db file sequential read' ela= 255 file#=4 block#=94014 blocks=1 obj#=0 tim=32037576906991 WAIT #1: nam='db file sequential read' ela= 267 file#=5 block#=2545616 blocks=1 obj#=0 tim=32037576907293 Incident 598501 created, dump file: /oracle/app/product/diag/rdbms/xxxx/xxxx03/incident/incdir_598501/xxxx03_ora_15925410_i598501.trc ORA-00600: internal error code, arguments: [4097], [3909], [23], [121212], [], [], [], [], [], [], [], [] |
到这里,我们可以确认问题出在回滚段上。按理说这里应该存在活动事务,然而我并没有发现:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
index state cflags wrap# uel scn dba parent-xid nub stmt_num cmt ------------------------------------------------------------------------------------------------ 0x00 9 0x00 0x1d963 0x0015 0x0c52.0abb4e53 0x014197c8 0x0000.000.00000000 0x00000001 0x00000000 1381038712 0x01 9 0x00 0x1d952 0x001f 0x0c52.0ab7658c 0x01411a9f 0x0000.000.00000000 0x00000001 0x00000000 1381032874 0x02 9 0x17 0x1d981 0x0016 0x0c54.80ef57d6 0x014197ca 0x0000.000.00000000 0x00000001 0x014197ca 1381802983 0x03 9 0x00 0x1d970 0x0014 0x0c52.0ac0927b 0x014197ca 0x0000.000.00000000 0x00000001 0x00000000 1381045593 0x04 9 0x00 0x1d95f 0x001b 0x0c52.0abd8064 0x014197c9 0x0000.000.00000000 0x00000001 0x00000000 1381040912 0x05 9 0x00 0x1d95e 0x0006 0x0c52.0ab83478 0x01411a9f 0x0000.000.00000000 0x00000001 0x00000000 1381034672 0x06 9 0x00 0x1d96d 0x0008 0x0c52.0ab87c76 0x01411a9f 0x0000.000.00000000 0x00000001 0x00000000 1381034803 0x07 9 0x03 0x1d90c 0x001c 0x0c54.80f454d4 0x014197cd 0x0000.000.00000000 0x00000001 0x014197cd 1381803275 0x08 9 0x00 0x1d96b 0x000a 0x0c52.0ab8d83e 0x014197c8 0x0000.000.00000000 0x00000001 0x00000000 1381035069 0x09 9 0x00 0x1d93a 0x001a 0x0c52.0abe36ce 0x014197ca 0x0000.000.00000000 0x00000001 0x00000000 1381042341 0x0a 9 0x03 0x1d969 0x000e 0x0c52.0ab97338 0x014197c8 0x0000.000.00000000 0x00000001 0x014197c8 1381035606 0x0b 9 0x00 0x1d978 0x0007 0x0c54.80f3d624 0x014197cd 0x0000.000.00000000 0x00000001 0x00000000 1381803242 0x0c 9 0x17 0x1d967 0x001e 0x0c54.80f2d421 0x014197cc 0x0000.000.00000000 0x00000001 0x014197cc 1381803174 0x0d 9 0x17 0x1d966 0x0019 0x0c54.80f1250d 0x014197cb 0x0000.000.00000000 0x00000001 0x014197cb 1381803094 0x0e 9 0x00 0x1d965 0x0020 0x0c52.0ab99ed0 0x014197c8 0x0000.000.00000000 0x00000001 0x00000000 1381035752 0x0f 9 0x00 0x1d964 0xffff 0x0c54.80f5a25a 0x014197ce 0x0000.000.00000000 0x00000002 0x00000000 1381803359 0x10 9 0x00 0x1d963 0x0000 0x0c52.0abb2bf4 0x014197c8 0x0000.000.00000000 0x00000001 0x00000000 1381038687 0x11 9 0x00 0x1d952 0x0010 0x0c52.0abaefc0 0x014197c8 0x0000.000.00000000 0x00000001 0x00000000 1381038651 0x12 9 0x00 0x1d8f1 0x0003 0x0c52.0abff18d 0x014197ca 0x0000.000.00000000 0x00000001 0x00000000 1381044622 0x13 9 0x00 0x1d960 0x000d 0x0c54.80f0697d 0x014197cb 0x0000.000.00000000 0x00000001 0x00000000 1381803039 0x14 9 0x00 0x1d96f 0x001d 0x0c52.0ac0bd84 0x014197ca 0x0000.000.00000000 0x00000001 0x00000000 1381045638 0x15 9 0x00 0x1d96e 0x0021 0x0c52.0abbe7e3 0x014197c8 0x0000.000.00000000 0x00000001 0x00000000 1381039204 0x16 9 0x00 0x1d97d 0x0013 0x0c54.80f013a2 0x014197cb 0x0000.000.00000000 0x00000001 0x00000000 1381803019 0x17 9 0x00 0x1d96c 0x0001 0x0c52.0ab6d367 0x01411a9f 0x0000.000.00000000 0x00000001 0x00000000 1381032050 0x18 9 0x00 0x1d94b 0x0012 0x0c52.0abf8955 0x014197ca 0x0000.000.00000000 0x00000001 0x00000000 1381044066 0x19 9 0x03 0x1d94a 0x000c 0x0c54.80f22b8e 0x014197cc 0x0000.000.00000000 0x00000001 0x014197cc 1381803133 0x1a 9 0x00 0x1d969 0x0018 0x0c52.0abeabbd 0x014197ca 0x0000.000.00000000 0x00000001 0x00000000 1381042812 0x1b 9 0x00 0x1d968 0x0009 0x0c52.0abde1cb 0x014197ca 0x0000.000.00000000 0x00000002 0x00000000 1381041868 0x1c 9 0x03 0x1d967 0x000f 0x0c54.80f4e901 0x014197cd 0x0000.000.00000000 0x00000001 0x014197cd 1381803313 0x1d 9 0x00 0x1d966 0x0002 0x0c52.0ac116f6 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1381045850 0x1e 9 0x00 0x1d965 0x000b 0x0c54.80f34151 0x014197cd 0x0000.000.00000000 0x00000001 0x00000000 1381803201 0x1f 9 0x00 0x1d954 0x0005 0x0c52.0ab7b73c 0x01411a9f 0x0000.000.00000000 0x00000001 0x00000000 1381033709 0x20 9 0x00 0x1d943 0x0011 0x0c52.0aba5120 0x014197c8 0x0000.000.00000000 0x00000001 0x00000000 1381037404 0x21 9 0x00 0x1d962 0x0004 0x0c52.0abc21a6 0x014197c8 0x0000.000.00000000 0x00000001 0x00000000 1381039267 |
所以,只能说该数据库在被强制关闭时oracle并没有来得及去更新warp#,最终出现了这个情况。 当然解决方案比较简单,屏蔽掉回滚段即可。话说客户这里,我strings一下发现有超过10000个回滚段,哈哈~~~。 还好,顺利打开数据库,且从我们的分析来看,没有数据丢失.
3 Responses to “15 TB 3节点RAC 的恢复记录”
是否需要重建UNDO表空间?
这里其实不需要,open之后,把问题回滚段drop即可。
安全起见可以重建undo!
[…] 详见原本博客链接地址:15 TB 3节点RAC 的恢复记录 本条目发布于 2013 年 10 月 25 日。属于 数据库教程 分类。作者是 admin。 0 次浏览 document.getElementById("bdshell_js").src = "http://bdimg.share.baidu.com/static/js/shell_v2.js?cdnversion=" + Math.ceil(new Date()/3600000) […]
Leave a Reply
You must be logged in to post a comment.