非归档恢复的一个模拟例子
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
本文链接地址: 非归档恢复的一个模拟例子
培训班一个学生模拟非归档数据恢复,始终无法打开数据库,我在自己的vm 将其打包的文件进行了恢复,如下是完整的恢复和分析过程,供大家参考!
++++ 编辑pfile并创建相关目录
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
*.audit_file_dest='/oracle/admin/cos11g/adump' *.audit_trail='db' *.compatible='11.2.0.0.0' *.control_files='/oracle/database/cos11g/control01.ctl' *.db_block_size=8192 *.db_domain='' *.db_name='cos11g' #*.db_recovery_file_dest='/u01/app/oracle/fast_recovery_area' #*.db_recovery_file_dest_size=4322230272 *.diagnostic_dest='/oracle' *.dispatchers='(PROTOCOL=TCP) (SERVICE=cos11gXDB)' *.memory_target=383286272 *.open_cursors=300 *.processes=150 *.remote_login_passwordfile='EXCLUSIVE' *.undo_tablespace='UNDOTBS1' *._allow_resetlogs_corruption=TRUE *._allow_error_simulation=TRUE #*.undo_management='manual' [oracle@11gR2texing cos11g]$ mkdir -p /oracle/admin/cos11g/adump [oracle@11gR2texing cos11g]$ |
++++mount启动数据库
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
[oracle@11gR2texing database]$ export ORACLE_SID=cos11g [oracle@11gR2texing database]$ sqlplus "/as sysdba" SQL*Plus: Release 11.2.0.3.0 Production on Fri Oct 18 08:21:48 2013 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to an idle instance. SQL> startup mount pfile='/oracle/database/init.ora'; ORACLE instance started. Total System Global Area 384143360 bytes Fixed Size 2228624 bytes Variable Size 255856240 bytes Database Buffers 121634816 bytes Redo Buffers 4423680 bytes Database mounted. SQL> set lines 200 pagesize 200 SQL> col name for a65 SQL> select file#,name,checkpoint_change# from v$datafile order by 1; FILE# NAME CHECKPOINT_CHANGE# ---------- ----------------------------------------------------------------- ------------------ 1 /u01/app/oracle/oradata/cos11g/system01.dbf 2223501 2 /u01/app/oracle/oradata/cos11g/sysaux01.dbf 2223501 3 /u01/app/oracle/oradata/cos11g/undotbs01.dbf 2223501 4 /u01/app/oracle/oradata/cos11g/users01.dbf 2223501 5 /u01/app/oracle/oradata/cos11g/example01.dbf 2223501 SQL> col member for a65 SQL> select member from v$Logfile; MEMBER ----------------------------------------------------------------- /u01/app/oracle/oradata/cos11g/redo03.log /u01/app/oracle/oradata/cos11g/redo02.log /u01/app/oracle/oradata/cos11g/redo01.log |
+++++ rename datafile & logfile
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
SQL> alter database rename file '/u01/app/oracle/oradata/cos11g/system01.dbf' to '/oracle/database/cos11g/system01.dbf' ; Database altered. SQL> alter database rename file '/u01/app/oracle/oradata/cos11g/sysaux01.dbf' to '/oracle/database/cos11g/sysaux01.dbf' ; alter database rename file '/u01/app/oracle/oradata/cos11g/undotbs01.dbf' to '/oracle/database/cos11g/undotbs01.dbf'; Database altered. SQL> Database altered. SQL> alter database rename file '/u01/app/oracle/oradata/cos11g/users01.dbf' to '/oracle/database/cos11g/users01.dbf' ; Database altered. SQL> alter database rename file '/u01/app/oracle/oradata/cos11g/example01.dbf' to '/oracle/database/cos11g/example01.dbf'; alter database rename file '/u01/app/oracle/oradata/cos11g/redo01.log' to '/oracle/database/cos11g/redo01.log'; Database altered. SQL> SQL> SQL> select * from v$log; GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME ---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ --------- 1 1 1 52428800 512 1 NO CURRENT 1223086 28-AUG-14 2.8147E+14 3 1 0 52428800 512 1 YES UNUSED 0 0 2 1 0 52428800 512 1 YES UNUSED 0 0 SQL> alter database rename file '/u01/app/oracle/oradata/cos11g/redo01.log' to '/oracle/database/cos11g/back/redo01.log'; alter database rename file '/u01/app/oracle/oradata/cos11g/redo01.log' to '/oracle/database/cos11g/back/redo01.log' * ERROR at line 1: ORA-01511: error in renaming log/data files ORA-01512: error renaming log file /u01/app/oracle/oradata/cos11g/redo01.log - new file /oracle/database/cos11g/back/redo01.log not found ORA-27046: file size is not a multiple of logical block size Additional information: 1 SQL> SQL> oradebug setmypid Statement processed. SQL> alter database backup controlfile to trace; alter database backup controlfile to trace * ERROR at line 1: ORA-16433: The database must be opened in read/write mode. |
由于他这里redo有问题,所以只能当redo全丢失来进行处理了。这是为什么rename redo 无法成功的原因。
对于ORA-16433错误,可以通过重建controlfile来解决。
++++++ recreate controlfile
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
[oracle@11gR2texing cos11g]$ dd if=system01.dbf of=aa.log bs=8192 skip=801 count=1 1+0 records in 1+0 records out 8192 bytes (8.2 kB) copied, 0.0257745 s, 318 kB/s [oracle@11gR2texing cos11g]$ strings aa.log GLOBAL_DB_NAME COS11G Global database name, DBTIMEZONE 00:00 DB time zone, NO_USERID_VERIFIER_SALT 75282D7EC73B5C40AFF387B2BF5692A0, WORKLOAD_REPLAY_MODE bPREPARE implies external replay clients can connect; REPLAY implies workload replay is in progress, WORKLOAD_CAPTURE_MODE /CAPTURE implies workload capture is in progress, EXPORT_VIEWS_VERSION Export views revision #, DEFAULT_PERMANENT_TABLESPACE USERS$Name of default permanent tablespace, GLOBAL_DB_NAME SEEDDATA Global database name, NLS_RDBMS_VERSION 11.2.0.3.0 RDBMS version for NLS parameters, NLS_NCHAR_CHARACTERSET AL16UTF16 NCHAR Character set, NLS_NCHAR_CONV_EXCP FALSE NLS conversion exception, NLS_LENGTH_SEMANTICS BYTE NLS length semantics, NLS_COMP BINARY NLS comparison, NLS_DUAL_CURRENCY Dual currency symbol, NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR Timestamp with timezone format, NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR Time with timezone format, NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM Time stamp format, NLS_TIME_FORMAT HH.MI.SSXFF AM Time format, NLS_SORT BINARY Linguistic definition, NLS_DATE_LANGUAGE AMERICAN Date language, NLS_DATE_FORMAT DD-MON-RR Date format, NLS_CALENDAR GREGORIAN Calendar system, NLS_CHARACTERSET AL32UTF8 Character set, NLS_NUMERIC_CHARACTERS Numeric characters, NLS_ISO_CURRENCY AMERICA ISO currency, NLS_CURRENCY Local currency, NLS_TERRITORY AMERICA Territory, NLS_LANGUAGE AMERICAN Language, DEFAULT_TBS_TYPE SMALLFILE Default tablespace type, DST_SECONDARY_TT_VERSION 0'Version of secondary timezone data file, DST_PRIMARY_TT_VERSION 14%Version of primary timezone data file, DST_UPGRADE_STATE NONE&State of Day Light Saving Time Upgrade< DBTIMEZONE -07:00 DB time zone, TDE_MASTER_KEY_ID, Flashback Timestamp TimeZone GMT"Flashback timestamp created in GMT, DEFAULT_TEMP_TABLESPACE TEMP$Name of default temporary tablespace, DEFAULT_EDITION ORA$BASE$Name of the database default edition, DEFAULT_PERMANENT_TABLESPACE SYSTEM$Name of default permanent tablespace, DEFAULT_TEMP_TABLESPACE SYSTEM$Name of default temporary tablespace, DICT.BASE 2 dictionary base tables version # SQL> startup nomount pfile='/oracle/database/init.ora'; ORACLE instance started. Total System Global Area 384143360 bytes Fixed Size 2228624 bytes Variable Size 255856240 bytes Database Buffers 121634816 bytes Redo Buffers 4423680 bytes SQL> CREATE CONTROLFILE REUSE DATABASE "COS11G" RESETLOGS NOARCHIVELOG 2 MAXLOGFILES 20 3 MAXLOGMEMBERS 3 4 MAXDATAFILES 1024 5 MAXINSTANCES 16 6 MAXLOGHISTORY 292 7 LOGFILE 8 GROUP 1 '/oracle/database/cos11g/redo01.log' SIZE 50M, 9 GROUP 2 '/oracle/database/cos11g/redo02.log' SIZE 50M, 10 GROUP 3 '/oracle/database/cos11g/redo03.log' SIZE 50M 11 -- STANDBY LOGFILE 12 DATAFILE 13 '/oracle/database/cos11g/system01.dbf', 14 '/oracle/database/cos11g/sysaux01.dbf', 15 '/oracle/database/cos11g/users01.dbf', 16 '/oracle/database/cos11g/undotbs01.dbf', 17 '/oracle/database/cos11g/example01.dbf' 18 CHARACTER SET AL32UTF8 19 ; Control file created. SQL> |
++++RECOVER
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
SQL> recover database using backup controlfile until cancel; ORA-00279: change 2223501 generated at 08/28/2014 11:22:42 needed for thread 1 ORA-00289: suggestion : /oracle/product/11.2.0/db_1/dbs/arch1_1_856783361.dbf ORA-00280: change 2223501 for thread 1 is in sequence #1 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} cancel ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below ORA-01194: file 1 needs more recovery to be consistent ORA-01110: data file 1: '/oracle/database/cos11g/system01.dbf' ORA-01112: media recovery not started SQL> select file#,checkpoint_change# from v$datafile order by 1; FILE# CHECKPOINT_CHANGE# ---------- ------------------ 1 2223501 2 2223501 3 2223501 4 2223501 5 2223501 SQL> c/datafile/datafile_header 1* select file#,checkpoint_change# from v$datafile_header order by 1 SQL> / FILE# CHECKPOINT_CHANGE# ---------- ------------------ 1 2223501 2 2223501 3 2223501 4 2223501 5 2223501 SQL> alter database open resetlogs; alter database open resetlogs * ERROR at line 1: ORA-01092: ORACLE instance terminated. Disconnection forced ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small Process ID: 19707 Session ID: 1 Serial number: 5 SQL> |
尝试使用隐含参数,以及屏蔽回滚段,发现错误依旧。对于报错,通常的思路是跟踪,去识别Oracle
为什么会在这个地方报错,下面我们在open resetlogs之前打开10046 event的跟踪:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
SQL> startup mount pfile='/oracle/database/init.ora'; ORACLE instance started. Total System Global Area 384143360 bytes Fixed Size 2228624 bytes Variable Size 255856240 bytes Database Buffers 121634816 bytes Redo Buffers 4423680 bytes Database mounted. SQL> recover database using backup controlfile until cancel; ORA-00279: change 2223505 generated at 10/18/2013 08:51:28 needed for thread 1 ORA-00289: suggestion : /oracle/product/11.2.0/db_1/dbs/arch1_1_829126281.dbf ORA-00280: change 2223505 for thread 1 is in sequence #1 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} cancel ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below ORA-01194: file 1 needs more recovery to be consistent ORA-01110: data file 1: '/oracle/database/cos11g/system01.dbf' ORA-01112: media recovery not started SQL> oradebug setmypid Statement processed. SQL> oradebug event 10046 trace name context forever,level 12 Statement processed. SQL> oradebug tracefile_name /oracle/diag/rdbms/cos11g/cos11g/trace/cos11g_ora_20178.trc SQL> alter database open resetlogs; alter database open resetlogs * ERROR at line 1: ORA-01092: ORACLE instance terminated. Disconnection forced ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small Process ID: 20178 Session ID: 1 Serial number: 7 |
下面分析10046 trace是重点。通常我们的思路是直接vi 编辑定位到trace的最后,会看到如下的信息:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
===================== PARSING IN CURSOR #140237859830552 len=142 dep=2 uid=0 oct=3 lid=0 tim=1382101211544386 hv=361892850 ad='768c34f8' sqlid='7bd391hat42zk' select /*+ rule */ name,file#,block#,status$,user#,undosqn,xactsqn,scnbas,scnwrp,DECODE(inst#,0,NULL,inst#),ts#,spare1 from undo$ where us#=:1 END OF STMT PARSE #140237859830552:c=1000,e=846,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=3,plh=0,tim=1382101211544383 BINDS #140237859830552: Bind#0 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0 kxsbbbfp=7f8babd1b6d0 bln=22 avl=02 flg=05 value=6 EXEC #140237859830552:c=1000,e=1144,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=3,plh=906473769,tim=1382101211545671 WAIT #140237859830552: nam='db file sequential read' ela= 25 file#=1 block#=321 blocks=1 obj#=34 tim=1382101211545793 WAIT #140237859830552: nam='db file sequential read' ela= 16 file#=1 block#=225 blocks=1 obj#=15 tim=1382101211545892 FETCH #140237859830552:c=0,e=225,p=2,cr=2,cu=0,mis=0,r=1,dep=2,og=3,plh=906473769,tim=1382101211545944 STAT #140237859830552 id=1 cnt=1 pid=0 pos=1 obj=15 op='TABLE ACCESS BY INDEX ROWID UNDO$ (cr=2 pr=2 pw=0 time=217 us)' STAT #140237859830552 id=2 cnt=1 pid=1 pos=1 obj=34 op='INDEX UNIQUE SCAN I_UNDO1 (cr=1 pr=1 pw=0 time=127 us)' CLOSE #140237859830552:c=0,e=10,dep=2,type=0,tim=1382101211546060 WAIT #140237859528568: nam='db file sequential read' ela= 26 file#=3 block#=208 blocks=1 obj#=0 tim=1382101211546288 WAIT #140237859528568: nam='db file sequential read' ela= 16 file#=3 block#=440 blocks=1 obj#=0 tim=1382101211546574 FETCH #140237859528568:c=4000,e=4057,p=5,cr=7,cu=0,mis=0,r=0,dep=1,og=4,plh=2853959010,tim=1382101211547438 ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small |
从10046的跟踪来看,我们发现该SQL访问了如下的几个block:
1 2 3 4 |
WAIT #140237859830552: nam='db file sequential read' ela= 25 file#=1 block#=321 blocks=1 obj#=34 tim=1382101211545793 WAIT #140237859830552: nam='db file sequential read' ela= 16 file#=1 block#=225 blocks=1 obj#=15 tim=1382101211545892 WAIT #140237859528568: nam='db file sequential read' ela= 26 file#=3 block#=208 blocks=1 obj#=0 tim=1382101211546288 WAIT #140237859528568: nam='db file sequential read' ela= 16 file#=3 block#=440 blocks=1 obj#=0 tim=1382101211546574 |
很明显,前面2个block是system的,后面2个file 3 的block是undo的block。 其中block 208是undo段头block。
我的直观感觉是既然访问这2个block有问题,肯定是有事务,何不直接bbed修改一下呢? 下面进行了尝试:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
SQL> oradebug setmypid Statement processed. SQL> alter system dump datafile '/oracle/database/cos11g/undotbs01.dbf' block 208; System altered. SQL> oradebug tracefile_name /oracle/diag/rdbms/cos11g/cos11g/trace/cos11g_ora_20967.trc SQL> SQL> alter system dump datafile '/oracle/database/cos11g/system01.dbf' block 321; System altered. SQL> Start dump data block from file /oracle/database/cos11g/system01.dbf minblk 321 maxblk 321 V10 STYLE FILE HEADER: Compatibility Vsn = 186646528=0xb200000 Db ID=2845990550=0xa9a25e96, Db Name='COS11G' Activation ID=0=0x0 Control Seq=1388=0x56c, File size=90880=0x16300 File Number=1, Blksiz=8192, File Type=3 DATA Dump all the blocks in range: buffer tsn: 0 rdba: 0x00400141 (1/321) scn: 0x0000.000f24ae seq: 0x01 flg: 0x06 tail: 0x24ae0601 frmt: 0x02 chkval: 0x5aa6 type: 0x06=trans data Hex dump of block: st=0, typ_found=1 Dump of memory from 0x00007FF407277E00 to 0x00007FF407279E00 7FF407277E00 0000A206 00400141 000F24AE 06010000 [....A.@..$......] 。。。。。 7FF407279DF0 00000000 00000000 00000000 24AE0601 [...............$] Block header dump: 0x00400141 Object id on Block? Y seg/obj: 0x22 csc: 0x00.f24ad itc: 2 flg: - typ: 2 - INDEX fsl: 0 fnx: 0x0 ver: 0x01 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 0x02 0x0000.019.00000012 0x00400217.001a.06 --U- 1 fsc 0x0000.000f24ae Leaf block dump =============== header address 140686068776540=0x7ff407277e5c kdxcolev 0 KDXCOLEV Flags = - - - kdxcolok 0 kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y kdxconco 1 kdxcosdc 0 kdxconro 21 kdxcofbo 78=0x4e kdxcofeo 7802=0x1e7a kdxcoavs 7724 kdxlespl 0 kdxlende 0 kdxlenxt 0=0x0 kdxleprv 0=0x0 kdxledsz 6 kdxlebksz 8032 row#0[8022] flag: ------, lock: 0, len=10, data:(6): 00 40 00 e1 00 00 col 0; len 1; (1): 80 ....... col 0; len 2; (2): c1 10 row#16[7846] flag: ------, lock: 0, len=11, data:(6): 00 40 00 e1 00 10 col 0; len 2; (2): c1 11 row#17[7835] flag: ------, lock: 0, len=11, data:(6): 00 40 00 e1 00 11 col 0; len 2; (2): c1 12 row#18[7824] flag: ------, lock: 0, len=11, data:(6): 00 40 00 e1 00 12 col 0; len 2; (2): c1 13 row#19[7813] flag: ------, lock: 0, len=11, data:(6): 00 40 00 e1 00 13 col 0; len 2; (2): c1 14 row#20[7802] flag: ------, lock: 2, len=11, data:(6): 00 40 00 e1 00 14 col 0; len 2; (2): c1 15 ----- end of leaf block dump ----- |
注意通过看blockdump,我们知道最后一行记录的lock byte为2. 这里我因此进行了修改(实际上这个事务状态是U,是不需要修改的,这一步骤是错误的思路)
下面先修改block 321:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
BBED> set dba 0x00400141 DBA 0x00400141 (4194625 1,321) BBED> map File: /oracle/database/cos11g/system01.dbf (1) Block: 321 Dba:0x00400141 ------------------------------------------------------------ KTB Data Block (Index Leaf) struct kcbh, 20 bytes @0 struct ktbbh, 72 bytes @20 struct kdxle, 32 bytes @92 sb2 kd_off[21] @124 ub1 freespace[7724] @166 ub1 rowdata[230] @7890 ub4 tailchk @8188 BBED> p *kd_off[20] rowdata[26] ----------- ub1 rowdata[26] @7916 0x00 BBED> d /v offset 7890 count 40 File: /oracle/database/cos11g/system01.dbf (1) Block: 321 Offsets: 7890 to 7929 Dba:0x00400141 ------------------------------------------------------- 00000000 00020040 00e10014 02c11500 l .......@.... 00004000 e1001302 c1140000 004000e1 l ..@......@. 001202c1 13000000 l ...... <16 bytes per line> BBED> modify /x 00 offset 7895 File: /oracle/database/cos11g/system01.dbf (1) Block: 321 Offsets: 7895 to 7934 Dba:0x00400141 ------------------------------------------------------------------------ 00004000 e1001402 c1150000 004000e1 001302c1 14000000 4000e100 1202c113 00000040 00e10011 <32 bytes per line> BBED> sum apply Check value for File 1, Block 321: current = 0xf8a6, required = 0xf8a6 BBED> verify DBVERIFY - Verification starting FILE = /oracle/database/cos11g/system01.dbf BLOCK = 321 DBVERIFY - Verification complete Total Blocks Examined : 1 Total Blocks Processed (Data) : 0 Total Blocks Failing (Data) : 0 Total Blocks Processed (Index): 1 Total Blocks Failing (Index): 0 Total Blocks Empty : 0 Total Blocks Marked Corrupt : 0 Total Blocks Influx : 0 Message 531 not found; product=RDBMS; facility=BBED BBED> |
下面修改block 225:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 |
BBED> set file 1 block 225 FILE# 1 BLOCK# 225 BBED> map File: /oracle/database/cos11g/system01.dbf (1) Block: 225 Dba:0x004000e1 ------------------------------------------------------------ KTB Data Block (Table/Cluster) struct kcbh, 20 bytes @0 struct ktbbh, 48 bytes @20 struct kdbh, 14 bytes @68 struct kdbt[1], 4 bytes @82 sb2 kdbr[21] @86 ub1 freespace[4287] @128 ub1 rowdata[3773] @4415 ub4 tailchk @8188 BBED> p ktbbh struct ktbbh, 48 bytes @20 ub1 ktbbhtyp @20 0x01 (KDDBTDATA) union ktbbhsid, 4 bytes @24 ub4 ktbbhsg1 @24 0x0000000f ub4 ktbbhod1 @24 0x0000000f struct ktbbhcsc, 8 bytes @28 ub4 kscnbas @28 0x00105ea6 ub2 kscnwrp @32 0x0000 sb2 ktbbhict @36 -2047 ub1 ktbbhflg @38 0x03 (KTBFONFL) ub1 ktbbhfsl @39 0x00 ub4 ktbbhfnx @40 0x00000000 struct ktbbhitl[0], 24 bytes @44 struct ktbitxid, 8 bytes @44 ub2 kxidusn @44 0x0000 ub2 kxidslt @46 0x0018 ub4 kxidsqn @48 0x00000014 struct ktbituba, 8 bytes @52 ub4 kubadba @52 0x00400220 ub2 kubaseq @56 0x001c ub1 kubarec @58 0x01 ub2 ktbitflg @60 0x2001 (KTBFUPB) union _ktbitun, 2 bytes @62 sb2 _ktbitfsc @62 0 ub2 _ktbitwrp @62 0x0000 ub4 ktbitbas @64 0x00105ea7 BBED> modify /x 80 offset 61 Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y File: /oracle/database/cos11g/system01.dbf (1) Block: 225 Offsets: 61 to 62 Dba:0x004000e1 ------------------------------------------------------------------------ 8000 <32 bytes per line> BBED> sum apply Check value for File 1, Block 225: current = 0x80b8, required = 0x80b8 BBED> verify DBVERIFY - Verification starting FILE = /oracle/database/cos11g/system01.dbf BLOCK = 225 Block Checking: DBA = 4194529, Block Type = KTB-managed data block data header at 0x7f717a7c7244 kdbchk: row locked by non-existent transaction table=0 slot=10 lockid=1 ktbbhitc=1 Block 225 failed with check code 6101 DBVERIFY - Verification complete Total Blocks Examined : 1 Total Blocks Processed (Data) : 1 Total Blocks Failing (Data) : 1 Total Blocks Processed (Index): 0 Total Blocks Failing (Index): 0 Total Blocks Empty : 0 Total Blocks Marked Corrupt : 0 Total Blocks Influx : 0 Message 531 not found; product=RDBMS; facility=BBED BBED> p *kdbr[10] rowdata[611] ------------ ub1 rowdata[611] @5026 0x2c BBED> x /r1nnnnnnnnnnnnnnn rowdata[611] @5026 ------------ flag@5026: 0x2c (KDRHFL, KDRHFF, KDRHFH) lock@5027: 0x01 cols@5028: 17 col 0[2] @5029: 10 col 1[21] @5032: -0 col 2[2] @5054: 1 col 3[2] @5057: 3 col 4[3] @5060: 272 col 5[5] @5064: 1072765 col 6[1] @5070: 0 col 7[3] @5072: 643 col 8[3] @5076: 124 col 9[1] @5080: 0 col 10[2] @5082: 3 col 11[2] @5085: 2 col 12[0] @5088: *NULL* col 13[0] @5089: *NULL* col 14[0] @5090: *NULL* col 15[0] @5091: *NULL* col 16[2] @5092: 2 BBED> modify /x 00 offset 5027 File: /oracle/database/cos11g/system01.dbf (1) Block: 225 Offsets: 5027 to 5028 Dba:0x004000e1 ------------------------------------------------------------------------ 0011 <32 bytes per line> BBED> sum apply Check value for File 1, Block 225: current = 0x81b8, required = 0x81b8 BBED> verify DBVERIFY - Verification starting FILE = /oracle/database/cos11g/system01.dbf BLOCK = 225 DBVERIFY - Verification complete Total Blocks Examined : 1 Total Blocks Processed (Data) : 1 Total Blocks Failing (Data) : 0 Total Blocks Processed (Index): 0 Total Blocks Failing (Index): 0 Total Blocks Empty : 0 Total Blocks Marked Corrupt : 0 Total Blocks Influx : 0 Message 531 not found; product=RDBMS; facility=BBED |
修改完成之后,尝试,发现仍然报错,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
SMON: enabling cache recovery ORA-01555 caused by SQL statement below (SQL ID: 3nkd3g3ju5ph1, Query Duration=0 sec, SCN: 0x0000.0021edd4): select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null Errors in file /oracle/diag/rdbms/cos11g/cos11g/trace/cos11g_ora_23058.trc: ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small Errors in file /oracle/diag/rdbms/cos11g/cos11g/trace/cos11g_ora_23058.trc: ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small Error 704 happened during db open, shutting down database USER (ospid: 23058): terminating the instance due to error 704 Instance terminated by USER, pid = 23058 ORA-1092 signalled during: alter database open resetlogs... opiodr aborting process unknown ospid (23058) as a result of ORA-1092 |
其中还遇到了让我觉得比较怪异的事情是,如果通过bbed修改undo$.status,强制将回滚段修改offline状态的话,那么会遇到如下的问题:
1 2 3 4 5 |
SMON: enabling cache recovery Errors in file /oracle/diag/rdbms/cos11g/cos11g/trace/cos11g_ora_22743.trc: ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01173: data dictionary indicates missing data file from system tablespace |
最后再回过头来分析,发现其实犯了一个致命的错误。alert log的ora-01555报错SQL其实和我们10046 最后看到的sql并不是同一个,
换句话讲,直接分析10046 trace的最后部分,是错误的,我们应该找ora-01555报错的这个SQL。
由于10046 trace内容比较多,我们直接切换到最后去查看分析,这样是有问题的。 于是尝试换个思路:
搜索报错的SQLID:3nkd3g3ju5ph1,然后定位到parse ID,然后进行过滤:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
[oracle@11gR2texing database]$ cat /oracle/diag/rdbms/cos11g/cos11g/trace/cos11g_ora_20178.trc |grep '140237859528568' PARSING IN CURSOR #140237859528568 len=202 dep=2 uid=0 oct=3 lid=0 tim=1382101211454266 hv=3819099649 ad='768c9448' sqlid='3nkd3g3ju5ph1' PARSE #140237859528568:c=999,e=489,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=4,plh=0,tim=1382101211454264 BINDS #140237859528568: EXEC #140237859528568:c=0,e=727,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=4,plh=2853959010,tim=1382101211455068 WAIT #140237859528568: nam='db file sequential read' ela= 92 file#=1 block#=345 blocks=1 obj#=37 tim=1382101211455209 WAIT #140237859528568: nam='db file sequential read' ela= 27 file#=1 block#=45803 blocks=1 obj#=37 tim=1382101211455295 WAIT #140237859528568: nam='db file sequential read' ela= 9 file#=1 block#=5457 blocks=1 obj#=37 tim=1382101211455377 FETCH #140237859528568:c=1000,e=335,p=3,cr=4,cu=0,mis=0,r=1,dep=2,og=4,plh=2853959010,tim=1382101211455419 STAT #140237859528568 id=1 cnt=1 pid=0 pos=1 obj=18 op='TABLE ACCESS BY INDEX ROWID OBJ$ (cr=4 pr=3 pw=0 time=338 us)' STAT #140237859528568 id=2 cnt=1 pid=1 pos=1 obj=37 op='INDEX RANGE SCAN I_OBJ2 (cr=3 pr=3 pw=0 time=324 us)' CLOSE #140237859528568:c=0,e=92,dep=2,type=3,tim=1382101211455548 BINDS #140237859528568: EXEC #140237859528568:c=0,e=108,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,plh=2853959010,tim=1382101211464071 WAIT #140237859528568: nam='db file sequential read' ela= 14 file#=1 block#=9615 blocks=1 obj#=37 tim=1382101211464119 WAIT #140237859528568: nam='db file sequential read' ela= 14 file#=1 block#=4615 blocks=1 obj#=18 tim=1382101211464228 FETCH #140237859528568:c=1000,e=166,p=2,cr=4,cu=0,mis=0,r=1,dep=2,og=4,plh=2853959010,tim=1382101211464251 CLOSE #140237859528568:c=0,e=8,dep=2,type=3,tim=1382101211464280 BINDS #140237859528568: EXEC #140237859528568:c=0,e=351,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,plh=2853959010,tim=1382101211524675 FETCH #140237859528568:c=0,e=64,p=0,cr=4,cu=0,mis=0,r=1,dep=2,og=4,plh=2853959010,tim=1382101211524786 CLOSE #140237859528568:c=0,e=26,dep=2,type=3,tim=1382101211524878 BINDS #140237859528568: EXEC #140237859528568:c=1000,e=461,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2853959010,tim=1382101211528545 WAIT #140237859528568: nam='db file sequential read' ela= 31 file#=1 block#=24412 blocks=1 obj#=37 tim=1382101211528688 WAIT #140237859528568: nam='db file sequential read' ela= 29 file#=1 block#=245 blocks=1 obj#=18 tim=1382101211528853 FETCH #140237859528568:c=0,e=333,p=2,cr=4,cu=0,mis=0,r=1,dep=1,og=4,plh=2853959010,tim=1382101211528929 CLOSE #140237859528568:c=0,e=28,dep=1,type=3,tim=1382101211529025 BINDS #140237859528568: EXEC #140237859528568:c=0,e=214,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2853959010,tim=1382101211537602 FETCH #140237859528568:c=0,e=45,p=0,cr=4,cu=0,mis=0,r=1,dep=1,og=4,plh=2853959010,tim=1382101211537675 CLOSE #140237859528568:c=0,e=14,dep=1,type=3,tim=1382101211537728 BINDS #140237859528568: EXEC #140237859528568:c=0,e=201,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2853959010,tim=1382101211540521 WAIT #140237859528568: nam='db file sequential read' ela= 28 file#=1 block#=9612 blocks=1 obj#=37 tim=1382101211540628 FETCH #140237859528568:c=0,e=147,p=1,cr=4,cu=0,mis=0,r=1,dep=1,og=4,plh=2853959010,tim=1382101211540695 CLOSE #140237859528568:c=0,e=15,dep=1,type=3,tim=1382101211540750 BINDS #140237859528568: EXEC #140237859528568:c=1000,e=287,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2853959010,tim=1382101211543353 WAIT #140237859528568: nam='db file sequential read' ela= 18 file#=1 block#=20804 blocks=1 obj#=37 tim=1382101211543439 WAIT #140237859528568: nam='db file sequential read' ela= 26 file#=3 block#=208 blocks=1 obj#=0 tim=1382101211546288 WAIT #140237859528568: nam='db file sequential read' ela= 16 file#=3 block#=440 blocks=1 obj#=0 tim=1382101211546574 FETCH #140237859528568:c=4000,e=4057,p=5,cr=7,cu=0,mis=0,r=0,dep=1,og=4,plh=2853959010,tim=1382101211547438 |
大家注意看最后4行,通过过滤我们可以发现其实还有一个file 1 block 20804 也被递归SQL访问了,因此我们之前多次尝试都报错,
是因为直接vi 查看10046 trace的最后部分,没有发现而已,这是非常隐蔽的。
检查发现20804这个block 确实存在一个事务,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
BBED> set file 1 block 20804 FILE# 1 BLOCK# 20804 BBED> p ktbbh struct ktbbh, 72 bytes @20 ub1 ktbbhtyp @20 0x02 (KDDBTINDEX) union ktbbhsid, 4 bytes @24 ub4 ktbbhsg1 @24 0x00000025 ub4 ktbbhod1 @24 0x00000025 struct ktbbhcsc, 8 bytes @28 ub4 kscnbas @28 0x0012a91c ub2 kscnwrp @32 0x0000 sb2 ktbbhict @36 7938 ub1 ktbbhflg @38 0x02 (NONE) ub1 ktbbhfsl @39 0x00 ub4 ktbbhfnx @40 0x00000000 struct ktbbhitl[0], 24 bytes @44 struct ktbitxid, 8 bytes @44 ub2 kxidusn @44 0x0002 ub2 kxidslt @46 0x0015 ub4 kxidsqn @48 0x0000009f struct ktbituba, 8 bytes @52 ub4 kubadba @52 0x00c03a8d ub2 kubaseq @56 0x001c ub1 kubarec @58 0x01 ub2 ktbitflg @60 0xc000 (KTBFIBI, KTBFCOM) union _ktbitun, 2 bytes @62 sb2 _ktbitfsc @62 0 ub2 _ktbitwrp @62 0x0000 ub4 ktbitbas @64 0x0002a810 struct ktbbhitl[1], 24 bytes @68 struct ktbitxid, 8 bytes @68 ub2 kxidusn @68 0x0006 ub2 kxidslt @70 0x0014 ub4 kxidsqn @72 0x00000442 struct ktbituba, 8 bytes @76 ub4 kubadba @76 0x00c001b8 ub2 kubaseq @80 0x010e ub1 kubarec @82 0x11 ub2 ktbitflg @84 0x0001 (NONE) +++这里00表示存在未提交的事务 union _ktbitun, 2 bytes @86 sb2 _ktbitfsc @86 0 ub2 _ktbitwrp @86 0x0000 ub4 ktbitbas @88 0x00000000 BBED> modify /x 0080 File: /oracle/database/cos11g/system01.dbf (1) Block: 20804 Offsets: 84 to 87 Dba:0x00405144 ------------------------------------------------------------------------ 00800000 <32 bytes per line> BBED> sum apply Check value for File 1, Block 20804: current = 0xf4aa, required = 0xf4aa BBED> verify DBVERIFY - Verification starting FILE = /oracle/database/cos11g/system01.dbf BLOCK = 20804 Block Checking: DBA = 4215108, Block Type = KTB-managed data block **** actual rows locked by itl 2 = 1 != # in trans. header = 0 ---- end index block validation Block 20804 failed with check code 6401 DBVERIFY - Verification complete Total Blocks Examined : 1 Total Blocks Processed (Data) : 0 Total Blocks Failing (Data) : 0 Total Blocks Processed (Index): 1 Total Blocks Failing (Index): 1 Total Blocks Empty : 0 Total Blocks Marked Corrupt : 0 Total Blocks Influx : 0 Message 531 not found; product=RDBMS; facility=BBED |
修改的同时,我们还需要修改锁定行的lock byte位置,如果找到这一行呢? 很简单,我们通过dump block,然后直接
查看lock:2就行了,定位到是如下的行:
1 2 3 4 5 6 7 8 9 10 |
row#102[1300] flag: ------, lock: 2, len=32, data:(6): 00 41 42 1e 00 3d col 0; len 1; (1): 80 col 1; len 5; (5): 54 45 53 54 32 col 2; len 2; (2): c1 02 col 3; NULL col 4; NULL col 5; NULL col 6; len 2; (2): c1 03 col 7; len 1; (1): 80 col 8; len 4; (4): c3 08 45 2e |
下面我们讲该行的lock type修改为00即可,这样block的校验不再报错。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
BBED> p *kd_off[104] rowdata[4] ---------- ub1 rowdata[4] @1392 0x00 BBED> x /rcccccnnnnnn rowdata[4] @1392 ---------- flag@1392: 0x00 (NONE) lock@1393: 0x02 keydata[6]: 0x00 0x41 0x42 0x1e 0x00 0x3d data key: col 0[1] @1401: . col 1[5] @1403: TEST2 col 2[2] @1409: col 3[0] @1412: *NULL* col 4[0] @1413: *NULL* col 5[0] @1414: *NULL* col 6[2] @1415: 2 col 7[1] @1418: 0 col 8[4] @1420: 76845 BBED> modify /x 00 offset 1393 File: /oracle/database/cos11g/system01.dbf (1) Block: 20804 Offsets: 1393 to 1394 Dba:0x00405144 ------------------------------------------------------------------------ 0000 <32 bytes per line> BBED> sum apply Check value for File 1, Block 20804: current = 0xf6aa, required = 0xf6aa BBED> verify DBVERIFY - Verification starting FILE = /oracle/database/cos11g/system01.dbf BLOCK = 20804 DBVERIFY - Verification complete Total Blocks Examined : 1 Total Blocks Processed (Data) : 0 Total Blocks Failing (Data) : 0 Total Blocks Processed (Index): 1 Total Blocks Failing (Index): 0 Total Blocks Empty : 0 Total Blocks Marked Corrupt : 0 Total Blocks Influx : 0 Message 531 not found; product=RDBMS; facility=BBED |
避免ora-016433错误,我再次重建了controlfile,最后顺利打开了该数据库,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
SQL> shutdown immediate ORA-01109: database not open Database dismounted. ORACLE instance shut down. SQL> startup nomount pfile='/oracle/database/init.ora'; ORACLE instance started. Total System Global Area 384143360 bytes Fixed Size 2228624 bytes Variable Size 255856240 bytes Database Buffers 121634816 bytes Redo Buffers 4423680 bytes SQL> CREATE CONTROLFILE REUSE DATABASE "COS11G" RESETLOGS NOARCHIVELOG 2 MAXLOGFILES 20 3 MAXLOGMEMBERS 3 4 MAXDATAFILES 1024 5 MAXINSTANCES 16 6 MAXLOGHISTORY 292 7 LOGFILE 8 GROUP 1 '/oracle/database/cos11g/redo01.log' SIZE 50M, 9 GROUP 2 '/oracle/database/cos11g/redo02.log' SIZE 50M, 10 GROUP 3 '/oracle/database/cos11g/redo03.log' SIZE 50M 11 -- STANDBY LOGFILE 12 DATAFILE 13 '/oracle/database/cos11g/system01.dbf', 14 '/oracle/database/cos11g/sysaux01.dbf', 15 '/oracle/database/cos11g/users01.dbf', 16 '/oracle/database/cos11g/undotbs01.dbf', 17 '/oracle/database/cos11g/example01.dbf' 18 CHARACTER SET AL32UTF8 19 ; Control file created. SQL> recover database using backup controlfile until cancel; ORA-00279: change 2223513 generated at 10/18/2013 09:08:11 needed for thread 1 ORA-00289: suggestion : /oracle/product/11.2.0/db_1/dbs/arch1_1_829127289.dbf ORA-00280: change 2223513 for thread 1 is in sequence #1 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} cancel ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below ORA-01194: file 1 needs more recovery to be consistent ORA-01110: data file 1: '/oracle/database/cos11g/system01.dbf' ORA-01112: media recovery not started SQL> alter database open resetlogs; Database altered. |
虽然是一个测试库,但是恢复过程还是有意思,也发现了自己之前分析10046 trace的一个致命问题,通常是直接vi 查看trace的最后位置去看,这真是致命的,我想这一点,很多人都是这样,希望大家能有所启发!
2 Responses to “非归档恢复的一个模拟例子”
没太看懂,数据库在什么状况下崩溃,那些东西损坏都不知道
身为菜鸟,希望能说的更详细
Leave a Reply
You must be logged in to post a comment.