今天所做的两件事
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
本文链接地址: 今天所做的两件事
event 1:对10g R2 ocr恢复到不同路径的测试
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
由于最近有3套10gR2 rac的迁移,迁移后路径会发现改变,故进行了一次测试。 ----集群原始状态: [oracle@rac1 ~]$ crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora....SM1.asm application ONLINE ONLINE rac1 ora....C1.lsnr application ONLINE ONLINE rac1 ora.rac1.gsd application ONLINE ONLINE rac1 ora.rac1.ons application ONLINE ONLINE rac1 ora.rac1.vip application ONLINE ONLINE rac1 ora....SM2.asm application ONLINE ONLINE rac2 ora....C2.lsnr application ONLINE ONLINE rac2 ora.rac2.gsd application ONLINE ONLINE rac2 ora.rac2.ons application ONLINE ONLINE rac2 ora.rac2.vip application ONLINE ONLINE rac2 ora.roger.db application ONLINE ONLINE rac1 ora....lldb.cs application ONLINE ONLINE rac1 ora....er1.srv application ONLINE ONLINE rac1 ora....er2.srv application ONLINE ONLINE rac2 ora....r1.inst application ONLINE ONLINE rac1 ora....r2.inst application ONLINE ONLINE rac2 |
在cluster所有资源都正常的情况下,进行export导出备份:
1 2 3 4 |
[root@rac1 bin]# ./ocrconfig -export /tmp/ocr_bak.ocr -s online [root@rac1 bin]# ls -ltr /tmp/ocr_bak.ocr -rw-r--r-- 1 root root 104371 Nov 13 04:53 /tmp/ocr_bak.ocr [root@rac1 bin]# |
注意:一定要保证该备份的可用性。
—–停止所有节点的crs 进程
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
[root@rac1 bin]# ./crsctl stop crs Stopping resources. This could take several minutes. Successfully stopped CRS resources. Stopping CSSD. Shutting down CSS daemon. Shutdown request successfully issued. [root@rac1 bin]# [root@rac1 bin]# ./ocrcheck Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 521836 Used space (kbytes) : 4584 Available space (kbytes) : 517252 ID : 1672426254 Device/File Name : /dev/raw/raw1 Device/File integrity check succeeded Device/File not configured Cluster registry integrity check succeeded [root@rac1 bin]# ps -ef|grep d.bin root 1977 4588 0 05:00 pts/2 00:00:00 grep d.bin [root@rac1 bin]# [root@rac2 bin]# ./crsctl stop crs Stopping resources. This could take several minutes. Successfully stopped CRS resources. Stopping CSSD. Shutting down CSS daemon. Shutdown request successfully issued. [root@rac2 bin]# ps -ef|grep d.bin root 17790 9278 0 05:01 pts/2 00:00:00 grep d.bin [root@rac2 bin]# |
—–修改/etc/oracle/ocr.loc
1 2 3 4 5 6 7 8 9 10 11 |
--rac1 [root@rac1 oracle]# cat ocr.loc ocrconfig_loc=/dev/raw/raw3 local_only=FALSE [root@rac1 oracle]# --rac2 [root@rac2 oracle]# cat ocr.loc ocrconfig_loc=/dev/raw/raw3 local_only=FALSE [root@rac2 oracle]# |
—–导入备份
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
[root@rac1 bin]# [root@rac1 bin]# ./ocrconfig -import /tmp/ocr_bak.ocr [root@rac1 bin]# ./ocrcheck Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 529900 Used space (kbytes) : 4576 Available space (kbytes) : 525324 ID : 1450286276 Device/File Name : /dev/raw/raw3 Device/File integrity check succeeded Device/File not configured Cluster registry integrity check succeeded [root@rac1 bin]# 在rac2节点也进行check: [root@rac2 oracle]# cd /home/oracle/app/oracle/product/10.2.0/crs/bin [root@rac2 bin]# pwd /home/oracle/app/oracle/product/10.2.0/crs/bin [root@rac2 bin]# ./ocrcheck Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 529900 Used space (kbytes) : 4576 Available space (kbytes) : 525324 ID : 1450286276 Device/File Name : /dev/raw/raw3 Device/File integrity check succeeded Device/File not configured Cluster registry integrity check succeeded |
—-启动crs,检查是否能够正常启动
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
rac1: [root@rac1 bin]# ./crsctl start crs Attempting to start CRS stack The CRS stack will be started shortly [root@rac1 bin]# rac2: [root@rac2 bin]# ./crsctl start crs Attempting to start CRS stack The CRS stack will be started shortly [root@rac2 bin]# 经过测试,可以正常启动: [root@rac1 bin]# ./crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora....SM1.asm application ONLINE ONLINE rac1 ora....C1.lsnr application ONLINE ONLINE rac1 ora.rac1.gsd application ONLINE ONLINE rac1 ora.rac1.ons application ONLINE ONLINE rac1 ora.rac1.vip application ONLINE ONLINE rac1 ora....SM2.asm application ONLINE ONLINE rac2 ora....C2.lsnr application ONLINE ONLINE rac2 ora.rac2.gsd application ONLINE ONLINE rac2 ora.rac2.ons application ONLINE ONLINE rac2 ora.rac2.vip application ONLINE ONLINE rac2 ora.roger.db application ONLINE ONLINE rac1 ora....lldb.cs application ONLINE ONLINE rac1 ora....er1.srv application ONLINE ONLINE rac1 ora....er2.srv application ONLINE ONLINE rac2 ora....r1.inst application ONLINE ONLINE rac1 ora....r2.inst application ONLINE ONLINE rac2 [root@rac1 bin]# |
备注:
1. 很久没搭建10gR2 rac了,翻了下以前的笔记,记得要给vip指定默认网关,不然vip资源会
经常自动down掉,只是我的vmware环境是这样的,可以通过如下方式进行修复:
修改racgvip脚本中的如下内容:
FAIL_WHEN_DEFAULTGW_NOT_FOUND=1 参数修改为0或 指定 DEFAULTGW 即可。
2. linux 安装10gR2 rac时,升级到10205 rac时,执行root102.sh显示timeout, 其中evmd log如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
2012-11-13 01:33:44.210: [ EVMD][1885888]0ENV Logging level for Module: OCRCLI 0 2012-11-13 01:33:44.226: [ EVMD][1885888]0ENV Logging level for Module: CSSCLNT 0 2012-11-13 01:33:44.227: [ EVMD][1885888]0Creating pidfile /home/oracle/app/oracle/product/10.2.0/crs/evm/init/rac1.pid 2012-11-13 01:33:44.228: [ COMMCRS][60271504]clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=rac1DBG_EVMD)) [ clsdmt][116939664]Fail to listen to (ADDRESS=(PROTOCOL=ipc)(KEY=rac1DBG_EVMD)) [ clsdmt][116939664]Terminating clsdm listening thread 2012-11-13 01:33:44.304: [ EVMD][1885888]0Authorization database built successfully. 2012-11-13 01:33:44.768: [ COMMCRS][60271504]clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth)) 2012-11-13 01:33:44.769: [ EVMAPP][1885888][PANIC]0Unable to open local accept socket(CLSA) - errno: 18 2012-11-13 01:33:44.769: [ EVMD][1885888][PANIC]0EVMD exiting 2012-11-13 01:33:44.769: [ EVMD][1885888]0Done. 2012-11-13 01:33:45.688: [ EVMD][3995328]0EVMD Starting 2012-11-13 01:33:45.688: [ EVMD][3995328]0 Oracle Database 10g CRS Release 10.2.0.5.0 Production Copyright 1996, 2007, Oracle. All rights reserved 2012-11-13 01:33:45.688: [ EVMD][3995328]0Initializing OCR 2012-11-13 01:33:46.200: [ EVMD][3995328]0Active Version from OCR:10.2.0.5.0 2012-11-13 01:33:46.200: [ EVMD][3995328]0Active Version and Software Version are same 2012-11-13 01:33:46.200: [ EVMD][3995328]0Initializing Diagnostics Settings 2012-11-13 01:33:46.230: [ EVMD][3995328]0ENV Logging level for Module: allcomp 0 2012-11-13 01:33:46.243: [ EVMD][3995328]0ENV Logging level for Module: default 0 2012-11-13 01:33:46.258: [ EVMD][3995328]0ENV Logging level for Module: COMMCRS 0 2012-11-13 01:33:46.270: [ EVMD][3995328]0ENV Logging level for Module: COMMNS 0 2012-11-13 01:33:46.283: [ EVMD][3995328]0ENV Logging level for Module: EVMD 0 2012-11-13 01:33:46.297: [ EVMD][3995328]0ENV Logging level for Module: EVMDMAIN 0 2012-11-13 01:33:46.312: [ EVMD][3995328]0ENV Logging level for Module: EVMCOMM 0 2012-11-13 01:33:46.326: [ EVMD][3995328]0ENV Logging level for Module: EVMEVT 0 2012-11-13 01:33:46.338: [ EVMD][3995328]0ENV Logging level for Module: EVMAPP 0 2012-11-13 01:33:46.353: [ EVMD][3995328]0ENV Logging level for Module: EVMAGENT 0 2012-11-13 01:33:46.377: [ EVMD][3995328]0ENV Logging level for Module: CRSOCR 0 2012-11-13 01:33:46.390: [ EVMD][3995328]0ENV Logging level for Module: CLUCLS 0 2012-11-13 01:33:46.403: [ EVMD][3995328]0ENV Logging level for Module: OCRRAW 0 2012-11-13 01:33:46.419: [ EVMD][3995328]0ENV Logging level for Module: OCROSD 0 2012-11-13 01:33:46.433: [ EVMD][3995328]0ENV Logging level for Module: OCRAPI 0 2012-11-13 01:33:46.447: [ EVMD][3995328]0ENV Logging level for Module: OCRUTL 0 2012-11-13 01:33:46.461: [ EVMD][3995328]0ENV Logging level for Module: OCRMSG 0 2012-11-13 01:33:46.474: [ EVMD][3995328]0ENV Logging level for Module: OCRCLI 0 2012-11-13 01:33:46.489: [ EVMD][3995328]0ENV Logging level for Module: CSSCLNT 0 2012-11-13 01:33:46.489: [ EVMD][3995328]0Creating pidfile /home/oracle/app/oracle/product/10.2.0/crs/evm/init/rac1.pid 2012-11-13 01:33:46.490: [ COMMCRS][49093520]clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=rac1DBG_EVMD)) [ clsdmt][38603664]Fail to listen to (ADDRESS=(PROTOCOL=ipc)(KEY=rac1DBG_EVMD)) [ clsdmt][38603664]Terminating clsdm listening thread 2012-11-13 01:33:46.565: [ EVMD][3995328]0Authorization database built successfully. 2012-11-13 01:33:47.041: [ COMMCRS][49093520]clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth)) 2012-11-13 01:33:47.041: [ EVMAPP][3995328][PANIC]0Unable to open local accept socket(CLSA) - errno: 18 2012-11-13 01:33:47.041: [ EVMD][3995328][PANIC]0EVMD exiting 2012-11-13 01:33:47.041: [ EVMD][3995328]0Done. 最近检查发现/var/tmp权限不对,执行如下操作即可: chown -R oracle:dba /var/tmp |
Event 2: 关于standby_file_management参数为manual时添加datafile丢失的问题
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
----主库 SQL> select name,OPEN_MODE,DATABASE_ROLE from v$database; NAME OPEN_MODE DATABASE_ROLE --------- ---------- ---------------- TEST READ WRITE PRIMARY SQL> ---备库 SQL> select name,OPEN_MODE,DATABASE_ROLE from v$database; NAME OPEN_MODE DATABASE_ROLE --------- ---------- ---------------- TEST MOUNTED PHYSICAL STANDBY SQL> |
—主库添加数据文件进行测试
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
SQL> show parameter standby NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ standby_archive_dest string /arch2 standby_file_management string auto SQL> alter tablespace test add datafile '/oracle/product/oradata/test/test04.dbf' size 10m; Tablespace altered. SQL> alter system switch logfile; System altered. SQL> / System altered. SQL> / System altered. SQL> ---备库 SQL> l 1* select file#,name,bytes/1024/1024 from v$datafile order by 1 SQL> / FILE# NAME BYTES/1024/1024 ---------- --------------------------------------------------------------------------- --------------- 1 /oracle/product/oradata/test/system01.dbf 225 2 /oracle/product/oradata/test/undo01.dbf 40 3 /oracle/product/oradata/test/rman.dbf 100 4 /oracle/product/oradata/test/test01.dbf 500 5 /oracle/product/oradata/test/perfstat.dbf 200 6 /oracle/product/oradata/test/undo02.dbf 360 7 /oracle/product/oradata/test/sysaux01.dbf 500 8 /oracle/product/oradata/test/test02.dbf 10 9 /oracle/product/oradata/test/test03.dbf 500 10 /oracle/product/oradata/test/test04.dbf 10 10 rows selected. SQL> SQL> SQL> show parameter standby NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ standby_archive_dest string /arch2 standby_file_management string AUTO SQL> alter system set standby_file_management=manual; System altered. |
——-主库添加数据文件(修改参数后)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
SQL> select name,OPEN_MODE,DATABASE_ROLE from v$database; NAME OPEN_MODE DATABASE_ROLE --------- ---------- ---------------- TEST READ WRITE PRIMARY SQL> alter tablespace test add datafile '/oracle/product/oradata/test/test05.dbf' size 10m; Tablespace altered. SQL> alter system switch logfile; System altered. SQL> 此时备库,报错了,如下: Tue Nov 13 13:46:44 2012 ALTER SYSTEM SET standby_file_management='MANUAL' SCOPE=BOTH; Tue Nov 13 13:47:22 2012 Primary database is in MAXIMUM PERFORMANCE mode RFS[2]: Successfully opened standby log 3: '/oracle/product/oradata/test/redo03.log' Tue Nov 13 13:47:23 2012 Media Recovery Log /arch2/1_378_726529113.dbf File #11 added to control file as 'UNNAMED00011' because the parameter STANDBY_FILE_MANAGEMENT is set to MANUAL The file should be manually created to continue. Errors with log /arch2/1_378_726529113.dbf MRP0: Background Media Recovery terminated with error 1274 Tue Nov 13 13:47:23 2012 Errors in file /oracle/product/admin/test/bdump/test_mrp0_7117.trc: ORA-01274: cannot add datafile '/oracle/product/oradata/test/test05.dbf' - file could not be created Some recovered datafiles maybe left media fuzzy Media recovery may continue but open resetlogs may fail Tue Nov 13 13:47:24 2012 Errors in file /oracle/product/admin/test/bdump/test_mrp0_7117.trc: ORA-01274: cannot add datafile '/oracle/product/oradata/test/test05.dbf' - file could not be created Tue Nov 13 13:47:24 2012 MRP0: Background Media Recovery process shutdown (test) |
下面我们手工再备库create datafile:
1 2 3 4 5 6 7 8 9 |
SQL> alter database create datafile 11 as '/oracle/product/oradata/test/test05.dbf' size 10m; Database altered. SQL> alter database recover managed standby database disconnect from session; Database altered. SQL> |
然后来观察一下alert log是否正常:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
Tue Nov 13 13:48:51 2012 alter database create datafile 11 as '/oracle/product/oradata/test/test05.dbf' size 10m Tue Nov 13 13:48:51 2012 Completed: alter database create datafile 11 as '/oracle/product/oradata/test/test05.dbf' size 10m Tue Nov 13 13:49:03 2012 alter database recover managed standby database disconnect from session Tue Nov 13 13:49:03 2012 Attempt to start background Managed Standby Recovery process (test) MRP0 started with pid=18, OS id=8681 Tue Nov 13 13:49:03 2012 MRP0: Background Managed Standby Recovery process started (test) Managed Standby Recovery not using Real Time Apply parallel recovery started with 2 processes Tue Nov 13 13:49:08 2012 Waiting for all non-current ORLs to be archived... Media Recovery Log /arch2/1_378_726529113.dbf Media Recovery Waiting for thread 1 sequence 379 (in transit) Tue Nov 13 13:49:09 2012 Completed: alter database recover managed standby database disconnect from session |
可以发现,完全正常,不知道为什么昨晚操作客户的3节点rac+standby(11gR2)时,通过create datafile发现居然报错,最后
无奈之下出现了很多其他的错误,本来想通过bbed直接去调scn,想想作罢,反正有备份,重新restore一次即可。
3 Responses to “今天所做的两件事”
event1中从日志中可以看出是/var/tmp的权限问题吗?
从日志无法直接看出,需要一点常识判断。
日志里面有一段是这样的:
Unable to open local accept socket …..
oracle 的socket信息是记录在/var/tmp下面的。
从这篇文档的lsof可以看出,occsd进程的scoke信息就在/var/tmp/.oracle下面。
http://www.killdb.com/2012/11/15/10g-rac%e5%a6%82%e4%bd%95%e9%80%9a%e8%bf%87votedisk%e6%9d%a5%e5%88%a4%e6%96%addisk%e5%bf%83%e8%b7%b3%ef%bc%9f.html
Leave a Reply
You must be logged in to post a comment.