11.2.0.4 RAC CRS diskgroup auto dismount问题
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
这几天出差到甘肃,某客户的集群有点小问题。其中节点的crs检测发现有点异常,不过怪异的是业务却没有受到任何影响。如下是我检测的结果:
1 2 3 4 5 6 7 8 9 10 11 |
[grid@tysy1 bin]$ ocrcheck PROT-602: Failed to retrieve data from the cluster registry PROC-26: Error while accessing the physical storage [grid@tysy1 bin]$ crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 6fe50eed90fe4f44bf68c2da8459bfc1 (ORCL:OCR_VOTE01) [CRS] 2. ONLINE 7afb21a9c8c94f27bf6aecb55d4e6e58 (ORCL:OCR_VOTE02) [CRS] 3. ONLINE 77c7c00e799a4f47bf41316f27a1d95c (ORCL:OCR_VOTE03) [CRS] Located 3 voting disk(s). |
我检测发现crsct status res -t -init 执行ok,执行crsctl status res -t报错。由于客户使用的是asmlib,我检查发现asmlib的disk都是正常的,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
[grid@tysy1 bin]$ /etc/init.d/oracleasm listdisks ARCHVOL DATAVOL01 DATAVOL02 DATAVOL03 DATAVOL04 FRAVOL OCR_VOTE01 OCR_VOTE02 OCR_VOTE03 [root@tysy1 disks]# /etc/init.d/oracleasm querydisk -d OCR_VOTE01 Disk "OCR_VOTE01" is a valid ASM disk on device [8,21] [root@tysy1 disks]# /etc/init.d/oracleasm querydisk -d OCR_VOTE02 Disk "OCR_VOTE02" is a valid ASM disk on device [8,22] [root@tysy1 disks]# /etc/init.d/oracleasm querydisk -d OCR_VOTE03 Disk "OCR_VOTE03" is a valid ASM disk on device [8,23] [root@tysy1 disks]# ls -ltr /dev/sdb* brw-rw---- 1 root disk 8, 16 Sep 1 20:23 /dev/sdb brw-rw---- 1 root disk 8, 17 Sep 1 20:23 /dev/sdb1 brw-rw---- 1 root disk 8, 21 Sep 1 20:23 /dev/sdb5 brw-rw---- 1 root disk 8, 22 Sep 1 20:23 /dev/sdb6 brw-rw---- 1 root disk 8, 23 Sep 1 20:23 /dev/sdb7 brw-rw---- 1 root disk 8, 24 Sep 1 20:23 /dev/sdb8 brw-rw---- 1 root disk 8, 25 Sep 1 20:23 /dev/sdb9 brw-rw---- 1 root disk 8, 26 Sep 1 20:23 /dev/sdb10 brw-rw---- 1 root disk 8, 27 Sep 1 20:23 /dev/sdb11 brw-rw---- 1 root disk 8, 28 Sep 1 20:23 /dev/sdb12 |
检查asmlib的状态都是ok的,同时在节点1的查看asm disk,发现crs磁盘组都无法识别:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
[grid@tysy1 bin]$ NAME PATH STATE GROUP_NUMBER DISK_NUMBER MOUNT_S HEADER_STATU MODE_ST LABEL ---------- ----------------- -------- ------------ ----------- ------- ------------ ------- ------------ ORCL:OCR_VOTE01 NORMAL 0 0 CLOSED MEMBER ONLINE OCR_VOTE01 ORCL:OCR_VOTE02 NORMAL 0 1 CLOSED MEMBER ONLINE OCR_VOTE02 ORCL:OCR_VOTE03 NORMAL 0 2 CLOSED MEMBER ONLINE OCR_VOTE03 ARCHVOL ORCL:ARCHVOL NORMAL 1 0 CACHED MEMBER ONLINE ARCHVOL DATAVOL01 ORCL:DATAVOL01 NORMAL 5 0 CACHED MEMBER ONLINE DATAVOL01 DATAVOL02 ORCL:DATAVOL02 NORMAL 5 1 CACHED MEMBER ONLINE DATAVOL02 DATAVOL03 ORCL:DATAVOL03 NORMAL 6 0 CACHED MEMBER ONLINE DATAVOL03 DATAVOL04 ORCL:DATAVOL04 NORMAL 6 1 CACHED MEMBER ONLINE DATAVOL04 FRAVOL ORCL:FRAVOL NORMAL 7 0 CACHED MEMBER ONLINE FRAVOL |
同时查看crs日志也会发现Oracle 认为crs磁盘中无法访问:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
015-09-02 14:06:19.302: [OCRCHECK][1069037312]ocrcheck starts... 2015-09-02 14:06:19.369: [ OCRASM][1069037312]proprasmo: kgfoCheckMount return [6]. Cannot proceed with dirty open. 2015-09-02 14:06:19.369: [ OCRASM][1069037312]proprasmo: Error in open/create file in dg [CRS] [ OCRASM][1069037312]SLOS : SLOS: cat=6, opn=kgfo, dep=0, loc=kgfoCkMt03 2015-09-02 14:06:19.369: [ OCRASM][1069037312]ASM Error Stack : 2015-09-02 14:06:19.391: [ OCRASM][1069037312]proprasmo: kgfoCheckMount returned [6] 2015-09-02 14:06:19.391: [ OCRASM][1069037312]proprasmo: The ASM disk group CRS is not found or not mounted 2015-09-02 14:06:19.392: [ OCRRAW][1069037312]proprioo: Failed to open [+CRS]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE. 2015-09-02 14:06:19.392: [ OCRRAW][1069037312]proprioo: No OCR/OLR devices are usable 2015-09-02 14:06:19.392: [ OCRASM][1069037312]proprasmcl: asmhandle is NULL 2015-09-02 14:06:19.392: [ OCRRAW][1069037312]proprinit: Could not open raw device 2015-09-02 14:06:19.392: [ OCRASM][1069037312]proprasmcl: asmhandle is NULL 2015-09-02 14:06:19.392: [ default][1069037312]a_init:7!: Backend init unsuccessful : [26] 2015-09-02 14:06:19.392: [OCRCHECK][1069037312]initreboot: Failed to initialize OCR in REBOOT level. Retval:[26] Error:[PROC-26: Error while accessing the physical storage ] [OCRCHECK][1069037312]initreboot: Attempting to initialize OCR in DEFAULT level and update a key so that vote information is updated. 2015-09-02 14:06:19.393: [ OCRMSG][1069037312]prom_waitconnect: CONN NOT ESTABLISHED (0,29,1,2) 2015-09-02 14:06:19.393: [ OCRMSG][1069037312]GIPC error [29] msg [gipcretConnectionRefused] 2015-09-02 14:06:19.393: [ OCRMSG][1069037312]prom_connect: error while waiting for connection complete [24] 2015-09-02 14:06:19.394: [OCRCHECK][1069037312]initreboot: Failed to initialize OCR in DEFAULT level. Retval:[32] Error:[PROC-32: Cluster Ready Services on the local node is not running Messaging error [gipcretConnectionRefused] [29]] 2015-09-02 14:06:19.394: [OCRCHECK][1069037312]Failed to access OCR repository. Retval [26] Error [PROC-26: Error while accessing the physical storage ] 2015-09-02 14:06:19.394: [OCRCHECK][1069037312]Failed to initialize ocrchek2 2015-09-02 14:06:19.394: [OCRCHECK][1069037312]Exiting [status=failed]... |
到这里完全没有任何思路,我们还是来仔细分析下asm alert log,如下所示:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
Tue Sep 01 23:07:03 2015 NOTE: process _b000_+asm1 (19088) initiating offline of disk 0.3915955003 (OCR_VOTE01) with mask 0x7e in group 4 NOTE: process _b000_+asm1 (19088) initiating offline of disk 1.3915955004 (OCR_VOTE02) with mask 0x7e in group 4 NOTE: process _b000_+asm1 (19088) initiating offline of disk 2.3915955005 (OCR_VOTE03) with mask 0x7e in group 4 NOTE: checking PST: grp = 4 GMON checking disk modes for group 4 at 25 for pid 30, osid 19088 ERROR: no read quorum in group: required 2, found 0 disks NOTE: checking PST for grp 4 done. NOTE: initiating PST update: grp = 4, dsk = 0/0xe968bb3b, mask = 0x6a, op = clear NOTE: initiating PST update: grp = 4, dsk = 1/0xe968bb3c, mask = 0x6a, op = clear NOTE: initiating PST update: grp = 4, dsk = 2/0xe968bb3d, mask = 0x6a, op = clear GMON updating disk modes for group 4 at 26 for pid 30, osid 19088 ERROR: no read quorum in group: required 2, found 0 disks Tue Sep 01 23:07:03 2015 NOTE: cache dismounting (not clean) group 4/0xD7184BC4 (CRS) WARNING: Offline for disk OCR_VOTE01 in mode 0x7f failed. WARNING: Offline for disk OCR_VOTE02 in mode 0x7f failed. WARNING: Offline for disk OCR_VOTE03 in mode 0x7f failed. NOTE: messaging CKPT to quiesce pins Unix process pid: 19090, image: oracle@tysy1 (B001) Tue Sep 01 23:07:03 2015 NOTE: halting all I/Os to diskgroup 4 (CRS) Tue Sep 01 23:07:03 2015 NOTE: LGWR doing non-clean dismount of group 4 (CRS) NOTE: LGWR sync ABA=15.189 last written ABA 15.189 Tue Sep 01 23:07:03 2015 kjbdomdet send to inst 2 detach from dom 4, sending detach message to inst 2 Tue Sep 01 23:07:03 2015 List of instances: 1 2 Dirty detach reconfiguration started (new ddet inc 1, cluster inc 4) Global Resource Directory partially frozen for dirty detach * dirty detach - domain 4 invalid = TRUE 0 GCS resources traversed, 0 cancelled Dirty Detach Reconfiguration complete Tue Sep 01 23:07:03 2015 WARNING: dirty detached from domain 4 NOTE: cache dismounted group 4/0xD7184BC4 (CRS) SQL> alter diskgroup CRS dismount force /* ASM SERVER:3608693700 */ Tue Sep 01 23:07:03 2015 NOTE: cache deleting context for group CRS 4/0xd7184bc4 GMON dismounting group 4 at 27 for pid 35, osid 19090 NOTE: Disk OCR_VOTE01 in mode 0x7f marked for de-assignment NOTE: Disk OCR_VOTE02 in mode 0x7f marked for de-assignment NOTE: Disk OCR_VOTE03 in mode 0x7f marked for de-assignment NOTE:Waiting for all pending writes to complete before de-registering: grpnum 4 Tue Sep 01 23:07:18 2015 ASM Health Checker found 1 new failures Tue Sep 01 23:07:33 2015 SUCCESS: diskgroup CRS was dismounted SUCCESS: alter diskgroup CRS dismount force /* ASM SERVER:3608693700 */ Tue Sep 01 23:07:33 2015 NOTE: diskgroup resource ora.CRS.dg is offline SUCCESS: ASM-initiated MANDATORY DISMOUNT of group CRS Tue Sep 01 23:07:33 2015 Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_4208.trc: ORA-15078: ASM diskgroup was forcibly dismounted ...... ...... Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_4208.trc: ORA-15078: ASM diskgroup was forcibly dismounted Wed Sep 02 00:40:16 2015 Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_4208.trc: ORA-15078: ASM diskgroup was forcibly dismounted WARNING: requested mirror side 1 of virtual extent 4 logical extent 0 offset 294912 is not allocated; I/O request failed WARNING: requested mirror side 2 of virtual extent 4 logical extent 1 offset 294912 is not allocated; I/O request failed Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_4208.trc: ORA-15078: ASM diskgroup was forcibly dismounted ORA-15078: ASM diskgroup was forcibly dismounted Wed Sep 02 00:40:16 2015 SQL> alter diskgroup CRS check /* proxy */ ORA-15032: not all alterations performed ORA-15001: diskgroup "CRS" does not exist or is not mounted ERROR: alter diskgroup CRS check /* proxy */ Wed Sep 02 00:41:33 2015 NOTE: client exited [4197] Wed Sep 02 00:41:34 2015 NOTE: [crsd.bin@tysy1 (TNS V1-V3) 13821] opening OCR file Wed Sep 02 00:41:36 2015 NOTE: [crsd.bin@tysy1 (TNS V1-V3) 13834] opening OCR file ...... ...... Wed Sep 02 14:06:08 2015 NOTE: [ocrcheck.bin@tysy1 (TNS V1-V3) 16749] opening OCR file Wed Sep 02 14:06:19 2015 NOTE: [ocrcheck.bin@tysy1 (TNS V1-V3) 16815] opening OCR file Wed Sep 02 14:13:58 2015 NOTE: [ocrcheck.bin@tysy1 (TNS V1-V3) 18667] opening OCR file Wed Sep 02 14:17:52 2015 NOTE: [ocrcheck.bin@tysy1 (TNS V1-V3) 19633] opening OCR file |
我们可以发现,asm做了一个check,发现失败(ASM Health Checker found 1 new failures),然后就强行把crs 磁盘组dismount了。我们进一步来分析下trace 文件的内容:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
[grid@tysy1 client]$ cat /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_4208.trc Trace file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_4208.trc Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options ORACLE_HOME = /u01/app/11.2.0/grid System name: Linux Node name: tysy1 Release: 2.6.32-358.el6.x86_64 Version: #1 SMP Tue Jan 29 11:47:41 EST 2013 Machine: x86_64 VM name: VMWare Version: 6 Instance name: +ASM1 Redo thread mounted by this instance: 0 <none> Oracle process number: 26 Unix process pid: 4208, image: oracle@tysy1 (TNS V1-V3) *** 2015-09-01 23:07:33.936 *** SESSION ID:(289.3) 2015-09-01 23:07:33.936 *** CLIENT ID:() 2015-09-01 23:07:33.936 *** SERVICE NAME:() 2015-09-01 23:07:33.936 *** MODULE NAME:(crsd.bin@tysy1 (TNS V1-V3)) 2015-09-01 23:07:33.936 *** ACTION NAME:() 2015-09-01 23:07:33.936 WARNING:failed xlate 1 ORA-15078: ASM diskgroup was forcibly dismounted ...... ...... WARNING:failed xlate 1 ORA-15078: ASM diskgroup was forcibly dismounted *** 2015-09-02 00:40:16.217 WARNING:failed xlate 1 ORA-15078: ASM diskgroup was forcibly dismounted ksfdrfms:Mirror Read file=+CRS.255.4294967295 fob=0x9f403518 bufp=0x7f48c1381a00 blkno=1096 nbytes=4096 WARNING:failed xlate 1 WARNING: requested mirror side 1 of virtual extent 4 logical extent 0 offset 294912 is not allocated; I/O request failed ksfdrfms:Read failed from mirror side=1 logical extent number=0 dskno=65535 WARNING:failed xlate 1 WARNING: requested mirror side 2 of virtual extent 4 logical extent 1 offset 294912 is not allocated; I/O request failed ksfdrfms:Read failed from mirror side=2 logical extent number=1 dskno=65535 ORA-15078: ASM diskgroup was forcibly dismounted ORA-15078: ASM diskgroup was forcibly dismounted |
进一步我分析gmon进程trace 文件内容发现其中存在call stack的内容,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
---+ASM1_gmon_4091.trc *** 2015-09-01 22:55:11.686 ERROR: GMON could not validate any PST Hbeat (grp 6) for Validate 2015-09-01 22:55:11.686182 : kfdpHbeatblk_transfer: Found pending Hbeat I/O in grp=6 @ HBeatCblk=0x7f2bf4f337f8 iop=0x7f2bf4f33808 iostate=3. kfdpHbeatCB_dump: at 0x7f2bf4f337f8 with ts=09/01/2015 22:55:11 iop=0x7f2bf4f33808, grp=6, disk=0/3915955008, isWrite=0 Hbeat #0 state=3 iostate=3 ----- Abridged Call Stack Trace ----- ksedsts()+465<-kfdpHbeatblk_transfer()+953<-kfdpHbeat_doCheck()+662<-kfdpHbeat_timeout()+854<-kfdp_timeoutBg()+1642<-ksbcti()+513<-ksbabs()+1735<-ksbrdp()+1045<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai_real()+250<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253 ----- End of Abridged Call Stack Trace ----- WARNING: Waited 15 secs for write IO to PST disk 0 in group 1. *** 2015-09-01 23:07:03.571 WARNING: Waited 15 secs for write IO to PST disk 0 in group 1. WARNING: Waited 15 secs for write IO to PST disk 0 in group 4. WARNING: Waited 15 secs for write IO to PST disk 1 in group 4. WARNING: Waited 15 secs for write IO to PST disk 2 in group 4. NOTE: Set to be offline flag for disk OCR_VOTE01 only locally: flag 0x3211 NOTE: Set to be offline flag for disk OCR_VOTE02 only locally: flag 0x3211 NOTE: Set to be offline flag for disk OCR_VOTE03 only locally: flag 0x3211 WARNING: Waited 15 secs for write IO to PST disk 0 in group 4. WARNING: Waited 15 secs for write IO to PST disk 1 in group 4. WARNING: Waited 15 secs for write IO to PST disk 2 in group 4. WARNING: Waited 15 secs for write IO to PST disk 0 in group 5. WARNING: Waited 15 secs for write IO to PST disk 0 in group 5. WARNING: Waited 15 secs for write IO to PST disk 0 in group 6. WARNING: Waited 15 secs for write IO to PST disk 0 in group 6. WARNING: Waited 15 secs for write IO to PST disk 0 in group 7. WARNING: Waited 15 secs for write IO to PST disk 0 in group 7. ----- Abridged Call Stack Trace ----- ksedsts()+465<-kfdpGc_doTobeoflnAsync()+37<-kfdpGc_checkTobeofln()+623<-kfdpGc_timeout()+17<-kfdp_timeoutBg()+1657<-ksbcti()+513<-ksbabs()+1735<-ksbrdp()+1045<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai_real()+250<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253 。。。。。。 。。。。。。 --------------- HBEAT ------------------ kfdpHbeat_dump: state=1, inst=1, ts=33023031.450354176, rnd=1036706912.2853146286.3238582348.2977351792. kfk io-queue: (nil) kfdpHbeatCB_dump: at 0x7f2bf4d28248 with ts=09/01/2015 23:06:47 iop=0x7f2bf4d28258, grp=4, disk=0/3915955003, isWrite=1 Hbeat #17 state=4 iostate=3 kfdpHbeatCB_dump: at 0x7f2bf4d28080 with ts=09/01/2015 23:06:47 iop=0x7f2bf4d28090, grp=4, disk=1/3915955004, isWrite=1 Hbeat #17 state=4 iostate=3 kfdpHbeatCB_dump: at 0x7f2bf4d27eb8 with ts=09/01/2015 23:06:47 iop=0x7f2bf4d27ec8, grp=4, disk=2/3915955005, isWrite=1 Hbeat #17 state=4 iostate=3 NOTE: kfdp_updateInt: forceDismount grp 4 NOTE: GMON: failed to update modes: triggering force dismount of group 4 2015-09-01 23:07:03.625783 : kfdpHbeatblk_transfer: Found pending Hbeat I/O in grp=4 @ HBeatCblk=0x7f2bf4d28248 iop=0x7f2bf4d28258 iostate=3. kfdpHbeatCB_dump: at 0x7f2bf4d28248 with ts=09/01/2015 23:06:47 iop=0x7f2bf4d28258, grp=4, disk=0/3915955003, isWrite=1 Hbeat #17 state=4 iostate=3 2015-09-01 23:07:03.625823 : kfdpHbeatblk_transfer: Found pending Hbeat I/O in grp=4 @ HBeatCblk=0x7f2bf4d28080 iop=0x7f2bf4d28090 iostate=3. kfdpHbeatCB_dump: at 0x7f2bf4d28080 with ts=09/01/2015 23:06:47 iop=0x7f2bf4d28090, grp=4, disk=1/3915955004, isWrite=1 Hbeat #17 state=4 iostate=3 2015-09-01 23:07:03.625844 : kfdpHbeatblk_transfer: Found pending Hbeat I/O in grp=4 @ HBeatCblk=0x7f2bf4d27eb8 iop=0x7f2bf4d27ec8 iostate=3. kfdpHbeatCB_dump: at 0x7f2bf4d27eb8 with ts=09/01/2015 23:06:47 iop=0x7f2bf4d27ec8, grp=4, disk=2/3915955005, isWrite=1 Hbeat #17 state=4 iostate=3 ----- Abridged Call Stack Trace ----- ksedsts()+465<-kfdpHbeatblk_transfer()+953<-kfdp_forceDismountAsyncGmon()+619<-kfdp_updateInt()+664<-kfdp_updateDskBg()+814<-ksbabs()+771<-ksbrdp()+1045<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai_real()+250<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253 ----- End of Abridged Call Stack Trace ----- *** 2015-09-01 23:07:03.906 GMON dismounting group 4 at 27 for pid 35, osid 19090 NOTE: kfdp_doDismount: dismount grp 4 *** 2015-09-01 23:07:06.609 NOTE: kfdpUtil_freeSlMsg: ksvtst for grp: 4 KSV status 0x10 *** 2015-09-01 23:07:36.732 NOTE: kfdpUtil_freeSlMsg: ksvtst for grp: 4 KSV status 0x10 ERROR: GMON could not validate any PST Hbeat (grp 1) for Validate *** 2015-09-01 23:11:03.981 2015-09-01 23:11:03.981362 : kfdpHbeatblk_transfer: Found pending Hbeat I/O in grp=1 @ HBeatCblk=0x7f2bf4f32f10 iop=0x7f2bf4f32f20 iostate=3. kfdpHbeatCB_dump: at 0x7f2bf4f32f10 with ts=09/01/2015 23:11:03 iop=0x7f2bf4f32f20, grp=1, disk=0/3915955002, isWrite=0 Hbeat #0 state=3 iostate=3 ----- Abridged Call Stack Trace ----- ksedsts()+465<-kfdpHbeatblk_transfer()+953<-kfdpHbeat_doCheck()+662<-kfdpHbeat_timeout()+854<-kfdp_timeoutBg()+1642<-ksbcti()+513<-ksbabs()+1735<-ksbrdp()+1045<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai_real()+250<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253 ----- End of Abridged Call Stack Trace ----- |
根据上述内容,分析发现与Oracle Bug 18694414 完全一致。如下是该bug的call stack的描述:
1 2 3 4 |
ksedsts()+465<-kfdpHbeatblk_transfer()+953<-kfdpHbeat_doCheck()+662<-kfdpHbeat @ _timeout()+854<-kfdp_timeoutBg()+1642<-ksbcti()+513<-ksbabs()+1735<-ksbrdp()+1 @ 045<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai_real()+250<-ssthrdmain()+ @ 265<-main()+201<-__libc_start_main()+253 |
根据Oracle mos文档的描述,建议将_asm_hbeatiowait 参数调整为185,因为LInux环境disk 默认的timeout为180,如下:
1 2 3 |
[root@tysy1 ~]# cat /sys/block/sdb/device/timeout 180 [root@tysy1 ~]# |
One Response to “11.2.0.4 RAC CRS diskgroup auto dismount问题”
[…] 11.2.0.4 RAC CRS diskgroup auto dismount问题| love wife & love. – 2015年9月4日. ERROR: no read quorum in group: required 2, found 0 disks. NOTE: initiating PST update: grp = 4, dsk = 0/0xe968bb3b, mask = 0x6a, op =. […]
Leave a Reply
You must be logged in to post a comment.