11gRAC报错CRS-4535, CRS-4000解决

2019-05-24 20:13:16 浏览数 (1)

环境:AIX6.1 Oracle11.2.0.4 RAC(2 nodes)

  • 1.故障现象
  • 2.定位问题
  • 3.处理问题

1.故障现象

使用crsctl查看集群各资源状态,在任一节点都会直接报错CRS-4535, CRS-4000;但此时数据库是可以被正常访问的。 具体故障现象如下:

代码语言:javascript复制
#节点1查询
grid@bjdb1:/home/grid>crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

#节点2查询
root@bjdb2:/>crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

同样的,crs_stat -t 查看一样报错,错误码是CRS-0184:

代码语言:javascript复制
root@bjdb1:/>crs_stat -t
CRS-0184: Cannot communicate with the CRS daemon.

节点2也一样!

确定此时数据库是可以被正常访问的。如下:

代码语言:javascript复制
#节点2模拟客户端登录RAC集群,使用SCAN IP访问,发现可以正常访问到数据库
oracle@bjdb2:/home/oracle>sqlplus jingyu/jingyu@192.168.103.31/bjdb

SQL*Plus: Release 11.2.0.4.0 Production on Mon Oct 10 14:24:47 2016

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL>

RAC环境下的/etc/hosts文件相关内容:

代码语言:javascript复制
#scan
192.168.103.31  scan-ip

2.定位问题

首先查看节点1的集群相关日志: Clusterware(GI)的日志存放在$GRID_HOME/log/nodename下; Clusterware(GI)对应几个关键的后台进程css,crs,evm,它们的日志分别存在cssd,crsd,evmd目录下;

节点1查看相关日志:

代码语言:javascript复制
#查看GI的alert日志文件,最近的记录只是提示GI所在存储空间使用率高,稍后清理下即可,而且目前还有一定空间剩余,显然并非是此次故障的原因。
root@bjdb1:/opt/u01/app/11.2.0/grid/log/bjdb1>tail -f alert*.log
2016-10-10 14:18:26.125:
[crflogd(39190674)]CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full. The storage location is '/opt/u01/app/11.2.0/grid/crf/db/bjdb1'.
2016-10-10 14:23:31.125:
[crflogd(39190674)]CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full. The storage location is '/opt/u01/app/11.2.0/grid/crf/db/bjdb1'.
2016-10-10 14:28:36.125:
[crflogd(39190674)]CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full. The storage location is '/opt/u01/app/11.2.0/grid/crf/db/bjdb1'.
2016-10-10 14:33:41.125:
[crflogd(39190674)]CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full. The storage location is '/opt/u01/app/11.2.0/grid/crf/db/bjdb1'.
2016-10-10 14:38:46.125:
[crflogd(39190674)]CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full. The storage location is '/opt/u01/app/11.2.0/grid/crf/db/bjdb1'.

#因为crsctl不可以使用,进而查看crs的日志信息,发现3号已经有报错,无法打开裸设备,从而导致无法初始化OCR;继续看错误信息,发现是这个时候访问共享存储时无法成功。怀疑此刻存储出现问题,需要进一步和现场人员确定此时间点是否有存储相关的施工。
root@bjdb1:/opt/u01/app/11.2.0/grid/log/bjdb1/crsd>tail -f crsd.log
2016-10-03 18:04:40.248: [  OCRRAW][1]proprinit: Could not open raw device
2016-10-03 18:04:40.248: [  OCRASM][1]proprasmcl: asmhandle is NULL
2016-10-03 18:04:40.252: [  OCRAPI][1]a_init:16!: Backend init unsuccessful : [26]
2016-10-03 18:04:40.253: [  CRSOCR][1] OCR context init failure.  Error: PROC-26: Error while accessing the physical storage

2016-10-03 18:04:40.253: [    CRSD][1] Created alert : (:CRSD00111:) :  Could not init OCR, error: PROC-26: Error while accessing the physical storage

2016-10-03 18:04:40.253: [    CRSD][1][PANIC] CRSD exiting: Could not init OCR, code: 26
2016-10-03 18:04:40.253: [    CRSD][1] Done.

节点2查看相关日志:

代码语言:javascript复制
#查看GI的alert日志,发现节点2的ctss有CRS-2409的报错,虽然根据MOS文档 ID 1135337.1说明,This is not an error. ctssd is reporting that there is a time difference and it is not doing anything about it as it is running in observer mode.只需要查看两个节点的时间是否一致,但实际上查询节点时间一致:
root@bjdb2:/opt/u01/app/11.2.0/grid/log/bjdb2>tail -f alert*.log
2016-10-10 12:29:22.145:
[ctssd(5243030)]CRS-2409:The clock on host bjdb2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2016-10-10 12:59:38.799:
[ctssd(5243030)]CRS-2409:The clock on host bjdb2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2016-10-10 13:34:11.402:
[ctssd(5243030)]CRS-2409:The clock on host bjdb2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2016-10-10 14:12:44.168:
[ctssd(5243030)]CRS-2409:The clock on host bjdb2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2016-10-10 14:44:04.824:
[ctssd(5243030)]CRS-2409:The clock on host bjdb2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.

#查看节点2的crs日志,发现和节点1相近的时间点,同样访问共享存储出现了问题,进而无法初始化OCR
root@bjdb2:/opt/u01/app/11.2.0/grid/log/bjdb2/crsd>tail -f crsd.log
2016-10-03 18:04:31.077: [  OCRRAW][1]proprinit: Could not open raw device
2016-10-03 18:04:31.077: [  OCRASM][1]proprasmcl: asmhandle is NULL
2016-10-03 18:04:31.081: [  OCRAPI][1]a_init:16!: Backend init unsuccessful : [26]
2016-10-03 18:04:31.081: [  CRSOCR][1] OCR context init failure.  Error: PROC-26: Error while accessing the physical storage

2016-10-03 18:04:31.082: [    CRSD][1] Created alert : (:CRSD00111:) :  Could not init OCR, error: PROC-26: Error while accessing the physical storage

2016-10-03 18:04:31.082: [    CRSD][1][PANIC] CRSD exiting: Could not init OCR, code: 26
2016-10-03 18:04:31.082: [    CRSD][1] Done.

现在登入到grid用户,确定下ASM磁盘组的状态: sqlplus / as sysasm 直接查询v$asm_diskgroup;

代码语言:javascript复制
select name, state, total_mb, free_mb from v$asm_diskgroup;

发现OCR_VOTE1磁盘组在两个ASM实例上都是没有mount;

代码语言:javascript复制
SQL> select instance_name from v$instance;

INSTANCE_NAME
------------------------------------------------
 ASM2

SQL> select name, state, total_mb, free_mb from v$asm_diskgroup;

NAME                           STATE                               TOTAL_MB    FREE_MB
------------------------------ --------------------------------- ---------- ----------
DATA                           MOUNTED                               737280      88152
FRA_ARCHIVE                    MOUNTED                                10240       9287
OCR_VOTE1                      DISMOUNTED                                 0          0

另一个节点一样;

代码语言:javascript复制
节点1mount OCR相关磁盘组
SQL> select name, state from v$asm_diskgroup;

NAME                           STATE
------------------------------ ---------------------------------
DATA                           MOUNTED
FRA_ARCHIVE                    MOUNTED
OCR_VOTE1                      DISMOUNTED

再确认下目前GI的一些核心后台进程:

代码语言:javascript复制
#发现crs这个进程是没有启动的,查询没有任何结果输出
root@bjdb1:/>ps -ef|grep crsd.bin|grep -v grep

同样,节点2查询也是一样没有启动crs进程。

简单总结问题现状:故障发生在10月3日 下午18:04左右,所有节点都因为无法访问共享存储进而导致OCR初始化失败。目前的crs进程是没有正常启动的。

3.处理问题

3.1 尝试手工挂载OCR磁盘组

代码语言:javascript复制
SQL> alter diskgroup ocr_vote1 mount;

Diskgroup altered.

SQL> select name, state from v$asm_diskgroup;

NAME                           STATE
------------------------------ ---------------------------------
DATA                           MOUNTED
FRA_ARCHIVE                    MOUNTED
OCR_VOTE1                      MOUNTED

3.2 节点1启动CRS 目前,crs这个进程依然是没有启动的,

代码语言:javascript复制
#证明crsd.bin当前没有启动
root@bjdb1:/>ps -ef|grep crsd.bin|grep -v grep

节点1尝试正常开启crs失败

代码语言:javascript复制
root@bjdb1:/>crsctl start crs
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.

节点1尝试正常关闭crs失败

代码语言:javascript复制
root@bjdb1:/>crsctl stop crs
CRS-2796: The command may not proceed when Cluster Ready Services is not running
CRS-4687: Shutdown command has completed with errors.
CRS-4000: Command Stop failed, or completed with errors.

那么下一步如何处理呢? 最终选择在节点1强制停止crs再启动成功

代码语言:javascript复制
#强制关闭节点1的crs
root@bjdb1:/>crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'bjdb1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'bjdb1'
CRS-2673: Attempting to stop 'ora.crf' on 'bjdb1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'bjdb1'
CRS-2673: Attempting to stop 'ora.evmd' on 'bjdb1'
CRS-2673: Attempting to stop 'ora.asm' on 'bjdb1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'bjdb1'
CRS-2677: Stop of 'ora.evmd' on 'bjdb1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'bjdb1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'bjdb1' succeeded

CRS-5017: The resource action "ora.crf stop" encountered the following error:
action for daemon aborted. For details refer to "(:CLSN00108:)" in "/opt/u01/app/11.2.0/grid/log/bjdb1/agent/ohasd/orarootagent_root/orarootagent_root.log".

CRS-2675: Stop of 'ora.crf' on 'bjdb1' failed
CRS-2679: Attempting to clean 'ora.crf' on 'bjdb1'
CRS-2681: Clean of 'ora.crf' on 'bjdb1' succeeded

CRS-2675: Stop of 'ora.asm' on 'bjdb1' failed
CRS-2679: Attempting to clean 'ora.asm' on 'bjdb1'
CRS-2681: Clean of 'ora.asm' on 'bjdb1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'bjdb1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'bjdb1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'bjdb1'
CRS-2677: Stop of 'ora.cssd' on 'bjdb1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'bjdb1'
CRS-2677: Stop of 'ora.gipcd' on 'bjdb1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'bjdb1'
CRS-2677: Stop of 'ora.gpnpd' on 'bjdb1' succeeded
CRS-2677: Stop of 'ora.drivers.acfs' on 'bjdb1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'bjdb1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
root@bjdb1:/>

#查看crsctl资源状态,此时肯定没有
root@bjdb1:/>crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

#查看crsd.bin,cssd.bin,evmd.bin,都已经没有了相关进程
root@bjdb1:/>ps -ef|grep crsd.bin
    root  8126466 25428158   0 15:52:50  pts/0  0:00 grep crsd.bin
root@bjdb1:/>ps -ef|grep cssd.bin
    root  8126470 25428158   0 15:53:01  pts/0  0:00 grep cssd.bin
root@bjdb1:/>ps -ef|grep evmd.bin
    root 35520600 25428158   0 15:53:13  pts/0  0:00 grep evmd.bin
    
#查看pmon进程,也都没有了
root@bjdb1:/>ps -ef|grep pmon|grep -v grep
root@bjdb1:/>

#尝试再次启动crs,成功!
root@bjdb1:/>crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

#查看crsctl资源,依然报错,说明还没有完全起来
root@bjdb1:/>crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

#等待一段时间,可以查GI相关的核心后台进程
root@bjdb1:/>ps -ef|grep crsd.bin|grep -v grep
root@bjdb1:/>ps -ef|grep cssd.bin|grep -v grep
    grid 10747994 26542192   0 15:55:03      -  0:00 /opt/u01/app/11.2.0/grid/bin/ocssd.bin
root@bjdb1:/>ps -ef|grep pmon
    root 39387390 25428158   0 15:57:23  pts/0  0:00 grep pmon
root@bjdb1:/>ps -ef|grep pmon|grep -v grep
    grid 39911466        1   0 15:58:47      -  0:00 asm_pmon_ ASM2
root@bjdb1:/>ps -ef|grep pmon|grep -v grep
    root 37814470 25428158   0 15:59:27  pts/0  0:00 grep pmon
    grid 39911466        1   0 15:58:47      -  0:00 asm_pmon_ ASM2
root@bjdb1:/>
root@bjdb1:/>ps -ef|grep crsd.bin
root@bjdb1:/>ps -ef|grep cssd.bin
    grid 10747994 26542192   0 15:55:03      -  0:00 /opt/u01/app/11.2.0/grid/bin/ocssd.bin

root@bjdb1:/>ps -ef|grep evmd.bin
    grid 40173760        1   0 15:57:10      -  0:00 /opt/u01/app/11.2.0/grid/bin/evmd.bin
root@bjdb1:/>ps -ef|grep crsd.bin
    root 37683238        1   0 15:59:54      -  0:01 /opt/u01/app/11.2.0/grid/bin/crsd.bin reboot
root@bjdb1:/>

#当核心进程都起来时,再次查看crsctl资源情况,发现已经可以正常查询,各资源正在启动
root@bjdb1:/>crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       bjdb1
ora.LISTENER.lsnr
               ONLINE  ONLINE       bjdb1
ora.OCR_VOTE1.dg
               ONLINE  ONLINE       bjdb1
ora.asm
               ONLINE  ONLINE       bjdb1                    Started
ora.gsd
               OFFLINE OFFLINE      bjdb1
ora.net1.network
               ONLINE  ONLINE       bjdb1
ora.ons
               ONLINE  ONLINE       bjdb1
ora.registry.acfs
               ONLINE  ONLINE       bjdb1
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  OFFLINE
ora.bjdb.db
      1        ONLINE  OFFLINE
      2        ONLINE  OFFLINE
ora.bjdb1.vip
      1        ONLINE  ONLINE       bjdb1
ora.bjdb2.vip
      1        ONLINE  OFFLINE                               STARTING
ora.cvu
      1        ONLINE  ONLINE       bjdb1
ora.oc4j
      1        ONLINE  ONLINE       bjdb1
ora.scan1.vip
      1        ONLINE  OFFLINE                               STARTING

最后等待一段时间后,再次查询,发现节点1各资源已经全部正常。

代码语言:javascript复制
root@bjdb1:/>crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       bjdb1
ora.LISTENER.lsnr
               ONLINE  ONLINE       bjdb1
ora.OCR_VOTE1.dg
               ONLINE  ONLINE       bjdb1
ora.asm
               ONLINE  ONLINE       bjdb1                    Started
ora.gsd
               OFFLINE OFFLINE      bjdb1
ora.net1.network
               ONLINE  ONLINE       bjdb1
ora.ons
               ONLINE  ONLINE       bjdb1
ora.registry.acfs
               ONLINE  ONLINE       bjdb1
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  OFFLINE
ora.bjdb.db
      1        ONLINE  ONLINE       bjdb1                    Open
      2        ONLINE  OFFLINE
ora.bjdb1.vip
      1        ONLINE  ONLINE       bjdb1
ora.bjdb2.vip
      1        ONLINE  OFFLINE
ora.cvu
      1        ONLINE  ONLINE       bjdb1
ora.oc4j
      1        ONLINE  ONLINE       bjdb1
ora.scan1.vip
      1        ONLINE  OFFLINE

3.3 解决节点1上GI对应存储空间使用率过高 继续观察节点1的日志:

代码语言:javascript复制
grid@bjdb1:/opt/u01/app/11.2.0/grid/log/bjdb1>tail -f alert*.log
2016-10-10 16:03:25.373:
[crflogd(39780590)]CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full. The storage location is '/opt/u01/app/11.2.0/grid/crf/db/bjdb1'.
2016-10-10 16:08:30.373:
[crflogd(39780590)]CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full. The storage location is '/opt/u01/app/11.2.0/grid/crf/db/bjdb1'.
2016-10-10 16:09:50.796:
[ctssd(5046446)]CRS-2407:The new Cluster Time Synchronization Service reference node is host bjdb1.
2016-10-10 16:10:20.373:
[crflogd(39780590)]CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full. The storage location is '/opt/u01/app/11.2.0/grid/crf/db/bjdb1'.
2016-10-10 16:15:25.373:
[crflogd(39780590)]CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full. The storage location is '/opt/u01/app/11.2.0/grid/crf/db/bjdb1'.

其实这个之前也看到过,就是需要清理/opt/u01目录空间了!查找可以删除的一些历史日志,解决完这个提示就不会再出现!

3.4 节点2手工挂载OCR,重启CRS 节点1问题已解决,在节点2同样挂载OCR后重启CRS 方法都一样,只是在节点2操作,不再赘述。

代码语言:javascript复制
#强制停止节点2的crs
root@bjdb2:/>crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'bjdb2'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'bjdb2'
CRS-2673: Attempting to stop 'ora.crf' on 'bjdb2'
CRS-2673: Attempting to stop 'ora.ctssd' on 'bjdb2'
CRS-2673: Attempting to stop 'ora.evmd' on 'bjdb2'
CRS-2673: Attempting to stop 'ora.asm' on 'bjdb2'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'bjdb2'
CRS-2677: Stop of 'ora.crf' on 'bjdb2' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'bjdb2' succeeded
CRS-2677: Stop of 'ora.evmd' on 'bjdb2' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'bjdb2' succeeded
CRS-2675: Stop of 'ora.asm' on 'bjdb2' failed
CRS-2679: Attempting to clean 'ora.asm' on 'bjdb2'
CRS-2681: Clean of 'ora.asm' on 'bjdb2' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'bjdb2'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'bjdb2' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'bjdb2'
CRS-2677: Stop of 'ora.cssd' on 'bjdb2' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'bjdb2'
CRS-2677: Stop of 'ora.gipcd' on 'bjdb2' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'bjdb2'
CRS-2677: Stop of 'ora.gpnpd' on 'bjdb2' succeeded
CRS-2677: Stop of 'ora.drivers.acfs' on 'bjdb2' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'bjdb2' has completed
CRS-4133: Oracle High Availability Services has been stopped.

再启动: crsctl start crs

等待一段时间后查询:

代码语言:javascript复制
#观察到crs进程已经启动
root@bjdb2:/>ps -ef|grep crsd.bin|grep -v grep
    root 22610148        1   0 16:24:15      -  0:01 /opt/u01/app/11.2.0/grid/bin/crsd.bin reboot

#最后使用crsctl查看资源已经恢复正常
root@bjdb2:/>crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       bjdb1
               ONLINE  ONLINE       bjdb2
ora.LISTENER.lsnr
               ONLINE  ONLINE       bjdb1
               ONLINE  ONLINE       bjdb2
ora.OCR_VOTE1.dg
               ONLINE  ONLINE       bjdb1
               ONLINE  ONLINE       bjdb2
ora.asm
               ONLINE  ONLINE       bjdb1                    Started
               ONLINE  ONLINE       bjdb2                    Started
ora.gsd
               OFFLINE OFFLINE      bjdb1
               OFFLINE OFFLINE      bjdb2
ora.net1.network
               ONLINE  ONLINE       bjdb1
               ONLINE  ONLINE       bjdb2
ora.ons
               ONLINE  ONLINE       bjdb1
               ONLINE  ONLINE       bjdb2
ora.registry.acfs
               ONLINE  ONLINE       bjdb1
               ONLINE  ONLINE       bjdb2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       bjdb2
ora.bjdb.db
      1        ONLINE  ONLINE       bjdb1                    Open
      2        ONLINE  ONLINE       bjdb2                    Open
ora.bjdb1.vip
      1        ONLINE  ONLINE       bjdb1
ora.bjdb2.vip
      1        ONLINE  ONLINE       bjdb2
ora.cvu
      1        ONLINE  ONLINE       bjdb1
ora.oc4j
      1        ONLINE  ONLINE       bjdb1
ora.scan1.vip
      1        ONLINE  ONLINE       bjdb2
      
#查看运行在节点2上的监听程序,之前故障时,scan的监听就在节点2上
root@bjdb2:/>ps -ef|grep tns
    grid  5308430        1   0   Aug 17      -  5:05 /opt/u01/app/11.2.0/grid/bin/tnslsnr LISTENER_SCAN1 -inherit
    grid  5505240        1   1   Aug 17      - 27:23 /opt/u01/app/11.2.0/grid/bin/tnslsnr LISTENER -inherit
root@bjdb2:/>

至此,完成本次RAC集群CRS-4535,CRS-4000故障的处理;值得注意的是,巡检发现故障后,我整个troubleshooting解决过程,RAC数据库对外都是可以提供服务的,这点也说明了RAC的稳定性!

0 人点赞