欢迎访问 生活随笔!

生活随笔

当前位置: 首页 > 编程资源 > 编程问答 >内容正文

编程问答

分析11.2.0.3 rac CRS-1714:Unable to discover any voting files

发布时间:2023/12/8 编程问答 52 豆豆
生活随笔 收集整理的这篇文章主要介绍了 分析11.2.0.3 rac CRS-1714:Unable to discover any voting files 小编觉得挺不错的,现在分享给大家,帮大家做个参考.

结论:

   1,11.2.0.3或者说ORACLE不同版本的RAC进程依赖机制一直在发展演化,一定要尽力搞清RAC各进程间依赖关系,到关重要
   2,CRS-1714:Unable to discover any voting files只是表面现象,并非真正是VOTING DISK损坏,具体需要你结合对应的LOG进行分析
   3,如果RAC节点的GPNPD进程所用的配置文件PROFILE.XML(OLR),可能要重建损坏的节点
   4,删除RAC节点以及添加节点,一定要详细查看官方手册,因为里面分类很多
  5,最重要的一点,如果在分析LOG日志,卡住没思路或从未碰过类似问题,一定要查看MOS,搜索关键字,比如本案例的GPNP PROFILE

分析过程:

1,redhat 6.4上面的2节点11.2。0.4 RAC的CRSD进程没有启动,从集群ALERT日志发现,找不到表决磁盘
2015-09-16 16:53:36.138
[cssd(25059)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/grid/11.2.0.4/log/jingfa1/cssd/ocssd.log
2015-09-16 16:53:51.176




2,运行如下命令关闭2个节点的所在ORACLE相关进程
/u01/grid/11.2.0.4/bin/crsctl stop crs






3,确认2个节点的ORACLE进程全部关闭


ps -ef|grep d.bin
root      1077 24425  0 09:00 pts/1    00:00:00 grep d.bin


4,在第1个节点以独占方式启动CRS
/u01/grid/11.2.0.4/bin/crsctl start crs -excl -nocrs




5,在第1个节点查看ASM进程是否启动




6,在第1个节点查看集群进程是否以独占方式启动




7,在第1个节点查看ocr磁盘是否工作正常
/u01/grid/11.2.0.4/bin/ocrcheck






8,如果ocr磁盘工作不正常,且其备份存在,可用备份恢复ocr磁盘
/u01/grid/11.2.0.4/bin/ocrconfig -showbackup




/u01/grid/11.2.0.4/bin/ocrconfig -restore ocr备份文件 




9,在第1个节点以GRID用户查看OCR及VOTING DISK磁盘组是否存在,发现存在
  1* select disk_number,path from v$asm_disk
SQL> /


DISK_NUMBER PATH
----------- --------------------------------------------------
          0 /dev/ocr_vote
          0 /dev/data


SQL> 
SQL> 
SQL> show parameter disk_


NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
asm_diskgroups                       string      DATA
asm_diskstring                       string      /dev/*
SQL> select name,sector_size,block_size,allocation_unit_size/1024/1024 as au_mb from v$asm_diskgroup;


NAME                           SECTOR_SIZE BLOCK_SIZE      AU_MB
------------------------------ ----------- ---------- ----------
DATA                                   512       4096          2
OCRVOTE                                512       4096          2




10,在第1个节点确认VOTING DISK是否工作不正常,确实发现不了
/u01/grid/11.2.0.4/bin/crsctl query css votedisk


11,从上述第9步的asm_diskgroups发现,仅加载一个ASM磁盘组DATA,而没有加载OCRVOTE,所以调整其参数,让ASM实例启动时加载OCRVOTE及DATA磁盘组,这样
    我想就可以在ASM实例启时自动加载VOTING DISK磁盘组了




alter system set asm_diskgroups=data,ocrvote sid='*';






show parameter disk_


12,关闭节点1的CRS集群相关进程
/u01/grid/11.2.0.4/bin/crsctl stop crs


13,重启2个节点的集群进程,确认crsd进程是否正常,发现问题依旧,还是找不到表决磁盘
/u01/grid/11.2.0.4/bin/crsctl start crs


14,关闭2个节点的集群进程,然后在节点1以独占方式启动集群进程


/u01/grid/11.2.0.4/bin/crsctl stop crs


/u01/grid/11.2.0.4/bin/crsctl start crs -excl -nocrs


15,在节点1直接替换ocrvote磁盘组,修复voting disk
/u01/grid/11.2.0.4/bin/crsctl replace votedisk +ocrvote


16,在节点1查看voting disk是否正常
/u01/grid/11.2.0.4/bin/crsctl query css votedisk


17,关闭节点的集群进程,然后在2节点重启集群进程
/u01/grid/11.2.0.4/bin/crsctl stop crs


/u01/grid/11.2.0.4/bin/crsctl start crs






 18,在2个节点确认VOTING DISK是否可以正常工作(如下命令必须CRSD进程启动才有结果,否则为空,且CRSD进程是在集群所有进程最后一个启动),这下节点1正常了,但节点2还是CRSD进程启不来
 /u01/grid/11.2.0.4/bin/crsctl query css votedisk




19,查看节点2的GRID用户的TRC文件,发现节点2的VOTING DISK的CLUSTER GUID标识和GPNP PROFILE不一致,所以最终节点2发现不了VOTING DISK
2015-09-16 17:58:51.847: [    CSSD][1851041536]clssnmvDiskVerify: discovered a potential voting file
2015-09-16 17:58:51.847: [   SKGFD][1851041536]Handle 0x7fd95808f980 from lib :UFS:: for disk :/dev/ocr_vote:




 ---这里GPNP进程发现VOTING DISK的GUID和CLUSTER GUID不相同
2015-09-16 17:58:51.965: [    CSSD][1851041536]clssnmvDiskCreate: Cluster guid 0acef774f25dcfb0bf3d0c7b3db02abe found in voting disk /dev/ocr_vote does not match with the 
cluster guid 7d8026436ade6fe0ff597a0f6df497e1 obtained from the GPnP profile
--移除了VOTING DISK
2015-09-16 17:58:51.965: [    CSSD][1851041536]clssnmvDiskDestroy: removing the voting disk /dev/ocr_vote
2015-09-16 17:58:51.965: [   SKGFD][1851041536]Lib :UFS:: closing handle 0x7fd95808f980 for disk :/dev/ocr_vote:
--找不到VOTING DISK
2015-09-16 17:58:51.965: [    CSSD][1851041536]clssnmvDiskVerify: Successful discovery of 0 disks
2015-09-16 17:58:51.965: [    CSSD][1851041536]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2015-09-16 17:58:51.965: [    CSSD][1851041536]clssnmvFindInitialConfigs: No voting files found
2015-09-16 17:58:51.965: [    CSSD][1851041536](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds


21,我们在第2个节点看看GPNP进程是个什么东西
[grid@jingfa2 jingfa2]$ ps -ef|grep -i gpnp
grid      5238 32255  0 10:02 pts/1    00:00:00 grep -i gpnp
grid     18060     1  0 09:45 ?        00:00:01 /u01/grid/11.2.0.4/bin/gpnpd.bin


22,在第2个节点看看gpnp profile文件在哪儿
[grid@jingfa2 gpnpd]$ locate gpnp|grep -i --color profile
/u01/grid/11.2.0.4/gpnp/profiles
/u01/grid/11.2.0.4/gpnp/jingfa2/profiles
/u01/grid/11.2.0.4/gpnp/jingfa2/profiles/peer
/u01/grid/11.2.0.4/gpnp/jingfa2/profiles/peer/pending.xml
/u01/grid/11.2.0.4/gpnp/jingfa2/profiles/peer/profile.old
/u01/grid/11.2.0.4/gpnp/jingfa2/profiles/peer/profile.xml  --我估计就是这个文件
/u01/grid/11.2.0.4/gpnp/jingfa2/profiles/peer/profile_orig.xml
/u01/grid/11.2.0.4/gpnp/profiles/peer
/u01/grid/11.2.0.4/gpnp/profiles/peer/profile.xml
/u01/grid/11.2.0.4/gpnp/profiles/peer/profile_orig.xml




23,查看节点2gpnp profile文件的内容,从/u01/grid/11.2.0.4/gpnp/jingfa2/profiles/peer/profile.xml文件,发现7d8026436ade6fe0ff597a0f6df497e1这个GUID,可见就是这个文件
    同时我对比了节点1的这个文件,发现0acef774f25dcfb0bf3d0c7b3db02abe在此文件可以找到,所以我尝试手工更新GUID,用0acef774f25dcfb0bf3d0c7b3db02abe替换7d8026436ade6fe0ff597a0f6df497e1


0acef774f25dcfb0bf3d0c7b3db02abe


[grid@jingfa2 gpnpd]$ more /u01/grid/11.2.0.4/gpnp/jingfa2/profiles/peer/profile.xml|grep -i --color 7d8026436ade6fe0ff597a0f6df497e1
<?xml version="1.0" encoding="UTF-8"?><gpnp:GPnP-Profile Version="1.0" xmlns="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:gpnp="http://www.grid-pnp.org/2005/11/gpnp-profile" 
xmlns:orcl="http://www.oracle.com/gpnp/2005/11/gpnp-profile" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.grid-pnp.org/2005/11/gpnp-profile 
gpnp-profile.xsd" ProfileSequence="7" ClusterUId="7d8026436ade6fe0ff597a0f6df497e1" ClusterName="jingfa-scan" PALocation=""><gpnp:Network-Profile><gpnp:HostNetwork id="gen" 
HostName="*"><gpnp:Network id="net1" IP="192.168.0.0" Adapter="eth0" Use="public"/><gpnp:Network id="net2" IP="10.0.0.0" Adapter="eth1" 
Use="cluster_interconnect"/></gpnp:HostNetwork></gpnp:Network-Profile><orcl:CSS-Profile id="css" DiscoveryString="+asm" 
LeaseDuration="400"/><orcl:ASM-Profile id="asm" DiscoveryString="/dev/ocr*" 
SPFile="+OCRVOTE/jingfa-scan/asmparameterfile/registry.253.849167179"/><ds:Signature 
xmlns:ds="http://www.w3.org/2000/09/xmldsig#"><ds:SignedInfo><ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/>
<ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/><ds:Reference URI=""><ds:Transforms>
<ds:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/><ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"> 
<InclusiveNamespaces xmlns="http://www.w3.org/2001/10/xml-exc-c14n#" PrefixList="gpnp orcl xsi"/></ds:Transform></ds:Transforms>
<ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/><ds:DigestValue>cPtosOiD17nSId/92MTAPaQ+dLU=</ds:DigestValue></ds:Reference>
</ds:SignedInfo><ds:SignatureValue>Ca56sx6DgsCSxrRqPz2ReOzhkf9eYiqVYuj2XLadwuBURX2PL+nYD7LhLFFj27EpuSIx0SfGVhOPm/i016ws7tWATeSKBJDVyTAELgBEYPsMumW4vKm7rVXs
SbVJolycA3pFHtGqZ7FZjzSXxdj5Xq4LlBLGVWR3gYKnqxuRGv0=</ds:SignatureValue>
</ds:Signature></gpnp:GPnP-Profile>
[grid@jingfa2 gpnpd]$ 


24,调整文件前先备份节点2这个文件
cp /u01/grid/11.2.0.4/gpnp/jingfa2/profiles/peer/profile.xml  /u01/grid/11.2.0.4/gpnp/jingfa2/profiles/peer/profile.xml.20150917bak


vi /u01/grid/11.2.0.4/gpnp/jingfa2/profiles/peer/profile.xml


:s/7d8026436ade6fe0ff597a0f6df497e1/0acef774f25dcfb0bf3d0c7b3db02abe/g


保存即可




25,在节点2重启集群进程,发现节点1的集群进程发生了重启,而且奇怪的是我24步改的又回以了原样,再次强行修改,再重启节点2集群进程
     经过反复尝试,说明gpnp进程会对此文件进行恢复,即使你手工改了也没用


26,即使上面的方法行不通,换另一个方法,查查2个节点AGENT进程有何区别


[root@jingfa1 ~]# ps -ef|grep agent|grep grid|grep -v grep
grid      3647     1  0 09:44 ?        00:00:10 /u01/grid/11.2.0.4/bin/oraagent.bin
root      3660     1  0 09:44 ?        00:00:36 /u01/grid/11.2.0.4/bin/orarootagent.bin
grid      5793     1  0 09:45 ?        00:00:01 /u01/grid/11.2.0.4/bin/scriptagent.bin
oracle    5938     1  0 09:45 ?        00:00:20 /u01/grid/11.2.0.4/bin/oraagent.bin
grid     23427     1  0 09:43 ?        00:00:16 /u01/grid/11.2.0.4/bin/oraagent.bin
root     23656     1  0 09:43 ?        00:00:39 /u01/grid/11.2.0.4/bin/orarootagent.bin
root     23818     1  0 09:43 ?        00:00:19 /u01/grid/11.2.0.4/bin/cssdagent     


[grid@jingfa2 ctssd]$  ps -ef|grep agent|grep grid|grep -v grep
root     17274     1  0 11:31 ?        00:00:01 /u01/grid/11.2.0.4/bin/cssdagent
grid     31975     1  0 11:21 ?        00:00:01 /u01/grid/11.2.0.4/bin/oraagent.bin
root     32064     1  0 11:21 ?        00:00:00 /u01/grid/11.2.0.4/bin/orarootagent.bin
[grid@jingfa2 ctssd]$ 


27,在BAIDU找到一篇文章,准备直接从节点1把profile.xml复制到节点2进行替换


--关闭2个节点的集群进程
/u01/grid/11.2.0.4/bin/crsctl stop crs


--备份节点2的PROFILE.XML
cd /u01/grid/11.2.0.4/gpnp/jingfa2/profiles/peer/
cp profile.xml profile.xml.20150918bak




--从节点1把profile.xml复制到节点2进行替换
rm profile.xml


cd /u01/grid/11.2.0.4/gpnp/jingfa1/profiles/peer
scp profile.xml grid@192.168.0.31:/u01/grid/11.2.0.4/gpnp/jingfa2/profiles/peer


28,启动节点2的集群进程,报错依旧
/u01/grid/11.2.0.4/bin/crsctl start crs


29,从节点2的gpnp.log可知,profile.xml是从本地的olr获知信息,我尝试把节点2本地的OLR删除,到时GPNP进程会从节点1获取PROFILE.XML
u01/grid/11.2.0.4/bin/ocrcheck -local
Status of Oracle Local Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2508
         Available space (kbytes) :     259612
         ID                       : 1774858304
         Device/File Name         : /u01/grid/11.2.0.4/cdata/jingfa2.olr
                                    Device/File integrity check succeeded


         Local registry integrity check succeeded


         Logical corruption check succeeded


-- -备份节点2的OLR文件
cp  /u01/grid/11.2.0.4/cdata/jingfa2.olr /u01/grid/11.2.0.4/cdata/jingfa2.olr.20150918bak




--删除节点2的olr文件
rm -rf  /u01/grid/11.2.0.4/cdata/jingfa2.olr


/u01/grid/11.2.0.4/bin/crsctl stop crs


--启动节点2的集群进程
/u01/grid/11.2.0.4/bin/crsctl start crs


---直接移除节点2的OLR文件,发现节点2整个集群进程无法启动
[ohasd(6836)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-26: Error while accessing the
 physical storage Operating System error [No such file or directory] [2]]. Details at (:OHAS00106:) in /u01/grid/11.2.0.4/log/jingfa2/ohasd/ohasd.log.


30,从官方手册了解下OLR概念,发现每个RAC节点仅保存与此节点相关的集群资源信息,可见每个节点的OLR文件不同
Clusterware Administration and Deployment Guide
3 Managing Oracle Cluster Registry and Voting Disks
In Oracle Clusterware 11g release 2 (11.2), each node in a cluster has a local registry for node-specific resources, called an Oracle Local Registry (OLR), 
that is installed and configured when Oracle Clusterware installs OCR


31,经查阅MOS,看来只能重建节点2了
Cluster guid found in voting disk does not match with the cluster guid obtained from the GPnP profile (文档 ID 1281791.1)


32,在节点1删除节点2,但报找不到节点2
[root@jingfa1 ~]# /u01/grid/11.2.0.4/bin/crsctl delete node -n jingfa2
CRS-4660: Could not find node jingfa2 to delete.
CRS-4000: Command Delete failed, or completed with errors.


33,在节点2更新要删除节点2的信息
[grid@jingfa2 ~]$ /u01/grid/11.2.0.4/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES=jingfa2" crs=true -silent -local
Starting Oracle Universal Installer...


Checking swap space: must be greater than 500 MB.   Actual 4094 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /home/grid/oraInventory
'UpdateNodeList' was successful.


34,在节点2卸载集群的安装配置
[grid@jingfa2 ~]$ /u01/grid/11.2.0.4/deinstall/deinstall -local
Checking for required files and bootstrapping ...
Please wait ...
Location of logs /home/grid/oraInventory/logs/


############ ORACLE DEINSTALL & DECONFIG TOOL START ############




######################### CHECK OPERATION START #########################
## [START] Install check configuration ##




Checking for existence of the Oracle home location /u01/grid/11.2.0.4
Oracle Home type selected for deinstall is: Oracle Grid Infrastructure for a Cluster
Oracle Base selected for deinstall is: /u01/app/grid
Checking for existence of central inventory location /home/grid/oraInventory
Checking for existence of the Oracle Grid Infrastructure home /u01/grid/11.2.0.4
The following nodes are part of this cluster: jingfa2
Checking for sufficient temp space availability on node(s) : 'jingfa2'


## [END] Install check configuration ##


Traces log file: /home/grid/oraInventory/logs//crsdc.log
Enter an address or the name of the virtual IP used on node "jingfa2"[jingfa2-vip]
 > 


The following information can be collected by running "/sbin/ifconfig -a" on node "jingfa2"
Enter the IP netmask of Virtual IP "192.168.0.23" on node "jingfa2"[255.255.255.0]
 > 


Enter the network interface name on which the virtual IP address "192.168.0.23" is active
 > 


Enter an address or the name of the virtual IP[]
 > 




Network Configuration check config START


Network de-configuration trace file location: /home/grid/oraInventory/logs/netdc_check2015-09-19_08-30-44-AM.log


Specify all RAC listeners (do not include SCAN listener) that are to be de-configured [LISTENER,LISTENER_SCAN1]:


Network Configuration check config END


Asm Check Configuration START


ASM de-configuration trace file location: /home/grid/oraInventory/logs/asmcadc_check2015-09-19_08-31-13-AM.log




######################### CHECK OPERATION END #########################




####################### CHECK OPERATION SUMMARY #######################
Oracle Grid Infrastructure Home is: /u01/grid/11.2.0.4
The cluster node(s) on which the Oracle home deinstallation will be performed are:jingfa2
Since -local option has been specified, the Oracle home will be deinstalled only on the local node, 'jingfa2', and the global configuration will be removed.
Oracle Home selected for deinstall is: /u01/grid/11.2.0.4
Inventory Location where the Oracle home registered is: /home/grid/oraInventory
Following RAC listener(s) will be de-configured: LISTENER,LISTENER_SCAN1
Option -local will not modify any ASM configuration.
Do you want to continue (y - yes, n - no)? [n]: y
A log of this session will be written to: '/home/grid/oraInventory/logs/deinstall_deconfig2015-09-19_08-26-31-AM.out'
Any error messages from this session will be written to: '/home/grid/oraInventory/logs/deinstall_deconfig2015-09-19_08-26-31-AM.err'


######################## CLEAN OPERATION START ########################
ASM de-configuration trace file location: /home/grid/oraInventory/logs/asmcadc_clean2015-09-19_08-31-36-AM.log
ASM Clean Configuration END


Network Configuration clean config START


Network de-configuration trace file location: /home/grid/oraInventory/logs/netdc_clean2015-09-19_08-31-36-AM.log


De-configuring RAC listener(s): LISTENER,LISTENER_SCAN1


De-configuring listener: LISTENER
    Stopping listener on node "jingfa2": LISTENER
    Warning: Failed to stop listener. Listener may not be running.
Listener de-configured successfully.


De-configuring listener: LISTENER_SCAN1
    Stopping listener on node "jingfa2": LISTENER_SCAN1
    Warning: Failed to stop listener. Listener may not be running.
Listener de-configured successfully.


De-configuring backup files...
Backup files de-configured successfully.


The network configuration has been cleaned up successfully.


Network Configuration clean config END




---------------------------------------->
运行到这里,提示以ROOT用户在节点2运行如下脚本
The deconfig command below can be executed in parallel on all the remote nodes. Execute the command on  the local node after the execution completes on all the remote nodes.


Run the following command as the root user or the administrator on node "jingfa2".


/tmp/deinstall2015-09-19_08-24-22AM/perl/bin/perl -I/tmp/deinstall2015-09-19_08-24-22AM/perl/lib -I/tmp/deinstall2015-09-19_08-24-22AM/crs/install /tmp/deinstall2015-09-19_08-24-22AM/crs/install/rootcrs.pl -force  -deconfig -paramfile "/tmp/deinstall2015-09-19_08-24-22AM/response/deinstall_Ora11g_gridinfrahome1.rsp"


Press Enter after you finish running the above commands


<----------------------------------------


35,在节点2以ROOT用户运行上述提示的脚本
[root@jingfa2 ~]# /tmp/deinstall2015-09-19_08-24-22AM/perl/bin/perl -I/tmp/deinstall2015-09-19_08-24-22AM/perl/lib -I/tmp/deinstall2015-09-19_08-24-22AM/crs/install /tmp/deinstall2015-09-19_08-24-22AM/crs/install/rootcrs.pl -force  -deconfig -paramfile "/tmp/deinstall2015-09-19_08-24-22AM/response/deinstall_Ora11g_gridinfrahome1.rsp"
Using configuration parameter file: /tmp/deinstall2015-09-19_08-24-22AM/response/deinstall_Ora11g_gridinfrahome1.rsp


PRCR-1119 : Failed to look up CRS resources of ora.cluster_vip_net1.type type
PRCR-1068 : Failed to query resources
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.gsd is registered
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.ons is registered
Cannot communicate with crsd


CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'jingfa2'
CRS-2673: Attempting to stop 'ora.crf' on 'jingfa2'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'jingfa2'
CRS-2677: Stop of 'ora.crf' on 'jingfa2' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'jingfa2'
CRS-2677: Stop of 'ora.mdnsd' on 'jingfa2' succeeded
CRS-2677: Stop of 'ora.gipcd' on 'jingfa2' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'jingfa2'
CRS-2677: Stop of 'ora.gpnpd' on 'jingfa2' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'jingfa2' has completed
CRS-4133: Oracle High Availability Services has been stopped.
Successfully deconfigured Oracle clusterware stack on this node




36,继续在节点2把34步没有运行的脚本运行完(点击回车键)
Remove the directory: /tmp/deinstall2015-09-19_08-24-22AM on node: 
Setting the force flag to false
Setting the force flag to cleanup the Oracle Base
Oracle Universal Installer clean START


Detach Oracle home '/u01/grid/11.2.0.4' from the central inventory on the local node : Done


Delete directory '/u01/grid/11.2.0.4' on the local node : Done


Failed to delete the directory '/u01/app/grid'. The directory is in use.
Delete directory '/u01/app/grid' on the local node : Failed <<<<


Oracle Universal Installer cleanup completed with errors.


Oracle Universal Installer clean END




## [START] Oracle install clean ##


Clean install operation removing temporary directory '/tmp/deinstall2015-09-19_08-24-22AM' on node 'jingfa2'


## [END] Oracle install clean ##




######################### CLEAN OPERATION END #########################




####################### CLEAN OPERATION SUMMARY #######################
Following RAC listener(s) were de-configured successfully: LISTENER,LISTENER_SCAN1
Oracle Clusterware is stopped and successfully de-configured on node "jingfa2"
Oracle Clusterware is stopped and de-configured successfully.
Successfully detached Oracle home '/u01/grid/11.2.0.4' from the central inventory on the local node.
Successfully deleted directory '/u01/grid/11.2.0.4' on the local node.
Failed to delete directory '/u01/app/grid' on the local node.
Oracle Universal Installer cleanup completed with errors.


Oracle deinstall tool successfully cleaned up temporary directories.
#######################################################################




############# ORACLE DEINSTALL & DECONFIG TOOL END #############


[grid@jingfa2 ~]$ 




37,在节点1运行更新集群配置信息
[grid@jingfa1 ~]$ /u01/grid/11.2.0.4/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES=jingfa1" crs=true -silent
Starting Oracle Universal Installer...


Checking swap space: must be greater than 500 MB.   Actual 4094 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /home/grid/oraInventory
'UpdateNodeList' was successful.


38,在节点1以ORACLE用户更新集群配置信息
[oracle@jingfa1 ~]$ /u01/app/oracle/product/11.2.0.4/db_1/oui/bin/runInstaller  -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES=jingfa1"
Starting Oracle Universal Installer...


Checking swap space: must be greater than 500 MB.   Actual 4094 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /home/grid/oraInventory
'UpdateNodeList' was successful.






39,确认节点2已经从集群中移除
[grid@jingfa1 ~]$ /u01/grid/11.2.0.4/bin/cluvfy stage -post nodedel -n jingfa2


Performing post-checks for node removal 


Checking CRS integrity...


Clusterware version consistency passed


CRS integrity check passed


Node removal check passed


Post-check for node removal was successful. 




40,准备在节点1开始添加节点2到集群,验证节点2是否可以添加到节点1,运行如下命令
/u01/grid/11.2.0.4/bin/cluvfy stage -pre nodeadd -n jingfa2


 摄错请重新重成节点间的SSH互信即可
ERROR: 
PRVF-7610 : Cannot verify user equivalence/reachability on existing cluster nodes
Verification cannot proceed




ERROR: 
PRVF-7617 : Node connectivity between "jingfa1 : 192.168.0.21" and "jingfa1 : 192.168.0.22" failed
TCP connectivity check failed for subnet "192.168.0.0"




原因:Linux中未关闭Firewall


停掉防火墙即可:
service iptables save
service iptables stop
chkconfig iptables off


41,再次在节点1重新验证节点2是否可以验证到集群环境中,其实就是验证节点2的硬软件环境是否满足运行集群的环境
检查节点互信,安装软件包好多信息,NTP以及DNS和多播
[grid@jingfa1 ~]$ /u01/grid/11.2.0.4/bin/cluvfy stage -pre nodeadd -n jingfa2


Performing pre-checks for node addition 


Checking node reachability...
Node reachability check passed from node "jingfa1"




Checking user equivalence...
User equivalence check passed for user "grid"


Checking node connectivity...


Checking hosts config file...


Verification of the hosts config file successful


Check: Node connectivity for interface "eth0"
Node connectivity passed for interface "eth0"
TCP connectivity check passed for subnet "192.168.0.0"


Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "192.168.0.0".
Subnet mask consistency check passed.


Node connectivity check passed


Checking multicast communication...


Checking subnet "192.168.0.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "192.168.0.0" for multicast communication with multicast group "230.0.1.0" passed.


Check of multicast communication passed.


Checking CRS integrity...


Clusterware version consistency passed


CRS integrity check passed


Checking shared resources...


Checking CRS home location...
"/u01/grid/11.2.0.4" is shared
Shared resources check for node addition passed




Checking node connectivity...


Checking hosts config file...


Verification of the hosts config file successful


Check: Node connectivity for interface "eth0"
Node connectivity passed for interface "eth0"
TCP connectivity check passed for subnet "192.168.0.0"




Check: Node connectivity for interface "eth1"
Node connectivity passed for interface "eth1"
TCP connectivity check passed for subnet "10.0.0.0"


Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "192.168.0.0".
Subnet mask consistency check passed for subnet "10.0.0.0".
Subnet mask consistency check passed.


Node connectivity check passed


Checking multicast communication...


Checking subnet "192.168.0.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "192.168.0.0" for multicast communication with multicast group "230.0.1.0" passed.


Checking subnet "10.0.0.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "10.0.0.0" for multicast communication with multicast group "230.0.1.0" passed.


Check of multicast communication passed.
Total memory check passed
Available memory check passed
Swap space check passed
Free disk space check passed for "jingfa2:/u01/grid/11.2.0.4,jingfa2:/tmp"
Free disk space check failed for "jingfa1:/u01/grid/11.2.0.4,jingfa1:/tmp"
Check failed on nodes: 
        jingfa1
Check for multiple users with UID value 1101 passed 
User existence check passed for "grid"
Run level check passed
Hard limits check passed for "maximum open file descriptors"
Soft limits check passed for "maximum open file descriptors"
Hard limits check passed for "maximum user processes"
Soft limits check passed for "maximum user processes"
System architecture check passed
Kernel version check passed
Kernel parameter check passed for "semmsl"
Kernel parameter check passed for "semmns"
Kernel parameter check passed for "semopm"
Kernel parameter check passed for "semmni"
Kernel parameter check passed for "shmmax"
Kernel parameter check passed for "shmmni"
Kernel parameter check passed for "shmall"
Kernel parameter check passed for "file-max"
Kernel parameter check passed for "ip_local_port_range"
Kernel parameter check passed for "rmem_default"
Kernel parameter check passed for "rmem_max"
Kernel parameter check passed for "wmem_default"
Kernel parameter check passed for "wmem_max"
Kernel parameter check passed for "aio-max-nr"
Package existence check passed for "make"
Package existence check passed for "binutils"
Package existence check passed for "gcc(x86_64)"
Package existence check passed for "libaio(x86_64)"
Package existence check passed for "glibc(x86_64)"
Package existence check passed for "compat-libstdc++-33(x86_64)"
Package existence check passed for "elfutils-libelf(x86_64)"
Package existence check passed for "elfutils-libelf-devel"
Package existence check passed for "glibc-common"
Package existence check passed for "glibc-devel(x86_64)"
Package existence check passed for "glibc-headers"
Package existence check passed for "gcc-c++(x86_64)"
Package existence check passed for "libaio-devel(x86_64)"
Package existence check passed for "libgcc(x86_64)"
Package existence check passed for "libstdc++(x86_64)"
Package existence check passed for "libstdc++-devel(x86_64)"
Package existence check passed for "sysstat"
Package existence check failed for "pdksh"
Check failed on nodes: 
        jingfa2,jingfa1
Package existence check passed for "expat(x86_64)"
Check for multiple users with UID value 0 passed 
Current group ID check passed


Starting check for consistency of primary group of root user


Check for consistency of root user's primary group passed


Checking OCR integrity...


OCR integrity check passed


Checking Oracle Cluster Voting Disk configuration...


Oracle Cluster Voting Disk configuration check passed
Time zone consistency check passed


Starting Clock synchronization checks using Network Time Protocol(NTP)...


NTP Configuration file check started...
NTP Configuration file check passed
No NTP Daemons or Services were found to be running
PRVF-5507 : NTP daemon or service is not running on any node but NTP configuration file exists on the following node(s):
jingfa2,jingfa1
Clock synchronization check using Network Time Protocol(NTP) failed




User "grid" is not part of "root" group. Check passed
Checking consistency of file "/etc/resolv.conf" across nodes


File "/etc/resolv.conf" does not have both domain and search entries defined
domain entry in file "/etc/resolv.conf" is consistent across nodes
search entry in file "/etc/resolv.conf" is consistent across nodes
PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes: jingfa1,jingfa2


File "/etc/resolv.conf" is not consistent across nodes




Pre-check for node addition was unsuccessful on all the nodes. 
[grid@jingfa1 ~]$ 




42,在节点1添加节点2


[grid@jingfa1 ~]$ /u01/grid/11.2.0.4/oui/bin/addNode.sh "CLUSTER_NEW_NODES={jingfa2}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={jingfa2-vip}"


类似内容略
ERROR: 
PRVF-7617 : Node connectivity between "jingfa1 : 192.168.0.21" and "jingfa1 : 192.168.0.24" failed


ERROR: 
PRVF-7617 : Node connectivity between "jingfa1 : 192.168.0.21" and "jingfa1 : 192.168.0.23" failed
TCP connectivity check failed for subnet "192.168.0.0"


Checking VIP configuration.
Checking VIP Subnet configuration.
Check for VIP Subnet configuration passed.
Checking VIP reachability
PRVF-10209 : VIPs "jingfa2-vip" are active before Clusterware installation




43,根据42步运行提示,停止节点2的VIP资源并移除并资源
[root@jingfa1 ~]# /u01/grid/11.2.0.4/bin/crsctl stop res  ora.jingfa2.vip
CRS-2673: Attempting to stop 'ora.jingfa2.vip' on 'jingfa1'
CRS-2677: Stop of 'ora.jingfa2.vip' on 'jingfa1' succeeded


[root@jingfa1 ~]# /u01/grid/11.2.0.4/bin/crsctl delete res  ora.jingfa2.vip


确认节点2的VIP资源已经清理完毕
[root@jingfa1 ~]# /u01/grid/11.2.0.4/bin/crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       jingfa1                                      
无关内容略     
ora.jingfa1.vip
      1        ONLINE  ONLINE       jingfa1                                      
ora.oc4j
      1        ONLINE  ONLINE       jingfa1                                      
ora.scan1.vip
      1        ONLINE  ONLINE       jingfa1




44,忽略预安装前准备工作检查,开始在节点1添加节点2      
[grid@jingfa1 ~]$ export IGNORE_PREADDNODE_CHECKS=Y
[grid@jingfa1 ~]$ /u01/grid/11.2.0.4/oui/bin/addNode.sh "CLUSTER_NEW_NODES={jingfa2}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={jingfa2-vip}"
Starting Oracle Universal Installer...


Checking swap space: must be greater than 500 MB.   Actual 4067 MB    Passed
Oracle Universal Installer, Version 11.2.0.3.0 Production
Copyright (C) 1999, 2011, Oracle. All rights reserved.




Performing tests to see whether nodes jingfa2 are available
............................................................... 100% Done.


.
-----------------------------------------------------------------------------
Cluster Node Addition Summary
Global Settings
   Source: /u01/grid/11.2.0.4
   New Nodes
Space Requirements
   New Nodes
      jingfa2
         /: Required 5.01GB : Available 8.51GB
Installed Products
   Product Names
      Oracle Grid Infrastructure 11.2.0.3.0 
      Sun JDK 1.5.0.30.03 
      Installer SDK Component 11.2.0.3.0 
      类似内容略


      SQL*Plus 11.2.0.3.0 
      Oracle Netca Client 11.2.0.3.0 
      Oracle Net 11.2.0.3.0 
      Oracle JVM 11.2.0.3.0 
      Oracle Internet Directory Client 11.2.0.3.0 
      Oracle Net Listener 11.2.0.3.0 
      Cluster Ready Services Files 11.2.0.3.0 
      Oracle Database 11g 11.2.0.3.0 
-----------------------------------------------------------------------------




Instantiating scripts for add node (Saturday, September 19, 2015 10:53:28 AM GMT+08:00)
.                                                                 1% Done.
Instantiation of add node scripts complete


Copying to remote nodes (Saturday, September 19, 2015 10:53:33 AM GMT+08:00)
...............................................................................................                                 96% Done.
Home copied to new nodes


Saving inventory on nodes (Saturday, September 19, 2015 11:05:15 AM GMT+08:00)
.                                                               100% Done.
Save inventory complete
WARNING:A new inventory has been created on one or more nodes in this session. However, it has not yet been registered as the central inventory of this system. 
To register the new inventory please run the script at '/home/grid/oraInventory/orainstRoot.sh' with root privileges on nodes 'jingfa2'.
If you do not register the inventory, you may not be able to update or patch the products you installed.
The following configuration scripts need to be executed as the "root" user in each new cluster node. Each script in the list below is followed by a list of nodes.
/home/grid/oraInventory/orainstRoot.sh #On nodes jingfa2
/u01/grid/11.2.0.4/root.sh #On nodes jingfa2
To execute the configuration scripts:
    1. Open a terminal window
    2. Log in as "root"
    3. Run the scripts in each cluster node
    
The Cluster Node Addition of /u01/grid/11.2.0.4 was successful.
Please check '/tmp/silentInstall.log' for more details.
[grid@jingfa1 ~]$ 


45,以ROOT用户在节点2运行44步提示的SH脚本
[root@jingfa2 ~]# /home/grid/oraInventory/orainstRoot.sh
Changing permissions of /home/grid/oraInventory.
Adding read,write permissions for group.
Removing read,write,execute permissions for world.


Changing groupname of /home/grid/oraInventory to oinstall.
The execution of the script is complete.
[root@jingfa2 ~]# /u01/grid/11.2.0.4/root.sh
Performing root user operation for Oracle 11g 


The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/grid/11.2.0.4


Enter the full pathname of the local bin directory: [/usr/local/bin]: 
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.




Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/grid/11.2.0.4/crs/install/crsconfig_params
Creating trace directory
User ignored Prerequisites during installation
OLR initialization - successful
Adding Clusterware entries to upstart
CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node jingfa1, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
clscfg: EXISTING configuration version 5 detected.
clscfg: version 5 is 11g Release 2.
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded
[root@jingfa2 ~]# 



个人简介


8年oracle从业经验,具备丰富的oracle技能,目前在国内北京某专业oracle服务公司从事高级技术顾问。 服务过的客户: 中国电信 中国移动 中国联通 中国电通 国家电网 四川达州商业银行 湖南老百姓大药房 山西省公安厅 中国邮政 北京302医院      河北廊坊新奥集团公司
 项目经验: 中国电信3G项目AAA系统数据库部署及优化
      中国联通4G数据库性能分析与优化 中国联通CRM数据库性能优化 中国移动10086电商平台数据库部署及优化 湖南老百姓大药房ERR数据库sql优化项目 四川达州商业银行TCBS核心业务系统数据库模型设计和RAC部署及优化 四川达州商业银行TCBS核心业务系统后端批处理存储过程功能模块编写及优化 北京高铁信号监控系统RAC数据库部署及优化 河南宇通客车数据库性能优化 中国电信电商平台核心采购模块表模型设计及优化 中国邮政储蓄系统数据库性能优化及sql优化 北京302医院数据库迁移实施 河北廊坊新奥data guard部署及优化 山西公安厅身份证审计数据库系统故障评估 国家电网上海灾备项目4 node rac+adg 
       贵州移动crm及客服数据库性能优化项目
       贵州移动crm及客服务数据库sql审核项目
       深圳穆迪软件有限公司数据库性能优化项目
联系方式: 手机:18201115468 qq   :   305076427 qq微博: wisdomone1 新浪微博:wisdomone9 qq群:275813900     itpub博客名称:wisdomone1    http://blog.itpub.net/9240380/

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/9240380/viewspace-1804149/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/9240380/viewspace-1804149/

总结

以上是生活随笔为你收集整理的分析11.2.0.3 rac CRS-1714:Unable to discover any voting files的全部内容,希望文章能够帮你解决所遇到的问题。

如果觉得生活随笔网站内容还不错,欢迎将生活随笔推荐给好友。