'crs' 태그의 글 목록

'crs'에 해당되는 글 2건

2014.02.22 Steps to Remove Node from Cluster When the Node Crashes Due to OS/Hardware Failure and cannot boot up (Doc ID 466975.1)
2013.05.30 Oracle 11g RAC startup policy 변경(crs start시 instance자동 start 설정)

Steps to Remove Node from Cluster When the Node Crashes Due to OS/Hardware Failure and cannot boot up (Doc ID 466975.1)

Steps to Remove Node from Cluster When the Node Crashes Due to OS/Hardware Failure and cannot boot up (Doc ID 466975.1) To BottomTo Bottom

Modified:16-Nov-2012Type:HOWTO

Rate this document Email link to this document Open document in new window Printable Page

In this Document

Goal

Fix

Summary

Example Configuration

Initial Stage

Step 1 Remove oifcfg information for the failed node

Step 2 Remove ONS information

Step 3 Remove resources

Step 4 Execute rootdeletenode.sh

Step 5 Update the Inventory

References

APPLIES TO:

Oracle Server - Enterprise Edition - Version 10.2.0.1 to 11.1.0.6 [Release 10.2 to 11.1]

Oracle Server - Standard Edition - Version 10.2.0.1 to 11.1.0.6 [Release 10.2 to 11.1]

Information in this document applies to any platform.

Oracle Server Enterprise Edition - Version: 10.2.0.1 to 11.1.0.6

Oracle Clusterware

GOAL

This document is intented to provide the steps to be taken to remove a node from the Oracle cluster. The node itself is unavailable due to some OS issue or hardware issue which prevents the node from starting up. This document will provide the steps to remove such a node so that it can be added back after the node is fixed.

The steps to remove a node from a Cluster is already documented in the Oracle documentation at

Version Documentation Link

10gR2 http://download.oracle.com/docs/cd/B19306_01/rac.102/b14197/adddelunix.htm#BEIFDCAF

11gR1 http://download.oracle.com/docs/cd/B28359_01/rac.111/b28255/adddelclusterware.htm#BEIFDCAF

This note is different because the documentation covers the scenario where the node is accessible and the removal is a planned procedure. This note covers the scenario where the Node is unable to boot up and therefore it is not possible to run the clusterware commands from this node.

For 11gR2, refer to note 1262925.1

FIX

Summary

Basically all the steps documented in the Oracle Clusterware Administration and Deployment Guide must be followed. The difference here is that we skip the steps that are to be executed on the node which is not available and we run some extra commands on the other node which is going to remain in the cluster to remove the resources from the node that is to be removed.

Example Configuration

All steps outlined in this document were executed on a cluster with the following configuration:

Item Value

Node Names lc2n1, lc2n2, lc2n3

Operating System Oracle Enterprise Linux 5 Update 4

Oracle Clusterware Release 10.2.0.5.0

ASM & Database Release 10.2.0.5.0

Clusterware Home /u01/app/oracle/product/10.2.0/crs ($CRS_HOME)

ASM Home /u01/app/oracle/product/10.2.0/asm

Database Home /u01/app/oracle/product/10.2.0/db_1

Cluster Name lc2

Assume that node lc2n3 is down due to a hardware failure and cannot even boot up. The plan is to remove it from the clusterware, fix the issue and then add it again to the Clusterware. In this document, we will cover the steps to remove the node from the clusterware

Please note that for better readability instead of 'crs_stat -t' the sample script 'crsstat' from

Doc ID 259301.1 CRS and 10g/11.1 Real Application Clusters

was used to query the state of the CRS resources. This script is not part of a standard CRS installation.

Initial Stage

At this stage, the Oracle Clusterware is up and running on nodes lc2n1 & lc2n2 (good nodes) . Node lc2n3 is down and cannot be accessed. Note that the Virtual IP of lc2n3 is running on Node 1. The rest of the lc2n3 resources are OFFLINE:

[oracle@lc2n1 ~]$ crsstat

Name Target State Host

-------------------------------------------------------------------------------

ora.LC2DB1.LC2DB11.inst ONLINE ONLINE lc2n1

ora.LC2DB1.LC2DB12.inst ONLINE ONLINE lc2n2

ora.LC2DB1.LC2DB13.inst ONLINE OFFLINE

ora.LC2DB1.LC2DB1_SRV1.LC2DB11.srv ONLINE ONLINE lc2n1

ora.LC2DB1.LC2DB1_SRV1.LC2DB12.srv ONLINE ONLINE lc2n2

ora.LC2DB1.LC2DB1_SRV1.LC2DB13.srv ONLINE OFFLINE

ora.LC2DB1.LC2DB1_SRV1.cs ONLINE ONLINE lc2n1

ora.LC2DB1.db ONLINE ONLINE lc2n2

ora.lc2n1.ASM1.asm ONLINE ONLINE lc2n1

ora.lc2n1.LISTENER_LC2N1.lsnr ONLINE ONLINE lc2n1

ora.lc2n1.gsd ONLINE ONLINE lc2n1

ora.lc2n1.ons ONLINE ONLINE lc2n1

ora.lc2n1.vip ONLINE ONLINE lc2n1

ora.lc2n2.ASM2.asm ONLINE ONLINE lc2n2

ora.lc2n2.LISTENER_LC2N2.lsnr ONLINE ONLINE lc2n2

ora.lc2n2.gsd ONLINE ONLINE lc2n2

ora.lc2n2.ons ONLINE ONLINE lc2n2

ora.lc2n2.vip ONLINE ONLINE lc2n2

ora.lc2n3.ASM3.asm ONLINE OFFLINE

ora.lc2n3.LISTENER_LC2N3.lsnr ONLINE OFFLINE

ora.lc2n3.gsd ONLINE OFFLINE

ora.lc2n3.ons ONLINE OFFLINE

ora.lc2n3.vip ONLINE ONLINE lc2n1

[oracle@lc2n1 ~]$

Step 1 Remove oifcfg information for the failed node

Generally most installations use the global flag of the oifcfg command and therefore they can skip this step. They can confirm this using:

[oracle@lc2n1 bin]$ $CRS_HOME/bin/oifcfg getif

eth0 192.168.56.0 global public

eth1 192.168.57.0 global cluster_interconnect

If the output of the command returns global as shown above then you can skip the following step (executing the command below on a global defination will return an error as shown below.

If the output of the oifcfg getif command does not return global then use the following command

[oracle@lc2n1 bin]$ $CRS_HOME/bin/oifcfg delif -node lc2n3

PROC-4: The cluster registry key to be operated on does not exist.

PRIF-11: cluster registry error

Step 2 Remove ONS information

Execute the following command to find out the remote port number to be used

[oracle@lc2n1 bin]$ cat $CRS_HOME/opmn/conf/ons.config

localport=6113

remoteport=6200

loglevel=3

useocr=on

and remove the information pertaining to the node to be deleted using:

[oracle@lc2n1 bin]$ $CRS_HOME/bin/racgons remove_config lc2n3:6200

Step 3 Remove resources

In this step, the resources that were defined on this node have to be removed. These resources include Database Instances, ASm, Listener and Nodeapps resources. A list of these can be acquired by running crsstat (crs_stat -t) command from any node

[oracle@lc2n1 ~]$ crsstat |grep OFFLINE

ora.LC2DB1.LC2DB13.inst ONLINE OFFLINE

ora.LC2DB1.LC2DB1_SRV1.LC2DB13.srv ONLINE OFFLINE

ora.lc2n3.ASM3.asm ONLINE OFFLINE

ora.lc2n3.LISTENER_LC2N3.lsnr ONLINE OFFLINE

ora.lc2n3.gsd ONLINE OFFLINE

ora.lc2n3.ons ONLINE OFFLINE

Before removing any resource it is recommended to take a backup of the OCR:

[root@lc2n1 ~]# cd $CRS_HOME/cdata/lc2

[root@lc2n1 lc2]# $CRS_HOME/bin/ocrconfig -export ocr_before_node_removal.exp

[root@lc2n1 lc2]# ls -l ocr_before_node_removal.exp

-rw-r--r-- 1 root root 151946 Nov 15 15:24 ocr_before_node_removal.exp

Use 'srvctl' from the database home to delete the database instance on node 3:

[oracle@lc2n1 ~]$ . oraenv

ORACLE_SID = [oracle] ? LC2DB1

[oracle@lc2n1 ~]$ $ORACLE_HOME/bin/srvctl remove instance -d LC2DB1 -i LC2DB13

Remove instance LC2DB13 from the database LC2DB1? (y/[n]) y

Use 'srvctl' from the ASM home to delete the ASM instance on node 3:

[oracle@lc2n1 ~]$ . oraenv

ORACLE_SID = [oracle] ? +ASM1

[oracle@lc2n1 ~]$ $ORACLE_HOME/bin/srvctl remove asm -n lc2n3

Next remove the listener resource.

Please note that there is no 'srvctl remove listener' subcommand prior to 11.1 so this command will not work in 10.2. Using 'netca' to delete the listener from a down node also is not an option as netca needs to remove the listener configuration from the listener.ora.

10.2 only:

The only way to remove the listener resources is to use the command 'crs_unregister', please use this command only in this particular scenario:

[oracle@lc2n1 lc2]$ $CRS_HOME/bin/crs_unregister ora.lc2n3.LISTENER_LC2N3.lsnr

11.1 only:

Set the environment to the home from which the listener runs (ASM or database):

[oracle@lc2n1 ~]$ . oraenv

ORACLE_SID = [oracle] ? +ASM1

[oracle@lc2n1 lc2]$ $ORACLE_HOME/bin/srvctl remove listener -n lc2n3

As user root stop the nodeapps resources:

[root@lc2n1 oracle]# $CRS_HOME/bin/srvctl stop nodeapps -n lc2n3

[root@lc2n1 oracle]# crsstat |grep OFFLINE

ora.lc2n3.LISTENER_LC2N3.lsnr OFFLINE OFFLINE

ora.lc2n3.gsd OFFLINE OFFLINE

ora.lc2n3.ons OFFLINE OFFLINE

ora.lc2n3.vip OFFLINE OFFLINE

Now remove them:

[root@lc2n1 oracle]# $CRS_HOME/bin/srvctl remove nodeapps -n lc2n3

Please confirm that you intend to remove the node-level applications on node lc2n3 (y/[n]) y

At this point all resources from the bad node should be gone:

[oracle@lc2n1 ~]$ crsstat

Name Target State Host

-------------------------------------------------------------------------------

ora.LC2DB1.LC2DB11.inst ONLINE ONLINE lc2n1

ora.LC2DB1.LC2DB12.inst ONLINE ONLINE lc2n2

ora.LC2DB1.LC2DB1_SRV1.LC2DB11.srv ONLINE ONLINE lc2n1

ora.LC2DB1.LC2DB1_SRV1.LC2DB12.srv ONLINE ONLINE lc2n2

ora.LC2DB1.LC2DB1_SRV1.cs ONLINE ONLINE lc2n1

ora.LC2DB1.db ONLINE ONLINE lc2n2

ora.lc2n1.ASM1.asm ONLINE ONLINE lc2n1

ora.lc2n1.LISTENER_LC2N1.lsnr ONLINE ONLINE lc2n1

ora.lc2n1.gsd ONLINE ONLINE lc2n1

ora.lc2n1.ons ONLINE ONLINE lc2n1

ora.lc2n1.vip ONLINE ONLINE lc2n1

ora.lc2n2.ASM2.asm ONLINE ONLINE lc2n2

ora.lc2n2.LISTENER_LC2N2.lsnr ONLINE ONLINE lc2n2

ora.lc2n2.gsd ONLINE ONLINE lc2n2

ora.lc2n2.ons ONLINE ONLINE lc2n2

ora.lc2n2.vip ONLINE ONLINE lc2n2

Step 4 Execute rootdeletenode.sh

From the node that you are not deleting execute as root the following command which will help find out the node number of the node that you want to delete

[oracle@lc2n1 ~]$ $CRS_HOME//bin/olsnodes -n

lc2n1 1

lc2n2 2

lc2n3 3

this number can be passed to the rootdeletenode.sh command which is to be executed as root from any node which is going to remain in the cluster.

[root@lc2n1 ~]# cd $CRS_HOME/install

[root@lc2n1 install]# ./rootdeletenode.sh lc2n3,3

CRS-0210: Could not find resource 'ora.lc2n3.ons'.

CRS-0210: Could not find resource 'ora.lc2n3.vip'.

CRS-0210: Could not find resource 'ora.lc2n3.gsd'.

CRS-0210: Could not find resource ora.lc2n3.vip.

CRS nodeapps are deleted successfully

clscfg: EXISTING configuration version 3 detected.

clscfg: version 3 is 10G Release 2.

Successfully deleted 14 values from OCR.

Key SYSTEM.css.interfaces.nodelc2n3 marked for deletion is not there. Ignoring.

Successfully deleted 5 keys from OCR.

Node deletion operation successful.

'lc2n3,3' deleted successfully

[root@lc2n1 install]# $CRS_HOME/bin/olsnodes -n

lc2n1 1

lc2n2 2

Step 5 Update the Inventory

From the node which is going to remain in the cluster run the following command as owner of the CRS_HOME. The argument to be passed to the CLUSTER_NODES is a comma seperated list of node names of the cluster which are going to remain in the cluster. This step needs to be performed from once per home (Clusterware, ASM and RDBMS homes).

[oracle@lc2n1 install]$ $CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=/u01/app/oracle/product/10.2.0/crs "CLUSTER_NODES={lc2n1,lc2n2}" CRS=TRUE

Starting Oracle Universal Installer...

No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed.

The inventory pointer is located at /etc/oraInst.loc

The inventory is located at /u01/app/oracle/oraInventory

'UpdateNodeList' was successful.

[oracle@lc2n1 install]$ $CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=/u01/app/oracle/product/10.2.0/asm "CLUSTER_NODES={lc2n1,lc2n2}"

Starting Oracle Universal Installer...

No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed.

The inventory pointer is located at /etc/oraInst.loc

The inventory is located at /u01/app/oracle/oraInventory

'UpdateNodeList' was successful.

[oracle@lc2n1 install]$ $CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1 "CLUSTER_NODES={lc2n1,lc2n2}"

Starting Oracle Universal Installer...

No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed.

The inventory pointer is located at /etc/oraInst.loc

The inventory is located at /u01/app/oracle/oraInventory

'UpdateNodeList' was successful.

저작자표시 비영리 (새창열림)

'ORACLE > RAC' 카테고리의 다른 글

RAC 클러스터 환경을 안정화하기 위한 주요 11가지 방안 (문서 ID 1575936.1) (0)	2014.07.02
오라클 single to RAC (1)	2014.03.12
TAF (Transparent Application Failover)란 무엇인가? (0)	2013.06.14
11g rac설치후 resource 자동으로 올라오게 설정 (0)	2013.05.30
Oracle 11g RAC startup policy 변경(crs start시 instance자동 start 설정) (0)	2013.05.30

Posted by [PineTree]

Oracle 11g RAC startup policy 변경(crs start시 instance자동 start 설정)

출처 : http://leejehong.tistory.com/170

[root@RAC1 ~]# srvctl config database -d devdb -v
Database unique name: devdb
Database name: devdb
Oracle home: /u01/app/11.2.0/db
Oracle user: oracle

10g CRS AUTO_START 값 변경.txt

11G Oracle RAC Startup Policy 변경.txt

11gR2 Disable Enable Automatic startup Oracle HAS.txt

Changing Resource Attributes in 11gR2 Grid Infrastructure.txt

crs_stat_resource상태확인(AUTO_START).txt

Spfile: /dev/raw/raw6
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: devdb
Database instances: devdb1,devdb2
Disk Groups:
Mount point paths:
Services:
Type: RAC
Database is administrator managed
[root@RAC1 ~]#

oracle@RAC1:/>crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora....ER.lsnr ora....er.type ONLINE    ONLINE    rac1
ora....N1.lsnr ora....er.type ONLINE    ONLINE    rac2
ora.cvu        ora.cvu.type   ONLINE    ONLINE    rac1
ora.devdb.db   ora....se.type OFFLINE   OFFLINE
ora.gsd        ora.gsd.type   ONLINE    ONLINE    rac1
ora....network ora....rk.type ONLINE    ONLINE    rac1
ora.oc4j       ora.oc4j.type ONLINE    ONLINE    rac1
ora.ons        ora.ons.type   ONLINE    ONLINE    rac1
ora....C1.lsnr application    ONLINE    ONLINE    rac1
ora.rac1.gsd   application    ONLINE    ONLINE    rac1
ora.rac1.ons   application    ONLINE    ONLINE    rac1
ora.rac1.vip   ora....t1.type ONLINE    ONLINE    rac1
ora....C2.lsnr application    ONLINE    ONLINE    rac2
ora.rac2.gsd   application    ONLINE    ONLINE    rac2
ora.rac2.ons   application    ONLINE    ONLINE    rac2
ora.rac2.vip   ora....t1.type ONLINE    ONLINE    rac2
ora....ry.acfs ora....fs.type OFFLINE   OFFLINE
ora.scan1.vip ora....ip.type ONLINE    ONLINE    rac2
oracle@RAC1:/>
oracle@RAC1:/>

reboot시 Database가 자동으로 시작되지 않음.

oracle@RAC1:/>crs_stat -p ora.devdb.db |grep -i start
AUTO_START=restore
GEN_START_OPTIONS@SERVERNAME(rac1)=open
GEN_START_OPTIONS@SERVERNAME(rac2)=open
RESTART_ATTEMPTS=2
START_TIMEOUT=600
oracle@RAC1:/>

AUTO_START=restore 로 되어있어서 startup 되지 않음.

# 참고
* AUTO_START=1(always) -> 그 전의 상태와 상관없이 하드웨어 설정상태만 정상이면 crs 재구동시 리소스가 online 되어짐. * * AUTO_START=2(never) -> 모든 리소스를 수동으로 시작
* AUTO_START=0(restore) -> 모든 리소스는 내리기 전 상태로 복귀.

[root@RAC1 profile]# crs_stat -p ora.devdb.db |grep -i start
AUTO_START=restore
GEN_START_OPTIONS@SERVERNAME(rac1)=open
GEN_START_OPTIONS@SERVERNAME(rac2)=open
RESTART_ATTEMPTS=2
START_TIMEOUT=600

[root@RAC1 profile]#
[root@RAC1 profile]# crsctl modify resource "ora.devdb.db" -attr "AUTO_START=always"
[root@RAC1 profile]# crs_stat -p ora.devdb.db |grep -i start
AUTO_START=always
GEN_START_OPTIONS@SERVERNAME(rac1)=open
GEN_START_OPTIONS@SERVERNAME(rac2)=open
RESTART_ATTEMPTS=2
START_TIMEOUT=600
[root@RAC1 profile]#

Changing Resource Attributes in 11gR2 Grid Infrastructure
In 11gR2 grid infrastructure installations certain resources may have auto start set to never and restore.
This was observed both on environments where clusterware was upgraded to 11.2 as well as newly installed environments.
Depending on the situation these may not be desirable. Auto start attribute setting could be changed as follows.

1. Check the current auto start values

# crsctl stat res -p
NAME=ora.FLASH.dg
TYPE=ora.diskgroup.type
ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r--
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
ALIAS_NAME=
AUTO_START=never

NAME=ora.DATA.dg
TYPE=ora.diskgroup.type
ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r--
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
ALIAS_NAME=
AUTO_START=never

NAME=ora.clusdb.db
TYPE=ora.database.type
ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r--
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
ACTIVE_PLACEMENT=1
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
AUTO_START=restore

2. Since ASM diskgroup that database depend on will never auto start database will also be unavailable.

3. Change the resource start attribute with

# crsctl modify resource "ora.FLASH.dg" -attr "AUTO_START=always"
# crsctl modify resource "ora.DATA.dg" -attr "AUTO_START=always"
# crsctl modify resource ora.clusdb.db -attr "AUTO_START=always"
Auto start must be upper case if not command will fail
crsctl modify resource ora.clusdb.db -attr "auto_start=always"
CRS-0160: The attribute 'auto_start' is not supported in this resource type.
CRS-4000: Command Modify failed, or completed with errors.

4. Verify the status change with
# crsctl stat res -p
NAME=ora.clusdb.db
TYPE=ora.database.type
ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r--
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
ACTIVE_PLACEMENT=1
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
AUTO_START=always

저작자표시 비영리 (새창열림)

'ORACLE > RAC' 카테고리의 다른 글

TAF (Transparent Application Failover)란 무엇인가? (0)	2013.06.14
11g rac설치후 resource 자동으로 올라오게 설정 (0)	2013.05.30
VIP Failover Take Long Time After Network Cable Pulled [ID 403743.1] (0)	2012.05.18
How to Configure Solaris Link-based IPMP for Oracle VIP [ID 730732.1] (0)	2012.05.18
oracle RAC 10.2.0.1 -> 10.2.0.5 패치 (0)	2012.05.11

Posted by [PineTree]

About DATABASE

'crs'에 해당되는 글 2건

Steps to Remove Node from Cluster When the Node Crashes Due to OS/Hardware Failure and cannot boot up (Doc ID 466975.1)

'ORACLE > RAC' 카테고리의 다른 글

Oracle 11g RAC startup policy 변경(crs start시 instance자동 start 설정)

'ORACLE > RAC' 카테고리의 다른 글

카테고리

공지사항

태그목록

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

달력

링크

티스토리툴바