ORACLE/RAC2014. 2. 22. 12:10
반응형

Steps to Remove Node from Cluster When the Node Crashes Due to OS/Hardware Failure and cannot boot up (Doc ID 466975.1) To BottomTo Bottom

Modified:16-Nov-2012Type:HOWTO

Rate this document Email link to this document Open document in new window Printable Page


In this Document

Goal

Fix

  Summary

  Example Configuration

  Initial Stage

  Step 1 Remove oifcfg information for the failed node

  Step 2 Remove ONS information

  Step 3 Remove resources

  Step 4 Execute rootdeletenode.sh

  Step 5 Update the Inventory

References

APPLIES TO:


Oracle Server - Enterprise Edition - Version 10.2.0.1 to 11.1.0.6 [Release 10.2 to 11.1]

Oracle Server - Standard Edition - Version 10.2.0.1 to 11.1.0.6 [Release 10.2 to 11.1]

Information in this document applies to any platform.

Oracle Server Enterprise Edition - Version: 10.2.0.1 to 11.1.0.6

Oracle Clusterware



GOAL


This document is intented to provide the steps to be taken to remove a node from the Oracle cluster. The node itself is unavailable due to some OS issue or hardware issue which prevents the node from starting up. This document will provide the steps to remove such a node so that it can be added back after the node is fixed.


The steps to remove a node from a Cluster is already documented in the Oracle documentation at


Version Documentation Link

10gR2 http://download.oracle.com/docs/cd/B19306_01/rac.102/b14197/adddelunix.htm#BEIFDCAF

11gR1 http://download.oracle.com/docs/cd/B28359_01/rac.111/b28255/adddelclusterware.htm#BEIFDCAF

This note is different because the documentation covers the scenario where the node is accessible and the removal is a planned procedure. This note covers the scenario where the Node is unable to boot up and therefore it is not possible to run the clusterware commands from this node.


For 11gR2, refer to note 1262925.1


 


FIX


Summary


Basically all the steps documented in the Oracle Clusterware Administration and Deployment Guide must be followed. The difference here is that we skip the steps that are to be executed on the node which is not available and we run some extra commands on the other node which is going to remain in the cluster to remove the resources from the node that is to be removed.


Example Configuration


 All steps outlined in this document were executed on a cluster with the following configuration:


Item Value

Node Names lc2n1, lc2n2, lc2n3

Operating System Oracle Enterprise Linux 5 Update 4

Oracle Clusterware Release 10.2.0.5.0

ASM & Database Release 10.2.0.5.0

Clusterware Home /u01/app/oracle/product/10.2.0/crs ($CRS_HOME)

ASM Home /u01/app/oracle/product/10.2.0/asm

Database Home /u01/app/oracle/product/10.2.0/db_1

 Cluster Name lc2

 


 Assume that node lc2n3 is down due to a hardware failure and cannot even boot up. The plan is to remove it from the clusterware, fix the issue and then add it again to the Clusterware. In this document, we will cover the steps to remove the node from the clusterware


Please note that for better readability instead of 'crs_stat -t' the sample script 'crsstat' from 

  Doc ID 259301.1 CRS and 10g/11.1 Real Application Clusters 

was used to query the state of the CRS resources. This script is not part of a standard CRS installation.

 


Initial Stage


At this stage, the Oracle Clusterware is up and running on nodes lc2n1 & lc2n2 (good nodes) . Node lc2n3 is down and cannot be accessed. Note that the Virtual IP of lc2n3 is running on Node 1. The rest of the lc2n3 resources are OFFLINE:


[oracle@lc2n1 ~]$ crsstat

Name                                     Target     State      Host      

-------------------------------------------------------------------------------

ora.LC2DB1.LC2DB11.inst                  ONLINE     ONLINE     lc2n1     

ora.LC2DB1.LC2DB12.inst                  ONLINE     ONLINE     lc2n2     

ora.LC2DB1.LC2DB13.inst                  ONLINE     OFFLINE              

ora.LC2DB1.LC2DB1_SRV1.LC2DB11.srv       ONLINE     ONLINE     lc2n1     

ora.LC2DB1.LC2DB1_SRV1.LC2DB12.srv       ONLINE     ONLINE     lc2n2     

ora.LC2DB1.LC2DB1_SRV1.LC2DB13.srv       ONLINE     OFFLINE              

ora.LC2DB1.LC2DB1_SRV1.cs                ONLINE     ONLINE     lc2n1     

ora.LC2DB1.db                            ONLINE     ONLINE     lc2n2     

ora.lc2n1.ASM1.asm                       ONLINE     ONLINE     lc2n1     

ora.lc2n1.LISTENER_LC2N1.lsnr            ONLINE     ONLINE     lc2n1     

ora.lc2n1.gsd                            ONLINE     ONLINE     lc2n1     

ora.lc2n1.ons                            ONLINE     ONLINE     lc2n1     

ora.lc2n1.vip                            ONLINE     ONLINE     lc2n1     

ora.lc2n2.ASM2.asm                       ONLINE     ONLINE     lc2n2     

ora.lc2n2.LISTENER_LC2N2.lsnr            ONLINE     ONLINE     lc2n2     

ora.lc2n2.gsd                            ONLINE     ONLINE     lc2n2     

ora.lc2n2.ons                            ONLINE     ONLINE     lc2n2     

ora.lc2n2.vip                            ONLINE     ONLINE     lc2n2     

ora.lc2n3.ASM3.asm                       ONLINE     OFFLINE              

ora.lc2n3.LISTENER_LC2N3.lsnr            ONLINE     OFFLINE              

ora.lc2n3.gsd                            ONLINE     OFFLINE              

ora.lc2n3.ons                            ONLINE     OFFLINE              

ora.lc2n3.vip                            ONLINE     ONLINE     lc2n1     

[oracle@lc2n1 ~]$

 


Step 1 Remove oifcfg information for the failed node


Generally most installations use the global flag of the oifcfg command and therefore they can skip this step. They can confirm this using:


[oracle@lc2n1 bin]$ $CRS_HOME/bin/oifcfg getif

eth0  192.168.56.0  global  public

eth1  192.168.57.0  global  cluster_interconnect

If the output of the command returns global as shown above then you can skip the following step (executing the command below on a global defination will return an error as shown below.


If the output of the oifcfg getif command does not return global then use the following command


[oracle@lc2n1 bin]$ $CRS_HOME/bin/oifcfg delif -node lc2n3 

PROC-4: The cluster registry key to be operated on does not exist.

PRIF-11: cluster registry error

 


Step 2 Remove ONS information


Execute the following command to find out the remote port number to be used


[oracle@lc2n1 bin]$ cat $CRS_HOME/opmn/conf/ons.config

localport=6113 

remoteport=6200 

loglevel=3

useocr=on

and remove the information pertaining to the node to be deleted using:


[oracle@lc2n1 bin]$ $CRS_HOME/bin/racgons remove_config lc2n3:6200

 


Step 3 Remove resources


In this step, the resources that were defined on this node have to be removed. These resources include Database Instances, ASm, Listener and Nodeapps resources. A list of these can be acquired by running crsstat (crs_stat -t) command from any node


[oracle@lc2n1 ~]$ crsstat |grep OFFLINE

ora.LC2DB1.LC2DB13.inst                  ONLINE     OFFLINE              

ora.LC2DB1.LC2DB1_SRV1.LC2DB13.srv       ONLINE     OFFLINE              

ora.lc2n3.ASM3.asm                       ONLINE     OFFLINE              

ora.lc2n3.LISTENER_LC2N3.lsnr            ONLINE     OFFLINE              

ora.lc2n3.gsd                            ONLINE     OFFLINE              

ora.lc2n3.ons                            ONLINE     OFFLINE             

 Before removing any resource it is recommended to take a backup of the OCR:


[root@lc2n1 ~]# cd $CRS_HOME/cdata/lc2

[root@lc2n1 lc2]# $CRS_HOME/bin/ocrconfig -export ocr_before_node_removal.exp

[root@lc2n1 lc2]# ls -l ocr_before_node_removal.exp

-rw-r--r-- 1 root root 151946 Nov 15 15:24 ocr_before_node_removal.exp

 Use 'srvctl' from the database home to delete the database instance on node 3:


[oracle@lc2n1 ~]$ . oraenv

ORACLE_SID = [oracle] ? LC2DB1

[oracle@lc2n1 ~]$ $ORACLE_HOME/bin/srvctl remove instance -d LC2DB1 -i LC2DB13

Remove instance LC2DB13 from the database LC2DB1? (y/[n]) y

 Use 'srvctl' from the ASM home to delete the ASM instance on node 3:


[oracle@lc2n1 ~]$ . oraenv

ORACLE_SID = [oracle] ? +ASM1

[oracle@lc2n1 ~]$ $ORACLE_HOME/bin/srvctl remove asm -n lc2n3

Next remove the listener resource.


Please note that there is no 'srvctl remove listener' subcommand prior to 11.1 so this command will not work in 10.2. Using 'netca' to delete the listener from a down node also is not an option as netca needs to remove the listener configuration from the listener.ora.

10.2 only:


The only way to remove the listener resources is to use the command 'crs_unregister', please use this command only in this particular scenario:


[oracle@lc2n1 lc2]$ $CRS_HOME/bin/crs_unregister ora.lc2n3.LISTENER_LC2N3.lsnr

 11.1 only:


 Set the environment to the home from which the listener runs (ASM or database):


[oracle@lc2n1 ~]$ . oraenv

ORACLE_SID = [oracle] ? +ASM1

[oracle@lc2n1 lc2]$ $ORACLE_HOME/bin/srvctl remove listener -n lc2n3 

  As user root stop the nodeapps resources:


[root@lc2n1 oracle]# $CRS_HOME/bin/srvctl stop nodeapps -n lc2n3

[root@lc2n1 oracle]# crsstat |grep OFFLINE

ora.lc2n3.LISTENER_LC2N3.lsnr            OFFLINE    OFFLINE              

ora.lc2n3.gsd                            OFFLINE    OFFLINE              

ora.lc2n3.ons                            OFFLINE    OFFLINE              

ora.lc2n3.vip                            OFFLINE    OFFLINE        

 Now remove them:


[root@lc2n1 oracle]#  $CRS_HOME/bin/srvctl remove nodeapps -n lc2n3

Please confirm that you intend to remove the node-level applications on node lc2n3 (y/[n]) y

 At this point all resources from the bad node should be gone:


[oracle@lc2n1 ~]$ crsstat 

Name                                     Target     State      Host      

-------------------------------------------------------------------------------

ora.LC2DB1.LC2DB11.inst                  ONLINE     ONLINE     lc2n1     

ora.LC2DB1.LC2DB12.inst                  ONLINE     ONLINE     lc2n2     

ora.LC2DB1.LC2DB1_SRV1.LC2DB11.srv       ONLINE     ONLINE     lc2n1     

ora.LC2DB1.LC2DB1_SRV1.LC2DB12.srv       ONLINE     ONLINE     lc2n2     

ora.LC2DB1.LC2DB1_SRV1.cs                ONLINE     ONLINE     lc2n1     

ora.LC2DB1.db                            ONLINE     ONLINE     lc2n2     

ora.lc2n1.ASM1.asm                       ONLINE     ONLINE     lc2n1     

ora.lc2n1.LISTENER_LC2N1.lsnr            ONLINE     ONLINE     lc2n1     

ora.lc2n1.gsd                            ONLINE     ONLINE     lc2n1     

ora.lc2n1.ons                            ONLINE     ONLINE     lc2n1     

ora.lc2n1.vip                            ONLINE     ONLINE     lc2n1     

ora.lc2n2.ASM2.asm                       ONLINE     ONLINE     lc2n2     

ora.lc2n2.LISTENER_LC2N2.lsnr            ONLINE     ONLINE     lc2n2     

ora.lc2n2.gsd                            ONLINE     ONLINE     lc2n2     

ora.lc2n2.ons                            ONLINE     ONLINE     lc2n2     

ora.lc2n2.vip                            ONLINE     ONLINE     lc2n2  

 


Step 4 Execute rootdeletenode.sh


From the node that you are not deleting execute as root the following command which will help find out the node number of the node that you want to delete


[oracle@lc2n1 ~]$ $CRS_HOME//bin/olsnodes -n

lc2n1   1

lc2n2   2

lc2n3   3

this number can be passed to the rootdeletenode.sh command which is to be executed as root from any node which is going to remain in the cluster.


[root@lc2n1 ~]# cd $CRS_HOME/install

[root@lc2n1 install]# ./rootdeletenode.sh lc2n3,3

CRS-0210: Could not find resource 'ora.lc2n3.ons'.

CRS-0210: Could not find resource 'ora.lc2n3.vip'.

CRS-0210: Could not find resource 'ora.lc2n3.gsd'.

CRS-0210: Could not find resource ora.lc2n3.vip.

CRS nodeapps are deleted successfully

clscfg: EXISTING configuration version 3 detected.

clscfg: version 3 is 10G Release 2.

Successfully deleted 14 values from OCR.

Key SYSTEM.css.interfaces.nodelc2n3 marked for deletion is not there. Ignoring.

Successfully deleted 5 keys from OCR.

Node deletion operation successful.

'lc2n3,3' deleted successfully

[root@lc2n1 install]# $CRS_HOME/bin/olsnodes -n

lc2n1   1

lc2n2   2

 


Step 5 Update the Inventory


From the node which is going to remain in the cluster run the following command as owner of the CRS_HOME. The argument to be passed to the CLUSTER_NODES is a comma seperated list of node names of the cluster which are going to remain in the cluster. This step needs to be performed from once per home (Clusterware, ASM and RDBMS homes).


[oracle@lc2n1 install]$ $CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=/u01/app/oracle/product/10.2.0/crs "CLUSTER_NODES={lc2n1,lc2n2}" CRS=TRUE  

Starting Oracle Universal Installer...


No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed.

The inventory pointer is located at /etc/oraInst.loc

The inventory is located at /u01/app/oracle/oraInventory

'UpdateNodeList' was successful.


[oracle@lc2n1 install]$ $CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=/u01/app/oracle/product/10.2.0/asm "CLUSTER_NODES={lc2n1,lc2n2}"

Starting Oracle Universal Installer...


No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed.

The inventory pointer is located at /etc/oraInst.loc

The inventory is located at /u01/app/oracle/oraInventory

'UpdateNodeList' was successful.

[oracle@lc2n1 install]$ $CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1 "CLUSTER_NODES={lc2n1,lc2n2}"

Starting Oracle Universal Installer...


No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed.

The inventory pointer is located at /etc/oraInst.loc

The inventory is located at /u01/app/oracle/oraInventory

'UpdateNodeList' was successful.

반응형
Posted by [PineTree]
ORACLE/10G2009. 8. 25. 14:10
반응형
racle 10g에서는  내장 (Built-in) 롤 중 CONNECT 롤 및 RESOURCE 롤에 부여된 권한 변화

Oracle 10g에서는  내장 (Built-in) 롤 중 CONNECT 롤 및 RESOURCE 롤에 부여된 권한이 약간 변경되었습니다.


다음은 Oracle 9i의 CONNECT롤 및 RESOURCE롤에 부여된 시스템 권한입니다.


SQL> select * from role_sys_privs where role='CONNECT';

ROLE                           PRIVILEGE                                ADM
------------------------------ ---------------------------------------- ---
CONNECT                        CREATE VIEW                              NO
CONNECT                        CREATE TABLE                             NO
CONNECT                        ALTER SESSION                            NO
CONNECT                        CREATE CLUSTER                           NO
CONNECT                        CREATE SESSION                           NO
CONNECT                        CREATE SYNONYM                           NO
CONNECT                        CREATE SEQUENCE                          NO
CONNECT                        CREATE DATABASE LINK                     NO

8 개의 행이 선택되었습니다.

 

SQL> select * from role_sys_privs where role='RESOURCE';

ROLE                           PRIVILEGE                                ADM
------------------------------ ---------------------------------------- ---
RESOURCE                       CREATE TYPE                              NO
RESOURCE                       CREATE TABLE                             NO
RESOURCE                       CREATE CLUSTER                           NO
RESOURCE                       CREATE TRIGGER                           NO
RESOURCE                       CREATE OPERATOR                          NO
RESOURCE                       CREATE SEQUENCE                          NO
RESOURCE                       CREATE INDEXTYPE                         NO
RESOURCE                       CREATE PROCEDURE                         NO

8 개의 행이 선택되었습니다.


다음은 Oracle 10g의 CONNECT롤 및 RESOURCE롤에 부여된 시스템 권한입니다.

Oracle 9i와 비교할 때, 10g에서는 CONNECT 롤에 CREATE SESSION 시스템 권한만 부여되어 있으며, RESOURCE롤의 경우에도 부여된 시스템 권한이 일부 변경 되었습니다.


SQL> select * from role_sys_privs where role='CONNECT';

ROLE                           PRIVILEGE                                ADM
------------------------------ ---------------------------------------- ---
CONNECT                        CREATE SESSION                           NO

 

SQL> select * from role_sys_privs where role='RESOURCE';

ROLE                           PRIVILEGE                                ADM
------------------------------ ---------------------------------------- ---
RESOURCE                       CREATE SEQUENCE                          NO
RESOURCE                       CREATE TRIGGER                           NO
RESOURCE                       CREATE CLUSTER                           NO
RESOURCE                       CREATE PROCEDURE                         NO
RESOURCE                       CREATE TYPE                              NO
RESOURCE                       CREATE OPERATOR                          NO
RESOURCE                       CREATE TABLE                             NO
RESOURCE                       CREATE INDEXTYPE                         NO

 

눈여겨 볼 내용은 CREATE VIEW 권한과 CREATE SYNONYM 시스템 권한 등이 CONNECT 롤에서 회수되었기 때문에 해당 권한이 필요한 경우에는 다음과 같이 직접 권한을 부여해주어야만 합니다.

 

grant create view, create synonym to scott;


11g R2 버젼에서는 변화가 없군요


SQL*Plus: Release 11.2.0.1.0 Production on Tue Sep 8 16:56:41 2009

Copyright (c) 1982, 2009, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options


SQL> select * from role_sys_privs where role='CONNECT';

ROLE                           PRIVILEGE                                ADM
------------------------------ ---------------------------------------- ---
CONNECT                        CREATE SESSION                           NO

SQL> select * from role_sys_privs where role='RESOURCE';

ROLE                           PRIVILEGE                                ADM
------------------------------ ---------------------------------------- ---
RESOURCE                       CREATE SEQUENCE                          NO
RESOURCE                       CREATE TRIGGER                           NO
RESOURCE                       CREATE CLUSTER                           NO
RESOURCE                       CREATE PROCEDURE                         NO
RESOURCE                       CREATE TYPE                              NO
RESOURCE                       CREATE OPERATOR                          NO
RESOURCE                       CREATE TABLE                             NO
RESOURCE                       CREATE INDEXTYPE                         NO

8 rows selected.

반응형
Posted by [PineTree]