ORACLE/RAC2014. 2. 22. 12:10
반응형

Steps to Remove Node from Cluster When the Node Crashes Due to OS/Hardware Failure and cannot boot up (Doc ID 466975.1) To BottomTo Bottom

Modified:16-Nov-2012Type:HOWTO

Rate this document Email link to this document Open document in new window Printable Page


In this Document

Goal

Fix

  Summary

  Example Configuration

  Initial Stage

  Step 1 Remove oifcfg information for the failed node

  Step 2 Remove ONS information

  Step 3 Remove resources

  Step 4 Execute rootdeletenode.sh

  Step 5 Update the Inventory

References

APPLIES TO:


Oracle Server - Enterprise Edition - Version 10.2.0.1 to 11.1.0.6 [Release 10.2 to 11.1]

Oracle Server - Standard Edition - Version 10.2.0.1 to 11.1.0.6 [Release 10.2 to 11.1]

Information in this document applies to any platform.

Oracle Server Enterprise Edition - Version: 10.2.0.1 to 11.1.0.6

Oracle Clusterware



GOAL


This document is intented to provide the steps to be taken to remove a node from the Oracle cluster. The node itself is unavailable due to some OS issue or hardware issue which prevents the node from starting up. This document will provide the steps to remove such a node so that it can be added back after the node is fixed.


The steps to remove a node from a Cluster is already documented in the Oracle documentation at


Version Documentation Link

10gR2 http://download.oracle.com/docs/cd/B19306_01/rac.102/b14197/adddelunix.htm#BEIFDCAF

11gR1 http://download.oracle.com/docs/cd/B28359_01/rac.111/b28255/adddelclusterware.htm#BEIFDCAF

This note is different because the documentation covers the scenario where the node is accessible and the removal is a planned procedure. This note covers the scenario where the Node is unable to boot up and therefore it is not possible to run the clusterware commands from this node.


For 11gR2, refer to note 1262925.1


 


FIX


Summary


Basically all the steps documented in the Oracle Clusterware Administration and Deployment Guide must be followed. The difference here is that we skip the steps that are to be executed on the node which is not available and we run some extra commands on the other node which is going to remain in the cluster to remove the resources from the node that is to be removed.


Example Configuration


 All steps outlined in this document were executed on a cluster with the following configuration:


Item Value

Node Names lc2n1, lc2n2, lc2n3

Operating System Oracle Enterprise Linux 5 Update 4

Oracle Clusterware Release 10.2.0.5.0

ASM & Database Release 10.2.0.5.0

Clusterware Home /u01/app/oracle/product/10.2.0/crs ($CRS_HOME)

ASM Home /u01/app/oracle/product/10.2.0/asm

Database Home /u01/app/oracle/product/10.2.0/db_1

 Cluster Name lc2

 


 Assume that node lc2n3 is down due to a hardware failure and cannot even boot up. The plan is to remove it from the clusterware, fix the issue and then add it again to the Clusterware. In this document, we will cover the steps to remove the node from the clusterware


Please note that for better readability instead of 'crs_stat -t' the sample script 'crsstat' from 

  Doc ID 259301.1 CRS and 10g/11.1 Real Application Clusters 

was used to query the state of the CRS resources. This script is not part of a standard CRS installation.

 


Initial Stage


At this stage, the Oracle Clusterware is up and running on nodes lc2n1 & lc2n2 (good nodes) . Node lc2n3 is down and cannot be accessed. Note that the Virtual IP of lc2n3 is running on Node 1. The rest of the lc2n3 resources are OFFLINE:


[oracle@lc2n1 ~]$ crsstat

Name                                     Target     State      Host      

-------------------------------------------------------------------------------

ora.LC2DB1.LC2DB11.inst                  ONLINE     ONLINE     lc2n1     

ora.LC2DB1.LC2DB12.inst                  ONLINE     ONLINE     lc2n2     

ora.LC2DB1.LC2DB13.inst                  ONLINE     OFFLINE              

ora.LC2DB1.LC2DB1_SRV1.LC2DB11.srv       ONLINE     ONLINE     lc2n1     

ora.LC2DB1.LC2DB1_SRV1.LC2DB12.srv       ONLINE     ONLINE     lc2n2     

ora.LC2DB1.LC2DB1_SRV1.LC2DB13.srv       ONLINE     OFFLINE              

ora.LC2DB1.LC2DB1_SRV1.cs                ONLINE     ONLINE     lc2n1     

ora.LC2DB1.db                            ONLINE     ONLINE     lc2n2     

ora.lc2n1.ASM1.asm                       ONLINE     ONLINE     lc2n1     

ora.lc2n1.LISTENER_LC2N1.lsnr            ONLINE     ONLINE     lc2n1     

ora.lc2n1.gsd                            ONLINE     ONLINE     lc2n1     

ora.lc2n1.ons                            ONLINE     ONLINE     lc2n1     

ora.lc2n1.vip                            ONLINE     ONLINE     lc2n1     

ora.lc2n2.ASM2.asm                       ONLINE     ONLINE     lc2n2     

ora.lc2n2.LISTENER_LC2N2.lsnr            ONLINE     ONLINE     lc2n2     

ora.lc2n2.gsd                            ONLINE     ONLINE     lc2n2     

ora.lc2n2.ons                            ONLINE     ONLINE     lc2n2     

ora.lc2n2.vip                            ONLINE     ONLINE     lc2n2     

ora.lc2n3.ASM3.asm                       ONLINE     OFFLINE              

ora.lc2n3.LISTENER_LC2N3.lsnr            ONLINE     OFFLINE              

ora.lc2n3.gsd                            ONLINE     OFFLINE              

ora.lc2n3.ons                            ONLINE     OFFLINE              

ora.lc2n3.vip                            ONLINE     ONLINE     lc2n1     

[oracle@lc2n1 ~]$

 


Step 1 Remove oifcfg information for the failed node


Generally most installations use the global flag of the oifcfg command and therefore they can skip this step. They can confirm this using:


[oracle@lc2n1 bin]$ $CRS_HOME/bin/oifcfg getif

eth0  192.168.56.0  global  public

eth1  192.168.57.0  global  cluster_interconnect

If the output of the command returns global as shown above then you can skip the following step (executing the command below on a global defination will return an error as shown below.


If the output of the oifcfg getif command does not return global then use the following command


[oracle@lc2n1 bin]$ $CRS_HOME/bin/oifcfg delif -node lc2n3 

PROC-4: The cluster registry key to be operated on does not exist.

PRIF-11: cluster registry error

 


Step 2 Remove ONS information


Execute the following command to find out the remote port number to be used


[oracle@lc2n1 bin]$ cat $CRS_HOME/opmn/conf/ons.config

localport=6113 

remoteport=6200 

loglevel=3

useocr=on

and remove the information pertaining to the node to be deleted using:


[oracle@lc2n1 bin]$ $CRS_HOME/bin/racgons remove_config lc2n3:6200

 


Step 3 Remove resources


In this step, the resources that were defined on this node have to be removed. These resources include Database Instances, ASm, Listener and Nodeapps resources. A list of these can be acquired by running crsstat (crs_stat -t) command from any node


[oracle@lc2n1 ~]$ crsstat |grep OFFLINE

ora.LC2DB1.LC2DB13.inst                  ONLINE     OFFLINE              

ora.LC2DB1.LC2DB1_SRV1.LC2DB13.srv       ONLINE     OFFLINE              

ora.lc2n3.ASM3.asm                       ONLINE     OFFLINE              

ora.lc2n3.LISTENER_LC2N3.lsnr            ONLINE     OFFLINE              

ora.lc2n3.gsd                            ONLINE     OFFLINE              

ora.lc2n3.ons                            ONLINE     OFFLINE             

 Before removing any resource it is recommended to take a backup of the OCR:


[root@lc2n1 ~]# cd $CRS_HOME/cdata/lc2

[root@lc2n1 lc2]# $CRS_HOME/bin/ocrconfig -export ocr_before_node_removal.exp

[root@lc2n1 lc2]# ls -l ocr_before_node_removal.exp

-rw-r--r-- 1 root root 151946 Nov 15 15:24 ocr_before_node_removal.exp

 Use 'srvctl' from the database home to delete the database instance on node 3:


[oracle@lc2n1 ~]$ . oraenv

ORACLE_SID = [oracle] ? LC2DB1

[oracle@lc2n1 ~]$ $ORACLE_HOME/bin/srvctl remove instance -d LC2DB1 -i LC2DB13

Remove instance LC2DB13 from the database LC2DB1? (y/[n]) y

 Use 'srvctl' from the ASM home to delete the ASM instance on node 3:


[oracle@lc2n1 ~]$ . oraenv

ORACLE_SID = [oracle] ? +ASM1

[oracle@lc2n1 ~]$ $ORACLE_HOME/bin/srvctl remove asm -n lc2n3

Next remove the listener resource.


Please note that there is no 'srvctl remove listener' subcommand prior to 11.1 so this command will not work in 10.2. Using 'netca' to delete the listener from a down node also is not an option as netca needs to remove the listener configuration from the listener.ora.

10.2 only:


The only way to remove the listener resources is to use the command 'crs_unregister', please use this command only in this particular scenario:


[oracle@lc2n1 lc2]$ $CRS_HOME/bin/crs_unregister ora.lc2n3.LISTENER_LC2N3.lsnr

 11.1 only:


 Set the environment to the home from which the listener runs (ASM or database):


[oracle@lc2n1 ~]$ . oraenv

ORACLE_SID = [oracle] ? +ASM1

[oracle@lc2n1 lc2]$ $ORACLE_HOME/bin/srvctl remove listener -n lc2n3 

  As user root stop the nodeapps resources:


[root@lc2n1 oracle]# $CRS_HOME/bin/srvctl stop nodeapps -n lc2n3

[root@lc2n1 oracle]# crsstat |grep OFFLINE

ora.lc2n3.LISTENER_LC2N3.lsnr            OFFLINE    OFFLINE              

ora.lc2n3.gsd                            OFFLINE    OFFLINE              

ora.lc2n3.ons                            OFFLINE    OFFLINE              

ora.lc2n3.vip                            OFFLINE    OFFLINE        

 Now remove them:


[root@lc2n1 oracle]#  $CRS_HOME/bin/srvctl remove nodeapps -n lc2n3

Please confirm that you intend to remove the node-level applications on node lc2n3 (y/[n]) y

 At this point all resources from the bad node should be gone:


[oracle@lc2n1 ~]$ crsstat 

Name                                     Target     State      Host      

-------------------------------------------------------------------------------

ora.LC2DB1.LC2DB11.inst                  ONLINE     ONLINE     lc2n1     

ora.LC2DB1.LC2DB12.inst                  ONLINE     ONLINE     lc2n2     

ora.LC2DB1.LC2DB1_SRV1.LC2DB11.srv       ONLINE     ONLINE     lc2n1     

ora.LC2DB1.LC2DB1_SRV1.LC2DB12.srv       ONLINE     ONLINE     lc2n2     

ora.LC2DB1.LC2DB1_SRV1.cs                ONLINE     ONLINE     lc2n1     

ora.LC2DB1.db                            ONLINE     ONLINE     lc2n2     

ora.lc2n1.ASM1.asm                       ONLINE     ONLINE     lc2n1     

ora.lc2n1.LISTENER_LC2N1.lsnr            ONLINE     ONLINE     lc2n1     

ora.lc2n1.gsd                            ONLINE     ONLINE     lc2n1     

ora.lc2n1.ons                            ONLINE     ONLINE     lc2n1     

ora.lc2n1.vip                            ONLINE     ONLINE     lc2n1     

ora.lc2n2.ASM2.asm                       ONLINE     ONLINE     lc2n2     

ora.lc2n2.LISTENER_LC2N2.lsnr            ONLINE     ONLINE     lc2n2     

ora.lc2n2.gsd                            ONLINE     ONLINE     lc2n2     

ora.lc2n2.ons                            ONLINE     ONLINE     lc2n2     

ora.lc2n2.vip                            ONLINE     ONLINE     lc2n2  

 


Step 4 Execute rootdeletenode.sh


From the node that you are not deleting execute as root the following command which will help find out the node number of the node that you want to delete


[oracle@lc2n1 ~]$ $CRS_HOME//bin/olsnodes -n

lc2n1   1

lc2n2   2

lc2n3   3

this number can be passed to the rootdeletenode.sh command which is to be executed as root from any node which is going to remain in the cluster.


[root@lc2n1 ~]# cd $CRS_HOME/install

[root@lc2n1 install]# ./rootdeletenode.sh lc2n3,3

CRS-0210: Could not find resource 'ora.lc2n3.ons'.

CRS-0210: Could not find resource 'ora.lc2n3.vip'.

CRS-0210: Could not find resource 'ora.lc2n3.gsd'.

CRS-0210: Could not find resource ora.lc2n3.vip.

CRS nodeapps are deleted successfully

clscfg: EXISTING configuration version 3 detected.

clscfg: version 3 is 10G Release 2.

Successfully deleted 14 values from OCR.

Key SYSTEM.css.interfaces.nodelc2n3 marked for deletion is not there. Ignoring.

Successfully deleted 5 keys from OCR.

Node deletion operation successful.

'lc2n3,3' deleted successfully

[root@lc2n1 install]# $CRS_HOME/bin/olsnodes -n

lc2n1   1

lc2n2   2

 


Step 5 Update the Inventory


From the node which is going to remain in the cluster run the following command as owner of the CRS_HOME. The argument to be passed to the CLUSTER_NODES is a comma seperated list of node names of the cluster which are going to remain in the cluster. This step needs to be performed from once per home (Clusterware, ASM and RDBMS homes).


[oracle@lc2n1 install]$ $CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=/u01/app/oracle/product/10.2.0/crs "CLUSTER_NODES={lc2n1,lc2n2}" CRS=TRUE  

Starting Oracle Universal Installer...


No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed.

The inventory pointer is located at /etc/oraInst.loc

The inventory is located at /u01/app/oracle/oraInventory

'UpdateNodeList' was successful.


[oracle@lc2n1 install]$ $CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=/u01/app/oracle/product/10.2.0/asm "CLUSTER_NODES={lc2n1,lc2n2}"

Starting Oracle Universal Installer...


No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed.

The inventory pointer is located at /etc/oraInst.loc

The inventory is located at /u01/app/oracle/oraInventory

'UpdateNodeList' was successful.

[oracle@lc2n1 install]$ $CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1 "CLUSTER_NODES={lc2n1,lc2n2}"

Starting Oracle Universal Installer...


No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed.

The inventory pointer is located at /etc/oraInst.loc

The inventory is located at /u01/app/oracle/oraInventory

'UpdateNodeList' was successful.

반응형
Posted by [PineTree]
ORACLE/RAC2013. 6. 14. 21:09
반응형

TAF (Transparent Application Failover)란 무엇인가?

RAC에서 사용하는 application failover의 개념으로 클라리언트 측의 feature이다.
두 노드가 있다고 가정하고, 한쪽 노드에 장애가 발생했을 경우,
나머지 살아있는 노드로 failover되는 기능이다.


TAF가 왜 필요한가?

TAF의 개념을 이해한다면 의심의 여지가 없다.
고가용성 시스템을 구축하기 위해 RAC를 사용한다면
장애 발생시에 failover를 안쓴다면 왜 RAC를 구축하겠는가?


TAF를 적용하기 위해서는 어떤 작업들이 필요한가?

1. 일단 클라이언트 측에 오라클 클라이언트 프로그램이 설치되어 있어야 한다.

2. 서버 쪽에 RAC가 구축되어 있어야 한다.

3. 오라클 클라이언트가 설치되어있는 곳의 $ORACLE_HOME/network/admin의 tnsnames.ora파일에
   아래와 같은 설정이 필요하다.

ORCLTEST =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = vip-linux1)(PORT = 1521))
    (ADDRESS = (PROTOCOL = TCP)(HOST = vip-linux2)(PORT = 1521))
    (LOAD_BALANCE = yes)
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = orcltest.idevelopment.info)
      (FAILOVER_MODE =
        (TYPE = SELECT)
        (METHOD = BASIC)
        (RETRIES = 180)
        (DELAY = 5)

      )
    )
  )

TAF가 지원하는 매개변수들에 대한 설명

TYPE: 
이것은 NONE, SESSION, 또는 SELECT 중 하나가 사용됩니다. 사용 해제을 위해서는 TYPE=SESSION 로 설정하고, 세션과 오픈 커서의 페일오버를 위해서는 TYPE=SELECT 로 설정하십시오. 그리고 TAF 를 해제하기 위해서는 TYPE=NONE 으로 설정하시면 됩니다.

METHOD: 
이것은 BASIC 또는 PRECONNECT 중 하나가 사용됩니다. BASIC 방식을 사용하면, 기존 접속이 실패할 때까지, TAF 는 접속의 재설정을 시도하지 않을 것입니다. PRECONNECT 방식을 사용하면, TAF 는 백업 접속을 위해 필요한 메모리 구조를 사전-설정하지만, 기존 접속이 실패할 때까지 백업 접속은 활성화되지 않을 것입니다.

BACKUP: 
이 하위-매개변수는 백업 접속의 설정을 위해 사용되는 네트 서비스 이름을 지정합니다. BACKUP 지정은 PRECONNECT 방식을 사용할 때 필요한데, BASIC 방식에서 강력하게 추천되고 있습니다; 그렇지 않다면, 클라이언트가 재접속을 할 때까지 추가적으로 지연을 시키면서 방금 실패한 인스턴스에 최초로 재접속을 시도할 것입니다. 그러나, 사용자는 LOAD_BALANCING=ON 인 상태에서는 BACKUP 을 지정할 수가 없습니다.

DELAY: 
TAF 가 장애 후에 BACKUP 에 연결하려는 시도 사이에서 기다리는 몇 초간의 지연 시간입니다.

RETRIES: 
포기하기 전, TAF 가 장애 후에 BACKUP 에 연결하기 위해 시도하는 횟수입니다. RETRIES 와 DELAY 는 TAF 가 백업 접속을 포기하기 전에 콜드 페일오버가 완료될 수 있는 시간을 갖게 해줍니다


4. 그리고 /etc/hosts 파일에 3.에서 정의한 hostname이 정의되어 있어야 한다.

os] cat /etc/hosts

10.10.100.101 vip-linux1
10.10.100.102 vip-linux2


실제로 TAF 테스트를 해보자

Window 머신 (또는 다른 플랫폼의 “non-RAC” 클라이언트 머신)에서 orcltest 서비스를 사용하여 SYSTEM 사용자로 클러스터 데이터베이스에 로그인합니다:

C:\> sqlplus system/manager@orcltest

COLUMN instance_name    FORMAT a13
COLUMN host_name        FORMAT a9
COLUMN failover_method  FORMAT a15
COLUMN failed_over      FORMAT a11

SELECT
    instance_name
  , host_name
  , NULL AS failover_type
  , NULL AS failover_method
  , NULL AS failed_over
FROM v$instance
UNION
SELECT
    NULL
  , NULL
  , failover_type
  , failover_method
  , failed_over
FROM v$session
WHERE username = 'SYSTEM';


INSTANCE_NAME HOST_NAME FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER
------------- --------- ------------- --------------- -----------
orcl1         linux1
                        SELECT        BASIC           NO

위에서 설정한 SQL*Plus 세션에서 로그아웃 하지 않습니다!

위 쿼리를 수행한 다음, abort 옵션을 사용하여 linux1 노드의 orcl1 인스턴스를 셧다운 합니다. 이 작업을 수행하기 위해 아래와 같이 srvctl 커맨드라인 유틸리티를 사용합니다:

# su - oracle
$ srvctl status database -d orcl
Instance orcl1 is running on node linux1
Instance orcl2 is running on node linux2

$ srvctl stop instance -d orcl -i orcl1 -o abort

$ srvctl status database -d orcl
Instance orcl1 is not running on node linux1
Instance orcl2 is running on node linux2

다시 앞의 SQL 세션으로 돌아가, 버퍼에 저장된 SQL 구문을 재실행합니다: 
COLUMN instance_name    FORMAT a13
COLUMN host_name        FORMAT a9
COLUMN failover_method  FORMAT a15
COLUMN failed_over      FORMAT a11

SELECT
    instance_name
  , host_name
  , NULL AS failover_type
  , NULL AS failover_method
  , NULL AS failed_over
FROM v$instance
UNION
SELECT
    NULL
  , NULL
  , failover_type
  , failover_method
  , failed_over
FROM v$session
WHERE username = 'SYSTEM';

INSTANCE_NAME HOST_NAME FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER
------------- --------- ------------- --------------- -----------
orcl2         linux2
                        SELECT        BASIC           YES

SQL> exit

위 실행 결과에서, 세션이 linux2 노드의 orcl2 인스턴스로 페일오버 되었음을 확인할 수 있습니다. 


참조 : http://www.oracle.com/technology/global/kr/pub/articles/hunter_rac10gr2_3.html
         http://www.oracle.com/technology/global/kr/deploy/availability/htdocs/taf.html
         http://publib.boulder.ibm.com/infocenter/pim/v6r0m0/index.jsp?topic=/com.ibm.wpc.ins.doc/wpc_tsk_setting_up_oracle_to_use_taf_support.html

반응형
Posted by [PineTree]
ORACLE/RAC2013. 5. 30. 23:04
반응형


root 유저로 실행


crs_stat -p ora.LISTENER_SCAN1.lsnr |grep -i start
crs_stat -p ora.scan1.vip  |grep -i start

[root@rac1 ~]# crsctl modify resource "ora.rac1.vip" -attr "AUTO_START=always"
[root@rac1 ~]#
[root@rac1 ~]# crsctl modify resource "ora.rac2.vip" -attr "AUTO_START=always"
[root@rac1 ~]# crsctl modify resource "ora.LISTENER_SCAN1.lsnr" -attr "AUTO_START=always"
[root@rac1 ~]# crsctl modify resource "ora.racdb.db" -attr "AUTO_START=always"
[root@rac1 ~]# crsctl modify resource "ora.scan1.vip" -attr "AUTO_START=always"
[root@rac1 ~]#
[root@rac1 ~]# crsctl modify resource "ora.net1.network" -attr "AUTO_START=always"
[root@rac1 ~]#
[root@rac1 ~]# crsctl modify resource "ora.LISTENER.lsnr" -attr "AUTO_START=always"
[root@rac1 ~]# crsctl modify resource "ora.oc4j" -attr "AUTO_START=always"
[root@rac1 ~]# crs_stat -p|grep restore

반응형
Posted by [PineTree]
ORACLE/RAC2013. 5. 30. 11:27
반응형


출처 : http://leejehong.tistory.com/170


[root@RAC1 ~]# srvctl config database -d devdb -v
Database unique name: devdb
Database name: devdb
Oracle home: /u01/app/11.2.0/db
Oracle user: oracle

10g CRS AUTO_START 값 변경.txt


11G Oracle RAC Startup Policy 변경.txt


11gR2 Disable Enable Automatic startup Oracle HAS.txt


Changing Resource Attributes in 11gR2 Grid Infrastructure.txt


crs_stat_resource상태확인(AUTO_START).txt


Spfile: /dev/raw/raw6
Domain: 
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: devdb
Database instances: devdb1,devdb2
Disk Groups: 
Mount point paths: 
Services: 
Type: RAC
Database is administrator managed
[root@RAC1 ~]#


oracle@RAC1:/>crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....ER.lsnr ora....er.type ONLINE    ONLINE    rac1        
ora....N1.lsnr ora....er.type ONLINE    ONLINE    rac2        
ora.cvu        ora.cvu.type   ONLINE    ONLINE    rac1        
ora.devdb.db   ora....se.type OFFLINE   OFFLINE               
ora.gsd        ora.gsd.type   ONLINE    ONLINE    rac1        
ora....network ora....rk.type ONLINE    ONLINE    rac1        
ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    rac1        
ora.ons        ora.ons.type   ONLINE    ONLINE    rac1        
ora....C1.lsnr application    ONLINE    ONLINE    rac1        
ora.rac1.gsd   application    ONLINE    ONLINE    rac1        
ora.rac1.ons   application    ONLINE    ONLINE    rac1        
ora.rac1.vip   ora....t1.type ONLINE    ONLINE    rac1        
ora....C2.lsnr application    ONLINE    ONLINE    rac2        
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   ora....t1.type ONLINE    ONLINE    rac2        
ora....ry.acfs ora....fs.type OFFLINE   OFFLINE               
ora.scan1.vip  ora....ip.type ONLINE    ONLINE    rac2        
oracle@RAC1:/>
oracle@RAC1:/>

reboot시 Database가 자동으로 시작되지 않음.


oracle@RAC1:/>crs_stat -p ora.devdb.db |grep -i start
AUTO_START=restore
GEN_START_OPTIONS@SERVERNAME(rac1)=open
GEN_START_OPTIONS@SERVERNAME(rac2)=open
RESTART_ATTEMPTS=2
START_TIMEOUT=600
oracle@RAC1:/>

AUTO_START=restore 로 되어있어서 startup 되지 않음.

# 참고
* AUTO_START=1(always) -> 그 전의 상태와 상관없이 하드웨어 설정상태만 정상이면 crs 재구동시 리소스가 online 되어짐. * * AUTO_START=2(never) -> 모든 리소스를 수동으로 시작 
* AUTO_START=0(restore) -> 모든 리소스는 내리기 전 상태로 복귀.

[root@RAC1 profile]# crs_stat -p ora.devdb.db |grep -i start
AUTO_START=restore
GEN_START_OPTIONS@SERVERNAME(rac1)=open
GEN_START_OPTIONS@SERVERNAME(rac2)=open
RESTART_ATTEMPTS=2
START_TIMEOUT=600

[root@RAC1 profile]# 
[root@RAC1 profile]# crsctl modify resource "ora.devdb.db" -attr "AUTO_START=always"
[root@RAC1 profile]# crs_stat -p ora.devdb.db |grep -i start
AUTO_START=always
GEN_START_OPTIONS@SERVERNAME(rac1)=open
GEN_START_OPTIONS@SERVERNAME(rac2)=open
RESTART_ATTEMPTS=2
START_TIMEOUT=600
[root@RAC1 profile]#

Changing Resource Attributes in 11gR2 Grid Infrastructure 
In 11gR2 grid infrastructure installations certain resources may have auto start set to never and restore. 
This was observed both on environments where clusterware was upgraded to 11.2 as well as newly installed environments. 
Depending on the situation these may not be desirable. Auto start attribute setting could be changed as follows.

1. Check the current auto start values

# crsctl stat res -p
NAME=ora.FLASH.dg
TYPE=ora.diskgroup.type
ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r--
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
ALIAS_NAME=
AUTO_START=never     

NAME=ora.DATA.dg
TYPE=ora.diskgroup.type
ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r--
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
ALIAS_NAME=
AUTO_START=never    

NAME=ora.clusdb.db
TYPE=ora.database.type
ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r--
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
ACTIVE_PLACEMENT=1
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
AUTO_START=restore

2. Since ASM diskgroup that database depend on will never auto start database will also be unavailable.

3. Change the resource start attribute with

# crsctl modify resource "ora.FLASH.dg" -attr "AUTO_START=always"
# crsctl modify resource "ora.DATA.dg" -attr "AUTO_START=always"
# crsctl modify resource ora.clusdb.db -attr "AUTO_START=always"
Auto start must be upper case if not command will fail 
crsctl modify resource ora.clusdb.db -attr "auto_start=always"
CRS-0160: The attribute 'auto_start' is not supported in this resource type.
CRS-4000: Command Modify failed, or completed with errors.


4. Verify the status change with 
# crsctl stat res -p
NAME=ora.clusdb.db
TYPE=ora.database.type
ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r--
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
ACTIVE_PLACEMENT=1
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
AUTO_START=always

반응형
Posted by [PineTree]
ORACLE/RAC2012. 5. 18. 16:38
반응형

VIP Failover Take Long Time After Network Cable Pulled [ID 403743.1]
--------------------------------------------------------------------------------
 
  수정 날짜 05-JAN-2011     유형 PROBLEM     상태 PUBLISHED  

In this Document
  Symptoms
  Changes
  Cause
  Solution
  References

--------------------------------------------------------------------------------

Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 11.1.0.7 - Release: 10.2 to 11.1
Information in this document applies to any platform.
***Checked for relevance on 05-Jan-2011***
Symptoms
This example is based on SUN Solaris platform, with IPMP configured for the public network. In this case, VIP failover takes almost 4 minutes to complete when both network cables of the public network are pulled from one node.

crsd.log shows:

2006-12-07 13:14:05.401: [ CRSAPP][4588] CheckResource error for ora.node1.vip error code = 1
2006-12-07 13:14:05.408: [ CRSRES][4588] In stateChanged, ora.node1.vip target is ONLINE
2006-12-07 13:14:05.409: [ CRSRES][4588] ora.node1.vip on node1 went OFFLINE unexpectedly
<<< detect network cable failure and VIP OFFLINE immediately

2006-12-07 13:14:05.410: [ CRSRES][4588] StopResource: setting CLI values
2006-12-07 13:14:05.420: [ CRSRES][4588] Attempting to stop `ora.node1.vip` on member `node1`
2006-12-07 13:14:06.651: [ CRSRES][4588] Stop of `ora.node1.vip` on member `node1` succeeded.
2006-12-07 13:14:06.652: [ CRSRES][4588] ora.node1.vip RESTART_COUNT=0 RESTART_ATTEMPTS=0
2006-12-07 13:14:06.667: [ CRSRES][4588] ora.node1.vip failed on node1 relocating.
2006-12-07 13:14:06.758: [ CRSRES][4588] StopResource: setting CLI values
2006-12-07 13:14:06.766: [ CRSRES][4588] Attempting to stop `ora.node1.LISTENER_NODE1.lsnr` on member `node1`
2006-12-07 13:17:41.399: [ CRSRES][4588] Stop of `ora.node1.LISTENER_NODE1.lsnr` on member `node1` succeeded.
<<< takes 3.5 minutes to stop listener

2006-12-07 13:17:41.402: Attempting to stop `ora.node1.ASM1.asm` on member `node1`
<<< stop dependant inst and ASM
2006-12-07 13:17:55.610: [ CRSRES][4588] Stop of `ora.node1.ASM1.asm` on member `node1` succeeded.

2006-12-07 13:17:55.661: [ CRSRES][4588] Attempting to start `ora.node1.vip` on member `node2`
2006-12-07 13:18:00.260: [ CRSRES][4588] Start of `ora.node1.vip` on member `node2` succeeded.
<<< now VIP failover complete after almost 4 mins


ora.node1.LISTENER_NODE1.lsnr.log shows:

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=node1vip)(PORT=1521)(IP=FIRST)))
TNS-12535: TNS:operation timed
2006-12-07 13:17:41.329: [ RACG][1] [23916][1][ora.node1.LISTENER_NODE1.lsnr]: out
   TNS-12560: TNS:protocol adapter error
     TNS-00505: Operation timed out
     Solaris Error: 145: Connection timed out
     Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.1.10.100)(PORT=1521)(IP=FIRST)))
The command completed successfully


Client connection hang during this failover time.


Changes
This may be a new setup, or a setup that was migrated from an earlier release.
Cause
This problem is caused by the first address in the listener.ora configuration being an address that uses the TCP protocol.

In this circumstance, when a network cable is pulled, "lsnrctl stop" listener has to wait for TCP timeout before it can check next address. On the Solaris platform, TCP timeout is defined by tcp_ip_abort_cinterval with a default value of 180000 (3 minutes).   That is why shutting down listener almost took 3.5 minutes. (TCP timeout on other platforms may vary).  The error message "Solaris Error: 145: Connection timed out" in ora.node1.LISTENER_NODE1.lsnr.log also indicates it is waiting for tcp timeout.

The listener.ora in this scenario is defined as:

 

[LISTENER_NODE1 =
 (DESCRIPTION_LIST =
   (DESCRIPTION =
     (ADDRESS_LIST =
       (ADDRESS = (PROTOCOL = TCP)(HOST = node1vip)(PORT = 1521)(IP = FIRST))
     )
     (ADDRESS_LIST =
       (ADDRESS = (PROTOCOL = TCP)(HOST = 10.1.10.100)(PORT = 1521)(IP = FIRST))
     )
     (ADDRESS_LIST =
       (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
     )
   )
 )Solution
To prevent this, move the IPC address to be the first address for the listener in the listener.ora, eg:


LISTENER_NODE1 =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
       (ADDRESS_LIST =
          (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
       )
       (ADDRESS_LIST =
          (ADDRESS = (PROTOCOL = TCP)(HOST = node1vip)(PORT = 1521)(IP = FIRST))
        )
       (ADDRESS_LIST =
           (ADDRESS = (PROTOCOL = TCP)(HOST = 10.1.10.100)(PORT = 1521)(IP = FIRST))
        )
     )
  )

When lsnrctl tries to stop the listener, it will now connect to the IPC address first, which is available during that time. It will not have to wait for tcp timeout.

After the above change, the VIP failover only takes 48 to 50 seconds to complete regardless of the tcp_ip_abort_cinterval setting.

Please note, listener.ora files newly created from 10.2.0.3 to 11.1.0.7 should have the IPC protocol as the first address in listener.ora in most cases.  However, if you have upgraded from a previous release, or manually modified/copied over a listener.ora from a previous install, you may not have the IPC protocol as the first address, regardless of your version. Manual modification is required to move IPC protocol to be the first address to avoid the problem described in this note.

References


 

반응형
Posted by [PineTree]
ORACLE/RAC2012. 5. 18. 16:24
반응형

How to Configure Solaris Link-based IPMP for Oracle VIP [ID 730732.1]

  수정 날짜 23-NOV-2011     유형 REFERENCE     상태 PUBLISHED  

In this Document
  Purpose
  Scope
  How to Configure Solaris Link-based IPMP for Oracle VIP
  References


Applies to:

Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 11.2.0.2 - Release: 10.2 to 11.2
Oracle Solaris on SPARC (64-bit)
Oracle Solaris on x86-64 (64-bit)

Purpose

This note will give a sample configuration for Link-based failure detection mode for IPMP which is introduced in Sun Solaris 10 platform.

Before Sun Solaris 10, there is only Probe-based failure detection IPMP configuration that you can find the example in note 283107.1

The main different between Probe-based IPMP and Link-based IPMP
- In Probe-based IPMP, beside the host's Physical IP address you also need to assign a test IP address for each NIC card. And one target system, normally default gateway, where multipathing daemon used for ICMP probe message.

- In Link-based IPMP, only the host's Physical IP address is required.

Scope

By default, link-based failure detection is always enabled in Solaris 10, provided that the driver for the interface supports this type of failure detection. The following Sun network drivers are supported in the current release


hme
eri
ce
ge
bge
qfe
dmfe
e1000g
ixgb
nge
nxge
rge
xge



Network Requirement
--------------------------------
There is no different for Probe-based and Link-based in term of hardware requirement.

It is only one Physical IP address required per cluster node. The following is list of NIC Card and IP addresses that will be used in the following example.
- Public Interface: ce0 and ce1
- Physical IP: 130.35.100.123
- Oracle RAC VIP: 130.35.100.124

How to Configure Solaris Link-based IPMP for Oracle VIP

IPMP Configuration
-----------------------------
1. ifconfig ce0 group racpub
2. ifconfig ce0 addif 130.35.100.123 netmask + broadcast + up
3. ifconfig ce1 group racpub

To preserve the IPMP configuration across reboot, you need to update the /etc/hostname.* files as following
1. The entry of /etc/hostname.ce0 file
130.35.100.123 netmask + broadcast + group racpub up

2. The entry of /etc/hostname.ce1 file
group racpub up

Before CRS installation , the 'ifconfig -a' output will be

lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
hme0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 192.168.1.1 netmask ffffff00 broadcast 192.168.1.255
ether 0:19:b9:3f:87:11
ce0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 130.35.100.123 netmask ffffff00 broadcast 130.35.100.255
groupname racpub
ether 0:14:d1:13:7b:7e
ce1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname racpub
ether 0:18:e7:8:c5:8b


Since no test IP assigned to public interfaces, the IP address of ce0 is the physical IP address and ce1 is 0.0.0.0.

CRS / VIPCA configuration
----------------------------------------
Upon successful of root.sh on the CRS installation, vipca will only make the primary interface as the public interface. If you start vipca application manually, the second screen (VIP Configuration Assistant, 1 of 2) will only list ce0 as the available public interface.

After that, you need to update CRS for the second NIC (ce1) information with srvctl command

# srvctl modify nodeapps -n tsrac1 -A 130.35.100.124/255.255.255.0/ce0\|ce1

After CRS is installed and Oracle VIP is running, the 'ifconfig -a' output will be

lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
hme0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 192.168.1.1 netmask ffffff00 broadcast 192.168.1.255
ether 0:19:b9:3f:87:11
ce0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 130.35.100.123 netmask ffffff00 broadcast 130.35.100.255
groupname racpub
ether 0:14:d1:13:7b:7e
ce0:1: flags=1040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4> mtu 1500 index 3
inet 130.35.100.124 netmask ffffff00 broadcast 130.35.100.255
ce1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname racpub
ether 0:18:e7:8:c5:8b


When the primary interface on the public network failed, either NIC faulty or the LAN cable broken, Oracle VIP will follow the physical IP failover to standby interface. As the following output of 'ifconfig -a' shows

lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
hme0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 192.168.1.1 netmask ffffff00 broadcast 192.168.1.255
ether 0:19:b9:3f:87:11
ce0: flags=19000802<BROADCAST,MULTICAST,IPv4,NOFAILOVER,FAILED> mtu 0 index 3
inet 0.0.0.0 netmask 0
groupname racpub
ether 0:14:d1:13:7b:7e
ce1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname racpub
ether 0:18:e7:8:c5:8b
ce1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
inet 130.35.100.123 netmask ffffff00 broadcast 130.35.100.255
ce1:2: flags=1040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4> mtu 1500 index 4
inet 130.35.100.124 netmask ffffff00 broadcast 130.35.100.255

References

NOTE:283107.1 - Configuring Solaris IP Multipathing (IPMP) for the Oracle 10g VIP
NOTE:368464.1 - How to Setup IPMP as Cluster Interconnect
docs.oracle.com/cd/E19253-01/816-4554/mpoverview/index.html

관련 정보 표시 관련 자료


제품
  • Oracle Database Products > Oracle Database > Oracle Database > Oracle Server - Enterprise Edition
키워드
CLUSTERWARE; CONFIGURATION; SOLARIS; VIP

반응형
Posted by [PineTree]
ORACLE/RAC2012. 5. 11. 11:00
반응형

 

RAC 10g R2 patch.pdf

반응형
Posted by [PineTree]
ORACLE/RAC2012. 5. 11. 10:49
반응형

오라클 매거진 2007년 가을호 발췌자료입니다.


일반적으로 RAC의 Inter-Connects의 성능과 가장 연관이 깊은것이 UDP Buffer의 크기이다. 오라클이 공식적으로 권고하는 UDP
Buffer 크기는 256K이고 대부분의 시스템에서 적절한 성능을 보장하고있다. Inter-Connect을 통해 주고 받는 Data량이 많은 경우 UDP Buffer Size를 1M~2M 정도 사용하기도 하며 Transaction에 비해 UDP Buffer Size 가 작을 경우 Packet Loss 현상이 발생할 수 있다. Packet Loss 현상이 자주 발생할 경우“gc cr block lost”,“gc current block lost”Oracle Wait Event가 나타난다. 이는 각 OS 별 Monitoring Tool인 nmon, topas, glance이나 Network 명령인 ifconfig, netstat 등과
Network 진단 Tool을 통해 UDP Packet Receive Error나 Dropped 된비율을 Monitoring 하도록 한다.
[oracle@rac1]$ netstat -s
Ip: ………
Tcp: …….

 


앞의 Network Setting 값들은 System 운영 중에 변경이 가능하며 System Rebooting 시에는 Default 값으로 초기화되므로 System Starting Script나 Kernel Level에서 값을 Settting하여 자동 적용이 되도록 한다.

 

고객사 장애사례

 

Instance Recovery 및 Reconfigure 시간을 최대 단축하는

Hidden Parameter(_imr_max_reconfig_max)와 Instance Recovery 없이 (_imr_active=false) Open 시도를 해도 두 번째 Node의 DB는 약20분 후에 Open됨.

 

DB Mount 상태에서 10046-level 12로 event를 걸어 DB Open하여 trace를 살펴보니 Inter-connects 간 DB Open전에 동기화를 위한 Recursive Call 시 ”global cache cr request”event와 해당 elaspsed time이 굉장히 자주 많이 발생하여 Inter-Connects 간 Network 전송속도에 문제가 있을 것으로 판단하여 OS의 Inter-Connects용 Gigabit-NIC 설
정 값을 확인해보니 100 M Full 설정이 되어 있어 1000 M Full로 변경하여 정상적으로 DB Open 및 업무가 진행이 되었다.

 

 


일반적으로 RAC의 Inter-Connects의 성능과 가장 연관이 깊은것이 UDP Buffer의 크기이다. 오라클이 공식적으로 권고하는 UDP
Buffer 크기는 256K이고 대부분의 시스템에서 적절한 성능을 보장하고있다. Inter-Connect을 통해 주고 받는 Data량이 많은 경우 UDP Buffer Size를 1M~2M 정도 사용하기도 하며 Transaction에 비해 UDP Buffer Size 가 작을 경우 Packet Loss 현상이 발생할 수 있다. Packet Loss 현상이 자주 발생할 경우“gc cr block lost”,“gc current block lost”Oracle Wait Event가 나타난다. 이는 각 OS 별 Monitoring Tool인 nmon, topas, glance이나 Network 명령인 ifconfig, netstat 등과
Network 진단 Tool을 통해 UDP Packet Receive Error나 Dropped 된비율을 Monitoring 하도록 한다.
[oracle@rac1]$ netstat -s
Ip: ………
Tcp: …….

 

OS별 Default UDP Buffer Size와 조절방법(MAX:8M)
kernel Value Default Command
Linex net.core.rmem_max 131071 sysctl -w net.core.rmem_max = 8388608
Solaris udp_max_buf 262144 ndd -set /dev/udp udp_max_buf 8388608
AIX sb_max 1048576 no -o sb_max=8388608
(1048576, 4194304, 83388608 값 중에서만)
OS별 Default UDP Buffer Size와 조절방법(MAX:8M)
kernel Value Default Command
Linex net.core.rmem_max 131071 sysctl -w net.core.rmem_max = 8388608
Solaris udp_max_buf 262144 ndd -set /dev/udp udp_max_buf 8388608
AIX sb_max 1048576 no -o sb_max=8388608
(1048576, 4194304, 83388608 값 중에서만)

 

 

Tuning Inter-Instance Performance in RAC and OPS [ID 181489.1]  

  수정 날짜 29-MAR-2009     유형 BULLETIN     상태 PUBLISHED  

PURPOSE
-------

This note was written to help DBAs and Support Analysts Understand Inter-Instance
Performance and Tuning in RAC.


SCOPE & APPLICATION
-------------------

Real Application Clusters uses the interconnect to transfer blocks and messages 
between instances.  If inter-instance performance is bad, almost all database 
operations can be delayed.  This note describes methods of identifying and 
resolving inter-instance performance issues.


TUNING INTER-INSTANCE PERFORMANCE IN RAC AND OPS
------------------------------------------------




SYMPTOMS OF INTER-INSTANCE PERFORMANCE PROBLEMS
-----------------------------------------------

The best way to monitor inter-instance performance is to take AWR or statspack 
snaps on each instance (at the same time) at regular intervals.  

If there are severe inter-instance performance issues or hung sessions, you 
may also want to run the racdiag.sql script from the following note 
to collect additional RAC specific data:

  Note 135714.1 
  Script to Collect RAC Diagnostic Information (racdiag.sql) 

The output of the script has tips for how to read the output.  

Within the AWR, statspack report, or racdiag.sql output, you can use the wait 
events and global cache statistics to monitor inter-instance performance.  It 
will be important to look for symptoms of inter-instance performance issues.  
These symptoms include:

1. The average cr block receive time will be high.  This value is calculated by
dividing the 'global cache cr block receive time' statistic by the 
'global cache cr blocks received' statistic:

	global cache cr block receive time
	----------------------------------
     	 global cache cr blocks received

Multiply this calculation by 10 to find the average number of milliseconds.  In a 
9.2 statspack report you can also use the following Global Cache Service Workload 
characteristics:

Ave receive time for CR block (ms):                        4.1

The following query can also be run to monitor the average cr block receive time 
since the last startup:

set numwidth 20
column "AVG CR BLOCK RECEIVE TIME (ms)" format 9999999.9
select b1.inst_id, b2.value "GCS CR BLOCKS RECEIVED", 
b1.value "GCS CR BLOCK RECEIVE TIME",
((b1.value / b2.value) * 10) "AVG CR BLOCK RECEIVE TIME (ms)"
from gv$sysstat b1, gv$sysstat b2
where b1.name = 'global cache cr block receive time' and
b2.name = 'global cache cr blocks received' and b1.inst_id = b2.inst_id ;

The average cr block receive time or current block receive time should typically be 
less than 15 milliseconds depending on your system configuration and volume, is the 
average latency of a consistent-read request round-trip from the requesting instance 
to the holding instance and back to the requesting instance. 

Please note that if you are on 9i and the global cache current block receive 
time is abnormally high and the average wait time for the 'global cache null 
to x' wait event is low (under 15ms) then you are likely hitting bug 2130923 
(statistics bug).  This is a problem in the way statstics are reported and does 
not impact performance.

More about that issue is documented in the following note:

  Note 243593.1 
  RAC: Ave Receive Time for Current Block is Abnormally High in Statspack 

2. "Global cache" or "gc" events will be the top wait event.  Some of these wait
events show the amount of time that an instance has requested a data block for a 
consistent read or current block via the global cache.  



When a consistent read buffer cannot be found in the local cache, an attempt is 
made to find a usable version in another instance. There are 3 possible outcomes, 
depending on whether any instance in the cluster has the requested data block 
cached or not: 

a) A cr block is received (i.e. another instance found or managed to produce the 
   wanted version).  The "global cache cr blocks received" statistic is incremented. 
b) No other instance has the block cached and therefor the requesting instance 
   needs to read from disk, but a shared lock will be granted to the requestor 
   The "global cache gets" statistic is incremented 
c) 9i RAC+ Only --> A current block is received (the current block is good enough for 
   the query ).  The " global cache current blocks received" statistic is 
   incremented.

In all three cases, the requesting process may wait for global cache cr request.
The view X$KCLCRST (CR Statistics) may be helpful in debugging 'global cache cr 
request' wait issues.  It will return the number of requests that were handled for 
data or undo header blocks, the number of requests resulting in the shipment of a 
block (cr or current),  and the number of times a read from disk status is returned.

It should be noted that having 'global cache' or 'gc' waits does not always
indicate an inter-instance performance issue.  Many times this wait is 
completely normal if data is read and modified concurrently on multiple
instances.  Global cache statistics should also be examined to determine if 
there is an inter-instance performance problem.

3. The GES may run out of tickets.  When viewing the racdiag.sql output 
(Note 135714.1) or querying the gv$ges_traffic_controller or 
gv$dlm_traffic_controller views, you may find that the TCKT_AVAIL shows '0'.  To 
find out the available network buffer space we introduce the concepts of tickets.  
The maximum number of tickets available is a function of the network send buffer 
size. In the case of lmd and lmon, they always buffer their messages in case of 
ticket unavailability.  A node relies on messages to come back from the remote 
node to release tickets for reuse.

4. The above information should be enough to identify an inter-instance performance
problem but additional calculations can be made to monitor inter-instance 
performance can be found in the documentation.


IDENTIFYING AND RESOLVING INTER-INSTANCE PERFORMANCE PROBLEMS
-------------------------------------------------------------

Inter-Instance performance issues can be caused by:

1. Under configured network settings at the OS.  Check UDP, or other network protocol 
settings and tune them.  See your OS specific documentation for instructions on how 
to do this.  If using UDP, make sure the parameters relating to send buffer space, 
receive buffer space, send highwater, and receive highwater are set well above the 
OS default.  The alert.log will indicate what protocol is being used.  Example:

	cluster interconnect IPC version:Oracle RDG
	IPC Vendor 1 proto 2 Version 1.0

Changing network parameters to optimal values:

 Sun (UDP Protcol) 
	UDP related OS parameters can be queried with the following command:
		ndd /dev/udp udp_xmit_hiwat
		ndd /dev/udp udp_recv_hiwat 
                ndd /dev/udp udp_max_buf 
	Set the udp_xmit_hiwat and udp_recv_hiwat to the OS maximum with:
		ndd -set /dev/udp udp_xmit_hiwat <value>
		ndd -set /dev/udp udp_recv_hiwat <value> 
                ndd -set /dev/udp udp_max_buf <1M or higher>
 IBM AIX (UDP Protocol)
	UDP related OS parameters can be queried with the following command:
		no -a
	Set the udp_sendspace and udp_recvspace to the OS maximum with:
		no -o <parameter>
 Linux (edit files)
	/proc/sys/net/core/rmem_default 
	/proc/sys/net/core/rmem_max
	/proc/sys/net/core/wmem_default
	/proc/sys/net/core/wmem_max 
 HP-UX (HMP Protocol):
	The file /opt/clic/lib/skgxp/skclic.conf contains the Hyper Messaging Protocol (HMP)
        configuration parameters that are relevant for Oracle:
	- CLIC_ATTR_APPL_MAX_PROCS Maximum number of Oracle processes. This includes
	  the background and shadow processes. It does not
	  include non-IPC processes like SQL client processes.
	- CLIC_ATTR_APPL_MAX_NQS This is a derivative of the first parameter; it will 
          be removed in the next release. For the time being, this should be set to 
          the value of CLIC_ATTR_APPL_MAX_PROCS.
	- CLIC_ATTR_APPL_MAX_MEM_EPTS Maximum number of Buffer descriptors. Oracle 
	  seems to require about 1500-5000 of them depending on the block size (8K or 
	  2K). You can choose the maximum value indicated above.
	- CLIC_ATTR_APPL_MAX_RECV_EPTS Maximum number of Oracle Ports. Typically, 
	  Oracle requires as many ports as there are processes. Thus it should be 
	  identical to CLIC_ATTR_APPL_MAX_PROCS.
	- CLIC_ATTR_APPL_DEFLT_PROC_SENDS Maximum number of outstanding sends. You 
	  can leave it at the default value of 1024.
	- CLIC_ATTR_APPL_DEFLT_NQ_RECVS Maximum number of outstanding receives on a 
	  port or buffer. You can leave it at the default value of 1024.
 HP-UX (UDP Protcol):
	Not tunable before HP-UX 11i Version 1.6
      For HP-UX 11i Version 1.6 or later be able to use below command to set socket_udp_rcvbuf_default & socket_udp_sndbuf_default 
      ndd -set /dev/udp socket_udp_rcvbuf_default 1048576
      echo $?
      ndd -set /dev/udp socket_udp_sndbuf_default 1048576
      echo $? 
 HP Tru64 (RDG Protocol):
	RDG related OS parameters are queried with the following command:
		/sbin/sysconfig -q rdg 
	The most important parameters and settings are:
	- rdg_max_auto_msg_wires - MUST be set to zero.
	- max_objs - Should be set to at least <# of Oracle processes * 5> and up to 
	  the larger of 10240 or <# of Oracle processes * 70>. Example: 5120
	- msg_size - Needs to set to at least <db_block_size>, but we recommend 
	  setting it to 32768, since Oracle9i supports different block sizes for each 
	  tablespace.
	- max_async_req - Should be set to at least 100 but 256+ may provide better 
	  performance.
	- max_sessions - Should be set to at least <# of Oracle processes + 20>, 
	  example: 500	
 HP TRU64 (UDP Protocol):
	UDP related OS parameters are queried with the following command:
		/sbin/sysconfig -q udp 
	udp_recvspace 
	udp_sendspace 


2. If the interconnect is slow, busy, or faulty, you can look for dropped packets,
retransmits, or cyclic redundancy check errors (CRC).  You can use netstat commands
to check the networks.  On Unix, check the man page for netstat for a list of options.  
Also check the OS logs for any errors and make sure that inter-instance traffic is 
not routed through a public network.  


With most network protcols, you can use 'oradebug ipc' to see which interconnects 
the database is using:

  SQL> oradebug setmypid
  SQL> oradebug ipc

This will dump a trace file to the user_dump_dest.  The output will look something 
like this:

SSKGXPT 0x1a2932c flags SSKGXPT_READPENDING     info for network 0
        socket no 10    IP 172.16.193.1         UDP 43749
        sflags SSKGXPT_WRITESSKGXPT_UP  info for network 1
        socket no 0     IP 0.0.0.0      UDP 0...

So you can see that we are using IP 172.16.193.1 with a UDP protocol.


3. A large number of processes in the run queue waiting for CPU or scheduling
delays.  If your CPU has limited idle time and your system typically processes 
long-running queries, then latency may be higher.  Ensure that LMSx processes get 
enough CPU.

4. Latency can be influenced by a high value for the DB_FILE_MULTIBLOCK_READ_COUNT 
parameter. This is because a requesting process can issue more than one request 
for a block depending on the setting of this parameter.  


ADDITIONAL RAC AND OPS PERFORMANCE TIPS
---------------------------------------

1. Poor SQL or bad optimization paths can cause additional block gets via the
interconnect just as it would via I/O.  

2. Tuning normal single instance wait events and statistics is also very 
important.

3. A poor gc_files_to_locks setting can cause problems.  In almost all cases 
in RAC, gc_files_to_locks does not need to set at all.  


4. The use of locally managed tablespaces (instead of dictionary managed) with 
the 'SEGMENT SPACE MANAGEMENT AUTO' option can reduce dictionary and freelist 
block contention.  Symptoms of this could include 'buffer busy' waits.  See the 
following notes for more information:

  Note 105120.1
  Advantages of Using Locally Managed vs Dictionary Managed Tablespaces 

  Note 103020.1 
  Migration from Dictionary Managed to Locally Managed Tablespaces 

  Note 180608.1
  Automatic Space Segment Management in RAC Environments


Following these recommendations can help you achieve maximum performance in
your clustered environment.


RELATED DOCUMENTS
-----------------
Oracle Documentation
Note 188135.1 - Documentation Index for Real Application Clusters and Parallel Server 
Note 94224.1 FAQ- STATSPACK COMPLETE REFERENCE 
Note 135714.1 - Script to Collect RAC or OPS Diagnostic Information 
Note 157766.1 - Sessions Wait Forever for 'global cache cr request' Wait Event...
Note 151051.1 - PARAMETER:CLUSTER_INTERCONNECTS
Note 120650.1 - Init.ora Parameter "OPS_INTERCONNECTS" Reference Note

반응형
Posted by [PineTree]
ORACLE/RAC2010. 4. 14. 10:00
반응형

출 처: http://scidb.tistory.com/


지난번에 Range 파티션에서 maxvalue 진정한 의미 라 는 글에서 Multi-Column으로 Range 파 티션을 구성할 때 주의사항에 대하여 알아 보았다. 이 글을 쉽게 이해하려면 위의 글을 먼저 보기 바란다. 테스트용 스크립트도 위의 글에서 사용한 것을 그대로 사용한다.

 

RAC4 Node로 구성되어있는 환경에서 동일한 SQL이 모든 Instance에서 골고루 수행될 때 1 Instance 만 유독 느리다면 무엇을 의심해야 할까? 네트워크 등의 문제일 수 있지만 가장 먼저 조사해야 할 것은 gc_current_grant_busy 이벤트가 발생하느냐 이다.

테 스트 환경을 만들어 보자.

 

CREATE TABLE t (

  id NUMBER,

  d1 DATE,

  day_num VARCHAR2(2), 

  inst_id NUMBER(1),

  pad VARCHAR2(4000),

  CONSTRAINT t_pk PRIMARY KEY (id)

)

PARTITION BY RANGE (day_num,inst_id) (

  PARTITION pt_1_1 VALUES LESS THAN ('1', 2),

  PARTITION pt_1_2 VALUES LESS THAN ('1', 3),

  PARTITION pt_1_3 VALUES LESS THAN ('1', 4),

  PARTITION pt_1_4 VALUES LESS THAN ('1', 5),

  PARTITION pt_2_1 VALUES LESS THAN ('2', 2),

  PARTITION pt_2_2 VALUES LESS THAN ('2', 3),

PARTITION pt_2_3 VALUES LESS THAN ('2', 4),

PARTITION pt_2_3 VALUES LESS THAN ('2', 5),

  ...중간생략

  PARTITION pt_7_1 VALUES LESS THAN ('7', 2),

  PARTITION pt_7_2 VALUES LESS THAN ('7', 3),

  PARTITION pt_7_3 VALUES LESS THAN ('7', 4),

  PARTITION pt_7_4 VALUES LESS THAN ('7', 5)

);

 

Table created.

 

---> 여기서 이전 글에서 사용했던 Insert 문 과 dbms_stats.gather_table_stats 수행

 

 

상황 : 아래의 SQL 2개가 모든 Instance에서 동시에 여러 번 수행된다.

 

SELECT COUNT(*)

  FROM T

 WHERE DAY_NUM = '3';           --> 3번 파티션

 

UPDATE T

   SET pad = LPAD('A', 4000, 'B')

 WHERE DAY_NUM = '4'          --> 4번 파티션

   AND INST_ID = :V_INST_ID;      --> 현재 수행되고 있는 Instance 번 호 대입

 

이 상황에서 1 Instance Update문만 유독 느리게 수행된다. 아래는 개발자와 필자의 대화내용이다.

 

개발자 : Update문의 Bind 변수에 1번만 넣으면 느린가요?


필자    : 1 Instance에서 Update 하려면  다른 Instance에서 Exclusive Mode의 Lock 권한을 받아야 하기 때문으로 추측됩니다.


개발자 : 권한이라뇨?


필자    : SELECT 시에 DAY_NUM 4번에 해당하는 파티션을 5번 이상 Access 했 기 때문에 권한이 다른 INSTANCE로 넘어간 것 같습니다. 이 현상을 FDC(Fairness Down Convert) 라고 합니다. FDC가 발생한 후에 DAY_NUM 4번에 해당하는 첫번째 파티션(pt_4_1)의 해당 블록에 UPDATE문을 수행하려면 권한을 받는 작업(gc_current_grant_busy 이벤트)이 필요합니다.


개발자 : 그럴 리가요? Update 문은 DAY_NUM = '4' 조건이고 Select 문 은 DAY_NUM = '3' 조건이므로 서로 다른 파티션 입니다. 따라서 SELECT 문과 UPDATE문이 동일 파티션을 Access 할 이유가 없습니다.


필자   : SELECT 문이 실제로는 DAY_NUM = '4' 의 첫번째 파티션을 항상 Access 합니다. MAXVALUE를 지정하지 않았으므로 그런 것 입니다.


개발자 :  그렇군요. 어 쩐지 tracegc_current_grant_busy가 많이 보였습니다.

 

아래는 개발자가 제시한 Trace 내용 중 Wait Event 부분을 발췌한 것이다.

 

core1_ora_13638.trc

-----------------------------

WAIT #11: nam='gc current grant busy' ela= 947 p1=28 p2=1672046 p3=33619969 obj#=12043270 tim=12372374088207

WAIT #11: nam='gc current grant busy' ela= 992 p1=29 p2=2310876 p3=33619969 obj#=12070599 tim=12372374089432

...중간생략

WAIT #11: nam='gc current grant busy' ela= 767 p1=28 p2=1673090 p3=33619969 obj#=12043272 tim=12372374096882

 


Fairness Down Convert란 무엇인가?
Exclusive mode의 lock이 Shared lock 모드로 Down Convert 된다는 뜻이다. 다른  Instance의 요청에 의해서 Exclusive mode의 lock 상태에서 블럭을 다른 INSTANCE로 전송하는 작업은 무거운 연산이므로 특정 횟수 이상 블럭을 요청할 경우 Shared lock 모드로 전환하겠다는 뜻이다.  FDC 발생 이후로는 블럭을 요청한 INSTANCE로는 블럭 전송이 불필요 하다. 따라서 성능이 향상된다.  하지만 반대로 원래의 Instance에서
그 블럭을 Update 하 려면 권한을 받아야만 하므로 성능이 느려지는 것이다.

FDC Control 할 수 있는 파라미터는 _FAIRNESS_THRESHOLD 이다. 이 파라미터는 Default 4 이다. 즉 특정 블록을 다른 Instance에서 5번 이상 Access 하는 경우 FDC가 발생하여 요청한 Instance로 권한이 넘어간다.

 

결론:

FDC 기능은 성능을 향상 시키기 위한 용도로 만들어 졌다. 하지만 위의 경우와 같이 오히려 느려지는 경우도 있다. Trade Off 특징이 잘 나타난다. 파티션의 특징을 잘 모르고 사용하였기 때문인데 해당 Select 문 뿐만 아니라 DML문까지 성능이 느려질 수 있으므로 주의해야 한다.
반응형
Posted by [PineTree]
ORACLE/RAC2009. 2. 23. 11:22
반응형
출처 : http://kr.blog.yahoo.com/dbacool/1157

RAC(OPS) 환경하에서 양쪽 Node의 archived log file을 RMAN을 사용하여 동시에 BACKUP 받는 방법
======================================================================================

ORACLE 9i 이전 버전
-------------------

Oracle 8i까지는 다음과 같은 Script를 통하여 Backup을 받을 수 있었습니다.

1) Script Name: arch_backup.rcv

run{
allocate channel node_1 type disk connect 'system/manager@v92hp1';
allocate channel node_2 type disk connect 'system/manager@v92hp2';

backup filesperset 1
(archivelog until time 'SYSDATE' thread 1 channel node_1)
(archivelog until time 'SYSDATE' thread 2 channel node_2);

release channel node_1;
release channel node_2;

}

2) 수행 방법

$ rman target=system/manager catalog=rman_user/rmanpw cmdfile='arch_backup.rcv' log='arch_backup.log'



ORACLE 9i 이후 버전
-------------------

그러나 Oracle9i 이상부터는 Archived file backup전에 다음과 같은 설정을 먼저
해 주셔야만 합니다.

1) Configuration 설정

$ rman target=system/manager catalog=rman_user/rmanpw
RMAN> Show all;
RMAN> CONFIGURE DEVICE TYPE DISK PARALLELISM 2;
RMAN> configure default device type to disk;
RMAN> configure channel 1 device type disk connect 'system/manager@v92hp1';
RMAN> configure channel 2 device type disk connect 'system/manager@v92hp2';

위 설정은 backup을 Disk에 받는 경우로 가정하고 device type을 모두 disk로 설정하였습니다.
만일 backup solution을 사용하여 tape에 받는다면 device type을 'sbt_tape'으로 변경해 주시면 됩니다

몇개의 Channel을 설정할 것인가에 따라 PARALLELISM의 값을 반드시맞춰 주어야 합니다.
이것을 맞춰주지 않으면 다음과 같은 형태의 Error가 발생하면서 다른 Node의 archive file들을 인식하지
못하게 됩니다.(실제로 Archived file들은 정상적으로 존재합니다)

RMAN-06059: expected archived log not found, lost of archived log compromises recoverability
ORA-19625: error identifying file /u01/64bit/app/oracle/product/9.2.0/admin/V92HP/arch/arch1_146.dbf
ORA-27037: unable to obtain file status
HP-UX Error: 2: No such file or directory
Additional information: 3

위 설정은 한번만 수행해 주시면 됩니다.
만일 CHANNEL을 잘못 설정하였으면 다음과 같은 명령으로 Clear 해 주시면 됩니다.

RMAN> configure channel 1 device type disk clear;


2)Archived file을 Backup 받습니다.

RMAN> run { backup
format='/u01/64bit/app/oracle/product/9.2.0/admin/V92HP/arch/%U'
archivelog all delete input;
}



ADDITIONAL INFORMATION(1)
-------------------------
RAC 환경 하에서 일부 Archived file들이 OS에서 삭제 되었을 경우 다음과 같은 명령을 통하여
validation check를 수행한 후에 backup을 수행하여 주십시요

RMAN> allocate channel for maintenance type disk connect 'system/manager@v92hp1';
RMAN> allocate channel for maintenance type disk connect 'system/manager@v92hp2';
RMAN> crosscheck archivelog all;

만약에 Configuration에서 이미 Channel을 설정해 주었다면
Channel allocation 없이 바로 crosscheck명령어를 수행해 주시면 됩니다.


ADDITIONAL INFORMATION(2)
-------------------------
Channel Configuration 설정시에 Backup FORMAT을 함께 설정하려면 다음과 같은 형태로 수행합니다.

RMAN> CONFIGURE DEVICE TYPE DISK PARALLELISM 2;
RMAN> configure default device type to disk;
RMAN> configure channel 1 device type disk connect 'system/manager@v92hp1' FORMAT '/arch/bkup%t_s%s_s%p';
RMAN> configure channel 2 device type disk connect 'system/manager@v92hp2' FORMAT '/arch/bkup%t_s%s_s%p';


ADDITIONAL INFORMATION(3)
-------------------------
Tape device를 사용할 경우 device type은 'sbt_tape'을 사용합니다.

RMAN> CONFIGURE DEVICE TYPE 'SBT_TAPE' PARALLELISM 2;
RMAN> configure default device type to 'sbt_tape';
RMAN> configure channel 1 device type 'sbt_tape' connect 'system/manager@v92hp1' FORMAT 'bkup%t_s%s_s%p';
RMAN> configure channel 2 device type 'sbt_tape' connect 'system/manager@v92hp2' FORMAT 'bkup%t_s%s_s%p';
반응형
Posted by [PineTree]