'RAC' 태그의 글 목록 (3 Page)

'RAC'에 해당되는 글 23건

2012.05.18 VIP Failover Take Long Time After Network Cable Pulled [ID 403743.1]
2012.05.18 How to Configure Solaris Link-based IPMP for Oracle VIP [ID 730732.1]
2012.05.11 UDP Buffer Tuning 기법

VIP Failover Take Long Time After Network Cable Pulled [ID 403743.1]

VIP Failover Take Long Time After Network Cable Pulled [ID 403743.1]
--------------------------------------------------------------------------------

수정 날짜 05-JAN-2011 유형 PROBLEM 상태 PUBLISHED

In this Document
Symptoms
Changes
Cause
Solution
References

--------------------------------------------------------------------------------

Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 11.1.0.7 - Release: 10.2 to 11.1
Information in this document applies to any platform.
***Checked for relevance on 05-Jan-2011***
Symptoms
This example is based on SUN Solaris platform, with IPMP configured for the public network. In this case, VIP failover takes almost 4 minutes to complete when both network cables of the public network are pulled from one node.

crsd.log shows:

2006-12-07 13:14:05.401: [ CRSAPP][4588] CheckResource error for ora.node1.vip error code = 1
2006-12-07 13:14:05.408: [ CRSRES][4588] In stateChanged, ora.node1.vip target is ONLINE
2006-12-07 13:14:05.409: [ CRSRES][4588] ora.node1.vip on node1 went OFFLINE unexpectedly
<<< detect network cable failure and VIP OFFLINE immediately

2006-12-07 13:14:05.410: [ CRSRES][4588] StopResource: setting CLI values
2006-12-07 13:14:05.420: [ CRSRES][4588] Attempting to stop `ora.node1.vip` on member `node1`
2006-12-07 13:14:06.651: [ CRSRES][4588] Stop of `ora.node1.vip` on member `node1` succeeded.
2006-12-07 13:14:06.652: [ CRSRES][4588] ora.node1.vip RESTART_COUNT=0 RESTART_ATTEMPTS=0
2006-12-07 13:14:06.667: [ CRSRES][4588] ora.node1.vip failed on node1 relocating.
2006-12-07 13:14:06.758: [ CRSRES][4588] StopResource: setting CLI values
2006-12-07 13:14:06.766: [ CRSRES][4588] Attempting to stop `ora.node1.LISTENER_NODE1.lsnr` on member `node1`
2006-12-07 13:17:41.399: [ CRSRES][4588] Stop of `ora.node1.LISTENER_NODE1.lsnr` on member `node1` succeeded.
<<< takes 3.5 minutes to stop listener

2006-12-07 13:17:41.402: Attempting to stop `ora.node1.ASM1.asm` on member `node1`
<<< stop dependant inst and ASM
2006-12-07 13:17:55.610: [ CRSRES][4588] Stop of `ora.node1.ASM1.asm` on member `node1` succeeded.

2006-12-07 13:17:55.661: [ CRSRES][4588] Attempting to start `ora.node1.vip` on member `node2`
2006-12-07 13:18:00.260: [ CRSRES][4588] Start of `ora.node1.vip` on member `node2` succeeded.
<<< now VIP failover complete after almost 4 mins

ora.node1.LISTENER_NODE1.lsnr.log shows:

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=node1vip)(PORT=1521)(IP=FIRST)))
TNS-12535: TNS:operation timed
2006-12-07 13:17:41.329: [ RACG][1] [23916][1][ora.node1.LISTENER_NODE1.lsnr]: out
   TNS-12560: TNS:protocol adapter error
     TNS-00505: Operation timed out
     Solaris Error: 145: Connection timed out
     Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.1.10.100)(PORT=1521)(IP=FIRST)))
The command completed successfully

Client connection hang during this failover time.

Changes
This may be a new setup, or a setup that was migrated from an earlier release.
Cause
This problem is caused by the first address in the listener.ora configuration being an address that uses the TCP protocol.

In this circumstance, when a network cable is pulled, "lsnrctl stop" listener has to wait for TCP timeout before it can check next address. On the Solaris platform, TCP timeout is defined by tcp_ip_abort_cinterval with a default value of 180000 (3 minutes). That is why shutting down listener almost took 3.5 minutes. (TCP timeout on other platforms may vary). The error message "Solaris Error: 145: Connection timed out" in ora.node1.LISTENER_NODE1.lsnr.log also indicates it is waiting for tcp timeout.

The listener.ora in this scenario is defined as:

[LISTENER_NODE1 =
(DESCRIPTION_LIST =
   (DESCRIPTION =
     (ADDRESS_LIST =
       (ADDRESS = (PROTOCOL = TCP)(HOST = node1vip)(PORT = 1521)(IP = FIRST))
     )
     (ADDRESS_LIST =
       (ADDRESS = (PROTOCOL = TCP)(HOST = 10.1.10.100)(PORT = 1521)(IP = FIRST))
     )
     (ADDRESS_LIST =
       (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
     )
   )
)Solution
To prevent this, move the IPC address to be the first address for the listener in the listener.ora, eg:

LISTENER_NODE1 =
(DESCRIPTION_LIST =
    (DESCRIPTION =
       (ADDRESS_LIST =
          (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
       )
       (ADDRESS_LIST =
          (ADDRESS = (PROTOCOL = TCP)(HOST = node1vip)(PORT = 1521)(IP = FIRST))
        )
       (ADDRESS_LIST =
           (ADDRESS = (PROTOCOL = TCP)(HOST = 10.1.10.100)(PORT = 1521)(IP = FIRST))
        )
     )
)

When lsnrctl tries to stop the listener, it will now connect to the IPC address first, which is available during that time. It will not have to wait for tcp timeout.

After the above change, the VIP failover only takes 48 to 50 seconds to complete regardless of the tcp_ip_abort_cinterval setting.

Please note, listener.ora files newly created from 10.2.0.3 to 11.1.0.7 should have the IPC protocol as the first address in listener.ora in most cases. However, if you have upgraded from a previous release, or manually modified/copied over a listener.ora from a previous install, you may not have the IPC protocol as the first address, regardless of your version. Manual modification is required to move IPC protocol to be the first address to avoid the problem described in this note.

References

저작자표시 비영리

'ORACLE > RAC' 카테고리의 다른 글

11g rac설치후 resource 자동으로 올라오게 설정 (0)	2013.05.30
Oracle 11g RAC startup policy 변경(crs start시 instance자동 start 설정) (0)	2013.05.30
How to Configure Solaris Link-based IPMP for Oracle VIP [ID 730732.1] (0)	2012.05.18
oracle RAC 10.2.0.1 -> 10.2.0.5 패치 (0)	2012.05.11
UDP Buffer Tuning 기법 (0)	2012.05.11

Posted by [PineTree]

How to Configure Solaris Link-based IPMP for Oracle VIP [ID 730732.1]

How to Configure Solaris Link-based IPMP for Oracle VIP [ID 730732.1]

		수정 날짜 23-NOV-2011 유형 REFERENCE 상태 PUBLISHED

In this Document
  Purpose
  Scope
  How to Configure Solaris Link-based IPMP for Oracle VIP
  References

Applies to:

Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 11.2.0.2 - Release: 10.2 to 11.2
Oracle Solaris on SPARC (64-bit)
Oracle Solaris on x86-64 (64-bit)

Purpose

This note will give a sample configuration for Link-based failure detection mode for IPMP which is introduced in Sun Solaris 10 platform.

Before Sun Solaris 10, there is only Probe-based failure detection IPMP configuration that you can find the example in note 283107.1

The main different between Probe-based IPMP and Link-based IPMP
- In Probe-based IPMP, beside the host's Physical IP address you also need to assign a test IP address for each NIC card. And one target system, normally default gateway, where multipathing daemon used for ICMP probe message.

- In Link-based IPMP, only the host's Physical IP address is required.

Scope

By default, link-based failure detection is always enabled in Solaris 10, provided that the driver for the interface supports this type of failure detection. The following Sun network drivers are supported in the current release

hme
eri
ce
ge
bge
qfe
dmfe
e1000g
ixgb
nge
nxge
rge
xge

Network Requirement
--------------------------------
There is no different for Probe-based and Link-based in term of hardware requirement.

It is only one Physical IP address required per cluster node. The following is list of NIC Card and IP addresses that will be used in the following example.
- Public Interface: ce0 and ce1
- Physical IP: 130.35.100.123
- Oracle RAC VIP: 130.35.100.124

How to Configure Solaris Link-based IPMP for Oracle VIP

IPMP Configuration
-----------------------------
1. ifconfig ce0 group racpub
2. ifconfig ce0 addif 130.35.100.123 netmask + broadcast + up
3. ifconfig ce1 group racpub

To preserve the IPMP configuration across reboot, you need to update the /etc/hostname.* files as following
1. The entry of /etc/hostname.ce0 file
130.35.100.123 netmask + broadcast + group racpub up

2. The entry of /etc/hostname.ce1 file
group racpub up

Before CRS installation , the 'ifconfig -a' output will be

lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
hme0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 192.168.1.1 netmask ffffff00 broadcast 192.168.1.255
ether 0:19:b9:3f:87:11
ce0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 130.35.100.123 netmask ffffff00 broadcast 130.35.100.255
groupname racpub
ether 0:14:d1:13:7b:7e
ce1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname racpub
ether 0:18:e7:8:c5:8b

Since no test IP assigned to public interfaces, the IP address of ce0 is the physical IP address and ce1 is 0.0.0.0.

CRS / VIPCA configuration
----------------------------------------
Upon successful of root.sh on the CRS installation, vipca will only make the primary interface as the public interface. If you start vipca application manually, the second screen (VIP Configuration Assistant, 1 of 2) will only list ce0 as the available public interface.

After that, you need to update CRS for the second NIC (ce1) information with srvctl command

# srvctl modify nodeapps -n tsrac1 -A 130.35.100.124/255.255.255.0/ce0\|ce1

After CRS is installed and Oracle VIP is running, the 'ifconfig -a' output will be

lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
hme0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 192.168.1.1 netmask ffffff00 broadcast 192.168.1.255
ether 0:19:b9:3f:87:11
ce0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 130.35.100.123 netmask ffffff00 broadcast 130.35.100.255
groupname racpub
ether 0:14:d1:13:7b:7e
ce0:1: flags=1040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4> mtu 1500 index 3
inet 130.35.100.124 netmask ffffff00 broadcast 130.35.100.255
ce1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname racpub
ether 0:18:e7:8:c5:8b

When the primary interface on the public network failed, either NIC faulty or the LAN cable broken, Oracle VIP will follow the physical IP failover to standby interface. As the following output of 'ifconfig -a' shows

lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
hme0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 192.168.1.1 netmask ffffff00 broadcast 192.168.1.255
ether 0:19:b9:3f:87:11
ce0: flags=19000802<BROADCAST,MULTICAST,IPv4,NOFAILOVER,FAILED> mtu 0 index 3
inet 0.0.0.0 netmask 0
groupname racpub
ether 0:14:d1:13:7b:7e
ce1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname racpub
ether 0:18:e7:8:c5:8b
ce1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
inet 130.35.100.123 netmask ffffff00 broadcast 130.35.100.255
ce1:2: flags=1040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4> mtu 1500 index 4
inet 130.35.100.124 netmask ffffff00 broadcast 130.35.100.255

References

NOTE:283107.1 - Configuring Solaris IP Multipathing (IPMP) for the Oracle 10g VIP
NOTE:368464.1 - How to Setup IPMP as Cluster Interconnect
docs.oracle.com/cd/E19253-01/816-4554/mpoverview/index.html

'ORACLE > RAC' 카테고리의 다른 글

Oracle 11g RAC startup policy 변경(crs start시 instance자동 start 설정) (0)	2013.05.30
VIP Failover Take Long Time After Network Cable Pulled [ID 403743.1] (0)	2012.05.18
oracle RAC 10.2.0.1 -> 10.2.0.5 패치 (0)	2012.05.11
UDP Buffer Tuning 기법 (0)	2012.05.11
RAC 에서 특정 Instance만 성능이 느린 이유 (0)	2010.04.14

Posted by [PineTree]

UDP Buffer Tuning 기법

오라클 매거진 2007년 가을호 발췌자료입니다.

일반적으로 RAC의 Inter-Connects의 성능과 가장 연관이 깊은것이 UDP Buffer의 크기이다. 오라클이 공식적으로 권고하는 UDP
Buffer 크기는 256K이고 대부분의 시스템에서 적절한 성능을 보장하고있다. Inter-Connect을 통해 주고 받는 Data량이 많은 경우 UDP Buffer Size를 1M~2M 정도 사용하기도 하며 Transaction에 비해 UDP Buffer Size 가 작을 경우 Packet Loss 현상이 발생할 수 있다. Packet Loss 현상이 자주 발생할 경우“gc cr block lost”,“gc current block lost”의 Oracle Wait Event가 나타난다. 이는 각 OS 별 Monitoring Tool인 nmon, topas, glance이나 Network 명령인 ifconfig, netstat 등과
Network 진단 Tool을 통해 UDP Packet Receive Error나 Dropped 된비율을 Monitoring 하도록 한다.
[oracle@rac1]$ netstat -s
Ip: ………
Tcp: …….

앞의 Network Setting 값들은 System 운영 중에 변경이 가능하며 System Rebooting 시에는 Default 값으로 초기화되므로 System Starting Script나 Kernel Level에서 값을 Settting하여 자동 적용이 되도록 한다.

고객사 장애사례

Instance Recovery 및 Reconfigure 시간을 최대 단축하는

Hidden Parameter(_imr_max_reconfig_max)와 Instance Recovery 없이 (_imr_active=false) Open 시도를 해도 두 번째 Node의 DB는 약20분 후에 Open됨.

DB Mount 상태에서 10046-level 12로 event를 걸어 DB Open하여 trace를 살펴보니 Inter-connects 간 DB Open전에 동기화를 위한 Recursive Call 시 ”global cache cr request”event와 해당 elaspsed time이 굉장히 자주 많이 발생하여 Inter-Connects 간 Network 전송속도에 문제가 있을 것으로 판단하여 OS의 Inter-Connects용 Gigabit-NIC 설
정 값을 확인해보니 100 M Full 설정이 되어 있어 1000 M Full로 변경하여 정상적으로 DB Open 및 업무가 진행이 되었다.

일반적으로 RAC의 Inter-Connects의 성능과 가장 연관이 깊은것이 UDP Buffer의 크기이다. 오라클이 공식적으로 권고하는 UDP
Buffer 크기는 256K이고 대부분의 시스템에서 적절한 성능을 보장하고있다. Inter-Connect을 통해 주고 받는 Data량이 많은 경우 UDP Buffer Size를 1M~2M 정도 사용하기도 하며 Transaction에 비해 UDP Buffer Size 가 작을 경우 Packet Loss 현상이 발생할 수 있다. Packet Loss 현상이 자주 발생할 경우“gc cr block lost”,“gc current block lost”의 Oracle Wait Event가 나타난다. 이는 각 OS 별 Monitoring Tool인 nmon, topas, glance이나 Network 명령인 ifconfig, netstat 등과
Network 진단 Tool을 통해 UDP Packet Receive Error나 Dropped 된비율을 Monitoring 하도록 한다.
[oracle@rac1]$ netstat -s
Ip: ………
Tcp: …….

OS별 Default UDP Buffer Size와 조절방법(MAX:8M)
kernel	Value	Default	Command
Linex	net.core.rmem_max	131071	sysctl -w net.core.rmem_max = 8388608
Solaris	udp_max_buf	262144	ndd -set /dev/udp udp_max_buf 8388608
AIX	sb_max	1048576	no -o sb_max=8388608 (1048576, 4194304, 83388608 값 중에서만)

OS별 Default UDP Buffer Size와 조절방법(MAX:8M)
kernel	Value	Default	Command
Linex	net.core.rmem_max	131071	sysctl -w net.core.rmem_max = 8388608
Solaris	udp_max_buf	262144	ndd -set /dev/udp udp_max_buf 8388608
AIX	sb_max	1048576	no -o sb_max=8388608 (1048576, 4194304, 83388608 값 중에서만)

Tuning Inter-Instance Performance in RAC and OPS [ID 181489.1]

		수정 날짜 29-MAR-2009 유형 BULLETIN 상태 PUBLISHED

PURPOSE
-------

This note was written to help DBAs and Support Analysts Understand Inter-Instance
Performance and Tuning in RAC.

SCOPE & APPLICATION
-------------------

Real Application Clusters uses the interconnect to transfer blocks and messages
between instances. If inter-instance performance is bad, almost all database
operations can be delayed. This note describes methods of identifying and
resolving inter-instance performance issues.

TUNING INTER-INSTANCE PERFORMANCE IN RAC AND OPS
------------------------------------------------

SYMPTOMS OF INTER-INSTANCE PERFORMANCE PROBLEMS
-----------------------------------------------

The best way to monitor inter-instance performance is to take AWR or statspack
snaps on each instance (at the same time) at regular intervals.

If there are severe inter-instance performance issues or hung sessions, you
may also want to run the racdiag.sql script from the following note
to collect additional RAC specific data:

Note 135714.1
Script to Collect RAC Diagnostic Information (racdiag.sql)

The output of the script has tips for how to read the output.

Within the AWR, statspack report, or racdiag.sql output, you can use the wait
events and global cache statistics to monitor inter-instance performance. It
will be important to look for symptoms of inter-instance performance issues.
These symptoms include:

1. The average cr block receive time will be high. This value is calculated by
dividing the 'global cache cr block receive time' statistic by the
'global cache cr blocks received' statistic:

global cache cr block receive time
----------------------------------
global cache cr blocks received

Multiply this calculation by 10 to find the average number of milliseconds. In a
9.2 statspack report you can also use the following Global Cache Service Workload
characteristics:

Ave receive time for CR block (ms): 4.1

The following query can also be run to monitor the average cr block receive time
since the last startup:

set numwidth 20
column "AVG CR BLOCK RECEIVE TIME (ms)" format 9999999.9
select b1.inst_id, b2.value "GCS CR BLOCKS RECEIVED",
b1.value "GCS CR BLOCK RECEIVE TIME",
((b1.value / b2.value) * 10) "AVG CR BLOCK RECEIVE TIME (ms)"
from gv$sysstat b1, gv$sysstat b2
where b1.name = 'global cache cr block receive time' and
b2.name = 'global cache cr blocks received' and b1.inst_id = b2.inst_id ;

The average cr block receive time or current block receive time should typically be
less than 15 milliseconds depending on your system configuration and volume, is the
average latency of a consistent-read request round-trip from the requesting instance
to the holding instance and back to the requesting instance.

Please note that if you are on 9i and the global cache current block receive
time is abnormally high and the average wait time for the 'global cache null
to x' wait event is low (under 15ms) then you are likely hitting bug 2130923
(statistics bug). This is a problem in the way statstics are reported and does
not impact performance.

More about that issue is documented in the following note:

Note 243593.1
RAC: Ave Receive Time for Current Block is Abnormally High in Statspack

2. "Global cache" or "gc" events will be the top wait event. Some of these wait
events show the amount of time that an instance has requested a data block for a
consistent read or current block via the global cache.

When a consistent read buffer cannot be found in the local cache, an attempt is
made to find a usable version in another instance. There are 3 possible outcomes,
depending on whether any instance in the cluster has the requested data block
cached or not:

a) A cr block is received (i.e. another instance found or managed to produce the
wanted version). The "global cache cr blocks received" statistic is incremented.
b) No other instance has the block cached and therefor the requesting instance
needs to read from disk, but a shared lock will be granted to the requestor
The "global cache gets" statistic is incremented
c) 9i RAC+ Only --> A current block is received (the current block is good enough for
the query ). The " global cache current blocks received" statistic is
incremented.

In all three cases, the requesting process may wait for global cache cr request.
The view X$KCLCRST (CR Statistics) may be helpful in debugging 'global cache cr
request' wait issues. It will return the number of requests that were handled for
data or undo header blocks, the number of requests resulting in the shipment of a
block (cr or current), and the number of times a read from disk status is returned.

It should be noted that having 'global cache' or 'gc' waits does not always
indicate an inter-instance performance issue. Many times this wait is
completely normal if data is read and modified concurrently on multiple
instances. Global cache statistics should also be examined to determine if
there is an inter-instance performance problem.

3. The GES may run out of tickets. When viewing the racdiag.sql output
(Note 135714.1) or querying the gv$ges_traffic_controller or
gv$dlm_traffic_controller views, you may find that the TCKT_AVAIL shows '0'. To
find out the available network buffer space we introduce the concepts of tickets.
The maximum number of tickets available is a function of the network send buffer
size. In the case of lmd and lmon, they always buffer their messages in case of
ticket unavailability. A node relies on messages to come back from the remote
node to release tickets for reuse.

4. The above information should be enough to identify an inter-instance performance
problem but additional calculations can be made to monitor inter-instance
performance can be found in the documentation.

IDENTIFYING AND RESOLVING INTER-INSTANCE PERFORMANCE PROBLEMS
-------------------------------------------------------------

Inter-Instance performance issues can be caused by:

1. Under configured network settings at the OS. Check UDP, or other network protocol
settings and tune them. See your OS specific documentation for instructions on how
to do this. If using UDP, make sure the parameters relating to send buffer space,
receive buffer space, send highwater, and receive highwater are set well above the
OS default. The alert.log will indicate what protocol is being used. Example:

cluster interconnect IPC version:Oracle RDG
IPC Vendor 1 proto 2 Version 1.0

Changing network parameters to optimal values:

Sun (UDP Protcol)
UDP related OS parameters can be queried with the following command:
ndd /dev/udp udp_xmit_hiwat
ndd /dev/udp udp_recv_hiwat
ndd /dev/udp udp_max_buf
Set the udp_xmit_hiwat and udp_recv_hiwat to the OS maximum with:
ndd -set /dev/udp udp_xmit_hiwat <value>
ndd -set /dev/udp udp_recv_hiwat <value>
ndd -set /dev/udp udp_max_buf <1M or higher>
IBM AIX (UDP Protocol)
UDP related OS parameters can be queried with the following command:
no -a
Set the udp_sendspace and udp_recvspace to the OS maximum with:
no -o <parameter>
Linux (edit files)
/proc/sys/net/core/rmem_default
/proc/sys/net/core/rmem_max
/proc/sys/net/core/wmem_default
/proc/sys/net/core/wmem_max
HP-UX (HMP Protocol):
The file /opt/clic/lib/skgxp/skclic.conf contains the Hyper Messaging Protocol (HMP)
configuration parameters that are relevant for Oracle:
- CLIC_ATTR_APPL_MAX_PROCS Maximum number of Oracle processes. This includes
the background and shadow processes. It does not
include non-IPC processes like SQL client processes.
- CLIC_ATTR_APPL_MAX_NQS This is a derivative of the first parameter; it will
be removed in the next release. For the time being, this should be set to
the value of CLIC_ATTR_APPL_MAX_PROCS.
- CLIC_ATTR_APPL_MAX_MEM_EPTS Maximum number of Buffer descriptors. Oracle
seems to require about 1500-5000 of them depending on the block size (8K or
2K). You can choose the maximum value indicated above.
- CLIC_ATTR_APPL_MAX_RECV_EPTS Maximum number of Oracle Ports. Typically,
Oracle requires as many ports as there are processes. Thus it should be
identical to CLIC_ATTR_APPL_MAX_PROCS.
- CLIC_ATTR_APPL_DEFLT_PROC_SENDS Maximum number of outstanding sends. You
can leave it at the default value of 1024.
- CLIC_ATTR_APPL_DEFLT_NQ_RECVS Maximum number of outstanding receives on a
port or buffer. You can leave it at the default value of 1024.
HP-UX (UDP Protcol):
Not tunable before HP-UX 11i Version 1.6
For HP-UX 11i Version 1.6 or later be able to use below command to set socket_udp_rcvbuf_default & socket_udp_sndbuf_default
ndd -set /dev/udp socket_udp_rcvbuf_default 1048576
echo $?
ndd -set /dev/udp socket_udp_sndbuf_default 1048576
echo $?
HP Tru64 (RDG Protocol):
RDG related OS parameters are queried with the following command:
/sbin/sysconfig -q rdg
The most important parameters and settings are:
- rdg_max_auto_msg_wires - MUST be set to zero.
- max_objs - Should be set to at least <# of Oracle processes * 5> and up to
the larger of 10240 or <# of Oracle processes * 70>. Example: 5120
- msg_size - Needs to set to at least <db_block_size>, but we recommend
setting it to 32768, since Oracle9i supports different block sizes for each
tablespace.
- max_async_req - Should be set to at least 100 but 256+ may provide better
performance.
- max_sessions - Should be set to at least <# of Oracle processes + 20>,
example: 500
HP TRU64 (UDP Protocol):
UDP related OS parameters are queried with the following command:
/sbin/sysconfig -q udp
udp_recvspace
udp_sendspace

2. If the interconnect is slow, busy, or faulty, you can look for dropped packets,
retransmits, or cyclic redundancy check errors (CRC). You can use netstat commands
to check the networks. On Unix, check the man page for netstat for a list of options.
Also check the OS logs for any errors and make sure that inter-instance traffic is
not routed through a public network.

With most network protcols, you can use 'oradebug ipc' to see which interconnects
the database is using:

SQL> oradebug setmypid
SQL> oradebug ipc

This will dump a trace file to the user_dump_dest. The output will look something
like this:

SSKGXPT 0x1a2932c flags SSKGXPT_READPENDING info for network 0
socket no 10 IP 172.16.193.1 UDP 43749
sflags SSKGXPT_WRITESSKGXPT_UP info for network 1
socket no 0 IP 0.0.0.0 UDP 0...

So you can see that we are using IP 172.16.193.1 with a UDP protocol.

3. A large number of processes in the run queue waiting for CPU or scheduling
delays. If your CPU has limited idle time and your system typically processes
long-running queries, then latency may be higher. Ensure that LMSx processes get
enough CPU.

4. Latency can be influenced by a high value for the DB_FILE_MULTIBLOCK_READ_COUNT
parameter. This is because a requesting process can issue more than one request
for a block depending on the setting of this parameter.

ADDITIONAL RAC AND OPS PERFORMANCE TIPS
---------------------------------------

1. Poor SQL or bad optimization paths can cause additional block gets via the
interconnect just as it would via I/O.

2. Tuning normal single instance wait events and statistics is also very
important.

3. A poor gc_files_to_locks setting can cause problems. In almost all cases
in RAC, gc_files_to_locks does not need to set at all.

4. The use of locally managed tablespaces (instead of dictionary managed) with
the 'SEGMENT SPACE MANAGEMENT AUTO' option can reduce dictionary and freelist
block contention. Symptoms of this could include 'buffer busy' waits. See the
following notes for more information:

Note 105120.1
Advantages of Using Locally Managed vs Dictionary Managed Tablespaces

Note 103020.1
Migration from Dictionary Managed to Locally Managed Tablespaces

Note 180608.1
Automatic Space Segment Management in RAC Environments

Following these recommendations can help you achieve maximum performance in
your clustered environment.

RELATED DOCUMENTS
-----------------
Oracle Documentation
Note 188135.1 - Documentation Index for Real Application Clusters and Parallel Server
Note 94224.1 FAQ- STATSPACK COMPLETE REFERENCE
Note 135714.1 - Script to Collect RAC or OPS Diagnostic Information
Note 157766.1 - Sessions Wait Forever for 'global cache cr request' Wait Event...
Note 151051.1 - PARAMETER:CLUSTER_INTERCONNECTS
Note 120650.1 - Init.ora Parameter "OPS_INTERCONNECTS" Reference Note

저작자표시 비영리

'ORACLE > RAC' 카테고리의 다른 글

How to Configure Solaris Link-based IPMP for Oracle VIP [ID 730732.1] (0)	2012.05.18
oracle RAC 10.2.0.1 -> 10.2.0.5 패치 (0)	2012.05.11
RAC 에서 특정 Instance만 성능이 느린 이유 (0)	2010.04.14
RAC(OPS) 환경하에서 양쪽 Node의 archived log file을 RMAN을 사용하여 동시에 BACKUP 받는 방법 (0)	2009.02.23
Linux / FireWire 환경에 Oracle RAC 10g Release 2 Cluster 설치하기 (0)	2008.12.23

Posted by [PineTree]

About DATABASE

'RAC'에 해당되는 글 23건

VIP Failover Take Long Time After Network Cable Pulled [ID 403743.1]

'ORACLE > RAC' 카테고리의 다른 글

How to Configure Solaris Link-based IPMP for Oracle VIP [ID 730732.1]

Applies to:

Purpose

Scope

How to Configure Solaris Link-based IPMP for Oracle VIP

References

'ORACLE > RAC' 카테고리의 다른 글

UDP Buffer Tuning 기법

'ORACLE > RAC' 카테고리의 다른 글

카테고리

공지사항

태그목록

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

달력

링크

티스토리툴바