Assessing and Optimizing Network Performance in the Oracle Data Guard Environment
Written by :Alireza Kamrani
Oracle Data Guard relies on the underlying network to send redo from the primary database to standby databases. Ensuring that the network is healthy and capable of supporting peak redo generation rates helps avoid future transport lags.
A transport lag forms when the primary database cannot ship redo to the standby faster than primary instance's redo generation rate. A transport lag can lead to potential data loss if a primary database failure occurs.
Network assessment consists of evaluating
- Network reliability
- Network bandwidth to accommodate peak redo generation rates
Note:
Each instance of the primary database instance generates its own redo and ships redo to the standby database in a single network stream. Therefore, maximizing single process network throughput for each node is critical for redo transport.
Historically there are areas that can reduce network and redo transport throughput resulting in potential transport lags:
- Network firewalls or network encryption
Network firewalls and network (not Oracle Net) encryption can reduce overall throughput significantly. Verify throughput with the oratcp tool (described below), with and without encryption, and tune accordingly.
At times reducing the encryption level can increase throughput significantly. A balance is required to meet security needs with your performance and data loss requirements. - Redo transport compression
When database initialization parameter has LOG_ARCHIVE_DEST_N attribute COMPRESSION=ENABLE, Oracle background processes have to compress the redo before sending network message, and uncompress the redo before processing the redo. This reduces the overall redo and network throughput. Compression is only recommended if network bandwidth is insufficient between the primary and standby destinations. - Oracle Net encryption
Depending on the Oracle Net encryption level, this will have varying redo throughput impact, because Oracle Net messages containing redo have to be encrypted before sending and then unencrypted before redo processing.
Note that if database encryption is already enabled with Transparent Data Encryption (TDE), redo is already encrypted, although Oracle Net encryption can also encrypt the message headers. - Untuned network for redo transport
- Increasing maximum operating system socket buffer size can increase single process throughput by 2-8 times. Test with different socket buffer sizes to see what value yields positive results, and ensure throughput is greater than the peak redo throughput.
- Compare performance with various MTU settings.
If average redo write size is less than 1500 bytes, then try various MTU settings including MTU=9000 (For example, Jumbo Frames) for network interface that sends or receives redo on your system. This may reduce some unnecessary network round trips which will increase overall throughput.
Also note that for SYNC transport, Oracle's average redo write size (for example, Oracle message send) increases significantly as determined by v$sysstats or AWR statistics "redo size / redo writes".
When sending redo across geographical regions, experiments have shown that using MTU=9000 can also benefit in some network topologies. Conduct performance tests with oratcp and compare the results with default MTU and MTU=9000 settings.
Gather Topology Information
Understanding the topology of the Oracle Data Guard configuration, and its relevance to Data Guard performance, helps eliminate infrastructure weaknesses that are often incorrectly attributed to the Data Guard architecture.
Oracle recommends that you outline the following high-level architecture information.
- Describe the primary and standby database system (number of nodes in Oracle RAC cluster, CPUs and memory per database node, storage I/O system)
- Describe network topology connecting the primary and standby systems
- Network switches and firewalls in between primary and standby
- Network bandwidth and latency
For standby databases with symmetric hardware and configuration, and with a well tuned network configuration that can support peak redo generation rates, the transport lag should be less than 1 second.
Understanding Network Usage of Data Guard
The phases of the Data Guard life cycle which use the network most heavily are:
- Instantiation - During this phase of standby database creation, files can be copied using parallelism from any host. Determining the degree of parallelism which maximizes throughput between nodes helps to optimize the standby instantiation process.
- Redo Transport (Steady State)- Under normal Data Guard operation the primary database ships redo to the standby which is then applied. Each RAC instance of a primary database ships redo in a single stream from the host on which it is running. Understanding the requirements of each primary database instance and ensuring a single process can achieve the throughput requirements is critical to a standby database staying current with the primary database.
Understanding Targets and Goals for Instantiation
Instantiation for large databases can take hours, or in extreme cases days. To allow for planning of an instantiation and also maximize throughput between the primary and standby system to complete the instantiation is as timely a manner as possible, first determine the goal for instantiation time. Then follow the process defined below to maximize per process throughput and identify the optimal degree of parallelism between the primary and standby nodes.
Understanding Throughput Requirements and Average Redo Write Size for Redo Transport
Required network bandwidth of a given Data Guard configuration is determined by the redo generate rate of the primary database.
Note:
In cases where the primary database is pre-existing, a baseline for the required network bandwidth can be established. If there is no existing primary database, skip this step and future references to the data further in the process.
While the Automatic Workload Repository (AWR) tool can be used to determine the redo generation rate, the snapshots are often 30 or 60 minutes apart which can dilute the peak rate. Since peak rates often occur for shorter periods of time, it is more accurate to use the following query which calculates the redo generation rate for each log when run on an existing database. (change the timestamps as appropriate)
SQL> SELECT THREAD#, SEQUENCE#, BLOCKS*BLOCK_SIZE/1024/1024 MB,
(NEXT_TIME-FIRST_TIME)*86400 SEC,
(BLOCKS*BLOCK_SIZE/1024/1024)/((NEXT_TIME-FIRST_TIME)*86400) "MB/S"
FROM V$ARCHIVED_LOG
WHERE ((NEXT_TIME-FIRST_TIME)*86400<>0)
AND FIRST_TIME BETWEEN TO_DATE('2022/01/15 08:00:00','YYYY/MM/DD HH24:MI:SS')
AND TO_DATE('2022/01/15 11:00:00','YYYY/MM/DD HH24:MI:SS')
AND DEST_ID=1 ORDER BY FIRST_TIME;
Example output:
THREAD# SEQUENCE# MB SEC MB/s
------- --------- ---------- ---------- ----------
2 2291 29366.1963 831 35.338383
1 2565 29365.6553 781 37.6000708
2 2292 29359.3403 537 54.672887
1 2566 29407.8296 813 36.1719921
2 2293 29389.7012 678 43.3476418
2 2294 29325.2217 1236 23.7259075
1 2567 11407.3379 2658 4.29169973
2 2295 24682.4648 477 51.7452093
2 2296 29359.4458 954 30.7751004
2 2297 29311.3638 586 50.0193921
1 2568 3867.44092 5510 .701894903
Note
To find the peak redo rate, choose times during the highest level of processing, such as peak OLTP periods, End of Quarter batch processing or End of Year batch processing.
♦️In this short example the highest rate was about 52MB/s. Ideally the network will support the maximum rate plus 30% or 68MB/s for this application.
Verify Average Redo Write Size
Using v$sysstats or looking at your AWR reports for various workload and peak intervals, record the average redo write size based on
Average Redo Write Size = "REDO SIZE" / "REDO WRITES"
♦️Use this average redo write size in your oratcp experiments. If the average redo write size > 1500 bytes, experiment with various MTU settings.
Understand Current Network Throughput
The Oracle utility oratcptest is a general-purpose tool for measuring network bandwidth and latency similar to iperf/qperf which can be run by any OS user.
The oratcptest utility provides options for controlling the network load such as:
- Network message size
- Delay time between messages
- Parallel streams
- Whether or not the oratcptest server should write messages on disk.
- Simulating Data Guard SYNC transport by waiting for acknowledgment (ACK) of a packet or ASYNC transport by not waiting for the ACK.
Note
This tool, like any Oracle network streaming transport, can simulate efficient network packet transfers from the source host to target host similar to Data Guard transport. Throughput can saturate the available network bandwidth between source and target servers.
♦️Therefore, Oracle recommends that short duration tests are performed and that consideration is given for any other critical applications sharing the same network.
Measure the Existing Throughput of One and Many Processes
Do the following tasks to measure the existing throughput.
Task 1: Install oratcptest
- Download the oratcptest.jar file from MOS note 2064368.1
- Copy the JAR file onto both client (primary) and server (standby)
Note:
oratcptest requires JRE 6 or later - Verify that the host has JRE 6 or later
- On all primary and standby hosts, verify that the JVM can run the JAR file by displaying the help
# java -jar oratcptest.jar -help
Task 2: Determine the Existing Throughput for a Single Process
Data Guard asynchronous redo transport (ASYNC) uses a streaming protocol which does not wait for packet acknowledgment and therefore achieves higher rates than SYNC transport.
- Start the test server on the receiving (standby) side.
java -jar oratcptest.jar -server [IP of standby host or VIP in RAC - configurations] -port=<any available port number>
- Run the test client. (Change the server address and port number to match that of your server started in step 4.)
- $ java -jar oratcptest.jar [IP of standby host or VIP in RAC configurations] -port=<port number> -mode=async -duration=120 -interval=20s [Requesting a test] Message payload = 1 Mbyte Payload content type= RANDOM Delay between messages = NO Number of connections = 1 Socket send buffer = (system default) Transport mode = ASYNC Disk write = NO Statistics interval = 20 seconds Test duration= 2 minutes Test frequency = NO Network Timeout = NO (1 Mbyte = 1024x1024 bytes) (17:54:44) The server is ready. Throughput (17:55:04) 20.919 Mbytes/s (17:55:24) 12.883 Mbytes/s (17:55:44) 10.457 Mbytes/s (17:56:04) 10.408 Mbytes/s (17:56:24) 12.423 Mbytes/s (17:56:44) 13.701 Mbytes/s (17:56:44) Test finished. Socket send buffer = 2 Mbytes Avg. throughput = 13.466 Mbytes/s
♦️In this example the average throughput between these two nodes was about 13 MB/s which does not meet the requirements of 68 MB/s from the query.
Note
This process can be scheduled to run at a given frequency using the -freq option to determine if the bandwidth varies at different times of the day. For instance setting -freq=1h/24h will repeat the test every hour for 24 hours.
Task 3: Determine Existing Throughput for Multiple Processes
- Repeat the previous test with two (2) connections (using num_conn parameter).
$ java -jar oratcptest.jar <target IP address> -port=<port number> -duration=60s -interval=10s -mode=async [-output=<results file>] -num_conn=2 [Requesting a test] Message payload = 1 Mbyte Payload content type = RANDOM Delay between messages = NO Number of connections = 2 Socket send buffer = (system default) Transport mode = ASYNC Disk write = NO Statistics interval = 20 seconds Test duration = 2 minutes Test frequency = NO Network Timeout = NO (1 Mbyte = 1024x1024 bytes) (18:08:02) The server is ready Throughput (18:08:22) 44.894 Mbytes/s (18:08:42) 23.949 Mbytes/s (18:09:02) 25.206 Mbytes/s (18:09:22) 23.051 Mbytes/s (18:09:42) 24.978 Mbytes/s (18:10:02) 22.647 Mbytes/s (18:10:02) Test finished. Avg. socket send buffer = 2097152 Avg. aggregate throughput = 27.454 Mbytes/s - Re-run step 1 Iteratively and increase the value of num_conn by two each time until the aggregate throughput does not increase for three consecutive values. For example if the aggregate throughput is approximately the same for 10, 12 and 14 connections, stop.
Note
RMAN can utilize all nodes in the cluster for instantiation. To find the total aggregate throughput, see My Oracle Support Creating a Physical Standby database using RMAN restore database from service (Doc ID 2283978.1) - Run the same test with all nodes in all clusters to find the current total aggregate throughput. Node 1 of primary to node 1 of standby, node 2 to node 2, etc. Sum the throughput found for all nodes.
- Reverse the roles and repeat the tests.
- Note the number of connections which achieved the best aggregate throughput.
Use the total size of the database and total aggregate throughput to estimate the amount of time it will take to complete the copy of the database.
A full instantiation also needs to apply the redo generated during the copy. Some additional percentage (0%-50%) should be added to this estimated time based on how active the database is.
If the estimated time meets the goal, no additional tuning is required for instantiation.
No comments:
Post a Comment