How to evaluate and analyze the Network Optimization in Oracle Standby environment
♠️ Alireza Kamrani ♠️
03/April/2025
One of the points that is required in designing the network architecture between the standby and the main database is to pay attention to the network bandwidth and ensure that there are settings that minimize the delay in sending packets. Knowing these things is important and significant for network tuning in the standby environment, and the right information can ensure that the standbys are in sync even during peak hours.
Oracle Data Guard redo transport performance is directly dependent on the performance of the primary and standby systems, the network that connects them, and the I/O subsystem.
For most Oracle Data Guard configurations, you should be able to achieve zero or minimal data loss by troubleshooting and tuning redo transport.
To calculate for Network Bandwidth Transfer for REDO LOG - Data Guard Environment in Primary database
Formula :
Required bandwidth = ((Redo rate bytes per sec. / 0.7) * 8) / 1,000,000 = bandwidth in Mbps
Note : Assuming TCP/IP network overhead of 30%.
Calculation :
1. RUN Statspack during peak intervals to measure REDO rate.
2. If it is RAC instances, we have to calculate for all the RAC instances.
3. Check the following SQL Statement
SQL> select * from v$sysmetric_history
where metric_name = 'Redo Generated Per Sec';
4. RDA-Output:
Performance - AWR Report - Statistic: "redo size"
Example :
Let us assume the redo rate is a 600 KB/sec.
Required bandwidth =
((Redo rate bytes per sec. / 0.7) * 8) / 1,000,000 = bandwidth in Mbps
= ((614400/0.7) * 8) /1,000,000
= 7.02 Mbps
In this topic, I will introduce a preferred tools that can help DBA to make better decisions with a accurate evaluations.
oratcptest is an Oracle-provided tool to evaluate network bandwidth and latency for Data Guard and Oracle RAC environments. It helps determine whether the network can handle redo transport between primary and standby databases.
1. Running oratcptest to Measure Bandwidth
The tool is available in Oracle 12c and later. It runs in client-server mode to simulate redo transport.
Step 1: Start the Listener on the Standby Server
On the standby database (or target system), start oratcptest in server mode:
oratcptest -server -host <STANDBY_HOST>
This opens a listener to receive test traffic.
Step 2: Run the Test from the Primary Database
On the primary database, run oratcptest in client mode:
oratcptest -client -host <STANDBY_HOST> -dir /tmp -time 60 -speed 100
Where:
• -client: Runs the test as a client.
• -host <STANDBY_HOST>: IP/hostname of the standby server.
• -dir /tmp: Location for temporary test files.
• -time 60: Duration of the test in seconds.
• -speed 100: Maximum speed in Mbps.
2. Analyzing the Output
After running the test, you get:
• Throughput (Mbps): Maximum sustainable bandwidth.
• Latency (ms): Round-trip delay.
• Packet loss: If network congestion occurs.
A sample output might look like:
Average Throughput: 200 Mbps
Latency: 5 ms
Packet Loss: 0.01%
This means your network can handle 200 Mbps, which should be sufficient if your redo rate is below this threshold.
3. Adjusting for Real-World Conditions
• Enable Compression: If redo transport compression is on, effective bandwidth may increase.
• Add Overhead: TCP/IP overhead (~10%) should be considered.
• Test During Peak Load: Network congestion can affect results.
Detailed guide on interpreting the output for Data Guard performance tuning:
Interpreting oratcptest Output for Oracle Data Guard Performance Tuning
Once you run oratcptest, the results help assess whether your network can sustain redo transport between the primary and standby databases. Here’s how to interpret the output and optimize performance.
1. Key Metrics from oratcptest Output
The output provides the following crucial performance indicators:
a) Throughput (Mbps)
• Definition: The maximum network bandwidth available for redo transport.
• Interpretation:
• If the reported throughput is higher than your redo generation rate, the network is sufficient.
• If the throughput is lower, you may experience redo lag, requiring optimizations.
✔ Example:
Average Throughput: 200 Mbps
👉 If your redo rate is 50 MB/sec (~400 Mbps), this bandwidth is insufficient, and you may experience standby lag.
b) Latency (ms)
• Definition: Time taken for data packets to travel between primary and standby.
• Ideal Value:
• For SYNC mode, latency should be < 5ms.
• For ASYNC mode, higher latency is tolerable but may impact failover times.
✔ Example:
Latency: 4.8 ms
👉 If using SYNC mode, this latency is acceptable. If it were >10 ms, you might need ASYNC mode or optimize network routing.
c) Packet Loss (%)
• Definition: Percentage of data packets lost during transmission.
• Ideal Value: 0% (or very close to zero).
• Impact:
• High packet loss (>0.1%) causes redo transport delays and Data Guard lag.
• If packet loss is high, check for network congestion or unstable links.
✔ Example:
Packet Loss: 0.01%
👉 This is within an acceptable range. However, >0.1% packet loss needs troubleshooting.
2. Performance Tuning Based on Results
Depending on the results, take the following actions:
a) Low Bandwidth Issues
❌ Problem: Reported bandwidth is lower than redo rate.
✅ Solution:
• Enable Redo Transport Compression: ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='SERVICE=standby_db COMPRESSION=ENABLE';
• Upgrade network speed (e.g., from 1Gbps to 10Gbps).
• Use a dedicated network interface for redo transport.
• Check hardware device such as Network Switches to hace correct configuration and speeds.
b) High Latency Issues
❌ Problem: Latency > 10ms affecting SYNC mode.
✅ Solution:
• Switch to ASYNC mode: ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='SERVICE=standby_db ASYNC';
• Reduce distance between primary and standby.
• Use a private/dedicated network instead of shared internet.
c) Packet Loss Issues
❌ Problem: Packet loss >0.1%, causing redo transport delays.
✅ Solution:
• Check network congestion (QoS settings, firewall, VPN issues).
• Switch to ASYNC mode if using SYNC and experiencing high packet loss.
• Use a higher-quality network provider or optimize routing.
3. Validating Optimizations
After applying fixes, rerun oratcptest to validate improvements:
oratcptest -client -host <STANDBY_HOST> -dir /tmp -time 60 -speed 500
If throughput increases, latency decreases, and packet loss is near 0%, the network is tuned correctly.
Automating oratcptest for Continuous Monitoring
To ensure continuous monitoring of network performance for Oracle Data Guard, you can set up automated scripts that:
• Run oratcptest periodically (e.g., every hour).
• Log the results for trend analysis.
• Send alerts if bandwidth, latency, or packet loss exceed thresholds.
1. Creating an Automation Script
You can use a Bash script to automate the test on the primary database.
Script: oratcptest_monitor.sh
#!/bin/bash
# Set standby server hostname or IP STANDBY_HOST="<standby-host>"
# Log file path LOG_FILE="/var/log/oratcptest.log"
# Run the oratcptest command (adjust speed & duration as needed)
oratcptest -client -host $STANDBY_HOST -dir /tmp -time 60 -speed 500 > /tmp/oratcptest_result.txt
# Extract key metrics from the output
THROUGHPUT=$(grep "Throughput" /tmp/oratcptest_result.txt | awk '{print $3}')
LATENCY=$(grep "Latency" /tmp/oratcptest_result.txt | awk '{print $2}')
PACKET_LOSS=$(grep "Packet Loss" /tmp/oratcptest_result.txt | awk '{print $3}' | tr -d '%')
# Log results
echo "$(date) | Throughput: ${THROUGHPUT} Mbps | Latency: ${LATENCY} ms | Packet Loss: ${PACKET_LOSS}%" >> $LOG_FILE
# Define threshold values THRESHOLD_BANDWIDTH=200
# Minimum acceptable bandwidth (Mbps) THRESHOLD_LATENCY=10
# Maximum acceptable latency (ms) THRESHOLD_PACKET_LOSS=0.1
# Maximum acceptable packet loss (%)
# Check if thresholds are exceeded and trigger alerts
if (( $(echo "$THROUGHPUT < $THRESHOLD_BANDWIDTH" | bc -l) )); then
echo "ALERT: Low Network Bandwidth ($THROUGHPUT Mbps)!" | mail -s "Oracle Network Alert" admin@example.com
fi
if (( $(echo "$LATENCY > $THRESHOLD_LATENCY" | bc -l) )); then
echo "ALERT: High Latency ($LATENCY ms)!" | mail -s "Oracle Network Alert" admin@example.com
fi
if (( $(echo "$PACKET_LOSS > $THRESHOLD_PACKET_LOSS" | bc -l) )); then
echo "ALERT: High Packet Loss ($PACKET_LOSS%)!" | mail -s "Oracle Network Alert" admin@example.com fi
# Clean up
rm -f /tmp/oratcptest_result.txt
2. Scheduling the Script Using Cron
To run this script every hour, add it to the crontab:
crontab -e
Add the following line:
0 * * * * /path/to/oratcptest_monitor.sh
This ensures the script runs at the start of every hour.
3. Analyzing Results
• The script logs all results in /var/log/oratcptest.log.
• Alerts are sent via email if thresholds are exceeded.
• You can visualize trends using tools like Grafana or ELK Stack.
♠️ Alireza Kamrani ♠️