Friday, February 14, 2025

Tuning Oracle Database Using some changes in internal layers of OS

Modifying the Default CFS Scheduler in Oracle Linux


Alireza Kamrani
14- Feb- 2025

In the previous post, I reviewed the capabilities available in Oracle Linux for scheduling and planning processes in Linux, in this post I will make changes in this area in order to tune the Oracle database and the operating system itself.

Note:
♨️Everything you see in this post is not a step-by-step method to increase the performance of your Production server and you should only use it to know ability of changes in different layers of the operating system and finally customize some of these observations and after testing for your environment.

Normally, you won't need these, but it can help increase your knowledge of how to apply changes to different layers, and perhaps in some cases you can improve performance by customizing some of the scripts.

Linux Scheduling Mechanisms:

Oracle Linux, like most modern Linux distributions, uses the Completely Fair Scheduler (CFS) as the default CPU scheduler.
You can tune and modify CFS behavior using kernel parameters and system utilities.

1. Check the Current Scheduler Configuration

Before making changes, check the existing CFS settings:

cat /proc/sys/kernel/sched_latency_ns cat /proc/sys/kernel/sched_min_granularity_ns
cat /proc/sys/kernel/sched_wakeup_granularity_ns

These parameters control how the scheduler handles task latencies and fairness.

2. Modify CFS Scheduling Parameters

a) Change CFS Latency and Granularity

The default CFS behavior can be adjusted using sysctl or directly modifying /proc/sys/kernel values.

Example: Reduce Latency for Faster Response

echo 5000000 | sudo tee /proc/sys/kernel/sched_latency_ns

# Default: 24ms, change to 5ms

echo 1000000 | sudo tee /proc/sys/kernel/sched_min_granularity_ns

# Default: 6ms, change to 1ms

echo 2000000 | sudo tee /proc/sys/kernel/sched_wakeup_granularity_ns

# Default: 4ms, change to 2ms

What This Does:

• Reduces task switch latency, making the system more responsive.

• Increases CPU time slices for interactive processes.

Persistent Configuration (After Reboot)

To make these changes permanent, add them to /etc/sysctl.conf:

echo "kernel.sched_latency_ns = 5000000" | sudo tee -a /etc/sysctl.conf

echo "kernel.sched_min_granularity_ns = 1000000" | sudo tee -a /etc/sysctl.conf

echo "kernel.sched_wakeup_granularity_ns = 2000000" | sudo tee -a /etc/sysctl.conf

Then apply the changes:

sudo sysctl -p


3. Set CPU Affinity for CFS Tasks

CFS allows binding tasks to specific CPU cores for better performance. Use taskset to manually set affinity:

taskset -c 0,1 ./my_app
# Binds process to CPU cores 0 and 1

To make it persistent, use systemd:

echo "0-1" | sudo tee /sys/fs/cgroup/cpuset/cpuset.cpus

4. Modify Per-Process CFS Nice Levels

CFS uses "nice values" (priority levels from -20 to 19) to determine CPU allocation. Lower values give higher priority.

nice -n -10 ./my_task

# Run task with higher priority renice -n -5 -p 1234

# Change priority of running process 1234

5. Use CGroups to Control CPU Allocation

You can limit CPU usage for specific processes using cgroups:

Create a CGroup

sudo mkdir /sys/fs/cgroup/cpu/my_group

echo 50000 | sudo tee /sys/fs/cgroup/cpu/my_group/cpu.cfs_quota_us

# Allow 50ms CPU time

echo $$ | sudo tee /sys/fs/cgroup/cpu/my_group/tasks
# Add current process to group

6. Change CFS Scheduler at Boot (Kernel Parameter)

If you want to experiment with other schedulers (like FIFO, RR, or DEADLINE), you can modify GRUB:

sudo grubby --update-kernel=ALL --args="sched_policy=RR" sudo reboot

To confirm the change:

cat /sys/kernel/debug/sched_features

Conclusion

Oracle Linux allows fine-tuning of the CFS scheduler for different workloads. By adjusting latency, granularity, CPU affinity, and priority, you can optimize performance for desktop, real-time, or server environments.

Optimizing CFS Scheduler for Oracle Database on Oracle Linux

Oracle databases are highly I/O-intensive and require careful CPU and memory scheduling to optimize performance.
Since Oracle Linux uses the Completely Fair Scheduler (CFS) by default, fine-tuning it can significantly improve database responsiveness, especially for OLTP (Online Transaction Processing) and OLAP (Analytical Processing) workloads.

1. Key Performance Factors for Oracle Database

When tuning CFS for Oracle DB, consider:
CPU Scheduling – Ensure Oracle DB processes get priority.
I/O Scheduling – Optimize disk access for high throughput.
NUMA Optimization – Improve memory locality for large databases.
HugePages – Reduce memory fragmentation and TLB misses.

2. Modify CFS Scheduler for Oracle DB Performance

a) Set Oracle Processes to Higher CPU Priority

Oracle DB runs multiple processes (oracle, pmon, dbwr, etc.). These should get more CPU time.

Check Oracle DB Process IDs:

pgrep -u oracle

Increase CPU Priority Using chrt

sudo chrt -f -p 99 $(pgrep pmon)

# Give highest FIFO priority to PMON sudo chrt -f -p 98 $(pgrep dbwr)

# Higher priority for DBWR process sudo

chrt -f -p 90 $(pgrep lgwr)

# High priority for log writer

Why?

• PMON (Process Monitor) needs fast response for crash recovery.

• DBWR (Database Writer) should not be delayed, as it writes dirty pages to disk.

• LGWR (Log Writer) must flush redo logs with minimal delay.

Make it Persistent Using systemd
#if exists:
sudo systemctl edit oracle-db.service

Add:

[Service]
CPUSchedulingPolicy=fifo CPUSchedulingPriority=99

Then reload:

sudo systemctl daemon-reexec

b) Optimize I/O Scheduler for Oracle Database

Set the Deadline Scheduler (Recommended)

echo deadline | sudo tee /sys/block/sdX/queue/scheduler

# Replace sdX with database disk

Why?

• Favors low-latency writes, critical for Oracle redo logs.

• Minimizes starvation of small transactions in OLTP workloads.

• Provides predictable performance for large queries in OLAP.

For SSDs, Use noop Instead

echo noop | sudo tee /sys/block/nvme0n1/queue/scheduler

Why?

• SSDs do not have seek time, so simple FIFO scheduling (noop) is best.

Make it Persistent

sudo grubby --update-kernel=ALL --args="elevator=deadline"
sudo reboot

c) Enable CPU Affinity for Oracle DB

To prevent CPU contention, bind Oracle DB to dedicated CPU cores.

Find Available CPUs:

lscpu | grep "CPU(s):"

Bind Oracle to Specific Cores (e.g., 2-5)

taskset -c 2-5 $(pgrep -u oracle)

Persistent Configuration with cgroups

sudo mkdir /sys/fs/cgroup/cpuset/

oracle echo 2-5 | sudo tee /sys/fs/cgroup/cpuset/oracle/cpuset.cpus

echo $$ | sudo tee /sys/fs/cgroup/cpuset/oracle/tasks

d) NUMA Optimization for Oracle Database

On NUMA systems, Oracle DB can suffer from cross-node memory latency.


Check NUMA Nodes:

numactl --hardware

Force Oracle to Use Local Memory Node:

numactl --membind=0 --cpunodebind=0 ./oracle

• This binds Oracle DB to NUMA node 0 for better memory locality.

Persistent NUMA Binding

sudo systemctl edit oracle-db

Add:

[Service]
ExecStart=/usr/bin/numactl --membind=0 --cpunodebind=0 /u01/app/oracle/product/19c/bin/oracle

e) Enable HugePages for Better Memory Management

HugePages reduce memory fragmentation and improve TLB efficiency for Oracle DB.

Check Current HugePages:

grep Huge /proc/meminfo

Calculate HugePages for Oracle (Assume 64GB SGA)

echo $((64 * 1024 / 2))
# 64GB / 2MB HugePage size = 32768

Set HugePages Count

echo 32768 | sudo tee /proc/sys/vm/nr_hugepages

Make It Persistent

echo "vm.nr_hugepages=32768" | sudo tee -a /etc/sysctl.conf sudo sysctl -p

3. Monitor Performance After Changes

After applying these optimizations, monitor Oracle DB performance.

Check CPU and Scheduling Statistics

ps -eo pid,comm,policy,pri,nice | grep oracle

Monitor I/O Performance

iostat -x 1

Check HugePages Usage

grep Huge /proc/meminfo

Verify NUMA Binding

numastat -p
$(pgrep -u oracle | head -n 1)


Summary: Oracle DB Tuning on Oracle Linux

By tuning the CFS scheduler, I/O scheduler, CPU affinity, NUMA settings, and HugePages, Oracle Database performance on Oracle Linux can be significantly improved, reducing query latency and improving transaction throughput.

Finally I provided a Script to tuning Oracle Database Processes:

Oracle Database Performance Optimization Script for Oracle Linux

This script automates CPU, I/O, NUMA, and memory tuning for Oracle Database on Oracle Linux.

๐Ÿ“Œ Features:

Sets Oracle DB processes to real-time priority (FIFO)
Configures I/O scheduler to deadline (or noop for SSDs)
Binds Oracle DB to specific CPU cores for optimal performance
Enables NUMA awareness for memory locality
Configures HugePages for efficient memory management

Script: oracle_tune.sh

#!/bin/bash

# Oracle Database Tuning Script for Oracle Linux
# Oracle Linux 8/9

echo "Starting Oracle Database Performance Optimization..."

# Variables
ORACLE_USER="oracle"
ORACLE_DISK="/dev/sda" 
# Change to correct database disk
CPU_CORES="2-5"        
# Adjust CPU affinity as needed
NUMA_NODE="0"          
# Adjust based on `numactl --hardware`
HUGEPAGES_COUNT=32768  
# Adjust based on Oracle SGA size

echo "Setting Real-Time Priority for Oracle Processes..."
for pid in $(pgrep -u $ORACLE_USER); do
    sudo chrt -f -p 99 $pid
done

echo "Optimizing I/O Scheduler..."
if [[ -e /sys/block/sda/queue/scheduler ]]; then
    echo "deadline" | sudo tee /sys/block/$ORACLE_DISK/queue/scheduler
fi

echo " Binding Oracle to CPUs: $CPU_CORES..."
for pid in $(pgrep -u $ORACLE_USER); do
    sudo taskset -cp $CPU_CORES $pid
done

echo "Enabling NUMA Optimization..."
for pid in $(pgrep -u $ORACLE_USER); do
    sudo numactl --membind=$NUMA_NODE --cpunodebind=$NUMA_NODE --physcpubind=$CPU_CORES --localalloc --pid=$pid
done

echo " Configuring HugePages..."
echo $HUGEPAGES_COUNT | sudo tee /proc/sys/vm/nr_hugepages
echo "vm.nr_hugepages=$HUGEPAGES_COUNT" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p

echo "Oracle Database Optimization Completed!"

How to Use the Script

Make the Script Executable

chmod +x oracle_tune.sh

Run as Root (or with sudo)

sudo ./oracle_tune.sh

Verify the Optimizations

• Check CPU Priority ps -eo pid,comm,policy,pri | grep oracle

• Check I/O Scheduler cat /sys/block/sda/queue/scheduler

• Check NUMA Binding numastat -p $(pgrep -u oracle | head -n 1)

• Check HugePages grep Huge /proc/meminfo

๐Ÿ“Œ Notes

• Adjust ORACLE_DISK, CPU_CORES, and HUGEPAGES_COUNT based on your system.

• Run lsblk to find the correct disk.

• Use numactl --hardware to determine NUMA nodes.

• Set SGA size properly when configuring HugePages.

Bonus: Run at Startup (Optional)

To apply these optimizations at every reboot:

sudo cp oracle_tune.sh /etc/init.d/ sudo chmod +x /etc/init.d/oracle_tune.sh sudo ln -s /etc/init.d/oracle_tune.sh /etc/rc.d/



Conclusion

This script automates CPU scheduling, I/O tuning, NUMA optimization, and HugePages configuration for Oracle Database, improving performance on Oracle Linux.

Automatically Running Script:

In Oracle Linux 8 (OEL8), simply placing the script in /etc/rc.d/ will not automatically execute it at boot. Since OEL8 uses systemd, you need to create a systemd service to ensure the script runs at startup.

Steps to Automatically Run the Script at Boot

Move the Script to a System Directory

First, place the script in a proper location:

sudo mv oracle_tune.sh /usr/local/bin/ sudo chmod +x /usr/local/bin/oracle_tune.sh

Create a systemd Service File

Now, create a service file for systemd:

sudo nano /etc/systemd/system/oracle_tune.service

Paste the Following Configuration:

[Unit]
Description=Oracle Database Performance Optimization After=network.target
[Service]
Type=oneshot
ExecStart=/usr/local/bin/oracle_tune.sh RemainAfterExit=yes
User=root
[Install]
WantedBy=multi-user.target

Reload systemd and Enable the Service

systemctl daemon-reload
systemctl enable oracle_tune.service
systemctl start oracle_tune.service

Verify That the Service Runs at Boot

Check the status:

systemctl status oracle_tune.service

Test a reboot:

sudo reboot

After the system restarts, verify:

systemctl status oracle_tune.service

Summary

Now, Oracle Linux will automatically optimize your database performance at every reboot.

Alireza Kamrani. 
DATABASE Consultant,  ACE.

No comments:

Post a Comment

Oracle SGA vs PGA Usages and Concepts

Why Oracle Uses PGA Instead of SGA for Large Table Scans and How to Optimize It Alireza Kamrani 02/27/2025 When selecting from large tabl...