Sunday, May 24, 2026

Oracle Direct Path Read wait event Solution

High "Direct Path Read" waits Caused by Full Table Scan on Large Table 


A complete overview & solutions to handeling issues 


Problem:
Specific SQL was found to be responsible for lot of "direct path reads" causing slow I/O.
For example: 
SELECT C1, C2, C3,.... FROM table_name WHERE 
C1 = :1 AND C3 = :5;

📍In the older AWR reports, this same query would work well, was not using "direct path reads" and was also not the top I/O consumer. 
📍We also see from the above, the query is using "TABLE ACCESS - FULL" (FTS), which was also the same case earlier.

CHANGES
The table size has grown from earlier and no longer "fits" into the definition of a small table.
If there are no indexes on the columns used in the predicates and the table is not a "small" table, then full table scans will cause high direct I/O reads.

SOLUTION
To prevent "direct path reads" on large tables, we have below options:

1. Tune the SQL by creating indexes or modifying the access paths, to avoid Full table scan, for the query.

2. If there is sufficient space in the buffer cache, and your table is large, then "Cache" the table using:

ALTER TABLE <table name> CACHE;

You can use the KEEP buffer pool to keep frequently accessed tables persistent in the buffer pool. 
Please note, it does not mean storing all blocks from the table in the buffer cache, but just tell Oracle to handle the blocks differently when they get into the cache. In other words, the table would be ALWAYStreated as small table regardless of the actual size of the table.

3. Increase SGA size - this will increase buffer cache accordingly and set a larger value of the "_small_table_threshold" parameter.

CACHE READS applies only to LOB storage. It’s useful when you want to bring LOB values into the buffer cache only during read, not during write operations. By the way, when creating table with LOB column, you could specify CACHE option in the LOB_storage_clause to places LOB data values in the buffer cache for faster access. By the default, both BasicFiles and SecureFiles LOBs is doing NOCACHE LOGGING. The downside of caching LOBs is when not using carefully, the flooding of cache of LOB object would force rest of database data out of buffer cache and have more Disk Read rather than cache hits for non-LOB data. This is uncommon behavior and SGA increase could improve the performance from this issue.

4. Change "_small_table_threshold" parameter value manually, to avoid direct path reads for the large table.

How to force 'Direct Path Read' for specific SQL Query?

There are ways to disable direct path read for SQL statements as follows: 

1. event 10949 level 1
2. _serial_direct_read = NEVER

However, there are no direct methods to force the direct path read operations which are faster for some SQL statements.

Sometimes, the same SQL that used to run in direct path read suddenly changed to conventional cache reads causing slow performance.

Goal of this document is to provide a method to force direct path read for such SQL statements.
 
SOLUTION:
There are 2 methods to force direct path reads for SQL statements.

1. Use PARALLEL hint to the SQL statements like, /*+ parallel(4) */ so that parallelism uses direct path read.

2. Setting the statistics of the tables involved in the SQL such that the no.of blocks of tables > _small_table_threshold to enable the serial direct path read.

(i) Check the value of _small_table_threshold parameter in your DB.

SQL> select nam.ksppinm NAME,val.KSPPSTVL VALUE from x$ksppi nam, x$ksppsv val where nam.indx = val.indx and nam.ksppinm = '_small_table_threshold';

(ii) Check the no.of blocks statistics for the table.

SQL> SELECT blocks FROM user_tables WHERE table_name = 'TABLE_NAME';

Example:
If the blocks from user_tables for the object show 100 and _small_table_threshold is set to 480 then set the blocks statistics manually to 1000 so that it would go for direct path read.

(iii) Set the no.of blocks statistics for the tables involved in the SQL manually greater than the "_small_table_threshold" value.

SQL> EXEC DBMS_STATS.SET_TABLE_STATS('username','tabname',numblks=>n);

Example:
SQL> EXEC DBMS_STATS.SET_TABLE_STATS(user,'TEST',numblks=>1000);

There is a parameter 

_direct_read_decision_statistics_driven that controls this:

NAME                       VALUE    DESCRIPTION
------------------  -------  -----------------
_direct_read_decision_statistics_driven            TRUE     
enable direct read decision based on optimizer statistics
 
When the above parameter is FALSE, the direct path read decision is done based on the actual block count of segment header.

When the above parameter is TRUE , the direct path read decision is done based on the optimizer statistics.
 
NOTE: Though the block counts are taken from the optimizer statistics, it is not the optimizer alone that does the direct path read decision, as there are other factors like buffer cache size, cached block counts, PX execution etc affecting the direct path reads.



Direct Path Read VS Scattered Read
For direct path read, SGA is bypassed and the blocks are read directly into PGA. 
Direct path read is faster than scattered reads because it can avoid latches. 
But on the other hand, oracle needs to count the total number of dirty blocks in the memory for each table/segment or go though the checkpoint queue to count the buffers📍

db file scattered read wait event is common for applications with a high amount of large reads (such as full or range scans).
This is true for BI, DWH and DSS workload environments.
The db file scattered read wait event represents a wait for a physical read of multiple Oracle blocks from the disk (unlike db file sequential read  which represents reading a single block). 
It is usually caused by scanning the entire or a sub-range of a table, index, table partition, or index partition.

If you had a medium table with Scattered reads and now mentioned table has a big size and have performance degradation, maybe one of solutions would be using Direct Path Read and if you will going to use Direct path reads instead of Sacttered read for some queries, one of the beginning step can be finding all Scattered reads wating events. Then see Plans, Number of table blocks , Number of dirty blocks , and reviews hidden params related to size of tables in database to make a better decision.

At the time when we look at a query’s execution plan, if seeing TABLE ACCESS FULL, there are only two ways to go to perform full table scan:
1. Conventional path read or buffer cache read
2. Direct path read

For direct path read, SGA is bypassed and the blocks are read directly into PGA. Direct path read is faster than scattered reads because it can avoid latches. But on the other hand, oracle needs to count the total number of dirty blocks in the memory for each table/segment or go though the checkpoint queue to count the buffers. 
The checkpoint is done for the segment to make all the blocks of the segment up-to-date. You could see enq: KO – fast object checkpoint event just before the direct path read.

Generally, having more direct path reads is not an issue, Are you facing any problem related to any SQL ? 

Average wait time for example 5 ms maybe not bad. 
You need to also check TOP SQL section in the order of elapsed time. 
Dont forget when any process tries to read a block in the PGA that has not yet been read from disk, it submits a wait call and updates the statistics for this particular event. 
So, the number of waits is not always the same as the number of read requests.

Direct reads are largely controlled by the _serial_direct_read parameter, and need to check per database version:

SQL> alter session set "_serial_direct_read"=FALSE;
Session altered.

SQL> show parameter "_serial_direct_read"
NAME                             TYPE      VALUE
--------------------   -------    --------------
_serial_direct_read     string       FALSE

Direct Path Read and size of Table
db_big_table_cache_percent_target is new in 12c and it appears to be used as a broad-brush approach for caching direct path reads that result from large-table full-table scans.

There are two parameters related to, db_big_table_cache_percent_target, namely the v$bt_scan_cache and v$bt_scan_obj_temps.

This appears (and the name suggests) that db_big_table_cache_percent_target is used like the KEEP pool, expect that is does not apply to individual tables and indexes, like assignment to the KEEP pool.  

Instead, setting db_big_table_cache_percent_target determines whether to cache all large tables.  

The default in Oracle for large tables is where any table is larger than 2% of the SGA size, it is considered a large table.


For small table, full table scan always read into buffer cache. If size of the table or segment > 5*_small_table_threshold or 10% of buffer cache, then the table/segment is considered Large tableand always uses Direct Path Read. For table between small and large table, it is Medium size table, i.e, between 2% and 10% of buffer cache. For Medium size table, sometimes it uses Direct Path Read and sometime it uses Buffer Cache Read. 

Automatic Big Table Caching
The Oracle documentation notes that the db_big_table_cache_percent_target parameter is knows as "automatic big table caching", and that it specifies the cache section target size for automatic big table caching, expressed as a percentage of the buffer cache size.

Automatic big table caching enables large-table full-table scans (using parallel query) as well as serial queries to use the buffer cache.  
This features is only for databases with very large database buffer cache sizes because caching a very large table will consume RAM buffers that might be used by smaller, popular tables.

Starting in Oracle 12.1.0.2, table scans can use a different algorithm in the following scenarios:

• Serial queries:  In single-instance configurations (not RAC), serialized (non-parallel) queries can use the automatic big table cache when the db_big_table_cache_percent_target initialization parameter is set to a non-zero value.

• Parallel queries:  In single-instance and RAC databases, parallel queries can use the automatic big table cache when the db_big_table_cache_percent_target initialization parameter is set to a non-zero value, and the parallel_degree_policy initialization parameter is set to AUTO or ADAPTIVE.

When you set db_big_table_cache_percent_target, the value indicates the percentage of the buffer cache to reserve for the big table cache. 
For example, this will reserve 25% of the data buffer (50 gigabytes, where db_cache_size=200):  

db_big_table_cache_percent_target=50;

The largest value that can be specified is 90, which will only allow 10% of the buffer cache for index access usage (db file sequential reads).  
It is important to note that the default value is db_big_table_cache_percent_target=0. Therefore, automatic big table caching is not enabled by default.

If you are considering setting db_big_table_cache_percent_target, consider using the "traditional" method instead, adjusting the large table threshold size and allowing the data buffers to work as-is.

You can also easily assign large tables to the KEEP Pool. 

📍So beware that
db_big_table_cache_percent_target does only appear to work with the default data buffer cache, so multiple block size buffers will not benefit from this feature.  
Also, remember that
db_big_table_cache_percent_target only cache full-table scan data access (db file scattered reads) and this parameter will not be effective for tuning OLTP systems that access data via an index (e.g. high db file sequential reads).

This parameter can be dynamically changed if the workload changes, but there will be a delay in this taking effect on a busy database.

alter system set db_big_table_cache_percent_target=50, scope=both;

In sum, the db_big_table_cache_percent_target parameter is exclusively for data warehouse (decision support systems) that perform exclusive large-table-full-table scans, and systems where you have hundreds of gigabytes of RAM for data caching.

Oracle offers, in releases 12 and later, an enhancement that may improve performance of ‘large’ tables, Automatic Big Table Caching. 

The feature automatically caches such tables into an area of the SGA reserved specifically for this purpose. 

The big table cache won’t ‘work’ for direct path reads; 

in the example that follows those have been ‘turned off’ for the session to ensure that the table will be loaded into the configured cache.
Let’s look at how it’s configured and what it can do for you when it’s active.

Unlike the In-Memory database option, Automatic Big Table Caching is a no additional cost feature; no additional licensing is required to use it.

It does require a change to memory settings, which may increase the shared pool/sga/memory settings and could require that additional memory be installed (if the database is heavily used and memory can’t be reallocated to the big table cache from the existing SGA/memory settings).

That being said, configuring Automatic Big Table Caching is fairly simple as only one parameter, db_big_table_cache_percent_target, needs to be set.
Depending upon the percentage configured this could create a situation where the current sga or memory settings need to be increased; that will depend upon the size of the big table or tables Oracle will be caching.

[If the table is too large for the cache then it could be ‘cached’ to disk, and the usage query will reflect that.]
Let’s look at a possible configuration and see how it performs.

In this example the cache percent is set to 80 (a rather large number, true, but this is for a small, personal database and the table will be quite large in comparison to the available SGA.

Let’s create and populate a table to run an example with:

Sql  > create table bigtable(  id number, mame varchar2(800), pdate date); 
Table created. 
Sql>begin 
for i in 1..1000000 loop 
  insert into bigtable 
values(i, 'Test'||lpad(i, 773,'0'), sysdate+mod(i,991)); 
end loop;

commit;  end; 

PL/SQL procedure successfully completed.  

Sql > exec dbms_stats.gather_table_stats(user, 'BIGTABLE', cascade=>true) 
PL/SQL procedure successfully completed.  

Let’s check on what the database considers a ‘small’ table:

Sql > connect sys/pass as sysdba 

Sql > SELECT a.ksppinm "Parameter", c.ksppstvl "Value" FROM x$ksppi a, x$ksppcv b, x$ksppsv c, v$parameter p  WHERE a.indx = b.indx AND a.indx = c.indx  AND p.name(+) = a.ksppinm AND UPPER(a.ksppinm) IN ( UPPER('_db_block_buffers'), UPPER('_small_table_threshold')); 
Paramete                              Value 
_db_block_buffers              208502 _small_table_threshold      4170 

The value is in blocks, so any table larger than 4170 blocks will be considered a big table. 
Let’s now check on the number of blocks our example table occupies:

SYS > connect pop/pass
Connected. 

Sql> set echo on linesize 150 pagesize 100 sql > column table_name format a35  

Sql > alter session set parallel_degree_policy=auto; 
Session altered. 

Aql > alter session set "_serial_direct_read"=never;
Session altered. 

Sql > select blocks from dba_tables where table_name = 'BIGTABLE'; 
BLOCKS 
----------
 112097 
Sql > 

We verify the percent target that was set earlier:

Sql > show parameter DB_BIG_TABLE_CACHE_PERCENT_TARGET NAME                     BbbTYPE          VALUE 
---------------- ----------- ------------- db_big_table_cache_percent_target string 80
Sql > 

Now we check to see what the cache shows as being used; 
since this example has been run more than once the object count is greater than 1:

SQL> select BT_CACHE_TARGET, OBJECT_COUNT, MEMORY_BUF_ALLOC, MIN_CACHED_TEMP from v$bt_scan_cache;
BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP -------- ------------ ------- ---------------
80             2                  170037       1000 
Sql > 

Let’s run a count against the table (there are no indexes on the table) and exercise that cache a bit:

Sql > select count(*) from BIGTABLE;
COUNT(*)
----------
1000000

Sql > select BT_CACHE_TARGET, OBJECT_COUNT, MEMORY_BUF_ALLOC, MIN_CACHED_TEMP from v$bt_scan_cache; 

BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP
------- ------------ ---------- --------- 
80          2          170037              1000 

That the object count did not increase indicates that the object is cached, and that Oracle is using it to provide the results we requested. 
Repeated queries against the table (using full table scans and NOT using direct reads) will provide the same results.
Checking the query statistics, we see nothing but consistent gets, which is what is to be expected when using the big table cache for cached objects:

Statistics 
--------------------------------- 
1 recursive calls 0 db block gets 111360 consistent gets 1 physical reads 0 redo size 542 bytes sent via SQL*Net to client 552 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed 

Running the same query again produces the same results (since the statistics didn’t change those numbers will not be repeated):

Sql> select count(*) from BIGTABLE; COUNT(*)
 ---------- 
1000000

sql > select BT_CACHE_TARGET, OBJECT_COUNT, MEMORY_BUF_ALLOC, MIN_CACHED_TEMP from v$bt_scan_cache;
BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP
----- ------------ ------ ----
80      2      170037       1000 
Sql > 

Should another ‘big’ table be accessed (by conventional reads) it would also go into the cache, provided there is enough space in the cache.
Again, this is automatically done by Oracle once the big table cache has been configured.

Automatic Big Table Caching can be a performance improvement when large 
tables are the norm rather than the exception.
That it’s also a feature that doesn’t require additional licensing makes it all the more desirable to configure and used.

########################

Oracle 26ai new chapter in memory architecture (MGA)

Understanding Oracle’s Managed Global Area (MGA):

 A New Chapter in Memory Architecture


While exploring Oracle’s latest AI Database 26ai documentation (Oracle Docs → Memory Architecture Diagram), one feature that stands out is the Managed Global Area (MGA) ;  a modern evolution in Oracle’s memory design that adds a new layer of flexibility and intelligence.

What Is the Managed Global Area (MGA)?


The MGA (Managed Global Area) is a semi-shared memory region that bridges the gap between the System Global Area (SGA) and the Program Global Area (PGA).


Unlike the SGA (which is fully shared across all processes) or the PGA (which is private to a

single process), the MGA is shared selectively , only among a trusted set of Oracle processes.
Its goal is to let certain background or foreground processes share data and structures dynamically, without the rigidity of static SGA allocations.


The Power of “Namespaces” in MGA:

The key innovation behind MGA is its namespace-based architecture that inherent from Linux namespace concepts.

• Each namespace represents a logical memory domain within the MGA ; a kind of “sandbox” or shared memory context that related processes can attach to.
• These namespaces are modular and dynamic ; Oracle components can create, attach, or drop them as needed.
• For example, a namespace could be used for:
• A parallel query execution team sharing intermediate results
• A metadata caching group
• A vector or AI operator needing temporary shared state
• Because each namespace is isolated, Oracle can manage access, size, and cleanup independently ; keeping the system both flexible and safe.

This namespace-based sharing allows Oracle to optimize memory for on-demand collaboration among processes, without the overhead or risk of global sharing.

Why Oracle Added MGA:

MGA fills an architectural gap between static and private memory management:
• Dynamic Sharing – Enables processes to share temporary memory safely using namespaces.
• Elastic & Modular – Memory segments can be created, resized, or destroyed dynamically — no instance restart required.
• Controlled Scope – Only approved processes can attach to a namespace; it’s not globally visible like the SGA.
• Governed by PGA Limits – MGA usage counts toward the PGA aggregate limit, ensuring unified memory control.
• Recoverable & Flexible – Enables better resilience and cleaner memory lifecycle management, especially for transient workloads.

How MGA Fits in Oracle’s Memory Hierarchy:

Here’s how Oracle’s three main memory areas relate:

System Global Area (SGA)
Fully shared across the entire database instance.
Used for caches, buffer cache, and shared SQL or library cache.
Mostly static ; memory is allocated at instance startup.


Program Global Area (PGA)
Private to each server or background process.
Used for sorts, session state, and runtime working memory.
Dynamically allocated and freed per session or process.

Managed Global Area (MGA)
Semi-shared ,shared only among a trusted set of Oracle processes.
Organized into namespaces, each acting as an isolated memory domain.
Enables dynamic, temporary sharing of data structures between related processes.
Elastic and modular , namespaces can be created, resized, and dropped on demand.
Memory consumption is governed under PGA aggregate limits for unified control.

Why This Matters for DBAs & Architects:

As Oracle continues to evolve toward AI-driven workloads and adaptive memory management, understanding MGA becomes essential:
• Monitor MGA memory usage within overall PGA limits.
• Expect more internal Oracle components (e.g., vector search, AI operators, and parallel processing) to leverage MGA namespaces for optimized data sharing.
• When tuning or troubleshooting, remember that some shared structures may now live in MGA namespaces rather than the SGA.


What The Hell Is MGA Anyway?

Think of the MGA as shared memory for a small group of processes that need to work together. Unlike the SGA where everyone attaches, MGA is picky - only trusted processes get in, and the memory comes and goes as needed.

Right now, three things use it heavily:

· Parallel queries - The coordinator and slave processes pass intermediate results back and forth
· Vector search (26ai) - That fancy AI similarity search? Yep, lives in MGA
· Some multitenant metadata ops - If you're running PDBs, MGA is there

Here's what nobody tells you: MGA counts against PGA_AGGREGATE_LIMIT. Not SGA. PGA.

Why Your 19c Upgrade Went Sideways

You had a system running on 12c. PGA_AGGREGATE_LIMIT set to whatever. Life was good.

You upgrade to 19c. Same workload. Same settings. Suddenly sessions start dying with "ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT."

Sound familiar?

Your PGA limit didn't change. Your workload didn't change. But now parallel queries are chewing up MGA memory that never existed before, and that 12c limit you set five years ago just became insufficient overnight.

Let Me Show You What I Mean

Run this on a live system:


SELECT name, value/1024/1024/1024 AS gb 
FROM v$pgastat 
WHERE name LIKE 'MGA%';


I've seen this return 6, 8, even 12 GB on systems with heavy parallel execution. That's 12 GB of memory you didn't account for, now eating into your PGA limit.

How I Actually Size This Stuff

The Lazy Method (Works 80% of the time)

Take your old PGA limit and add this:


New Limit = Old Limit + (max_processes × 5 MB)


Find your max processes with:


SELECT max_utilization FROM v$resource_limit WHERE resource_name = 'processes';


For a system with 500 concurrent processes, that's an extra 2.5 GB. Easy math.

The Solaris Headache

If you're on Solaris (and honestly, who still is?), you need to care about shared memory segments because Oracle uses something called Optimized Shared Memory for MGA there.

Each MGA segment eats a System V shared mory identifier. The formula is:


Extra segments needed = MGA_Memory_Size / Granule_Size


Granule size depends on your SGA:

SGA Size     Granule
Under 1 GB      4 MB
1-8 GB               16 MB
8-16 GB             32 MB
16-32 GB          64 MB
32-64 GB          128 MB
64-128 GB        256 MB
Over 128 GB    512 MB

So if your SGA is 40 GB (64 MB granules) and MGA is using 6 GB, you need about 96 additional shared memory segments. Check your project.max-shm-ids before you run out.

What About The SGA?

Don't touch your SGA sizing just because MGA showed up. The SGA still does what it always did - buffer cache, shared pool, redo log buffer. MGA doesn't touch it.

But here's a mistake I see constantly: people set SGA_TARGET to 80% of physical RAM, leaving nothing for PGA + MGA + OS. That worked when PGA was the only variable. Now MGA is another wildcard.

I leave at least 20-30% of RAM for non-SGA memory. On a 64 GB box, SGA gets 40 GB tops. The rest handles PGA, MGA, and the OS so your server doesn't start swapping.

Quick Checklist Before Your Next Upgrade

Before you move anything to 19c or 23ai:

1. Check your current PGA limit
   
   SHOW PARAMETER pga_aggregate_limit;
   
2. See what MGA is actually using (if you're already on 18c+)
  
   SELECT * FROM v$pgastat WHERE name LIKE '%MGA%';
  
3. Bump your limit by at least 20-30% - seriously, just do it. You can always lower it later.

4. If you're on Solaris, check your shared memory limits
   
   prctl -n project.max-shm-ids $$
   

The Bottom Line

Oracle didn't screw this up - MGA is actually a clever solution for coordinated process memory. But the documentation around sizing is scattered across three different manuals, and most DBAs don't find out about it until something breaks.

Add the headroom now. Test your parallel workloads. Watch V$PGASTAT like a hawk for a few weeks after any upgrade.

And for the love of all that is holy, don't assume your 12c memory settings will work on 19c. They won't. I learned that one the hard way so you don't have to.


Finally, the Managed Global Area (MGA) brings namespace-based intelligence and modularity to Oracle’s memory system.
It’s not just shared memory; it’s structured, scoped, and smart memory, a foundation for Oracle’s next-generation database performance.

Wednesday, May 20, 2026

Real-time query on standby server requirements and musts - PART3_Final part

Ensuring Real-Time Query Accuracy on Oracle Standby Servers: Requirements and Best Practices

PART#3

Summary: Troubleshooting and Tuning SQL on Oracle Active Data Guard (Read-Only Standby)

This article explains how to troubleshoot, tune, and monitor slow SQL queries on an Oracle Active Data Guard (ADG) physical standby database, which operates in read-only mode. Because standby databases cannot perform write operations, many Oracle tuning and diagnostic tools require special handling using database links, primary database interactions, and UMF/RMF configurations.

 

Part 1 — Running SQL Tuning Advisor on a Read-Only Standby Database

Problem                              

Running SQL Tuning Advisor directly on a standby database fails with:

ORA-13792: This operation requires a database link

This happens because:

  • SQL Tuning Advisor needs to write tuning data
  • Standby databases are read-only
  • Oracle requires tuning metadata to be stored on the primary database

 

Solution

1. Create a Database Link from Standby to Primary

Use the SYS$UMF user:

create database link lnk_to_pri
connect to "SYS$UMF"
identified by "oracle90"
using 'DXTPRI';

Verify connectivity:

select db_unique_name from v$database@lnk_to_pri;

2. Create a SQL Tuning Task on the Standby

DBMS_SQLTUNE.CREATE_TUNING_TASK(
   sql_text => my_sqltext,
   task_name => 'TEST_sql_tuning_task1',
   database_link_to => 'lnk_to_pri');

3. Execute the Tuning TaskEXEC DBMS_SQLTUNE.execute_tuning_task(
   task_name => 'TEST_sql_tuning_task1',
   database_link_to => 'lnk_to_pri');

4. Generate the Tuning Report

select dbms_sqltune.report_tuning_task(
   'TEST_sql_tuning_task1',
   database_link_to => 'lnk_to_pri')
from dual;

Result

Oracle successfully generates SQL tuning recommendations for queries running on the standby database by storing advisory data on the primary database.

Typical recommendations include:

  • Accepting a SQL Profile
  • Improved execution plans
  • Reduced full table scans

Part 2 — Creating SQL Profiles for Slow SQL on Standby

Problem

Custom SQL Profiles generated using coe_xfr_sql_profile.sql fail on standby databases with:

ORA-00600

because SQL Profiles require write access.

Scenario

A slow SQL query had multiple execution plans:

  • Some plans were efficient
  • One plan caused major slowdowns due to plan flipping

Goal:

  • Force the optimizer to use the good plan

Solution

1. Generate the SQL Profile Script on Standby

Using:

coe_xfr_sql_profile.sql

 

2. Move the Generated Script to the Primary Database

Using SCP/FTP/etc.

3. Execute the Script on the Primary Database

The script calls:

DBMS_SQLTUNE.IMPORT_SQL_PROFILE

This creates the SQL Profile on the primary.

4. Allow Data Guard to Replicate the Profile

The SQL Profile automatically propagates to the standby database through redo apply.

Result

  • The standby database receives the SQL Profile
  • The optimizer consistently chooses the better execution plan
  • Query performance stabilizes

Part 3 — Using SQLT (SQLTXPLAIN) with Standby Databases

Problem

SQLT tools like:

  • SQLTXTRACT
  • SQLTXECUTE

require write access to store repository data.

Standby databases are read-only.

Solution — Use SQLTXTRSBY

Oracle provides:

sqltxtrsby.sql

which allows SQLT analysis against standby databases via database links.

Steps

1. Install SQLT on the Primary Database

Run:

sqcreate.sql

DDL changes replicate to standby.

2. Create Database Link to Standby

create public database link dblink_tostandby
connect to sqltxplain
identified by oracle90
using 'DIXITSTAN';

3. Run the Problematic SQL on Standby

Capture the SQL ID:

select sql_id from v$sqlarea;

4. Run SQLTXTRSBY from the Primary

@sqltxtrsby <sql_id> DBLINK_TOSTANDBY

Result

SQLT generates a ZIP report containing:

  • SQL execution details
  • Explain plans
  • Performance data

Limitations:

  • No 10053 trace
  • No SQL Profile scripts
  • No SQL Tuning Advisor reports

because standby remains read-only.

Part 4 — Generating AWR Reports for a Standby Database

Problem

AWR snapshots normally require write access, so AWR scripts fail on standby databases.

Solution — Configure RMF/UMF Remote AWR

Oracle supports remote AWR collection using:

  • RMF (Remote Management Framework)
  • UMF (Unified Management Framework)

 

 

Main Steps

1. Unlock and Configure SYS$UMF

alter system set "_umf_remote_enabled"=TRUE;

Required on both primary and standby.

2. Create Bidirectional Database Links

  • Primary → Standby
  • Standby → Primary

using SYS$UMF.

3. Configure UMF Nodes

Primary:

dbms_umf.configure_node('primary_site');

Standby:

dbms_umf.configure_node(
   'standby_site',
   'STANDBY_TO_PRIMARY_DBLINK');

4. Create UMF Topology

DBMS_UMF.create_topology('Topology_1');

5. Register the Standby Node

DBMS_UMF.register_node(...)

6. Register the Standby for AWR

DBMS_WORKLOAD_REPOSITORY.register_remote_database(
   node_name=>'standby_site');

7. Create Remote AWR Snapshots

From primary:

dbms_workload_repository.create_remote_snapshot(
   'standby_site');

8. Generate AWR Report

Run:

@?/rdbms/admin/awrrpti.sql

and choose:

  • Standby DBID
  • Instance number
  • Snapshot range

Result

You can successfully generate AWR reports for a physical standby database, including:

  • DB Time
  • CPU usage
  • Logical/physical reads
  • IO statistics
  • SQL activity

even though the standby is read-only.

Overall Key Takeaways

Oracle ADG Read-Only Limitation

Many tuning tools require write access and therefore cannot run directly on standby databases.

Database Links Are Essential

Most solutions rely on:

  • Standby → Primary DB links
  • SYS$UMF user
  • Remote execution/storage

SQL Profiles Must Be Created on Primary

SQL Profiles cannot be imported directly into standby databases.

SQLT Supports Standby Through SQLTXTRSBY

This enables deep SQL diagnostics for standby workloads.

AWR Reporting Requires RMF/UMF Setup

Remote AWR snapshots enable performance analysis on standby systems.

Core Oracle Features Used

  • DBMS_SQLTUNE
  • SQL Profiles
  • SQLT / SQLTXPLAIN
  • Active Data Guard
  • Database Links
  • RMF / UMF
  • AWR Remote Snapshots
  • SYS$UMF
  • DBMS_WORKLOAD_REPOSITORY

 

                                                                                                                     

More:

https://fatdba.com/2022/01/15/part-1-running-sql-tuning-advisor-for-a-slow-sql-in-a-read-only-standby-database/

https://docs.oracle.com/en/database/oracle/oracle-database/21/tgsql/sql-tuning-advisor.html#GUID-B383AE33-81F0-4D6A-A48F-A97C06448A2F

 

Sunday, May 10, 2026

Real-time query on standby server requirements and musts - Part2

Ensuring Real-Time Query Accuracy on Oracle Standby Servers: Requirements and Best Practices

--Part 2

In the part1, I described some solutions to ensuring Oracle standby database is synched with primary for critical reporting services. I showing that STANDBY_MAX_DATA_DELAY parameter how helps to control GAP, and review 'alter session sync with primary' affection to force a synchronization between standby and primary database.

One of another techniques is creating a specific service using tuning options:

Define your read service on standby with:

srvctl add service -d <DB_UNIQUE_NAME> -s ro_service \

  -role PHYSICAL_STANDBY \

  -preferred <standby_instance> \

  -pdb <your_pdb> \

  -commit_outcome TRUE \

  -failovertype SELECT \

  -replay_init_time 1800 \

  -rlbgoal SERVICE_TIME \

  -clbgoal SHORT \

  -maxlag 5

 

Explanation

  • -role PHYSICAL_STANDBY → ensures this service runs only on the standby (Active Data Guard).
  • -rlbgoal SERVICE_TIME → enables runtime load balancing based on actual response times.
  • -clbgoal SHORT → optimizes for short-lived connections, such as web requests.
  • -maxlag 5 → enforces a maximum 5-second apply lag; if the standby falls behind, the service is automatically taken offline.
    This parameter behaves similarly to STANDBY_MAX_DATA_DELAY=5, but at the service level instead of the instance level.
  • -commit_outcome TRUE and -failovertype TRANSACTION → enable Application Continuity for reliable transaction replay.
  • -replay_init_time 1800 → sets 30-minute window for session replay initialization.

 

 

 

 

 

Parameter

Meaning

failovertype TRANSACTION

During a failover (or replay), in-flight transactions are fully replayed. Ensures transactional integrity. Common for read-write / primary services.

failovertype SELECT

During a failover, only SELECT statements are replayed. No transactional replay. Common for read-only / standby services.

 

-clbgoal Overview

The -clbgoal parameter in srvctl add service controls how Oracle distributes client connections at connect time across available instances.

  • CLB = Connect-Time Load Balancing
  • Affects which instance a new client session is sent to when multiple instances are available for a service.
  • Typical values: SHORT or LONG

 

🔹 Value Definitions

Value

Meaning

When to Use

SHORT

Distribute connections to the instance with the fewest active sessions or shortest estimated workload. Optimized for short-lived connections (OLTP queries, quick selects, small jobs).

Web apps, dashboards, and services with many small queries.

LONG

Distribute connections considering long-running queries and current service time. Optimized for heavy, long-duration queries to avoid overloading an instance.

BI / reporting queries, large analytics queries on ADG standby.

 

🔹 How They Behave on Standby

a) TRANSACTION

  • Guarantees full transactional continuity if a session fails over from one node to another or from standby to primary.
  • All in-flight statements (including DML) are replayed.
  • Introduces slightly more overhead on session initialization and replay logic.
  • Mostly used for OLTP / primary read-write workloads with Application Continuity.

b) SELECT

  • Only read queries (SELECTs) are replayed if a failover happens.
  • No DML replay (no transactions).
  • Minimal overhead, optimized for read-only reporting.
  • Perfect for ADG standby read services where clients do not issue DML.

-commit_outcome : Enable Transaction Guard; when set to TRUE, the commit outcome for a transaction is accessible after the transaction's session fails due to a recoverable outage.

-replay_init_time:

For Application Continuity, this parameter specifies the difference between the time, in seconds, of original processing of the first operation of a request and the time that the replay is ready to start after a successful reconnect. Application Continuity will not replay after the specified amount of time has passed. This parameter is intended to avoid the unintentional processing of a transaction when a system is recovered after a long period. The default is 5 minutes (300). The maximum value is 24 hours (86400). If the -failover_type parameter is not set to TRANSACTION, then you cannot use this parameter.

 

-rlbgoal {NONE | SMART_CONN | SERVICE_TIME | THROUGHPUT}         

Runtime Load Balancing Goal (for the Load Balancing Advisory).

Set this parameter to SMART_CONN to enable Smart Connection Rebalance.

Set this parameter to SERVICE_TIME to balance connections by response time.

Set this parameter to THROUGHPUT to balance connections by throughput.

 

-maxlag 5 here works similar to STANDBY_MAX_DATA_DELAY, controlling read service availability based on lag.

 

-maxlag(maximum_lag_time ):

The default value is ANY. You must also specify the -global option.

Maximum replication lag time in seconds for a global service.

Must be a non-negative integer.

 

What is Session Draining

Session draining is a feature in Oracle that allows existing client sessions to complete gracefully before a service is stopped, relocated, or taken offline.

  • Purpose: Avoid killing active user sessions or long-running queries when:
    • Shutting down a service
    • Moving a service to another node
    • Performing maintenance on an instance

 

  • How it works:

1.                   Oracle marks the service or instance as unavailable for new connections.

2.                   Existing sessions continue running until:

      • They finish naturally, or
      • A timeout expires (-drain_timeout), after which sessions are forcibly disconnected.

This ensures zero disruption for users while allowing controlled maintenance.

 

 

When is Session Draining Used?

  • RAC (Real Application Clusters):
    • Moving a service between nodes (srvctl relocate service)
    • Stopping a node or instance (srvctl stop instance)
  • Active Data Guard:
    • Switching standby services or taking them offline
  • General maintenance:
    • Applying patches, restarting nodes, or scaling resources

 

3️ Key SRVCTL Options for Session Draining

Option

Description

Default / Notes

-drain_timeout <seconds>

Maximum time Oracle will wait for existing sessions to complete before forcefully terminating them.

Default = 300 seconds (5 min). Can be adjusted based on workload.

-drain

Tells Oracle to drain sessions instead of immediately stopping the service. Often used with srvctl stop service or relocate service.

Must be combined with -drain_timeout.

-graceful

Similar purpose; ensures in-flight sessions complete naturally (sometimes used in older SRVCTL versions).

Less commonly used in 12c+; -drain is preferred.

-force

Immediately kills sessions without waiting. Overrides session draining.

Use only if you need immediate shutdown.

 

Example Scenarios

a) Draining sessions before stopping a service

srvctl stop service -d ORCL -s app_ro_svc -drain -drain_timeout 600

  • Marks app_ro_svc as unavailable for new connections.
  • Existing sessions wait up to 600 seconds to complete.
  • After timeout, any remaining sessions are disconnected.

 

 

b) Relocating a service with session draining

srvctl relocate service -d ORCL -s app_rw_svc -oldinst orcl1 -newinst orcl2 -drain -drain_timeout 300

  • Stops routing new connections to orcl1.
  • Existing sessions finish up to 5 minutes.
  • After 300 seconds, remaining sessions are moved or terminated on orcl1.

 

c) Forcing a service stop (skip draining)

srvctl stop service -d ORCL -s app_ro_svc -force

  • Immediately terminates all sessions.
  • Useful for emergency shutdowns, but disrupts clients.

 

Best Practices

  1. Use session draining for long-running queries or critical workloads to avoid breaking users.
  2. Set -drain_timeout based on expected query duration:
    • OLTP: 30–60 seconds usually enough
    • Analytics / reporting: 300–900 seconds depending on query length
  3. Combine with service relocation for zero-downtime maintenance.
  4. Monitor sessions via:

SELECT sid, serial#, username, status FROM v$session WHERE service_name='APP_RO_SVC';

  1. For Active Data Guard reporting services, session draining is especially useful when you need to:
    • Take standby service offline
    • Apply patches or upgrade without disrupting analytics users

 

 Summary

  • Session draining = letting active sessions finish gracefully before stopping or relocating a service.
  • Key options:
    • -drain → enable session draining
    • -drain_timeout <seconds> → maximum wait for sessions to finish
    • -force → bypass draining, kill sessions immediately
  • Use draining for long-running queries, maintenance, or zero-downtime service moves.
  • Particularly important for ADG standby services and long-running reporting workloads.

 

 

In the last part (Part3) I continue another’s techniques.

 

 

 

*************** Alireza Kamrani ****************

 

 

 

 

 

Oracle Direct Path Read wait event Solution

High "Direct Path Read" waits Caused by Full Table Scan on Large Table  A complete overview & solutions to handeling issues  P...