♦️High "Direct Path Read" waits Caused by Full Table Scan on Large Table
A complete overview & solutions to handeling issues
Alireza Kamrani
12/18/2024
Problem:
Specific SQL was found to be responsible for lot of "direct path reads" causing slow I/O.
For example:
SELECT C1, C2, C3,.... FROM table_name WHERE
C1 = :1 AND C3 = :5;
đź“ŤIn the older AWR reports, this same query would work well, was not using "direct path reads" and was also not the top I/O consumer.
đź“ŤWe also see from the above, the query is using "TABLE ACCESS - FULL" (FTS), which was also the same case earlier.
CHANGES
The table size has grown from earlier and no longer "fits" into the definition of a small table.
If there are no indexes on the columns used in the predicates and the table is not a "small" table, then full table scans will cause high direct I/O reads.
SOLUTION
To prevent "direct path reads" on large tables, we have below options:
1. Tune the SQL by creating indexes or modifying the access paths, to avoid Full table scan, for the query.
2. If there is sufficient space in the buffer cache, and your table is large, then "Cache" the table using:
ALTER TABLE <table name> CACHE;
You can use the KEEP buffer pool to keep frequently accessed tables persistent in the buffer pool.
Please note, it does not mean storing all blocks from the table in the buffer cache, but just tell Oracle to handle the blocks differently when they get into the cache. In other words, the table would be ALWAYStreated as small table regardless of the actual size of the table.
3. Increase SGA size - this will increase buffer cache accordingly and set a larger value of the "_small_table_threshold" parameter.
CACHE READS applies only to LOB storage. It’s useful when you want to bring LOB values into the buffer cache only during read, not during write operations. By the way, when creating table with LOB column, you could specify CACHE option in the LOB_storage_clause to places LOB data values in the buffer cache for faster access. By the default, both BasicFiles and SecureFiles LOBs is doing NOCACHE LOGGING. The downside of caching LOBs is when not using carefully, the flooding of cache of LOB object would force rest of database data out of buffer cache and have more Disk Read rather than cache hits for non-LOB data. This is uncommon behavior and SGA increase could improve the performance from this issue.
4. Change "_small_table_threshold" parameter value manually, to avoid direct path reads for the large table.
How to force 'Direct Path Read' for specific SQL Query?
There are ways to disable direct path read for SQL statements as follows:
1. event 10949 level 1
2. _serial_direct_read = NEVER
However, there are no direct methods to force the direct path read operations which are faster for some SQL statements.
Sometimes, the same SQL that used to run in direct path read suddenly changed to conventional cache reads causing slow performance.
Goal of this document is to provide a method to force direct path read for such SQL statements.
SOLUTION:
There are 2 methods to force direct path reads for SQL statements.
1. Use PARALLEL hint to the SQL statements like, /*+ parallel(4) */ so that parallelism uses direct path read.
2. Setting the statistics of the tables involved in the SQL such that the no.of blocks of tables > _small_table_threshold to enable the serial direct path read.
(i) Check the value of _small_table_threshold parameter in your DB.
SQL> select nam.ksppinm NAME,val.KSPPSTVL VALUE from x$ksppi nam, x$ksppsv val where nam.indx = val.indx and nam.ksppinm = '_small_table_threshold';
(ii) Check the no.of blocks statistics for the table.
SQL> SELECT blocks FROM user_tables WHERE table_name = 'TABLE_NAME';
Example:
If the blocks from user_tables for the object show 100 and _small_table_threshold is set to 480 then set the blocks statistics manually to 1000 so that it would go for direct path read.
(iii) Set the no.of blocks statistics for the tables involved in the SQL manually greater than the "_small_table_threshold" value.
SQL> EXEC DBMS_STATS.SET_TABLE_STATS('username','tabname',numblks=>n);
Example:
SQL> EXEC DBMS_STATS.SET_TABLE_STATS(user,'TEST',numblks=>1000);
There is a parameter
_direct_read_decision_statistics_driven that controls this:
NAME VALUE DESCRIPTION
------------------ ------- -----------------
_direct_read_decision_statistics_driven TRUE
enable direct read decision based on optimizer statistics
When the above parameter is FALSE, the direct path read decision is done based on the actual block count of segment header.
When the above parameter is TRUE , the direct path read decision is done based on the optimizer statistics.
NOTE: Though the block counts are taken from the optimizer statistics, it is not the optimizer alone that does the direct path read decision, as there are other factors like buffer cache size, cached block counts, PX execution etc affecting the direct path reads.
Direct Path Read VS Scattered Read
For direct path read, SGA is bypassed and the blocks are read directly into PGA.
Direct path read is faster than scattered reads because it can avoid latches.
But on the other hand, oracle needs to count the total number of dirty blocks in the memory for each table/segment or go though the checkpoint queue to count the buffersđź“Ť
db file scattered read wait event is common for applications with a high amount of large reads (such as full or range scans).
This is true for BI, DWH and DSS workload environments.
The db file scattered read wait event represents a wait for a physical read of multiple Oracle blocks from the disk (unlike db file sequential read which represents reading a single block).
It is usually caused by scanning the entire or a sub-range of a table, index, table partition, or index partition.
If you had a medium table with Scattered reads and now mentioned table has a big size and have performance degradation, maybe one of solutions would be using Direct Path Read and if you will going to use Direct path reads instead of Sacttered read for some queries, one of the beginning step can be finding all Scattered reads wating events. Then see Plans, Number of table blocks , Number of dirty blocks , and reviews hidden params related to size of tables in database to make a better decision.
At the time when we look at a query’s execution plan, if seeing TABLE ACCESS FULL, there are only two ways to go to perform full table scan:
1. Conventional path read or buffer cache read
2. Direct path read
For direct path read, SGA is bypassed and the blocks are read directly into PGA. Direct path read is faster than scattered reads because it can avoid latches. But on the other hand, oracle needs to count the total number of dirty blocks in the memory for each table/segment or go though the checkpoint queue to count the buffers.
The checkpoint is done for the segment to make all the blocks of the segment up-to-date. You could see enq: KO – fast object checkpoint event just before the direct path read.
Generally, having more direct path reads is not an issue, Are you facing any problem related to any SQL ?
Average wait time for example 5 ms maybe not bad.
You need to also check TOP SQL section in the order of elapsed time.
Dont forget when any process tries to read a block in the PGA that has not yet been read from disk, it submits a wait call and updates the statistics for this particular event.
So, the number of waits is not always the same as the number of read requests.
Direct reads are largely controlled by the _serial_direct_read parameter, and need to check per database version:
SQL> alter session set "_serial_direct_read"=FALSE;
Session altered.
SQL> show parameter "_serial_direct_read"
NAME TYPE VALUE
-------------------- ------- --------------
_serial_direct_read string FALSE
Direct Path Read and size of Table
db_big_table_cache_percent_target is new in 12c and it appears to be used as a broad-brush approach for caching direct path reads that result from large-table full-table scans.
There are two parameters related to, db_big_table_cache_percent_target, namely the v$bt_scan_cache and v$bt_scan_obj_temps.
This appears (and the name suggests) that db_big_table_cache_percent_target is used like the KEEP pool, expect that is does not apply to individual tables and indexes, like assignment to the KEEP pool.
Instead, setting db_big_table_cache_percent_target determines whether to cache all large tables.
The default in Oracle for large tables is where any table is larger than 2% of the SGA size, it is considered a large table.
For small table, full table scan always read into buffer cache. If size of the table or segment > 5*_small_table_threshold or 10% of buffer cache, then the table/segment is considered Large tableand always uses Direct Path Read. For table between small and large table, it is Medium size table, i.e, between 2% and 10% of buffer cache. For Medium size table, sometimes it uses Direct Path Read and sometime it uses Buffer Cache Read.
Automatic Big Table Caching
The Oracle documentation notes that the db_big_table_cache_percent_target parameter is knows as "automatic big table caching", and that it specifies the cache section target size for automatic big table caching, expressed as a percentage of the buffer cache size.
Automatic big table caching enables large-table full-table scans (using parallel query) as well as serial queries to use the buffer cache.
This features is only for databases with very large database buffer cache sizes because caching a very large table will consume RAM buffers that might be used by smaller, popular tables.
Starting in Oracle 12.1.0.2, table scans can use a different algorithm in the following scenarios:
• Serial queries: In single-instance configurations (not RAC), serialized (non-parallel) queries can use the automatic big table cache when the db_big_table_cache_percent_target initialization parameter is set to a non-zero value.
• Parallel queries: In single-instance and RAC databases, parallel queries can use the automatic big table cache when the db_big_table_cache_percent_target initialization parameter is set to a non-zero value, and the parallel_degree_policy initialization parameter is set to AUTO or ADAPTIVE.
When you set db_big_table_cache_percent_target, the value indicates the percentage of the buffer cache to reserve for the big table cache.
For example, this will reserve 25% of the data buffer (50 gigabytes, where db_cache_size=200):
db_big_table_cache_percent_target=50;
The largest value that can be specified is 90, which will only allow 10% of the buffer cache for index access usage (db file sequential reads).
It is important to note that the default value is db_big_table_cache_percent_target=0. Therefore, automatic big table caching is not enabled by default.
If you are considering setting db_big_table_cache_percent_target, consider using the "traditional" method instead, adjusting the large table threshold size and allowing the data buffers to work as-is.
You can also easily assign large tables to the KEEP Pool.
đź“ŤSo beware that
db_big_table_cache_percent_target does only appear to work with the default data buffer cache, so multiple block size buffers will not benefit from this feature.
Also, remember that
db_big_table_cache_percent_target only cache full-table scan data access (db file scattered reads) and this parameter will not be effective for tuning OLTP systems that access data via an index (e.g. high db file sequential reads).
This parameter can be dynamically changed if the workload changes, but there will be a delay in this taking effect on a busy database.
alter system set db_big_table_cache_percent_target=50, scope=both;
In sum, the db_big_table_cache_percent_target parameter is exclusively for data warehouse (decision support systems) that perform exclusive large-table-full-table scans, and systems where you have hundreds of gigabytes of RAM for data caching.
Oracle offers, in releases 12 and later, an enhancement that may improve performance of ‘large’ tables, Automatic Big Table Caching.
The feature automatically caches such tables into an area of the SGA reserved specifically for this purpose.
The big table cache won’t ‘work’ for direct path reads;
in the example that follows those have been ‘turned off’ for the session to ensure that the table will be loaded into the configured cache.
Let’s look at how it’s configured and what it can do for you when it’s active.
Unlike the In-Memory database option, Automatic Big Table Caching is a no additional cost feature; no additional licensing is required to use it.
It does require a change to memory settings, which may increase the shared pool/sga/memory settings and could require that additional memory be installed (if the database is heavily used and memory can’t be reallocated to the big table cache from the existing SGA/memory settings).
That being said, configuring Automatic Big Table Caching is fairly simple as only one parameter, db_big_table_cache_percent_target, needs to be set.
Depending upon the percentage configured this could create a situation where the current sga or memory settings need to be increased; that will depend upon the size of the big table or tables Oracle will be caching.
[If the table is too large for the cache then it could be ‘cached’ to disk, and the usage query will reflect that.]
Let’s look at a possible configuration and see how it performs.
In this example the cache percent is set to 80 (a rather large number, true, but this is for a small, personal database and the table will be quite large in comparison to the available SGA.
Let’s create and populate a table to run an example with:
Sql > create table bigtable( id number, mame varchar2(800), pdate date);
Table created.
Sql>begin
for i in 1..1000000 loop
insert into bigtable
values(i, 'Test'||lpad(i, 773,'0'), sysdate+mod(i,991));
end loop;
commit; end;
/
PL/SQL procedure successfully completed.
Sql > exec dbms_stats.gather_table_stats(user, 'BIGTABLE', cascade=>true)
PL/SQL procedure successfully completed.
Let’s check on what the database considers a ‘small’ table:
Sql > connect sys/pass as sysdba
Sql > SELECT a.ksppinm "Parameter", c.ksppstvl "Value" FROM x$ksppi a, x$ksppcv b, x$ksppsv c, v$parameter p WHERE a.indx = b.indx AND a.indx = c.indx AND p.name(+) = a.ksppinm AND UPPER(a.ksppinm) IN ( UPPER('_db_block_buffers'), UPPER('_small_table_threshold'));
Paramete Value
_db_block_buffers 208502 _small_table_threshold 4170
The value is in blocks, so any table larger than 4170 blocks will be considered a big table.
Let’s now check on the number of blocks our example table occupies:
SYS > connect pop/pass
Connected.
Sql> set echo on linesize 150 pagesize 100 sql > column table_name format a35
Sql > alter session set parallel_degree_policy=auto;
Session altered.
Aql > alter session set "_serial_direct_read"=never;
Session altered.
Sql > select blocks from dba_tables where table_name = 'BIGTABLE';
BLOCKS
----------
112097
Sql >
We verify the percent target that was set earlier:
Sql > show parameter DB_BIG_TABLE_CACHE_PERCENT_TARGET NAME BbbTYPE VALUE
---------------- ----------- ------------- db_big_table_cache_percent_target string 80
Sql >
Now we check to see what the cache shows as being used;
since this example has been run more than once the object count is greater than 1:
SQL> select BT_CACHE_TARGET, OBJECT_COUNT, MEMORY_BUF_ALLOC, MIN_CACHED_TEMP from v$bt_scan_cache;
BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP -------- ------------ ------- ---------------
80 2 170037 1000
Sql >
Let’s run a count against the table (there are no indexes on the table) and exercise that cache a bit:
Sql > select count(*) from BIGTABLE;
COUNT(*)
----------
1000000
Sql > select BT_CACHE_TARGET, OBJECT_COUNT, MEMORY_BUF_ALLOC, MIN_CACHED_TEMP from v$bt_scan_cache;
BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP
------- ------------ ---------- ---------
80 2 170037 1000
That the object count did not increase indicates that the object is cached, and that Oracle is using it to provide the results we requested.
Repeated queries against the table (using full table scans and NOT using direct reads) will provide the same results.
Checking the query statistics, we see nothing but consistent gets, which is what is to be expected when using the big table cache for cached objects:
Statistics
---------------------------------
1 recursive calls 0 db block gets 111360 consistent gets 1 physical reads 0 redo size 542 bytes sent via SQL*Net to client 552 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
Running the same query again produces the same results (since the statistics didn’t change those numbers will not be repeated):
Sql> select count(*) from BIGTABLE; COUNT(*)
----------
1000000
sql > select BT_CACHE_TARGET, OBJECT_COUNT, MEMORY_BUF_ALLOC, MIN_CACHED_TEMP from v$bt_scan_cache;
BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP
----- ------------ ------ ----
80 2 170037 1000
Sql >
Should another ‘big’ table be accessed (by conventional reads) it would also go into the cache, provided there is enough space in the cache.
Again, this is automatically done by Oracle once the big table cache has been configured.
Automatic Big Table Caching can be a performance improvement when large
tables are the norm rather than the exception.
That it’s also a feature that doesn’t require additional licensing makes it all the more desirable to configure and used.
For a scenario to forcing and change Direct Path Read and change Optimizer behavior can see below articles :
https://dincosman.com/2023/11/12/using-table-stats-hint/
https://www.dnsstuff.com/oracle-wait-event-explained-direct-path-read-temp-jk01
Alireza Kamrani
No comments:
Post a Comment