Dynamic
Statistic vs Dynamic Sampling
Effect of
change OPTIMIZER_DYNAMIC_SAMPLING param in Oracle
When need /
What is challenges
Controlling
Dynamic Statistics
By default, when optimizer statistics are missing, stale, or
insufficient, dynamic statistics automatically run recursive
SQL during parsing to scan a small random sample of table blocks.
About
Dynamic Statistics Levels
The dynamic statistics level controls both
when the database gathers dynamic statistics, and the size of the sample that
the optimizer uses to gather the statistics.
Set the dynamic statistics level using either the OPTIMIZER_DYNAMIC_SAMPLING initialization
parameter or a statement hint.
Note:
Dynamic statistics were called dynamic
sampling in releases earlier than Oracle Database 12c Release
1 (12.1).
The following table describes the levels for dynamic
statistics. Note the following:
- If
dynamic statistics are not disabled, then the database may choose to use
dynamic statistics when a SQL statement uses parallel execution.
- If
the OPTIMIZER_ADAPTIVE_FEATURES initialization parameter
is true, then the optimizer uses dynamic statistics when relevant SQL
plan directives exist.
The database maintains the
resulting statistics in the server result cache, making them available to other
queries.
Table: Dynamic Statistics Levels
Level |
When the
Optimizer Uses Dynamic Statistics |
Sample
Size (Blocks) |
0 |
Do not use
dynamic statistics. |
n/a |
1 |
Use dynamic
statistics for all tables that do not have statistics, but only if the
following criteria are met:
|
32 |
2 |
Use dynamic
statistics if at least one table in the statement has no statistics. This is
the default value. |
64 |
3 |
Use dynamic
statistics if any of the following conditions is true:
|
64 |
4 |
Use dynamic
statistics if any of the following conditions is true:
|
64 |
5 |
The criteria
are identical to level 4, but the database uses a different sample size. |
128 |
6 |
The criteria
are identical to level 4, but the database uses a different sample size. |
256 |
7 |
The criteria
are identical to level 4, but the database uses a different sample size. |
512 |
8 |
The criteria
are identical to level 4, but the database uses a different sample size. |
1024 |
9 |
The criteria
are identical to level 4, but the database uses a different sample size. |
4086 |
10 |
The criteria
are identical to level 4, but the database uses a different sample size. |
All blocks |
11 |
The database
uses adaptive dynamic sampling automatically when the optimizer deems it
necessary. |
Automatically
determined |
Setting
Dynamic Statistics Levels Manually
Determining a database-level setting that would be
beneficial to all SQL statements can be difficult.
When setting the level for dynamic statistics, Oracle
recommends setting the OPTIMIZER_DYNAMIC_SAMPLING initialization
parameter at the session level.
Assumptions
This tutorial assumes the following:
- You
want correct selectivity estimates for the following query, which
has WHERE clause predicates on two correlated columns:
SELECT * FROM
sh.customers WHERE
cust_city='Los Angeles' AND cust_state_province='CA';
- The
preceding query uses serial processing.
- The sh.customers table
contains 932 rows that meet the conditions in the query.
- You
have gathered statistics on the sh.customers table.
- You
created an index on
the cust_city and cust_state_province columns.
- The OPTIMIZER_DYNAMIC_SAMPLING initialization
parameter is set to the default level of 2.
To set
the dynamic statistics level manually:
- Connect
SQL*Plus to the database with the appropriate privileges, and then explain
the execution plan as follows:
EXPLAIN PLAN FOR
SELECT * FROM sh.customers WHERE
cust_city='Los Angeles'
AND
cust_state_province='CA';
- Query
the plan as follows:
SET LINESIZE 130
SET PAGESIZE 0
SELECT *
FROM TABLE(DBMS_XPLAN.DISPLAY);
The output appears below (the
example has been reformatted to fit on the page):
-------------------------------------------------------------------------------
|Id| Operation | Name |Rows|Bytes|Cost
| Time |
-------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 53|
9593|53(0)|00:00:01|
| 1| TABLE ACCESS BY INDEX ROWID|CUSTOMERS | 53| 9593|53(0)|00:00:01|
|*2| INDEX RANGE SCAN |CUST_CITY_STATE_IND | 53| 9593| 3(0)|00:00:01|
-------------------------------------------------------------------------------
Predicate Information (identified by operation
id):
---------------------------------------------------
2 - access("CUST_CITY"='Los
Angeles' AND "CUST_STATE_PROVINCE"='CA')
The columns in the WHERE clause have a real-world
correlation, but the optimizer is not aware that Los Angeles is in California
and assumes both predicates reduce the number of rows returned.
Thus, the table contains 932 rows
that meet the conditions, but the optimizer estimates 53,
as shown in bold.
If the database had used dynamic statistics for this plan,
then the Note section of the plan output would have indicated this
fact.
The optimizer did not use dynamic statistics because the
statement executed serially, standard statistics exist, and the parameter OPTIMIZER_DYNAMIC_SAMPLING is
set to the default of 2.
- Set
the dynamic statistics level to 4 in the session using the
following statement:
ALTER SESSION SET OPTIMIZER_DYNAMIC_SAMPLING=4;
- Explain
the plan again:
EXPLAIN PLAN FOR
SELECT * FROM
sh.customers
WHERE
cust_city='Los Angeles' AND cust_state_province='CA';
The new plan shows a more accurate estimate of the number of
rows, as shown by the value 932 in bold:
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 2008213504
---------------------------------------------------------------------------
| Id | Operation | Name |Rows |
Bytes |Cost (%CPU)|Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT |
| 932
| 271K| 406 (1)| 00:00:05 |
|* 1 | TABLE ACCESS FULL| CUSTOMERS | 932 | 271K|
406 (1)| 00:00:05 |
---------------------------------------------------------------------------
Predicate Information (identified by operation
id):
---------------------------------------------------
1 - filter("CUST_CITY"='Los
Angeles' AND "CUST_STATE_PROVINCE"='CA')
Note
- dynamic statistics used for this statement (level=4)
The note at the bottom of the plan indicates that the
sampling level is 4.
The additional dynamic statistics made the optimizer aware
of the real-world relationship between
the cust_city and cust_state_province columns, thereby
enabling it to produce a more accurate estimate for the number of rows: 932 rather than 53.
When DBA might change OPTIMIZER_DYNAMIC_SAMPLING
(tune or override)?
A DBA usually considers changing OPTIMIZER_DYNAMIC_SAMPLING
only in specific scenarios, not as a general practice:
Common
Reasons to Increase It
- Missing
or stale statistics
- If
tables don’t have stats and DBMS_STATS cannot be run (e.g., transient
staging tables, external tables).
- Example:
ETL jobs loading into temp tables, then queried immediately.
- Complex
predicates not well estimated by normal stats
- Multi-column
correlated predicates
- Complex
functions or expressions in WHERE clause that lack function-based
indexes.
- Star
schema or data warehouse queries
- Joins
between large fact and dimension tables where stats are often not enough.
- Temporary
tables in batch jobs
- Since
stats are often not gathered, dynamic sampling gives better plans.
Common
Reasons to Decrease It
- High
parse-time overhead
- Dynamic
sampling runs extra queries behind the scenes, which increases
parse time.
- This
is bad for OLTP systems with many short-running queries.
- Stable
environment with good stats
- If
you already have well-maintained statistics, dynamic sampling is
usually unnecessary and just adds overhead.
- Predictability
of plans
- Dynamic
sampling can cause query plans to change at parse time depending on
sample results, which reduces plan stability.
Best
Practices for DBAs
- Don’t
change it globally unless you really need to.
Instead, use hints or SQL Profiles, SPM for specific queries: - OPTIMIZER_DYNAMIC_SAMPLING(n)
hint in SQL.
- Keep
statistics up to date with DBMS_STATS so you don’t need high dynamic
sampling levels.
- Use
higher levels (like 4, 6, 11) only for queries with uncertain stats
(DW, ad-hoc queries).
- For
OLTP, stick with defaults (0–2).
No comments:
Post a Comment