Monday, May 26, 2025

An Overview of Oracle Data Guard Capabilities

 

An Overview of Oracle Data Guard Capabilities:

Oracle Data Guard ensures high availability, data protection, and disaster recovery for enterprise data.

Oracle Data Guard provides a comprehensive set of services that create, maintain, manage, and monitor one or more standby databases to enable production Oracle databases to survive disasters and data corruptions.

Oracle Data Guard maintains these standby databases as copies of the production database.

Then, if the production database becomes unavailable because of a planned or an unplanned outage, Oracle Data Guard can switch any standby database to the production role, minimizing the downtime associated with the outage.

Oracle Data Guard can be used with traditional backup, restoration, and cluster techniques to provide a high level of data protection and data availability.

Oracle Data Guard transport services are also used by other Oracle features such as Oracle Streams and Oracle GoldenGate for efficient and reliable transmission of redo from a source database to one or more remote destinations.

With Oracle Data Guard, administrators can optionally improve production database performance by offloading resource-intensive backup and reporting operations to standby systems.

Oracle Database with Oracle Data Guard

Oracle Data Guard is a high availability and disaster-recovery solution that provides very fast automatic failover (referred to as fast-start failover) in database failures, node failures, corruption, and media failures. Furthermore, the standby databases can be used for read-only access and subsequently for reader farms, for reporting, and for testing and development.

Although traditional solutions (such as backup and recovery from tape, storage-based remote mirroring, and database log shipping) can deliver some level of high availability, Oracle Data Guard provides the most comprehensive high availability and disaster recovery solution for Oracle databases.

Oracle Data Guard Advantages Over Traditional Solutions

Oracle Data Guard provides a number of advantages over traditional solutions, including the following:

  • Fast, automatic or automated database failover for data corruptions, lost writes, and database and site failures
  • Automatic corruption repair automatically replaces a corrupted block on the primary or physical standby by copying a good block from a physical standby or primary database
  • Most comprehensive protection against data corruptions and lost writes on the primary database
  • Reduced downtime for storage, Oracle ASM, Oracle RAC, system migrations and some platform migrations, and changes using Data Guard switchover
  • Reduced downtime with Oracle Data Guard rolling upgrade capabilities
  • Ability to off-load primary database activities—such as backups, queries, or reporting—without sacrificing the RTO and RPO ability to use the standby database as a read-only resource using the real-time query apply lag capability
  • Ability to integrate non-database files using Oracle Database File System (DBFS) as part of the full site failover operations
  • No need for instance restart, storage remastering, or application reconnections after site failures
  • Transparency to applications
  • Transparent and integrated support for application failover
  • Effective network utilization

For data resident in Oracle databases, Oracle Data Guard, with its built-in zero-data-loss capability, is more efficient, less expensive, and better optimized for data protection and disaster recovery than traditional remote mirroring solutions.

Oracle Data Guard provides a compelling set of technical and business reasons that justify its adoption as the disaster recovery and data protection technology of choice, over traditional remote mirroring solutions.

The types of standby databases are as follows:

  • Physical standby database

Provides a physically identical copy of the primary database, with on-disk database structures that are identical to the primary database on a block-for-block basis. The database schema, including indexes, are the same. A physical standby database is kept synchronized with the primary database, through Redo Apply, which recovers the redo data received from the primary database and applies the redo to the physical standby database.

Physical standby database can receive and apply redo while it is open for read-only access. A physical standby database can therefore be used concurrently for data protection and reporting.

Additionally, a physical standby database can be used to install eligible one-off patches, patch set updates (PSUs), and critical patch updates (CPUs), in rolling fashion.

  • Logical standby database

Contains the same logical information as the production database, although the physical organization and structure of the data can be different.

The logical standby database is kept synchronized with the primary database through SQL Apply, which transforms the data in the redo received from the primary database into SQL statements and then executes the SQL statements on the standby database.

The flexibility of a logical standby database lets you upgrade Oracle Database software (patch sets and new Oracle Database releases) and perform other database maintenance in rolling fashion with almost no downtime.

 From Oracle Database 11g onward, the transient logical database rolling upgrade process can also be used with existing physical standby databases.

  • Snapshot Standby Database

A snapshot standby database is a fully updatable standby database.

Like a physical or logical standby database, a snapshot standby database receives and archives redo data from a primary database. Unlike a physical or logical standby database, a snapshot standby database does not apply the redo data that it receives.

The redo data received by a snapshot standby database is not applied until the snapshot standby is converted back into a physical standby database, after first discarding any local updates made to the snapshot standby database.

A snapshot standby database is best used in scenarios that require a temporary, updatable snapshot of a physical standby database.

For example, you can use the Oracle Real Application Testing option to capture the database workload on a primary and then replay it for test purposes on the snapshot standby.

Because redo data received by a snapshot standby database is not applied until it is converted back into a physical standby, the time needed to recover from a primary database                                                                                                                                         failure is directly proportional to the amount of redo data that needs to be applied.

Oracle Data Guard Advantages Compared to Remote Mirroring Solutions

The following list summarizes the advantages of using Oracle Data Guard compared to using remote mirroring solutions:

  • Better network efficiency—With Oracle Data Guard, only the redo data needs to be sent to the remote site and the redo data can be compressed to provide even greater network efficiency. However, if a remote mirroring solution is used for data protection, typically you must mirror the database files, the online redo log, the archived redo logs, and the control file. If the fast recovery area is on the source volume that is remotely mirrored, then you must also remotely mirror the flashback logs. Thus, compared to Oracle Data Guard, a remote mirroring solution must transmit each change many more times to the remote site.
  • Better performance—Oracle Data Guard only transmits write I/O`s to the redo log files of the primary database, whereas remote mirroring solutions must transmit these writes and every write I/O to data files, additional members of online log file groups, archived redo log files, and control files.

Oracle Data Guard is designed so that it does not affect the Oracle database writer (DBWR) process that writes to data files, because anything that slows down the DBWR process affects database performance. However, remote mirroring solutions affect DBWR process performance because they subject all DBWR process write I/O`s to network and disk I/O induced delays inherent to synchronous, zero-data-loss configurations.

Compared to mirroring, Oracle Data Guard provides better performance and is more efficient, Oracle Data Guard always verifies the state of the standby database and validates the data before applying redo data, and Oracle Data Guard enables you to use the standby database for updates while it protects the primary database.

  • Better suited for WANs—Remote mirroring solutions based on storage systems often have a distance limitation due to the underlying communication technology (Fibre Channel or ESCON (Enterprise Systems Connection)) used by the storage systems.        In a typical example, the maximum distance between the systems connected in a point-to-point fashion and running synchronously can be only 10 kilometers. By using specialized devices, this distance can be extended to 66 kilometers. However, when the data centers are located more than 66 kilometers apart, you must use a series of repeaters and converters from third-party vendors. These devices convert ESCON or Fibre Channel to the appropriate IP, ATM, or SONET networks.

 Although in new version Oracle have some of new features and technologies to minimize Redo transfer between long distances such as Oracle cascading Standby or Far-Sync technology. Far sync instances are part of the Oracle Active Data Guard Far Sync feature, which requires an Oracle Active Data Guard license. A far sync instance consumes very little disk and processing resources, yet provides the ability to failover to a terminal destination with zero data loss, as well as offload the primary database of other types of overhead (for example, redo transport). A far sync instance manages a control file, receives redo into standby redo logs (SRLs), and archives those SRLs to local archived redo logs, but that is where the similarity with standbys ends. A far sync instance does not have user data files, cannot be opened for access, cannot run redo apply, and can never function in the primary role or be converted to any type of standby database. All redo transport options available to a primary when servicing a typical standby destination are also available to it when servicing a far sync instance. And all redo transport options are available to a far sync instance when servicing terminal destinations (for example, performing redo transport compression, if you have a license for the Oracle Advanced Compression option).

 

 

  • Better resilience and data protection—Oracle Data Guard ensures much better data protection and data resilience than remote mirroring solutions. This is because corruptions introduced on the production database probably can be mirrored by remote mirroring solutions to the standby site, but corruptions are eliminated by Oracle Data Guard.

For example, if a stray write occurs to a disk, or there is a corruption in the file system, or the host bus adaptor corrupts a block as it is written to disk, then a remote mirroring solution may propagate this corruption to the disaster-recovery site. Because Oracle Data Guard only propagates the redo data in the logs, and the log file consistency is checked before it is applied, all such external corruptions are eliminated by Oracle Data Guard. Automatic block repair may be possible, thus eliminating any downtime in an Oracle Data Guard configuration.

  • Higher flexibility—Oracle Data Guard is implemented on pure commodity hardware. It requires only a standard TCP/IP-based network link between the two computers. There is no fancy or expensive hardware required. It also allows the storage to be laid out in a different fashion from the primary computer. For example, you can put the files on different disks, volumes, file systems, and so on.
  • Better functionality—Oracle Data Guard provides full suite of data protection features that provide a much more comprehensive and effective solution optimized for data protection and disaster recovery than remote mirroring solutions. For example: Active Data Guard, Redo Apply for physical standby databases, and SQL Apply for logical standby databases, multiple protection modes, push-button automated switchover and failover capabilities, automatic gap detection and resolution, GUI-driven management and monitoring framework, cascaded redo log destinations.
  • Higher ROI—Businesses must obtain maximum value from their IT investments, and ensure that no IT infrastructure is sitting idle. Oracle Data Guard is designed to allow businesses get something useful out of their expensive investment in a disaster-recovery site. Typically, this is not possible with remote mirroring solutions.

Friday, May 16, 2025

Oracle Instance Tuning (Buffer wait classes)

Oracle Instance Tuning (Buffer wait classes)

Written by: Alireza Kamrani


Buffer busy waits

Possible Cause:
This wait indicates that there are some buffers in the buffer cache that multiple processes are attempting to access concurrently.

Depends on buffer type. For example, waits for an index block may be caused by a primary key that is based on an ascending sequence.

General Area: Buffer cache, DBWR


Examine V$SESSION while the problem is occurring to determine the type of block in contention.

Query V$WAITSTAT for the wait statistics for each class of buffer.

Common buffer classes that have buffer busy waits include data block, segment header, undo header, and undo block.

Check the following V$SESSION_WAIT parameter columns:

• P1: File ID
• P2: Block ID
• P3: Class ID

Causes

To determine the possible causes, first query V$SESSION to identify the value of ROW_WAIT_OBJ# when the session waits for buffer busy waits.

For example:

SELECT row_wait_obj#
  FROM V$SESSION
WHERE EVENT = 'buffer busy waits';

To identify the object and object type contended for, query DBA_OBJECTS using the value for ROW_WAIT_OBJ# that is returned from V$SESSION.

For example:

SELECT owner, object_name, subobject_name, object_type FROM DBA_OBJECTS WHERE data_object_id = &row_wait_obj;

Actions
The action required depends on the class of block contended for and the actual segment.

Segment Header
If the contention is on the segment header, then this is most likely free list contention.
Automatic segment-space management(ASSM) in locally managed tablespaces eliminates the need to specify the PCTUSED, FREELISTS, and FREELIST GROUPS parameters.

If possible, switch from manual space management to automatic segment-space management (ASSM).

The following information is relevant if you are unable to use ASSM (for example, because the tablespace uses dictionary space management).

A free list is a list of free data blocks that usually includes blocks existing in several different extents within the segment.

Free lists are composed of blocks in which free space has not yet reached PCTFREE or used space has shrunk below PCTUSED. Specify the number of process free lists with the FREELISTS parameter. The default value of FREELISTS is one. The maximum value depends on the data block size.

To find the current setting for free lists for that segment, run the following:

SELECT SEGMENT_NAME, FREELISTS
  FROM DBA_SEGMENTS
WHERE SEGMENT_NAME = segment name
   AND SEGMENT_TYPE = segment type;


Set free lists, or increase the number of free lists. If adding more free lists does not alleviate the problem, then use free list groups (even in single instance this can make a difference).

If using Oracle RAC, then ensure that each instance has its own free list group(s).

Data Block
If the contention is on tables or indexes (not the segment header):

• Check for right-hand indexes. These are indexes that are inserted into at the same point by many processes. For example, those that use sequence number generators for the key values.
• Consider using ASSM, global hash partitioned indexes, or increasing free lists to avoid multiple processes attempting to insert into the same block.

Undo Header
For contention on rollback segment header:
• If you are not using automatic undo management, then add more rollback segments.

Undo Block
For contention on rollback segment block:
• If you are not using automatic undo management, then consider making rollback segment sizes larger.

db file scattered read
This event signifies that the user process is reading buffers into the SGA buffer cache and is waiting for a physical I/O call to return. A db file scattered read issues a scattered read to read the data into multiple discontinuous memory locations.

A scattered read is usually a multiblock read. It can occur for a fast full scan (of an index) in addition to a full table scan.
The db file scattered read wait event identifies that a full scan is occurring.

When performing a full scan into the buffer cache, the blocks read are read into memory locations that are not physically adjacent to each other.
Such reads are called scattered read calls, because the blocks are scattered throughout memory.

This is why the corresponding wait event is called 'db file scattered read'.

multiblock (up to DB_FILE_MULTIBLOCK_READ_COUNT blocks) reads due to full scans into the buffer cache show up as waits for 'db file scattered read'.

Check the following V$SESSION_WAIT
 parameter columns:

• P1: The absolute file number
• P2: The block being read
• P3: The number of blocks (should be greater than 1)

Actions
On a healthy system, physical read waits should be the biggest waits after the idle waits.

However, also consider whether there are direct read waits (signifying full table scans with parallel query) or db file scattered read waits on an operational (OLTP) system that should be doing small indexed accesses.
Other things that could indicate excessive I/O load on the system include the following:
• Poor buffer cache hit ratio
• These wait events accruing most of the wait time for a user experiencing poor response time

Managing Excessive I/O
There are several ways to handle excessive   I/O waits.
In the order of effectiveness, these are as follows:
• Reduce the I/O activity by SQL tuning.
• Reduce the need to do I/O by managing the workload.
• Gather system statistics with DBMS_STATS package, allowing the query optimizer to accurately cost possible access paths that use full scans.
• Use Automatic Storage Management.
• Add more disks to reduce the number of  I/Os for each disk.
• Alleviate I/O hot spots by redistributing    I/O across existing disks.

The first course of action should be to find opportunities to reduce I/O.
Examine the SQL statements being run by sessions waiting for these events and statements causing high physical I/Os from V$SQLAREA.

Factors that can adversely affect the execution plans causing excessive I/O include the following:
• Improperly optimized SQL
• Missing indexes
• High degree of parallelism for the table (skewing the optimizer toward scans)
• Lack of accurate statistics for the optimizer
• Setting the value for DB_FILE_MULTIBLOCK_READ_COUNT  initialization parameter too high which favors full scans

Inadequate I/O Distribution
Besides reducing I/O, also examine the I/O distribution of files across the disks.
Is I/O distributed uniformly across the disks, or are there hot spots on some disks?
Are the number of disks sufficient to meet the I/O needs of the database?
See the total I/O operations (reads and writes) by the database, and compare those with the number of disks used. Remember to include the I/O activity of LGWR and ARCH processes.

Finding the SQL Statement executed by Sessions Waiting for I/O

Use the following query to determine, at a point in time, which sessions are waiting for  I/O:

SELECT SQL_ADDRESS, SQL_HASH_VALUE
  FROM V$SESSION
WHERE EVENT LIKE 'db file%read'; 

Finding the Object Requiring I/O
To determine the possible causes, first query V$SESSION to identify the value of ROW_WAIT_OBJ# when the session waits for db file scattered read.

For example:

SELECT row_wait_obj# FROM V$SESSION WHERE EVENT = 'db file scattered read';

To identify the object and object type contended for, query DBA_OBJECTS using the value for ROW_WAIT_OBJ# that is returned from V$SESSION.

For example:

SELECT owner, object_name, subobject_name, object_type FROM DBA_OBJECTS WHERE data_object_id = &row_wait_obj;

db file sequential read
This event signifies that the user process is reading a buffer into the SGA buffer cache and is waiting for a physical I/O call to return.

A sequential read is a single-block read.
Single block I/Os are usually the result of using indexes.

Rarely, full table scan calls could get truncated to a single block call because of extent boundaries, or buffers present in the buffer cache. These waits would also show up as db file sequential read.

Check the following V$SESSION_WAIT
parameter columns:
• P1: The absolute file number
• P2: The block being read
• P3: The number of blocks (should be 1)

See Also:
"db file scattered read" for information about managing excessive I/O, inadequate I/O distribution, and finding the SQL causing the I/O and the segment the I/O is performed on.

Actions
On a healthy system, physical read waits should be the biggest waits after the idle waits.
However, also consider whether there are db file sequential reads on a large data warehouse that should be seeing mostly full table scans with parallel query.

The following figure shows differences between these wait events:
• db file sequential read (single block read into one SGA buffer)
• db file scattered read (multiblock read into many discontinuous SGA buffers)
• direct read (single or multiblock read into the PGA, bypassing the SGA)






Tuesday, May 6, 2025

Oracle Instance Tuning(Part 5)

Alireza Kamrani


Additional Statistics
There are several statistics that can indicate performance problems that do not have corresponding wait events.

Redo Log Space Requests Statistic
The V$SYSSTAT statistic redo log space requests indicates how many times a server process had to wait for space in the online redo log, not for space in the redo log buffer.

Use this statistic and the wait events as an indication that you must tune checkpoints, DBWR, or archiver activity, not LGWR. Increasing the size of the log buffer does not help.

Read Consistency
Your system might spend excessive time rolling back changes to blocks in order to maintain a consistent view. Consider the following scenarios:

• If there are many small transactions and an active long-running query is running in the background on the same table where the changes are happening, then the query might need to roll back those changes often, in order to obtain a read-consistent image of the table. Compare the following V$SYSSTAT statistics to determine whether this is happening:

• consistent: changes statistic indicates the number of times a database block has rollback entries applied to perform a consistent read on the block. Workloads that produce a great deal of consistent changes can consume a great deal of resources.

• consistent gets: statistic counts the number of logical reads in consistent mode.

• If there are few very, large rollback segments, then your system could be spending a lot of time rolling back the transaction table during delayed block cleanout in order to find out exactly which system change number (SCN) a transaction was committed. When Oracle Database commits a transaction, all modified blocks are not necessarily updated with the commit SCN immediately. In this case, it is done later on demand when the block is read or updated. This is called delayed block cleanout.

The ratio of the following V$SYSSTAT statistics should be close to one:

✴️ ratio = transaction tables consistent reads - undo records applied / transaction tables consistent read rollbacks

▶️The recommended solution is to use automatic undo management.

• If there are insufficient rollback segments, then there is rollback segment (header or block) contention. Evidence of this problem is available by the following:

• Comparing the number of WAITS to the number of GETS in V$ROLLSTAT; the proportion of WAITS to GETS should be small.

• Examining V$WAITSTAT to see whether there are many WAITS for buffers of CLASS 'undo header'.

▶️The recommended solution is to use automatic undo management.

Table Fetch by Continued Row
You can detect migrated or chained rows by checking the number of table fetch continued row
statistic in V$SYSSTAT.

▶️A small number of chained rows (less than 1%) is unlikely to impact system performance. However, a large percentage of chained rows can affect performance.

▶️Chaining on rows larger than the block size is inevitable. Consider using a tablespace with a larger block size for such data.

However, for smaller rows, you can avoid chaining by using sensible space parameters and good application design.

🔶️For example, do not insert a row with key values filled in and nulls in most other columns, then update that row with the real data, causing the row to grow in size. Rather, insert rows filled with data from the start.

If an UPDATE statement increases the amount of data in a row so that the row no longer fits in its data block, then Oracle Database tries to find another block with enough free space to hold the entire row. If such a block is available, then Oracle Database moves the entire row to the new block. This operation is called row migration.

If the row is too large to fit into any available block, then the database splits the row into multiple pieces and stores each piece in a separate block.
This operation is called row chaining. The database can also chain rows when they are inserted.

Migration and chaining are especially detrimental to performance with the following:

• UPDATE statements that cause migration and chaining to perform poorly
• Queries that select migrated or chained rows because these must perform additional input and output

The definition of a sample output table named CHAINED_ROWS appears in a SQL script available on your distribution medium. The common name of this script is UTLCHN1.
SQL, although its exact name and location varies depending on your platform.
Your output table must have the same column names, data types, and sizes as the CHAINED_ROWS table.

▶️Increasing PCTFREE can help to avoid migrated rows. If you leave more free space available in the block, then the row has room to grow. You can also reorganize or re-create tables and indexes that have high deletion rates.
If tables frequently have rows deleted, then data blocks can have partially free space in them. If rows are inserted and later expanded, then the inserted rows might land in blocks with deleted rows but still not have enough room to expand. Reorganizing the table ensures that the main free space is totally empty blocks.

🔶️Note:
PCTUSED is not the opposite of PCTFREE.

Parse-Related Statistics
The more your application parses, the more potential for contention exists, and the more time your system spends waiting. If parse time CPU represents a large percentage of the CPU time, then time is being spent parsing instead of executing statements.
If this is the case, then it is likely that the application is using literal SQL and so SQL cannot be shared, or the shared pool is poorly configured.

There are several statistics available to identify the extent of time spent parsing by Oracle. Query the parse related statistics from V$SYSSTAT.
For example:

SELECT NAME, VALUE
  FROM V$SYSSTAT
WHERE NAME IN (  'parse time cpu', 'parse time elapsed', 'parse count (hard)', 'CPU used by this session' );

There are various ratios that can be computed to assist in determining whether parsing may be a problem:

• parse time CPU / parse time elapsed
This ratio indicates how much of the time spent parsing was due to the parse operation itself, rather than waiting for resources, such as latches. A ratio of one is good, indicating that the elapsed time was not spent waiting for highly contended resources.

• parse time CPU / CPU used by this session
This ratio indicates how much of the total CPU used by Oracle server processes was spent on parse-related operations. A ratio closer to zero is good, indicating that the majority of CPU is not spent on parsing.

Wait Events Statistics
The V$SESSION, V$SESSION_WAIT, V$SESSION_HISTORY, V$SESSION_EVENT, and V$SYSTEM_EVENT
views provide information on what resources were waited for, and, if the configuration parameter TIMED_STATISTICS is set to true, how long each resource was waited for.

Investigate wait events and related timing data when performing reactive performance tuning. The events with the most time listed against them are often strong indications of the performance bottleneck.

The following views contain related, but different, views of the same data:

• V$SESSION lists session information for each current session. It lists either the event currently being waited for, or the event last waited for on each session. This view also contains information about blocking sessions, the wait state, and the wait time.

• V$SESSION_WAIT is a current state view. It lists either the event currently being waited for, or the event last waited for on each session, the wait state, and the wait time.

• V$SESSION_WAIT_HISTORY lists the last 10 wait events for each current session and the associated wait time.

• V$SESSION_EVENT lists the cumulative history of events waited for on each session. After a session exits, the wait event statistics for that session are removed from this view.

• V$SYSTEM_EVENT lists the events and times waited for by the whole instance (that is, all session wait events data rolled up) since instance startup.

Because V$SESSION_WAIT is a current state view, it also contains a finer-granularity of information than V$SESSION_EVENT or V$SYSTEM_EVENT. It includes additional identifying data for the current event in three parameter columns: P1, P2, and P3.

For example, V$SESSION_EVENT can show that session 124 (SID=124) had many waits on the db file scattered read, but it does not show which file and block number.

However, V$SESSION_WAIT shows the file number in P1, the block number read in P2, and the number of blocks read in P3 (P1 and P2 let you determine for which segments the wait event is occurring).

This section concentrates on examples using V$SESSION_WAIT.

▶️However, Oracle recommends capturing performance data over an interval and keeping this data for performance and capacity analysis.

This form of rollup data is queried from the V$SYSTEM_EVENT view by AWR.

Most commonly encountered events are described in this chapter, listed in case-sensitive alphabetical order. Other event-related data to examine is also included. The case used for each event name is that which appears in the V$SYSTEM_EVENT view.

Changes to Wait Event Statistics from Past Releases
Starting with Oracle Database 11g, Oracle Database accumulates wait counts and time outs for wait events (such as in the V$SYSTEM_EVENT view) differently than in past releases.

Continuous waits for certain types of resources (such as enqueues) are internally divided into a set of shorter wait calls. In releases prior to Oracle Database 11g, each individual internal wait call was counted as a separate wait. Starting with Oracle Database 11g, a single resource wait is recorded as a single wait, irrespective of the number of internal time outs experienced by the session during the wait.
This change allows Oracle Database to display a more representative wait count, and an accurate total time spent waiting for the resource.

▶️Time outs now refer to the resource wait, instead of the individual internal wait calls. This change also affects the average wait time and the maximum wait time.
For example, if a user session must wait for an enqueue in order for a transaction row lock to update a single row in a table, and it takes 10 seconds to acquire the enqueue, Oracle Database breaks down the enqueue wait into 3-second wait calls.

In this example, there will be three 3-second wait calls, followed by a 1-second wait call.

From the session's perspective, however, there is only one wait on an enqueue.

In releases prior to Oracle Database 11g, the V$SYSTEM_EVENT view would represent this wait scenario as follows:

• TOTAL_WAITS: 4 waits (three 3-second waits, one 1-second wait)

• TOTAL_TIMEOUTS: 3 time outs (the first three waits time out and the enqueue is acquired during the final wait)

• TIME_WAITED: 10 seconds (sum of the times from the 4 waits)

• AVERAGE_WAIT: 2.5 seconds

• MAX_WAIT: 3 seconds

Starting with Oracle Database 11g, this wait scenario is represented as:

• TOTAL_WAITS: 1 wait (one 10-second wait)

• TOTAL_TIMEOUTS: 0 time outs (the enqueue is acquired during the resource wait)

• TIME_WAITED: 10 seconds (time for the resource wait)

• AVERAGE_WAIT: 10 seconds

• MAX_WAIT: 10 seconds

The following common wait events are affected by this change:

• Enqueue waits (such as enq: name - reason waits)

• Library cache lock waits

• Library cache pin waits

• Row cache lock waits

The following statistics are affected by this change:

• Wait counts

• Wait time outs

• Average wait time

• Maximum wait time

The following views are affected by this change:

• V$EVENT_HISTOGRAM

• V$EVENTMETRIC

• V$SERVICE_EVENT
• V$SERVICE_WAIT_CLASS

• V$SESSION_EVENT

• V$SESSION_WAIT

• V$SESSION_WAIT_CLASS

• V$SESSION_WAIT_HISTORY

• V$SYSTEM_EVENT

• V$SYSTEM_WAIT_CLASS

• V$WAITCLASSMETRIC

• V$WAITCLASSMETRIC_HISTORY

Oracle Enqueue Names
This appendix lists Oracle enqueues. Enqueues are shared memory structures (locks) that serialize access to database resources. 
They can be associated with a session or transaction.
Enqueue names are displayed in the LOCK_TYPE
column of the DBA_LOCK and DBA_LOCK_INTERNAL data dictionary views.

A resource uniquely identifies an object that can be locked by different sessions within an instance (local resource) or between instances (global resource). Each session that tries to lock the resource will have an enqueue on the resource.

Note:
The names of enqueues and their definitions may change from release to release.

The oracle user enqueues are:

The action to take depends on the lock type which is causing the most problems. The most common lock waits are generally for:

• TX: Transaction Lock -- Generally due to application or table setup issues, for example row level locking conflicts and ITL allocation
• TM: DML enqueue -- Generally due to application issues, particularly if foreign key constraints have not been indexed.
• ST: Space management enqueue -- Usually caused by too much space management occurring (for example, small extent sizes, lots of sorting, and so on)
• HW: High Water Mark -- Concurrent users trying to extend a segment's high-water mark for space allocated.

To determine which enqueues are causing the most waits systemwide, examine the V$ENQUEUE_STAT view thus:

SELECT eq_type "Lock", total_req# "Gets", total_wait# "Waits", cum_wait_time FROM V$enqueue_stat WHERE Total_wait# > 0 ;

This gives the systemwide number of waits for each lock type.

You can also examine:
• Sessions with high numbers of "enqueue waits" in the V$SESSTAT view
• Sampling of the V$LOCK view to find waiting / blocking sessions

The Oracle system enqueues are:

• BL, Buffer Cache Management
• BR, Backup/Restore
• CF, Controlfile Transaction
• CI, Cross-instance Call Invocation
• CU, Bind Enqueue
• DF, Datafile
• DL, Direct Loader Index Creation
• DM, Database Mount
• DR, Distributed Recovery Process
• DW, SecureFiles
• DX, Distributed Transaction
• FP, File Object
• FS, File Set
• IN, Instance Number
• IR, Instance Recovery
• IS, Instance State
• IV, Library Cache Invalidation
• JI, Enqueue used during AJV snapshot refresh
• JQ, Job Queue
• KK, Redo Log "Kick"
• KP, contention in Oracle Data Pump startup and shutdown processes
• KO, Multiple Object Checkpoint
• L[A-P], Library Cache Lock
• LS, Log Start or Switch
• MM, Mount Definition
• MR, Media Recovery
• N[A-Z], Library Cache Pin
• PE, ALTER SYSTEM SET PARAMETER = VALUE
• PF, Password File
• PI, Parallel Slaves
• PR, Process Startup
• PS, Parallel Slave Synchronization
• Q[A-Z], Row Cache
• RO, Object Reuse
• RT, Redo Thread
• RW, Row Wait
• SC, System Change Number
• SM, SMON
• SN, Sequence Number
• SQ, Sequence Number Enqueue
• SR, Synchronized Replication
• SS, Sort Segment
• SV, Sequence Number Value
• TA, Transaction Recovery
• TC, Thread Checkpoint
• TE, Extend Table
• TO, Temporary Table Object Enqueue
• TS, Temporary Segment (also TableSpace)
• TT, Tablespace
• UL, User-defined Locks
• UN, User Name
• US, Undo Segment, Serialization
• WL, Being Written Redo Log
• XA, Instance Attribute Lock
• XI, Instance Registration Lock
• ZA, Exclusive Lock When Moving Audit Table

Statistics Descriptions
This section describes some of the statistics stored in the V$SESSTAT and V$SYSSTAT views. 

The statistics are listed in alphabetical order.

The CLASS column contains a number representing one or more statistics classes. The following class numbers are additive:

• 1, User
• 2, Redo
• 4, Enqueue
• 8, Cache
• 16, OS
• 32, Real Application Clusters
• 64, SQL
• 128, Debug

For example, a class value of 72 represents a statistic that relates to SQL statements and caching.

Some statistics are populated only if the TIMED_STATISTICS initialization parameter is set to true.


Examples of Database Statistics Descriptions:

To see another's Oracle statistics you can visit :

https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/statistics-descriptions-2.html

Monday, May 5, 2025

Oracle Instance Tuning (Part 4)


Alireza Kamrani

Interpreting Oracle Database Statistics

Gather statistics that cover the time when the instance had the performance problem. If you previously captured baseline data for comparison, then you can compare the current data to the data from the baseline that most represents the problem workload.

When comparing two reports, ensure that the two reports are from times where the system was running comparable workloads.

Examine Load

Usually, wait events are the first data examined. However, if you have a baseline report, then check to see if the load has changed. Regardless of whether you have a baseline, it is useful to see whether the resource usage rates are high.

Load-related statistics to examine include redo size, session logical reads, db block changes, physical reads, physical read total bytes, physical writes, physical write total bytes, parse count (total), parse count (hard), and user calls. This data is queried from V$SYSSTAT. It is best to normalize this data over seconds and over transactions. It is also useful to examine the total I/O load in MB per second by using the sum of physical read total bytes and physical write total bytes. The combined value includes the I/O's used to buffer cache, redo logs, archive logs, by Recovery Manager (RMAN) backup and recovery and any Oracle Database background process.

In the AWR report, look at the Load Profile section. The data has been normalized over transactions and over seconds.

Changing Load

The load profile statistics over seconds show the changes in throughput (that is, whether the instance is performing more work each second). The statistics over transactions identify changes in the application characteristics by comparing these to the corresponding statistics from the baseline report.

High Rates of Activity

Examine the statistics normalized over seconds to identify whether the rates of activity are very high. It is difficult to make blanket recommendations on high values, because the thresholds are different on each site and are contingent on the application characteristics, the number and speed of CPUs, the operating system, the I/O system, and the Oracle Database release.

The following are some generalized examples (acceptable values vary at each site):

• A hard parse rate of more than 100 a second indicates that there is a very high amount of hard parsing on the system. High hard parse rates cause serious performance issues and must be investigated. Usually, a high hard parse rate is accompanied by latch contention on the shared pool and library cache latches.

• Check whether the sum of the wait times for library cache and shared pool latch events (latch: library cache, latch: library cache pin, latch: library cache lock and latch: shared pool) is significant compared to statistic DB time found in V$SYSSTAT. If so, examine the SQL ordered by Parse Calls section of the AWR report.

• A high soft parse rate could be in the rate of 300 a second or more. Unnecessary soft parses also limit application scalability. Optimally, a SQL statement should be soft parsed once in each session and executed many times.

Using Wait Event Statistics to Drill Down to Bottlenecks

Whenever an Oracle process waits for something, it records the wait using one of a set of predefined wait events. These wait events are grouped in wait classes. The Idle wait class groups all events that a process waits for when it does not have work to do and is waiting for more work to perform. Non-idle events indicate nonproductive time spent waiting for a resource or action to complete.





Note:

Not all symptoms can be evidenced by wait events.

The most effective way to use wait event data is to order the events by the wait time. This is only possible if TIMED_STATISTICS is set to true. Otherwise, the wait events can only be ranked by the number of times waited, which is often not the ordering that best represents the problem.

To get an indication of where time is spent, follow these steps:

• Examine the data collection for V$SYSTEM_EVENT. The events of interest should be ranked by wait time.

Identify the wait events that have the most significant percentage of wait time. To determine the percentage of wait time, add the total wait time for all wait events, excluding idle events, such as Null event , SQL*Net message from client, SQL*Net message to client, and SQL*Net more data to client. Calculate the relative percentage of the five most prominent events by dividing each event's wait time by the total time waited for all events.

Alternatively, look at the Top 5 Timed Events section at the beginning of the Automatic Workload Repository report. This section automatically orders the wait events (omitting idle events), and calculates the relative percentage:


Top 5 Timed Events
~~~~~~~~~~~~~~~~~~                              
Event                                  Waits    Time (s)    %Total Call Time
---------------------------------  ------------   -----------      -------------------------
CPU time                                            559     88.80
log file parallel write          2,181      28      4.42
SQL*Net more data from client   516,611        27      4.24
db file parallel write                       13,383          13      2.04
db file sequential read                   563                2       .27



In some situations, there might be a few events with similar percentages. This can provide extra evidence if all the events are related to the same type of resource request (for example, all I/O related events).


• Look at the number of waits for these events, and the average wait time. For example, for I/O related events, the average time might help identify whether the I/O system is slow. The following example of this data is taken from the Wait Event section of the AWR report:

                                                                        Avg
                                                                  Total Wait   wait     Waits
Event                          Waits  Timeouts   Time (s)   (ms)      /txn
-------------------------------  ---------   -----------   -------------  ---------  -------------
log file parallel write  2,181      0              28            13      41.2
SQL*Net more data from clie   516,611   0     27   0   9,747.4
db file parallel write         13,383         0         13      1     252.5


• The top wait events identify the next places to investigate. A table of common wait events is listed in the last section. It is usually a good idea to also have quick look at high-load SQL.

• Examine the related data indicated by the wait events to see what other information this data provides. Determine whether this information is consistent with the wait event data. In most situations, there is enough data to begin developing a theory about the potential causes of the performance bottleneck.

• To determine whether this theory is valid, cross-check data you have examined with other statistics available for consistency. The appropriate statistics vary depending on the problem, but usually include load profile-related data in V$SYSSTAT, operating system statistics, and so on. Perform cross-checks with other data to confirm or refute the developing theory.

Table of Wait Events and Potential Causes

Following table links wait events to possible causes and gives an overview of the Oracle data:



An Overview of Oracle Data Guard Capabilities

   An Overview of  Oracle  Data Guard Capabilities: Oracle Data Guard ensures high availability, data protection, and disaster recovery for ...