Thursday, August 29, 2024

Handling emergency performance problems in Oracle database

đź’˘Handling emergency performance problems in Oracle databaseđź’˘


Alireza Kamrani 

08/30/2024


This section provides techniques for dealing with performance emergencies. 

You presumably have a methodology for establishing and improving application performance. 

However, in an emergency situation, a component of the system has changed to transform it from a reliable, predictable system to one that is unpredictable and not satisfying user requests.

In this case, the performance engineer must rapidly determine what has changed and take appropriate actions to resume normal service as quickly as possible. 

In many cases, it is necessary to take immediate action, and a rigorous performance 

improvement project is unrealistic. 


After addressing the immediate performance problem, the performance engineer must collect sufficient debugging information either to get better clarity on the performance problem or to at least ensure that it does not happen again.


The method for debugging emergency performance problems is the same as the method described in the previous posts.


However, shortcuts are taken in various stages because of the timely nature of the problem. Keeping detailed notes and records of facts found as the debugging process progresses is essential for later analysis and justification of any remedial actions. This is analogous to a doctor keeping good patient notes for future reference.


Steps in the Emergency Performance Method

The Emergency Performance Method is as follows:

♦️Survey the performance problem and collect the symptoms of the performance problem. This process should include the following:

    • User feedback on how the system is underperforming. Is the problem throughput or response time?
    • Ask the question, "What has changed since we last had good performance?" This answer can give clues to the problem. However, getting unbiased answers in an escalated situation can be difficult. Try to locate some reference points, such as collected statistics or log files, that were taken before and after the problem.
    • Use automatic tuning features to diagnose and monitor the problem. In addition, you can use Oracle Enterprise Manager Cloud Control (Cloud Control) performance features to identify top SQL and sessions.

♦️Sanity-check the hardware utilization of all components of the application system. Check where the highest CPU utilization is, and check the disk, memory usage, and network performance on all the system components. This quick process identifies which tier is causing the problem. If the problem is in the application, then shift analysis to application debugging. Otherwise, move on to database server analysis.

♦️Determine if the database server is constrained on CPU or if it is spending time waiting on wait events. If the database server is CPU-constrained, then investigate the following:

    • Sessions that are consuming large amounts of CPU at the operating system level and database; check V$SESS_TIME_MODEL for database CPU usage
    • Sessions or statements that perform many buffer gets at the database level; check V$SESSTAT and V$SQLSTATS
    • Execution plan changes causing sub-optimal SQL execution; these can be difficult to locate
    • Incorrect setting of initialization parameters
    • Algorithmic issues caused by code changes or upgrades of all components

♦️If the database sessions are waiting on events, then follow the wait events listed in V$SESSION_WAIT to determine what is causing serialization. The V$ACTIVE_SESSION_HISTORY view contains a sampled history of session activity which you can use to perform diagnosis even after an incident has ended and the system has returned to normal operation. In cases of massive contention for the library cache, it might not be possible to logon or submit SQL to the database. In this case, use historical data to determine why there is suddenly contention on this latch. If most waits are for I/O, then examine V$ACTIVE_SESSION_HISTORY to determine the SQL being run by the sessions that are performing all of the inputs and outputs.

♦️Apply emergency action to stabilize the system. This could involve actions that take parts of the application off-line or restrict the workload that can be applied to the system. It could also involve a system restart or the termination of job in process. These naturally have service level implications. 

♦️Validate that the system is stable. Having made changes and restrictions to the system, validate that the system is now stable, and collect a reference set of statistics for the database. Now follow the rigorous performance method described earlier in this book to bring back all functionality and users to the system. This process may require significant application re-engineering before it is complete.


Alireza Kamrani .

Sunday, August 25, 2024

Steps in the Oracle Performance Improvement Method

Steps in the Oracle Performance Improvement Method

Database Box


Alireza Kamrani 

08/26/2024


Oracle performance methodology helps you to identify performance problems in an Oracle database. This involves identifying bottlenecks and fixing them. It is recommended that changes be made to a system only after you have confirmed that there is a bottleneck. 

Performance improvement, by its nature, is iterative. 

For this reason, removing the first bottleneck might not lead to performance improvement immediately, because another bottleneck might be revealed. 

Also, in some cases, if serialization points move to a more inefficient sharing mechanism, then performance could degrade. 

With experience, and by following a rigorous method of bottleneck elimination, applications can be debugged and made scalable.

Performance problems generally result from either a lack of throughput, unacceptable user/job response time, or both. The problem might be localized between application modules, or it might be for the entire system.


In this section you will find a essential methods to investigating and discovering performance issues:


  1. Perform the following initial standard checks:

a) Get candid feedback from users. Determine the performance project's scope and subsequent performance goals, and performance goals for the future. This process is key in future capacity planning.


b) Get a full set of operating system, database, and application statistics from the system when the performance is both good and bad. If these are not available, then get whatever is available. Missing statistics are analogous to missing evidence at a crime scene: They make detectives work harder and it is more time-consuming.


C) Sanity-check the operating systems of all computers involved with user performance. By sanity-checking the operating system, you look for hardware or operating system resources that are fully utilized. List any over-used resources as symptoms for analysis later. In addition, check that all hardware shows no errors or diagnostics.


  1. Check for the top ten most common mistakes with Oracle Database(presented in previous post at this group), and determine if any of these are likely to be the problem. List these as symptoms for later analysis. These are included because they represent the most likely problems. ADDM automatically detects and reports nine of these top ten issues.
  2. Build a conceptual model of what is happening on the system using the symptoms as clues to understand what caused the performance problems. 
  3. Propose a series of remedy actions and the anticipated behavior to the system, then apply them in the order that can benefit the application the most. ADDM produces recommendations each with an expected benefit. A golden rule in performance work is that you only change one thing at a time and then measure the differences. Unfortunately, system downtime requirements might prohibit such a rigorous investigation method. If multiple changes are applied at the same time, then try to ensure that they are isolated so that the effects of each change can be independently validated.
  4. Validate that the changes made have had the desired effect, and see if the user's perception of performance has improved. Otherwise, look for more bottlenecks, and continue refining the conceptual model until your understanding of the application becomes more accurate.
  5. Repeat the last three steps until performance goals are met or become impossible due to other constraints.


This method identifies the biggest bottleneck and uses an objective approach to performance improvement. The focus is on making large performance improvements by increasing application efficiency and eliminating resource shortages and bottlenecks. In this process, it is anticipated that minimal (less than 10%) performance gains are made from instance tuning, and large gains (100% +) are made from isolating application inefficiencies. 


Alireza Kamrani 

08/26/2024

Saturday, August 17, 2024

Top Ten Mistakes Found in Oracle Database Environment

 đź’˘Top Ten Mistakes Found in Oracle Database Environments 

Database Box


Alireza Kamrani 

08/18/2024


The Oracle Performance Improvement Method:

Oracle performance methodology helps you to identify performance problems in an Oracle database. This involves identifying bottlenecks and fixing them. It is recommended that changes be made to a system only after you have confirmed that there is a bottleneck. 

Performance improvement, by its nature, is iterative. 

For this reason, removing the first bottleneck might not lead to performance improvement immediately, because another bottleneck might be revealed. 

Also, in some cases, if serialization points move to a more inefficient sharing mechanism, then performance could degrade. 

With experience, and by following a rigorous method of bottleneck elimination, applications can be debugged and made scalable.

Performance problems generally result from either a lack of throughput, unacceptable user/job response time, or both. The problem might be localized between application modules, or it might be for the entire system.


Important article that you read in this section:


This section lists the most common mistakes found in Oracle databases. 

By following the Oracle performance improvement methodology, you should be able to avoid these mistakes altogether. 

If you find these mistakes in your system, then re-engineer the application where the performance effort is worthwhile.


  1. Bad connection management
    The application connects and disconnects for each database interaction. This problem is common with stateless middleware in application servers. It has over two orders of magnitude impact on performance, and is totally unscalable.
  2. Bad use of cursors and the shared pool
    Not using cursors results in repeated parses. If bind variables are not used, then there is hard parsing of all SQL statements. This has an order of magnitude impact in performance, and it is totally unscalable. Use cursors with bind variables that open the cursor and execute it many times. Be suspicious of applications generating dynamic SQL.
  3. Bad SQL
    Bad SQL is SQL that uses more resources than appropriate for the application requirement. This can be a decision support systems (DSS) query that runs for more than 24 hours, or a query from an online application that takes more than a minute. You should investigate SQL that consumes significant system resources for potential improvement. ADDM identifies high load SQL. SQL Tuning Advisor can provide recommendations for improvement.
  4. Use of nonstandard initialization parameters
    These might have been implemented based on poor advice or incorrect assumptions. Most databases provide acceptable performance using only the set of basic parameters. In particular, parameters associated with SPIN_COUNT on latches and undocumented optimizer features can cause a great deal of problems that can require considerable investigation.
    Likewise, optimizer parameters set in the initialization parameter file can override proven optimal execution plans. For these reasons, schemas, schema statistics, and optimizer settings should be managed as a group to ensure consistency of performance.
  5. Getting database I/O wrong
    Many sites lay out their databases poorly over the available disks. Other sites specify the number of disks incorrectly, because they configure disks by disk space and not I/O bandwidth.
  6. Online redo log setup problems
    Many sites run with too few online redo log files and files that are too small. Small redo log files cause system checkpoints to continuously put a high load on the buffer cache and I/O system. If too few redo log files exist, then the archive cannot keep up, and the database must wait for the archiver to catch up.
  7. Serialization of data blocks in the buffer cache due to lack of free lists, free list groups, transaction slots (INITRANS), or shortage of rollback segments. 
    This is particularly common on INSERT-heavy applications, in applications that have raised the block size above 8K, or in applications with large numbers of active users and few rollback segments. Use automatic segment-space management (ASSM) and automatic undo management to solve this problem.
  8. Long full table scans
    Long full table scans for high-volume or interactive online operations could indicate poor transaction design, missing indexes, or poor SQL optimization. Long table scans, by nature, are I/O intensive and unscalable.
  9. High amounts of recursive (SYS) SQL
    Large amounts of recursive SQL executed by SYS could indicate space management activities, such as extent allocations, taking place. This is unscalable and impacts user response time. Use locally managed tablespaces to reduce recursive SQL due to extent allocation. Recursive SQL executed under another user ID is probably SQL and PL/SQL, and this is not a problem. 
  10. Deployment and migration errors
    In many cases, an application uses too many resources because the schema owning the tables has not been successfully migrated from the development environment or from an older implementation. Examples of this are missing indexes or incorrect statistics. These errors can lead to sub-optimal execution plans and poor interactive user performance and maybe even facing with  'db sequential read or scattered read ' waiting event. When migrating applications of known performance, export the schema statistics to maintain plan stability using the DBMS_STATS package.
    Although these errors are not directly detected by ADDM, ADDM highlights the resulting high load SQL.

Sincerely ,
Alireza Kamrani.

Saturday, August 10, 2024

An Overview in Oracle 23ai Memory Management

 An Overview in Oracle 23ai Memory Management  

Database Box





Alireza Kamrani (08/10/2024)


About Memory Management

The memory structures that must be managed are the system global area (SGA) and the instance program global area (instance PGA). Oracle Database supports various memory management methods, which are chosen by initialization parameter settings.


Unified Memory Management

Unified Memory configures the database instance memory with a single parameter, MEMORY_SIZE. 


Using Unified Memory Management

You can allow the Oracle Database instance to automatically manage and tune memory for you based on total memory size.

Unified Memory configures the database instance memory with a single parameter, MEMORY_SIZE. 

The database can dynamically use this memory for SGA, PGA, MGA, UGA, and other memory segments. 

The split between different memory segments is based off the memory sizing of the PDBs currently opened in the CDB. 

If huge pages are configured, they can be used for both SGA and PGA. Unified Memory provides an extremely flexible memory configuration.


Automatic Memory Management

Oracle Database can manage the SGA memory and instance PGA memory completely automatically. You designate only the total memory size to be used by the instance, and Oracle Database dynamically exchanges memory between the SGA and the instance PGA as needed to meet processing demands. 

This capability is referred to as automatic memory management

With this memory management method, the database also dynamically tunes the sizes of the individual SGA components and the sizes of the individual PGAs. 

Oracle recommends automatic memory management for databases where the total size of the SGA and PGA memory is less than or equal to four gigabytes.


Manual Memory Management

If you prefer to exercise more direct control over the sizes of individual memory components, you can disable automatic memory management and configure the database for manual memory management. There are a few different methods available for manual memory management. Some of these methods retain some degree of automation. The methods therefore vary in the amount of effort and knowledge required by the DBA. These methods are:

  • Automatic shared memory management - for the SGA
  • Manual shared memory management - for the SGA
  • Automatic PGA memory management - for the instance PGA
  • Manual PGA memory management - for the instance PGA

These memory management methods are described later in this chapter.

If you create your database with Database Configuration Assistant (DBCA) and choose the basic installation option, automatic memory management is enabled when system memory is less than or equal to 4 gigabytes. 

When system memory is greater than 4 gigabytes, automatic memory management is disabled, and automatic shared memory management is enabled. 

If you choose advanced installation, then DBCA enables you to select automatic memory management or automatic shared memory management.

Oracle recommends automatic shared memory management when the total size of the SGA and PGA memory is four gigabytes or larger.


MEMORY_SIZE in 23c:


MEMORY_SIZE specifies the size of instance-wide usable memory.

Property

Description

Parameter type

Big integer

Syntax

MEMORY_SIZE = integer [K | M | G]

Default value

0

Modifiable

You can use ALTER SYSTEM to increase or decrease the value of this parameter. However, the value of this parameter must always be greater than or equal to its value at instance startup.

Modifiable in a PDB

No

Range of values

1536 MB to MEMORY_MAX_SIZE

Basic

No

Oracle RAC

Different instances can use different values.

In a text-based initialization parameter file, if you omit MEMORY_MAX_SIZE and include a value for MEMORY_SIZE, then the database automatically sets MEMORY_MAX_SIZE to the value of MEMORY_SIZE. 

♦️If you omit the line for MEMORY_SIZE and include a value for MEMORY_MAX_SIZE, the MEMORY_SIZE parameter defaults to 0. 

After instance startup, you can dynamically change MEMORY_SIZE to a nonzero value, provided that it does not exceed the value of MEMORY_MAX_SIZE.

♦️Based on the value of MEMORY_SIZE, the values for the SGA and PGA parameters SGA_TARGET and PGA_AGGREGATE_LIMIT will be calculated internally for the instance. The sizing will also depend on the large page settings.

♦️When MEMORY_SIZE is set to a nonzero value, the CDB-level values for the following SGA and PGA parameters will be ignored by the database: SGA_MAX_SIZE, PGA_AGGREGATE_LIMIT, and SGA_TARGET. If you assign a value to any of these parameters at the time of instance startup, the database will not report an error, but the parameter value will be ignored. 

If you dynamically modify any of these parameters, an error will be reported.

When MEMORY_SIZE is set to a nonzero value, the MEMORY_MAX_TARGET and MEMORY_TARGET parameters should be set to 0. 

If you attempt to set either of these parameters to a nonzero value, the request will be ignored, and a warning will be written to the alert log at startup. 

Note:This parameter is available starting with Oracle Database 23ai


♦️Unified Memory Management (UMM): 

A New Approach in Oracle Database 23c♦️


Prior to Oracle Database 23c, managing memory involved multiple methods:

– Automatic memory management (AMM)
– Automatic shared memory management (ASMM)
– And Manual memory management 


With the introduction of Oracle Database 23c, a new contender joins the arena: 

Unified Memory Management (UMM).


– Oracle Database 23c introduces UMM, a new memory management method that simplifies and potentially enhances performance.

– UMM builds upon Automatic Memory Management (AMM), offering more flexibility and the ability to leverage huge pages.

– It’s controlled by the new parameter MEMORY_SIZE, which supersedes the AMM parameter MEMORY_TARGET.


With UMM, similar to AMM, you set a total memory size, and Oracle Database dynamically allocates memory between the System Global Area (SGA) and Program Global Area (PGA) as needed.

Huge Page Support: UMM can potentially utilize huge pages, which can boost performance and reduce memory overhead. This suggests it may not be constrained by the 4GB limit like AMM.


Conclusion

UMM holds promise for simplifying memory management in Oracle Database 23c. While initial tests reveal intriguing behavior, comprehensive understanding and best practices await further documentation and experimentation.


Alireza Kamrani 

08/10/2024

Oracle Standby troubleshooting

 đź’˘ Oracle Standby troubleshootingđź’˘   Written by: Alireza Kamrani In this post I represented an essential methods to evaluating , configuri...