Database Tuning - Database Management - Oracle Database 12c DBA Handbook (2015)

Oracle Database 12c DBA Handbook (2015)

PART
II

Database Management

CHAPTER
8

Database Tuning

From a tuning perspective, every system has a performance bottleneck that may move from component to component over a time period of days or even weeks The goal of performance design is to make sure that the physical limitations of the applications and the associated hardware—I/O throughput rates, memory sizes, query performance, and so on—do not impact the business performance. If the application performance limits the business process it is supposed to be supporting, the application must be tuned. During the design process, the limits of the application environment must be evaluated, including the hardware and the design of the application’s interactions with the database. No environment provides infinite computing capacity, so every environment is designed to fail at some performance point. In the process of designing the application, you should strive to have your performance needs amply served by the performance capabilities of the environment.

Performance tuning is a part of the life cycle of every database application, and the earlier performance is addressed (preferably before going into production), the more likely it will be successfully resolved. As noted in previous chapters, most performance problems are not isolated symptoms but rather are the result of the system design. Tuning efforts should therefore focus on identifying and fixing the underlying flaws that result in unacceptable performance.

Tuning is the final step in a four-step process: planning, implementing, and monitoring must precede it. If you tune only for the sake of tuning, you are failing to address the full cycle of activity and will likely never resolve the underlying flaws that caused the performance problem.

Most of the database objects that can be tuned are discussed elsewhere in this book—for example, undo segments are covered thoroughly in Chapter 7. This chapter only discusses the tuning-related activities for such objects, while their own chapters cover planning and monitoring activities.

As of Oracle Database 10g, and significantly enhanced in Oracle Database 11g and 12c, you can take advantage of new tuning tools and features, including the Automated Workload Repository (AWR). For ease of use, and to take advantage of numerous automated monitoring and diagnostic tools, Oracle Cloud Control 12c is the Oracle-recommended tool on a routine basis as a central dashboard for all monitoring and performance tools. Before jumping into the Cloud Control tools, however, I’ll present some of the prerequisites and principles behind effective proactive and reactive tuning methods.

In the following sections, you will see tuning activities for the following areas:

Image Application design

Image SQL

Image Memory usage

Image Data access

Image Data manipulation

Image Network traffic

Image Physical storage

Image Logical storage

Image Tuning using the Automatic Workload Repository

Image Managing resources in a PDB

Image Performing Database Replay

Tuning Application Design

Why should a DBA tuning guide include a section on application design? And why should this section come first? Because nothing you can do as a DBA will have as great an impact on the system performance as the design of the application. The requirements for making the DBA’s involvement in application development a reality are described in Chapter 5. In designing an application, you can take several steps to make effective and proper use of the available technology, as described in the following sections.

Effective Table Design

No matter how well designed your database is, poor table design will lead to poor performance. Not only that, but overly rigid adherence to relational table designs will lead to poor performance. That is due to the fact that while fully relational table designs (said to be in the third normal form or even fourth or fifth normal form) are logically desirable, they are usually physically undesirable in anything but OLTP environments.

The problem with such designs is that although they accurately reflect the ways in which an application’s data is related to other data, they do not reflect the normal access paths that users will employ to access that data. Once the user’s access requirements are evaluated, the fully relational table design will become unworkable for many large queries. Typically, the first problems will occur with queries that return a large number of columns. These columns are usually scattered among several tables, forcing the tables to be joined together during the query. If one of the joined tables is large, the performance of the whole query may suffer unless a software/hardware platform such as Oracle Exadata or Oracle In-Memory Database filters the table columns themselves to only return the desired columns.

In designing the tables for an application, developers should first develop the model in third normal form (3NF) and then consider denormalizing data to meet specific requirements—for example, creating small derived tables (or materialized views) from large, static tables. Can that data be dynamically derived from the large, static tables on demand? Of course. But if the users frequently request it, and the data is largely unchanging, then it makes sense to periodically store that data in the format in which the users will ask for it.

For example, some applications store historical data and current data in the same table. Each row may have a timestamp column, so the current row in a set is the one with the most recent timestamp. Every time a user queries the table for a current row, the user will need to perform a subquery, such as the following:

images

If two such tables are joined, there will be two subqueries. In a small database, this may not present a performance problem, but as the number of tables and rows increase, performance problems will follow. Partitioning the historical data away from the current data or storing the historical data in a separate table will involve more work for the DBAs and developers but should improve the long-term performance of the application.

User-centered table design, rather than theory-centered table design, will yield a system that better meets the users’ requirements; this is not to say that you should not design the database using 3NF and 4NF methodologies: it’s a good starting point for revealing business requirements and a prerequisite for the physical database design. Physical database design options include separating a single table into multiple tables, and the reverse—combining multiple tables into one. The emphasis should be on providing the users the most direct path possible to the data they want in the format they want.

Distribution of CPU Requirements

When effectively designed and given adequate hardware, an Oracle database application will process I/O requests without excessive waits, will use memory areas without swapping and paging memory to disk, and will use the CPU threads without generating high load averages. Data that is read into memory by one process will be stored in memory and reused by many processes before it is aged out of memory. SQL commands are reused via the shared SQL area (the shared pool), further reducing the burden on the system.

If the I/O burdens of the system are reduced, the CPU burden may increase. You have several options for managing the CPU resources:

Image Schedule the CPU load. You should time long-running batch queries or update programs to run at off-peak hours. Rather than run them at lower operating system priority while online users are performing transactions, run them at normal operating system priority at an appropriate time. Maintaining their normal priority level while scheduling the jobs appropriately will minimize potential locking, undo, and CPU conflicts.

Image Take advantage of the opportunity to physically shift CPU requirements from one server to another. Wherever possible, isolate the database server from the application’s CPU requirements. The data distribution techniques described in the networking chapters of this book will result in data being stored in its most appropriate place, and the CPU requirements of the application may be separated from the I/O requirements against the database.

Image Consider using Oracle’s Real Application Clusters (RAC) technology on traditional hardware platforms or on the Exadata engineered system platform to spread the database access requirements for a single database across multiple instances. See Chapter 12 for a review of RAC features.

Image Use the database resource management features. You can use the Database Resource Manager to establish resource allocation plans and resource consumer groups. You can use Oracle’s capabilities to change the resource allocations available to the consumer groups. See Chapter 5 for details on creating and implementing resource consumer groups and resource plans via the Database Resource Manager.

Image Use Parallel Query to distribute the processing requirements of SQL statements among multiple CPUs. Parallelism can be used by almost every SQL, DML, and DDL command, including SELECT, CREATE TABLE AS SELECT, CREATE INDEX, RECOVER, partition management, and the SQL*Loader Direct Path loading options.

The degree to which a transaction is parallelized depends on the defined degree of parallelism for the transaction. Each table has a defined degree of parallelism, and a query can override the default degree of parallelism by using the PARALLEL hint. Using Automatic Degree of Parallelism (Auto DOP), Oracle evaluates the number of CPUs available on the server and the number of disks on which the table’s data is stored in order to determine the default degree of parallelism.

The maximum available parallelism level is set at the instance level. The PARALLEL_MAX_SERVERS initialization parameter sets the maximum number of parallel query server processes that can be used at any one time by all the processes in the database. For example, if you set PARALLEL_MAX_SERVERS to 32 for your instance, and you run a query that uses 30 parallel query server processes for its query and sorting operations, then only two parallel query server processes are available for all the rest of the users in the database. Therefore, you need to carefully manage the parallelism you allow for your queries and batch operations. The PARALLEL_ADAPTIVE_MULTI_USER parameter, when set to TRUE, enables an adaptive algorithm designed to improve performance in multi-user environments using parallel execution. The algorithm automatically reduces the requested degree of parallelism according to the system load at query startup time. The effective degree of parallelism is based on the default degree of parallelism, or the degree from the table, or hints, divided by a reduction factor.

For each table, you can set a default degree of parallelism via the PARALLEL clause of the CREATE TABLE and ALTER TABLE commands. The degree of parallelism tells Oracle how many parallel query server processes to attempt to use for each part of the operation. For example, if a query that performs both table scanning and data sorting operations has a degree of parallelism of 8, there could be 16 parallel query server processes used—eight for scanning and eight for sorting. You can also specify a degree of parallelism for an index when it is created, via the PARALLEL clause of the CREATE INDEX command.

The minimum number of parallel query server processes started is set via the PARALLEL_MIN_SERVERS initialization parameter. In general, you should set this parameter to a very low number (less than 12) unless the system is actively used at all hours of the day. Setting this parameter to a low value will force Oracle to repeatedly start new query server processes, but it will greatly decrease the amount of memory held by idle parallel query server processes during low-use periods. If you set a high value for PARALLEL_MIN_SERVERS, you may frequently have idle parallel query server processes on your server, holding onto the memory they had previously acquired but not performing any functions.

Parallelizing operations distributes their processing requirements across multiple CPUs; however, you should use these features carefully. If you use a degree of parallelism of 5 for a large query, you will have five separate processes accessing the data (and five more receiving the processed rows). If you have that many processes accessing the data, you may create contention for the disks on which the data is stored, thus hurting performance. When using Parallel Query, you should selectively apply it to those tables whose data is well distributed over many physical devices. Also, you should avoid using it for all tables; as noted earlier, a single query may use all the available parallel query server processes, eliminating the parallelism for all the rest of the transactions in your database.

Effective Application Design

In addition to the application design topics described later in this chapter are several general guidelines for Oracle applications.

First, applications should minimize the number of times they request data from the database. Options include the use of sequences, the use of PL/SQL blocks, and the denormalization of tables. You can use distributed database objects such as materialized views to help reduce the number of times a database is queried.

images
NOTE

Even mildly inefficient SQL can impact your database’s performance if it is executed frequently enough. SQL that generates few or no physical I/O reads still consumes CPU resources.

Second, different users of the same application should query the database in a very similar fashion. Consistent access paths increase the likelihood that requests may be resolved by information that is already available in the System Global Area (SGA). The sharing of data includes not only the tables and rows retrieved but also the queries that are used. If the queries are identical, a parsed version of a query may already exist in the shared SQL pool, reducing the amount of time needed to process the query. Cursor-sharing enhancements in the optimizer increase the likelihood of statement reuse within the shared pool—but the application needs to be designed with statement reuse in mind.

Third, you should restrict the use of dynamic SQL. Dynamic SQL, by definition, is undefined until run time; an application’s dynamic SQL may select a couple of rows the first time, perform several full table scans of the order table the second time, and inadvertently perform a Cartesian join the third time (or consciously perform a Cartesian join using the CROSS JOIN keyword in a SELECT statement!). In addition, there is no way to guarantee that a dynamically generated SQL statement is syntactically correct until run time. Dynamically generated SQL is a double-edged sword: you have the flexibility to create your SQL on the fly based on user input, but you open yourself up to SQL injection attacks for both your in-house applications and your external website applications.

Fourth, you should minimize the number of times you open and close sessions in the database. If the application repeatedly opens a session, executes a small number of commands, and then closes the session, the performance of the SQL may be a minor factor in the overall performance. The session management may cost more than any other step in the application.

When stored procedures are used, the same code may be executed multiple times, taking advantage of the shared pool. You can also manually compile procedures, functions, and packages to avoid run-time compilation. When you create a procedure, Oracle automatically compiles it. If the procedure later becomes invalid, the database must recompile it before executing it. To avoid incurring this compilation cost at run time, use the ALTER PROCEDURE command shown here:

image

You can view the SQL text for all procedures in a database via the TEXT column in the DBA_SOURCE view. The USER_SOURCE view will display the procedures owned by the user performing the query. Text for packages, functions, and package bodies is also accessible via the DBA_SOURCE and USER_SOURCE views, which in turn reference a table named SYS.SOURCE$.

The first two design guidelines discussed—limiting the number of user accesses and coordinating their requests—require the application developer to know as much as possible about how the data is to be used and the access paths involved. For this reason, it is critical that users be as involved in the application design as they are in the table design. If the users spend long hours drawing pictures of tables with the data modelers and little time with the application developers discussing the access paths, the application will most likely not meet the users’ needs. The access paths should be discussed as part of the data modeling exercise.

Tuning SQL

As with application design, the tuning of SQL statements seems far removed from a DBA’s duties. However, DBAs should be involved in reviewing the SQL that is written as part of the application. A well-designed application may still experience performance problems if the SQL it uses is poorly tuned. Application design and SQL problems cause most of the performance problems in properly designed databases.

The key to tuning SQL is to minimize the search path that the database uses to find the data. In most Oracle tables, each row has a ROWID associated with it. The ROWID contains information about the physical location of the row—its file, the block within that file, and the row within the database block.

When a query with no WHERE clause is executed, the database will usually perform a full table scan, reading every block from the table under the high-water mark (HWM). During a full table scan, the database locates the first block of the table and then reads sequentially through all the other blocks in the table. For large tables, full table scans can be very time-consuming.

When specific rows are queried, the database may use an index to help speed the retrieval of the desired rows. An index maps logical values in a table to their RowIDs—which in turn map them to specific physical locations. Indexes may either be unique—in which case there is no more than one occurrence for each value—or nonunique. Indexes only store RowIDs for NOT NULL values in the indexed columns.

You may index several columns together. This is called a concatenated or composite index, and it will be used if its leading column is used in the query’s WHERE clause. The optimizer can also use a “skip-scan” approach in which a concatenated index is used even if its leading column is not in the query’s WHERE clause.

Indexes must be tailored to the access path needed. Consider the case of a three-column, concatenated index. As shown in the following listing, this index is created on the CITY, STATE, and ZIP columns of the EMPLOYEE table:

image

If a query of the form

image

is executed, then the leading column of the index (CITY) is not in the WHERE clause. Oracle can use two types of index-based accesses to retrieve the rows—a skip-scan of the index or a full scan of the index. The optimizer will select an execution path based on the index’s statistics—its size, the size of the table, and the selectivity of the index. If users will frequently run this type of query, the index’s columns may need to be reordered with STATE first in order to reflect the actual usage pattern.

An index range scan is another index-based optimization that Oracle can use to efficiently retrieve selective data. Oracle uses an index range scan when the variable in the WHERE clause is equal to, less than, or greater than the specified constant and the variable is the leading column if the index is a multi-part index. No ORDER BY clause is required if you want the rows returned in the index order, as in this example where you are looking for employees hired before August 1st, 2012:

image

However, if you are using Parallel Query to retrieve these rows using the index, then you will need the ORDER BY clause to return the rows in the index order.

It is critical that the table’s data be as ordered as possible. If users are frequently executing range queries—selecting those values that are within a specified range—then having the data ordered may require fewer data blocks to be read while resolving the query, thus improving performance. The ordered entries in the index will point to a set of neighboring blocks in the table rather than blocks that are scattered throughout the datafile(s); this assumes that the data was ordered when loaded or the query has a GROUP BY clause using the indexed column(s).

For example, consider a range query of the following type:

image

This range query will require fewer data blocks to be read if the physical rows in the EMPLOYEE table are ordered by the EMPNO column. To guarantee that the rows are properly ordered in the table, copy the rows to a temporary table with an ORDER BY clause, TRUNCATE the original table, and reload the rows from the newly sorted table. In addition, you should use online segment shrink to reclaim fragmented free space below the HWM for tables with frequent DML activity; this improves cache utilization and requires fewer blocks to be scanned in full table scans. You use the ALTER TABLE … SHRINK SPACE command to compact the free space in a table.

Impact of Order on Load Rates

Indexes impact the performance of both queries and data loads. During INSERT operations, the rows’ order has a significant impact on load performance. Even in heavily indexed environments, properly ordering the rows prior to INSERT may improve load performance by 50 percent. This assumes you only have one index—you can’t load rows in index order if you have more than one index! Keep in mind that each additional index adds three times the overhead for DML operations when maintaining the index.

As an index grows, Oracle allocates new blocks. If a new index entry is added beyond the previous entry, the new entry will be added to the last block in the index. If the new entry causes Oracle to exceed the space available in that block, the entry will be moved to a new block. There is very little performance impact from this block allocation.

If the inserted rows are not ordered, new index entries will be written to existing index node blocks. If there is no more room in the block where the new value is added, and the block is not the last block in the index, the block’s entries will be split in two. Half the index entries will be left in the original block, and half will be moved to a new block. As a result, the performance suffers during loads (because of the additional space management activity) and during queries (because the index contains more unused space, requiring more blocks to be read for the same number of entries read).

images
NOTE

There is a significant drop in load performance when an index increases its number of internal levels. To see the number of levels, analyze an index and then select its BLEVEL column value from DBA_INDEXES.

Because of the way Oracle manages its indexes internally, load rates will be affected each time a new index is added (because it is unlikely that inserted rows will be sorted correctly for multiple columns). From a load rate perspective, favor fewer multicolumn indexes over multiple single-column indexes.

Additional Indexing Options

If the data is not very selective, you may consider using bitmap indexes. As described in Chapter 18, bitmap indexes are most effective for queries against large, static data sets with few distinct values. You can create both bitmap indexes and normal (B-tree) indexes on the same table, and Oracle will perform any necessary index conversions dynamically during query processing. See Chapter 18 for details on using bitmap indexes.

images
NOTE

Avoid creating bitmap indexes on tables modified by online transactions; data warehouse tables, however, are excellent candidates for bitmap indexes.

If two tables are frequently queried together, then clusters may be effective in improving performance. Clusters store rows from multiple tables in the same physical data blocks, based on their logical values (the cluster key).

Queries in which a column’s value is compared to an exact value (rather than a range of values) are called equivalence queries. A hash cluster stores a row in a specific location based on its value in the cluster key column. Every time a row is inserted, its cluster key value is used to determine in which block it should be stored; this same logic can be used during queries to quickly find data blocks that are needed for retrieval. Hash clusters are designed to improve the performance of equivalence queries; they will not be as helpful in improving the performance of the range queries discussed earlier. Performance will be significantly worse with range queries, queries that force a full table scan, or for hash clusters that are frequently updated.

Reverse indexes provide another tuning solution for equivalence queries. In a reverse index, the bytes of the index are stored in reverse order. In a traditional index, two consecutive values are stored next to each other. In a reverse index, consecutive values are not stored next to each other. For example, the values 2004 and 2005 are stored as 4002 and 5002, respectively, in a reverse index. Although not appropriate for range scans, reverse indexes may reduce contention for index blocks if many equivalence queries are performed. Reverse key indexes may need to be rebuilt quite often to perform well. They should also include a large value for PCTFREE to allow for inserts.

images
NOTE

You cannot reverse a bitmap index.

You can create function-based indexes on expressions involving columns. This query could not use a B-tree index on the NAME column:

image

However, the query

image

could, because the second query does not perform a function on the NAME column. Instead of creating an index on the column NAME, you can create an index on the column expression UPPER(NAME), as shown in the following example:

image

Although function-based indexes can be useful, be sure to consider the following points when creating them:

Image Can you restrict the functions that will be used on the column? If so, can you restrict all functions from being performed on the column?

Image Do you have adequate storage space for the additional indexes?

Image When you drop the table, you will be dropping more indexes (and therefore more extents) than before. How will that impact the time required to drop the table? (This is less of a consideration if you are using locally managed tablespaces, which you should be using if you’re running Oracle Database 10g or later.)

Function-based indexes are useful, but you should implement them sparingly. The more indexes you create on a table, the longer all INSERT, UPDATE, and DELETE operations will take. Of course, this applies to creating any additional indexes on a table, regardless of type.

Text indexes use Oracle’s text options (Oracle Text) to create and manage lists of words and their occurrences—similar to the way a book’s index works. Text indexes are most often used to support applications that perform searches on portions of words with wildcards.

Partitioned tables can have indexes that span all partitions (global indexes) or indexes that are partitioned along with the table partitions (local indexes). From a query-tuning perspective, local indexes may be preferable because they contain fewer entries than global indexes.

Generating Explain Plans

How can you determine which access path the database will use to perform a query? This information can be viewed via the EXPLAIN PLAN command. This command will evaluate the execution path for a query and will place its output into a table (named PLAN_TABLE) in the database. A sample EXPLAIN PLAN command is shown in the following listing:

image

The first line of this command tells the database that it is to explain its execution plan for the query without actually executing the query. You can optionally include a SET STATEMENT_ID clause to label the explain plan in PLAN_TABLE. Following the keyword FOR, the query to be analyzed is listed.

The account that is running this command must have a plan table in its schema. Oracle provides the CREATE TABLE commands needed for this table. The file, named utlxplan.sql, is located in the $ORACLE_HOME/rdbms/admin directory. Oracle creates a single PLAN_TABLE available for all users.

images
NOTE

You should drop and re-create the plan table following each Oracle upgrade because new columns may be added by the upgrade scripts.

Query the plan table using the DBMS_XPLAN procedure:

image

You can also use the Oracle-supplied script in $ORACLE_HOME/rdbms/admin/utlxpls.sql to query the plan table for serial execution, or the $ORACLE_HOME/rdbms/admin/utlxplp.sql for parallel execution.

This query will report on the types of operations the database must perform to resolve the query. The output will show the steps of the query execution in a hierarchical fashion, illustrating the relationships between the steps. For example, you may see an index-based step that has a TABLE ACCESS BY INDEX ROWID step as its parent, indicating that the index step is processed first and the RowIDs returned from the index are used to retrieve specific rows from the table.

You can use the SET AUTOTRACE ON command in SQL*Plus to automatically generate the EXPLAIN PLAN output and trace information for every query you run. The autotrace-generated output will not be displayed until after the query has completed, whereas the EXPLAIN PLAN output is generated without running the command. To enable autotrace-generated output, a plan table must either be created in the schema in which the autotrace utility will be used or created in the SYSTEM schema with access granted to the schema that will use the autotrace utility. The script plustrce.sql, located in the $ORACLE_HOME/sqlplus/admin directory, must also be run as SYS before you can SET AUTOTRACE ON. Users must have the PLUSTRACE role enabled prior to executing SET AUTOTRACE ON. For an installation or upgrade to Oracle Database 10g or later, this script is run automatically.

images
NOTE

To show the EXPLAIN PLAN output without running the query, use the SET AUTOTRACE TRACEONLY EXPLAIN command.

If you use the parallel query options or query remote databases, an additional section of the SET AUTOTRACE ON output will show the text of the queries executed by the parallel query server processes or the query executed within the remote database.

To disable the autotrace feature, use the SET AUTOTRACE OFF command.

The following listing shows how to turn on autotrace and generate an explain plan:

image

To understand the explain plan, read the order of operations within the hierarchy from inside out, until you come to a set of operations at the same level of indentation; then read from top to bottom. In this example, there are no operations at the same level of indentation; therefore, you read the order of operations from inside out. The first operation is the index range scan, followed by the table access; the SELECT STATEMENT operation displays the output to the user. Each operation has an ID value (the first column) and a parent ID value (the second number; it is blank in the topmost operation). In more complex explain plans, you may need to use the parent ID values to determine the order of operations.

This plan shows that the data returned to the user comes via a TABLE ACCESS BY INDEX ROWID operation. The RowIDs are supplied by an index range scan of a unique index.

Each step is assigned a “cost.” The cost is cumulative, reflecting the cost of that step plus the costs of all its child steps. You can use the cost values to identify steps that contribute the greatest amount to the overall cost of the query and then target them for specific tuning efforts.

When evaluating the output of the EXPLAIN PLAN command, you should make sure that the most selective indexes (that is, the most nearly unique indexes) are used by the query. If a nonselective index is used, you may be forcing the database to perform unnecessary reads to resolve the query. A full discussion of SQL tuning is beyond the scope of this book, but you should focus your tuning efforts on making sure that the most resource-intensive SQL statements are using the most selective indexes possible.

In general, transaction-oriented applications (such as multi-user systems used for data entry) judge performance by the time it takes to return the first row of a query. For transaction-oriented applications, you should focus your tuning efforts on using indexes to reduce the database’s response time to the query.

If the application is batch oriented (with large transactions and reports), you should focus on improving the time it takes to complete the overall transaction instead of the time it takes to return the first row from the transaction. Improving the overall throughput of the transaction may require using full table scans in place of index accesses—and may improve the overall performance of the application.

If the application is distributed across multiple databases, focus on reducing the number of times database links are used in queries. If a remote database is frequently accessed during a query, the cost of accessing that remote database is paid each time the remote data is accessed. Even if the cost of accessing the remote data is low, accessing it thousands of times will eventually place a performance burden on your application. See the section “Reducing Network Traffic” later in this chapter for additional tuning suggestions for distributed databases.

Tuning Memory Usage

In Oracle Database 10g, you were able to use the Automatic Workload Repository (AWR) toolset to gather and manage statistical data (as described later in this chapter). As of Oracle Database 11g, new initialization parameters were introduced such as MEMORY_TARGET to further automate the overall memory used by Oracle—helping you tune the database automatically when you don’t have time to read the AWR reports! With Oracle Database 12c, you can now tune your SQL statements with “zero effort.” This seems like marketing hype at first glance, but in some cases it’s true, because the Oracle optimizer can use an adaptive execution plan by stopping a plan after initial execution and making run-time adjustments to the plan after finding variations in the initial cardinality estimates.

The data block buffer cache and the shared pool are managed via a least recently used (LRU) algorithm. A preset area is set aside to hold values; when it fills, the least recently used data is eliminated from memory and written back to disk. An adequately sized memory area keeps the most frequently accessed data in memory; accessing less frequently used data requires physical reads.

Managing SGA Pools

You can see the queries performing the logical and physical reads in the database via the V$SQL view. V$SQL reports the cumulative number of logical and physical reads performed for each query currently in the shared pool, as well as the number of times each query was executed. The following script shows the SQL text for the queries in the shared pool, with the most I/O-intensive queries listed first. The query also displays the number of logical reads (buffer gets) per execution:

image

If the shared pool has been flushed, queries executed prior to the flush will no longer be accessible via V$SQL. However, the impact of those queries can still be seen, provided the users are still logged in. The V$SESS_IO view records the cumulative logical reads and physical reads performed for each user’s session. You can query V$SESS_IO for each session’s hit ratio, as shown in the following listing:

image

image

To see the objects whose blocks are currently in the data block buffer cache, query the X$BH table in SYS’s schema, as shown in the following query (note that the SYS and SYSTEM objects are excluded from the output so the DBA can focus on the application tables and indexes present in the SGA):

image

images
NOTE

You can query the NAME and KIND columns from V$CACHE to see similar data if you are not connected as the SYS user.

There are multiple cache areas within the data block buffer cache:

Image DEFAULT cache This is the standard cache for objects that use the default database block size for the database.

Image KEEP cache This is dedicated to objects you wish to keep in memory at all times. In general, this area is used for small tables with few transactions. This cache is good for lookup tables for such things as state codes, ZIP codes, and salesperson data.

Image RECYCLE cache This is dedicated to objects you wish to flush from memory quickly. Like the KEEP cache, the RECYCLE cache isolates objects in memory so that they do not interfere with the normal functioning of the DEFAULT cache. The KEEP and RECYCLE cache sizes only apply to the default database block size.

Image Block size-specific caches (DB_nK_CACHE_SIZE) Oracle supports multiple database block sizes within a single database; you must create a cache for each non-default database block size.

With all the areas of the SGA—the data block buffers, the dictionary cache, and the shared pool—the emphasis should be on sharing data among users. Each of these areas should be large enough to hold the most commonly requested data from the database. In the case of the shared pool, it should be large enough to hold the parsed versions of the most commonly used queries. When they are adequately sized, the memory areas in the SGA can dramatically improve the performance of individual queries and of the database as a whole.

The sizes of the KEEP and RECYCLE buffer pools do not reduce the available space in the data block buffer cache. For a table to use one of the new buffer pools, specify the name of the buffer pool via the BUFFER_POOL parameter within the table’s STORAGE clause. For example, if you want a table to be quickly removed from memory, assign it to the RECYCLE pool. The default pool is named DEFAULT, so you can use the ALTER TABLE command to redirect a table to the DEFAULT pool at a later date. Here is an example of assigning a table to the KEEP buffer pool:

images

If you do not set a size for the KEEP and RECYCLE pools, all of the data and index blocks assigned to those areas go to the default buffer cache.

You can use the LARGE_POOL_SIZE initialization parameter to specify the size of the large pool allocation heap in bytes. The large pool allocation heap is used in shared server systems for session memory, by parallel execution for message buffers, and by backup processes for I/O buffers. By default, the large pool is not created.

As of Oracle Database 10g, you can use Automatic Shared Memory Management (ASMM). To activate ASMM, set a nonzero value for the SGA_TARGET database initialization parameter. After you set SGA_TARGET to the size of the SGA you want (that is, all of the caches added together), you can then set the other cache-related parameters (DB_CACHE_SIZE, SHARED_POOL_SIZE, JAVA_POOL_SIZE, and LARGE_POOL_SIZE) each to 0; if you provide values for these parameters, those values will serve as the lower bound for the automatic tuning algorithm. Shut down and restart the database for the changes to take effect; the database will then begin actively managing the size of the different caches. You can monitor the size of the caches at any time via the V$SGASTAT dynamic performance view. Oracle Database 11g takes the automation a step farther: you can set MEMORY_TARGET to the total amount of memory available to Oracle. The amount of memory specified in MEMORY_TARGET is allocated between the SGA and PGA automatically; when MEMORY_TARGET is set, SGA_TARGET and PGA_AGGREGATE_TARGET are set to 0 and ignored.

As the workload in the database changes, the database will alter the cache sizes to reflect the needs of the application. For example, if there is a heavy batch-processing load at night and a more intensive online transaction load during the day, the database may alter the cache sizes as the load changes. These changes occur automatically, without DBA intervention. If you specify a value for a pool in your initialization parameter file, Oracle will use that as the minimum value for that pool.

images
NOTE

DBAs can create KEEP and RECYCLE pools in the buffer cache. KEEP and RECYCLE pools are not affected by the dynamic cache resizing and are not part of the DEFAULT buffer pool.

You may wish to selectively “pin” packages in the shared pool. Pinning packages in memory immediately after starting the database will increase the likelihood that a large enough section of contiguous free space is available in memory. As shown in the following listing, the KEEP procedure of the DBMS_SHARED_POOL package designates the packages to pin in the shared pool:

image

Pinning of packages is more closely related to application management than application tuning, but it can have a performance impact. If you can avoid dynamic management of fragmented memory areas, you minimize the work Oracle has to do when managing the shared pool.

Specifying the Size of the SGA

To enable the automatic management of the caches, set the SGA_TARGET initialization parameter to the size of the SGA.

If you choose to manage the caches manually, you can set the SGA_MAX_SIZE parameter to the size of the SGA. You can then specify the sizes for the individual caches; they can be dynamically altered while the database is running via the ALTER SYSTEM command.

You can also set SGA_TARGET to a size smaller than SGA_MAX_SIZE. Oracle will use the value of SGA_TARGET to initially set the individual caches and can grow them over time to occupy more memory up to SGA_MAX_SIZE. This is a good way to determine what the total memory requirements should be before deploying your database in a production environment.

image

For example, you may specify the following:

image

With these parameters, 4MB will be available for data queried from objects in tablespaces with 4KB block sizes. Objects using the standard 8KB block size will use the 160MB cache. While the database is open, you can change the SHARED_POOL_SIZE and DB_CACHE_SIZE parameter values via the ALTER SYSTEM command.

images
NOTE

With few exceptions, Oracle recommends a single block size of 8KB. Even on Oracle engineered systems such as Exadata the recommended and only block size is 8KB.

SGA_TARGET is a dynamic parameter and can be changed through Cloud Control or with the ALTER SYSTEM command.

SGA_TARGET can be increased up to the value of SGA_MAX_SIZE. It can be reduced until any one of the auto-tuned components reaches its minimum size—either a user-specified minimum or an internally determined minimum. Both of these parameters can be used to tune the SGA.

Using the Cost-Based Optimizer

With each release of its software, Oracle has added new features to its optimizer and has enhanced existing features. Effective use of the cost-based optimizer requires that the tables and indexes in your application be analyzed regularly. The frequency with which you analyze the objects depends on the rate of change within the objects. For batch transaction applications, you should reanalyze the objects after each large set of batch transactions. For OLTP applications, you should reanalyze the objects on a time-based schedule (such as via a weekly or nightly process).

images
NOTE

As of Oracle Database 10g Release 1, the rule-based optimizer is deprecated. There is no reason to use it unless you are supporting a legacy application that only runs under previous versions of Oracle Database.

Statistics on objects are gathered via executions of the DBMS_STATS package’s procedures. If you analyze a table, its associated indexes are automatically analyzed as well. You can analyze a schema (via the GATHER_SCHEMA_STATS procedure) or a specific table (via GATHER_TABLE_STATS). You can also analyze only the indexed columns, thus speeding the analysis process. In general, you should analyze a table’s indexes each time you analyze the table. In the following listing, the PRACTICE schema is analyzed:

image

You can view the statistics on tables and indexes via DBA_TABLES, DBA_TAB_COL_STATISTICS, and DBA_INDEXES. Some column-level statistics are still provided in DBA_TAB_COLUMNS, but they are provided there strictly for backward compatibility. The statistics for the columns of partitioned tables are found in DBA_PART_COL_STATISTICS.

images
NOTE

As of Oracle Database 10g, statistics are automatically gathered in a default installation using the automated maintenance tasks infrastructure (AutoTask) during maintenance windows.

When the command in the preceding listing is executed, all the objects belonging to the PRACTICE schema will be analyzed using the GATHER AUTO option of DBMS_STATS.GATHER_SCHEMA_STATS. You can also choose to estimate statistics based on a specified percentage of the table’s rows, but using the GATHER AUTO option gathers additional statistics that can further improve execution plans.

Tuning Data Access

Even if your tables are properly configured and indexed, your performance may suffer if there are wait events caused by file accesses. In the following sections, I’ll present some general recommendations related to file and tablespace configuration.

As the old saying goes, the best kind of I/O is the I/O you don’t have to do. Given that you do have to perform some I/O, your best investment in database configuration is to use Automatic Storage Management (ASM). While you can get the same general level of performance from a well-tuned OS file system (with optimal queue depths, SAN configuration, and so forth), ASM makes management of Oracle storage easier while at the same time maintaining optimal performance.

Identifying Chained Rows

When a data segment is created, a PCTFREE value is specified. The PCTFREE parameter tells the database how much space should be kept free in each data block. The free space is used when rows that are already stored in the data block extend in length via UPDATE operations.

If an UPDATE to a row causes that row to no longer completely fit in a single data block, that row may be moved to another data block, or the row may be chained to another block. If you are storing rows whose length is greater than the Oracle block size, you will automatically have chaining.

Chaining affects performance because it requires Oracle to look in multiple physical locations for data from the same logical row. By eliminating unnecessary chaining, you reduce the number of physical reads needed to return data from a datafile.

You can avoid chaining by setting the proper value for PCTFREE during the creation of data segments. The default value, 10, should be increased if your application will frequently update NULL values to non-NULL values, or if long text values are frequently updated.

If you must perform mass updates of a table, it will be much faster and more efficient to re-create the table and perform the update as part of the process. No rows will be chained or migrated utilizing this method.

You should only use the ANALYZE command to collect information about chained rows and to collect information about freelist blocks. The ANALYZE command has an option that detects and records chained rows in tables. Its syntax is

image

The ANALYZE command will put the output from this operation into a table called CHAINED_ROWS in your local schema. The SQL to create the CHAINED_ROWS table is in a file named utlchain.sql, in the $ORACLE_HOME/rdbms/admin directory. The following query will select the most significant columns from the CHAINED_ROWS table:

image

The output will show the RowIDs for all chained rows, allowing you to quickly see how many of the rows in the table are chained. If chaining is prevalent in a table, that table should be rebuilt with a higher value for PCTFREE.

You can see the impact of row chaining by querying V$SYSSTAT. The V$SYSSTAT entry for the “table fetch continued row” statistic will be incremented each time Oracle selects data from a chained row. This statistic will also be incremented when Oracle selects data from a spanned row—a row that is chained because it is greater than a block in length. Tables with LONG, BLOB, CLOB, and NCLOB datatypes are likely to have spanned rows. The “table fetch continued row” statistic is also available in the AWR reports (or Statspack reports in Oracle Database 10g and earlier).

In addition to chaining rows, Oracle will occasionally move rows. If a row exceeds the space available to its block, the row may be inserted into a different block. The process of moving a row from one block to another is called row migration, and the moved row is called a migrated row. During row migration, Oracle has to dynamically manage space in multiple blocks and access the freelist (the list of blocks available for INSERT operations). A migrated row does not appear as a chained row, but it does impact the performance of your transactions. See Chapter 6 for an example of using the DBMS_ADVISOR package to find and reorganize tables with chained rows.

images
TIP

Accessing a migrated row increments the count in the “table fetch continued row” statistic.

Using Index-Organized Tables

An index-organized table (IOT) is an index in which an entire row is stored, rather than just the key values for the row. Rather than storing a RowID for the row, the primary key for the row is treated as the row’s logical identifier. Rows in IOTs do not have RowIDs.

Within the IOT, the rows are stored sorted by their primary key values. Thus, any range query that is based on the primary key may benefit because the rows are stored near each other (see the section “Tuning SQL” earlier in this chapter for the steps involved in ordering the data within normal tables). Additionally, any equivalence query based on the primary key may benefit because the table’s data is all stored in the index. In the traditional table/index combination, an index-based access requires an index access followed by a table access. In an IOT, only the IOT is accessed; there is no companion index.

However, the performance gains from a single index access in place of a normal index/table combination access may be minimal—any index-based access should be fast. To help improve performance further, index-organized tables offer additional features:

Image An overflow area By setting the PCTTHRESHOLD parameter when the IOT is created, you can store the primary key data apart from the row data. If the row’s data exceeds the threshold of available space in the block, it will dynamically be moved to an overflow area. You can designate the overflow area to be in a separate tablespace, improving your ability to distribute the I/O associated with the table.

Image Secondary indexes You can create secondary indexes on the IOT. Oracle will use the primary key values as the logical RowIDs for the rows.

Image Reduced storage requirements In a traditional table/index combination, the same key values are stored in two places. In an IOT, they are stored once, reducing the storage requirements.

images
TIP

When specifying an overflow area, you can use the INCLUDING COLUMN clause to specify the column (and all successive columns in the table definition) that will be stored in the overflow area:

image

Both ORDER_DATE and ORDER_NOTES will be stored in the overflow area.

To create an IOT, use the ORGANIZATION INDEX clause of the CREATE TABLE command. You must specify a primary key when creating an IOT. Within an IOT, you can drop columns or mark them as inactive via the SET UNUSED clause of the ALTER TABLE command.

Tuning Issues for Index-Organized Tables

Like indexes, IOTs may become internally fragmented over time, as values are inserted, updated, and deleted. To rebuild an IOT, use the MOVE clause of the ALTER TABLE command. In the following example, the EMPLOYEE_IOT table is rebuilt, along with its overflow area:

image

You should avoid storing long rows of data in IOTs. In general, you should avoid using an IOT if the data is longer than 75 percent of the database block size. If the database block size is 4KB, and your rows will exceed 3KB in length, you should investigate the use of normal tables and indexes instead of IOTs. The longer the rows are, and the more transactions are performed against the IOT, the more frequently it will need to be rebuilt.

images
NOTE

You cannot use LONG datatypes in IOTs, but you can use LOBs. You shouldn’t be using LONG datatypes any more anyway—their use will be deprecated in a future version of Oracle. CLOBs have all the functionality of LONG datatypes and more.

As noted earlier in this chapter, indexes impact data load rates. For best results, the primary key index of an IOT should be loaded with sequential values to minimize the costs of index management.

Tuning Data Manipulation

Several data manipulation tasks—usually concerning the manipulation of large quantities of data—may involve the DBA. You have several options when loading and deleting large volumes of data, as described in the following sections.

Bulk Inserts: Using the SQL*Loader Direct Path Option

When used in the Conventional Path mode, SQL*Loader reads records from a file, generates INSERT commands, and passes them to the Oracle kernel. Oracle then finds places for those rows in free blocks in the table and updates any associated indexes.

In Direct Path mode, SQL*Loader creates formatted data blocks and writes directly to the datafiles. This requires occasional checks with the database to get new locations for data blocks, but no other I/O with the database kernel is required. The result is a data load process that is dramatically faster than Conventional Path mode.

If the table is indexed, the indexes will be placed in DIRECT PATH state during the load. After the load is complete, the new keys (index column values) will be sorted and merged with the existing keys in the index. To maintain the temporary set of keys, the load will create a temporary index segment that is at least as large as the largest index on the table. The space requirements for this can be minimized by presorting the data and using the SORTED INDEXES clause in the SQL*Loader control file.

To minimize the amount of dynamic space allocation necessary during the load, the data segment that you are loading into should already be created, with all the space it will need already allocated. You should also presort the data on the columns of the largest index in the table. Sorting the data and leaving the indexes on the table during a Direct Path load will usually yield better performance than if you were to drop the indexes before the load and then re-create them after it completed.

However, keep in mind that direct path load operations always use new extents. Therefore, if you use parallel DELETEs and then follow it with parallel direct path loads, you will potentially have an ever-increasing amount of free space in every block, and the disk space allocated to the table will increase much faster than you expect.

To take advantage of the Direct Path option, the table cannot be clustered, and there can be no other active transactions against it. During the load, only NOT NULL, UNIQUE, and PRIMARY KEY constraints will be enforced; after the load has completed, the CHECK and FOREIGN KEY constraints can be automatically reenabled. To force this to occur, use the REENABLE DISABLED_CONSTRAINTS clause in the SQL*Loader control file.

The only exception to this reenabling process is that table insert triggers, when reenabled, are not executed for each of the new rows in the table. A separate process must manually perform whatever commands were to have been performed by this type of trigger.

The SQL*Loader Direct Path loading option provides significant performance improvements over the SQL*Loader Conventional Path loader in loading data into Oracle tables by bypassing SQL processing, buffer cache management, and unnecessary reads for the data blocks. The Parallel Data Loading option of SQL*Loader allows multiple processes to load data into the same table, utilizing spare resources on the system and thereby reducing the overall elapsed times for loading. Given enough CPU and I/O resources, this can significantly reduce the overall loading times.

To use Parallel Data Loading, start multiple SQL*Loader sessions using the PARALLEL keyword (otherwise, SQL*Loader puts an exclusive lock on the table). Each session is an independent session requiring its own control file. The following listing shows an example of a Direct Path load using the DIRECT=TRUE parameter on the command line:

image

Each session creates its own log, bad, and discard files (part1.log, part2.log, part3.log, part1.bad, part2.bad, and so on) by default. Because you have multiple sessions loading data into the same table, only the APPEND option is allowed for Parallel Data Loading. The SQL*Loader REPLACE, TRUNCATE, and INSERT options are not allowed for Parallel Data Loading. If you need to delete the table’s data before starting the load, you must manually delete the data (via DELETE or TRUNCATE commands). You cannot use SQL*Loader to delete the rows automatically if you are using Parallel Data Loading.

images
NOTE

If you use Parallel Data Loading, indexes are not maintained by the SQL*Loader session. Before starting the loading process, you must drop all indexes on the table and disable all its PRIMARY KEY and UNIQUE constraints. After the loads complete, you can re-create the table’s indexes.

In serial Direct Path loading (PARALLEL=FALSE), SQL*Loader loads data into extents in the table. If the load process fails before the load completes, some data could be committed to the table prior to the process failure. In Parallel Data Loading, each load process creates temporary segments for loading the data. The temporary segments are later merged with the table. If a Parallel Data Loading process fails before the load completes, the temporary segments will not have been merged with the table. If the temporary segments have not been merged with the table being loaded, no data from the load will have been committed to the table.

You can use the SQL*Loader FILE parameter to direct each data loading session to a different datafile. By directing each loading session to its own datafile, you can balance the I/O load of the loading processes. Data loading is very I/O intensive and must be distributed across multiple disks for parallel loading to achieve significant performance improvements over serial loading.

After a Parallel Data Load, each session may attempt to reenable the table’s constraints. As long as at least one load session is still underway, attempting to reenable the constraints will fail. The final loading session to complete should attempt to reenable the constraints, and should succeed. You should check the status of your constraints after the load completes. If the table being loaded has PRIMARY KEY and UNIQUE constraints, you can create the associated indexes in parallel prior to enabling the constraints.

Bulk Data Moves: Using External Tables

You can query data from files outside the database via an object called an external table. An external table’s structure is defined via the ORGANIZATION EXTERNAL clause of the CREATE TABLE command; its syntax closely resembles the SQL*Loader control file syntax.

You cannot manipulate rows in an external table, and you cannot index it—every access of the table results in a full table scan (that is, a full scan of the file at the operating system level). As a result, the performance of queries against external tables tends to be worse than that of queries against tables stored within the database. However, external tables offer a couple of potential benefits for systems that load large sets of data:

Image Because the data is not stored within the database, the data is only stored once (outside the database, rather than both outside and inside the database), thus saving space.

Image Because the data is never loaded into the database, the data-loading time is eliminated.

Given that you cannot index external tables, they are most useful for operations in which large volumes of data are accessed once serially by batch programs. For example, many data warehousing environments have a staging area in which data is loaded into temporary tables prior to rows being inserted into the tables users will query. Instead of loading the data into those temporary tables, you can access the operating system files directly via external tables, saving time and space.

From an architectural perspective, external tables allow you to focus your database contents on the objects users will most commonly use—small codes tables, aggregation tables, and transaction tables—while keeping very large data sets outside the database. You can replace the files accessed by the external tables at any time without incurring any transaction overhead within the database.

Bulk Inserts: Common Traps and Successful Tricks

If your data is not being inserted from a flat file, SQL*Loader will not be a useful solution. For example, if you need to move a large set of data from one table to another, you will likely want to avoid having to write the data to a flat file and then read it back into the database. The fastest way to move data in your database is to move it from one table to another without going out to the operating system.

When you’re moving data from one table to another, there are several common methods for improving the performance of the data migration:

Image Tuning the structures (removing indexes and triggers)

Image Disabling constraints during the data migration

Image Using hints and options to improve the transaction performance

The first of the tips, tuning the structures, involves disabling any triggers or indexes that are on the table into which data is being loaded. For example, if you have a row-level trigger on the target table, that trigger will be executed for every row inserted into the table. If possible, disable the triggers prior to the data load. If the trigger should be executed for every inserted row, you may be able to do a bulk operation once the rows have been inserted, rather than a repeated operation during each INSERT. If properly tuned, the bulk operation will complete faster than the repeated trigger executions. You will need to be sure that the bulk operations execute for all rows that have not already been processed by the triggers.

In addition to disabling triggers, you should disable the indexes on the target table prior to starting the data load. If the indexes are left on the table, Oracle will dynamically manage the indexes as each row is inserted. Rather than continuously manage the index, drop it prior to the start of the load and re-create it when the load has completed.

images
NOTE

Disabling indexes and triggers resolves most of the performance problems associated with large table-to-table data migration efforts.

In addition to disabling indexes, you should consider disabling constraints on the table. If the source data is already in a table in the database, you can check that data for its adherence to your constraints (such as foreign keys or CHECK constraints) prior to loading it into your target table. Once the data has been loaded, you can reenable the constraints.

If none of those options gives you adequate performance, you should investigate the options Oracle has introduced for data migration tuning. Those options include the following:

Image The APPEND hint for INSERT commands Like the SQL*Loader Direct Path option, the APPEND hint loads blocks of data into a table, starting at the HWM for the table. Use of the APPEND hint will increase your space usage.

Image The NOLOGGING option If you are performing a CREATE TABLE AS SELECT command, use the NOLOGGING option to avoid writing to the redo logs during the operation. If the database utilizes a standby (Data Guard) server, then FORCE LOGGING will be the default behavior so that the inserted data will be logged into the redo logs regardless of the use of the NOLOGGING option.

Image The PARALLEL option Parallel Query uses multiple processes to accomplish a single task. For a CREATE TABLE AS SELECT operation, you can parallelize both the CREATE TABLE portion and the query. If you use the PARALLEL option, you should also use the NOLOGGING option; otherwise, the parallel operations will have to wait due to serialized writes to the online redo log files. However, Oracle best practices suggest logging all operations in a production database.

Before using any of these advanced options, you should first investigate the target table’s structures to make sure you’ve avoided the common traps cited earlier in this section.

You can also use programming logic to force INSERTs to be processed in arrays rather than as an entire set. For example, PL/SQL, Java, and C support array INSERTs, thus reducing the size of the transactions required to process a large set of data.

Bulk Deletes: The TRUNCATE Command

Occasionally, users attempt to delete all the rows from a table at once. When they encounter errors during this process, they complain that the rollback segments are too small, when in fact their transaction is too large.

A second problem occurs once the rows have all been deleted. Even though the segment no longer has any rows in it, it still maintains all the space that was allocated to it. Therefore, deleting all those rows saved not a single byte of allocated space.

The TRUNCATE command resolves both of these problems. It is a DDL command, not a DML command, so it cannot be rolled back. Once you have used the TRUNCATE command on a table, its rows are gone, and none of its DELETE triggers are executed in the process. However, the table retains all its dependent objects—such as grants, indexes, and constraints.

The TRUNCATE command is the fastest way to delete large volumes of data. Because it will delete all the rows in a table, this may force you to alter your application design so that no protected rows are stored in the same table as the rows to be deleted. If you use partitions, you can truncate one partition of a table without affecting the rest of the table’s partitions (see Chapter 18).

A sample TRUNCATE command for a table is shown here:

image

This example, in which the EMPLOYEE table’s rows are deleted, shows a powerful feature of TRUNCATE. The DROP STORAGE clause is used to deallocate the non-INITIAL space from the table (this is the default option). Therefore, you can delete all of a table’s rows and reclaim all but its initial extent’s allocated space, without dropping the table.

The TRUNCATE command also works for clusters. In this example, the REUSE STORAGE option is used to leave all allocated space empty within the segment that acquired it:

image

When this example command is executed, all the rows in the EMP_DEPT cluster will be instantly deleted.

To truncate a partition, you need to know its name. In the following example, the partition named PART3 of the EMPLOYEE table is truncated via the ALTER TABLE command:

image

The rest of the partitions of the EMPLOYEE table will be unaffected by the truncation of the PART3 partition. See Chapter 18 for details on creating and managing partitions.

As an alternative, you can create a PL/SQL program that uses dynamic SQL to divide a large DELETE operation into multiple smaller transactions with a COMMIT after each group of rows.

Using Partitions

You can use partitions to isolate data physically. For example, you can store each month’s transactions in a separate partition of an ORDERS table. If you perform a bulk data load or deletion on the table, you can customize the partitions to tune the data manipulation operation. For example:

Image You can truncate a partition and its indexes without affecting the rest of the table.

Image You can drop a partition, via the DROP PARTITION clause of the ALTER TABLE command.

Image You can drop a partition’s local index.

Image You can set a partition to NOLOGGING, reducing the impact of large transactions.

From a performance perspective, the chief advantage of partitions lies in their ability to be managed apart from the rest of the table. For example, being able to truncate a partition enables you to delete a large amount of data from a table (but not all of the table’s data) without generating any redo information. In the short term, the beneficiary of this performance improvement is the DBA; in the longer term, the entire enterprise benefits from the improved availability of the data. See Chapter 18 for details on implementing partitions and subpartitions.

You can use the EXCHANGE PARTITION option to greatly reduce the impact your data-loading processes have on system availability. Start by creating an empty table that has the same column structure as your partitioned table. Load your data into the new table and then analyze the new table. Create indexes on the new table to match the partitioned table’s indexes; the indexes must be local, and not global, indexes. When these steps are complete, alter the partitioned table using the EXCHANGE PARTITION clause to exchange an empty partition with the new table you populated. All the loaded data will now be accessible via the partitioned table. There is little impact to the system availability during this step because it is a DDL operation.

Reducing Network Traffic

As databases and the applications that use them become more distributed, the network that supports the servers may become a bottleneck in the process of delivering data to the users. Because DBAs typically have little control over the network management, it is important to use the database’s capabilities to reduce the number of network packets required for the data to be delivered. Reducing network traffic will reduce your reliance on the network and thus eliminate a potential cause of performance problems.

Replication of Data Using Materialized Views

You can manipulate and query data from remote databases. However, it is not desirable to have large volumes of data constantly sent from one database to another. To reduce the amount of data being sent across the network, you should consider different data replication options.

In a purely distributed environment, each data element exists in one database. When data is required, it is accessed from remote databases via database links. This purist approach is similar to implementing an application strictly in third normal form—an approach that will not easily support any major production application. Modifying the application’s tables to improve data retrieval performance involves denormalizing data. The denormalization process deliberately stores redundant data in order to shorten users’ access paths to the data.

In a distributed environment, replicating data accomplishes this goal. Rather than force queries to cross the network to resolve user requests, selected data from remote servers is replicated to the local server. This can be accomplished via a number of means, as described in the following sections.

Replicated data is out of date as soon as it is created. Replicating data for performance purposes is therefore most effective when the source data is very infrequently changed or when the business processes can support the use of old data.

Oracle’s distributed capabilities offer a means of managing the data replication within a database. Materialized views replicate data from a master source to multiple targets. Oracle provides tools for refreshing the data and updating the targets at specified time intervals.

Materialized views may be read-only or updatable. The management issues for materialized views are covered in Chapter 19; in this section, you will see their performance-tuning aspects.

Before creating a materialized view for replication, you should first create a database link to the source database. The following example creates a private database link called HR_LINK, using the LOC service name:

image

The CREATE DATABASE LINK command, as shown in this example, has several parameters:

Image The name of the link (HR_LINK, in this case).

Image The account to connect to.

Image The service name of the remote database (as found in the tnsnames.ora file for the server). In this case, the service name is LOC.

Materialized views automate the data replication and refresh processes. When materialized views are created, a refresh interval is established to schedule refreshes of replicated data. Local updates can be prevented, and transaction-based refreshes can be used. Transaction-based refreshes, available for many types of materialized views, send from the master database only those rows that have changed for the materialized view. This capability, described later in this chapter, may significantly improve the performance of your refreshes.

The syntax used to create the materialized view on the local server is shown in the following example, where the materialized view is given a name (LOCAL_EMP) and its storage parameters are specified. Its base query is given as well as its refresh interval. In this case, the materialized view is told to immediately retrieve the master data and then to perform the refresh operation again in seven days (SYSDATE+7).

image

The REFRESH FAST clause tells the database to use a materialized view log to refresh the local materialized view. The ability to use materialized view logs during refreshes is only available if the materialized view’s base query is simple enough that Oracle can determine which row in the materialized view will change when a row changes in the source tables.

When a materialized view log is used, only the changes to the master table are sent to the targets. If you use a complex materialized view, you must use the REFRESH COMPLETE clause in place of the REFRESH FAST clause. In a complete refresh, the refresh completely replaces the existing data in the materialized view’s underlying table.

Materialized view logs must be created in the master database, via the CREATE MATERIALIZED VIEW LOG command. An example of the CREATE MATERIALIZED VIEW LOG command is shown here:

image

The materialized view log is always created in the same schema as the master table.

You can use simple materialized views with materialized view logs to reduce the amount of network traffic involved in maintaining the replicated data. Because only the changes to the data will be sent via a materialized view log, the maintenance of simple materialized views should use fewer network resources than complex materialized views require, particularly if the master tables are large, fairly static tables. If the master tables are not static, the volume of transactions sent via the materialized view log may not be any less than would be sent to perform a complete refresh. For details on the refresh capabilities of materialized views, see Chapter 19.

Regardless of the refresh option chosen, you should index the materialized view’s base table to optimize queries against the materialized view. From a performance perspective, your goal is to present the users with the data they want in the format they want as quickly as possible. By creating materialized views on remote data, you can avoid traversing database links during queries. By creating materialized views on local data, you can prevent users from repeatedly aggregating large volumes of data, presenting them instead with pre-aggregated data that answers their most common queries.

Using Remote Procedure Calls

When using procedures in a distributed database environment, you can use one of two options: to create a local procedure that references remote tables or to create a remote procedure that is called by a local application.

The proper location for the procedure depends on the distribution of the data and the way the data is to be used. The emphasis should be on minimizing the amount of data that must be sent through the network to resolve the data request. The procedure should reside within the database that contains most of the data used during the procedure’s operations.

For example, consider this procedure:

image

In this case, the procedure only accesses a single table (EMPLOYEE) on a remote node (as indicated by the database link HR_LINK). To reduce the amount of data sent across the network, move this procedure to the remote database identified by the database link HR_LINK and remove the reference to that database link from the FROM clause in the procedure. Then, call the procedure from the local database by using the database link, as shown here:

image

In this case, two parameters are passed to the procedure—MY_EMP_NO is set to 1234, and RAISE is set to 2000. The procedure is invoked using a database link to tell the database where to find the procedure.

The tuning benefit of performing a remote procedure call is that all of the procedure’s processing is performed in the database where the data resides. The remote procedure call minimizes the amount of network traffic necessary to complete the procedure’s processing.

To maintain location transparency, you may create a local synonym that points to the remote procedure. The database link name will be specified in the synonym so that user requests will automatically use the remote database:

image

A user could then enter the command

image

and it would execute the remote procedure defined by the synonym MY_RAISE.

Using the Automatic Workload Repository

In Oracle Database 10g and earlier, Statspack gathers and reports on database statistics, albeit in a strictly text-based format! As of Oracle 10g, the Automatic Workload Repository (AWR) provides enhancements to the Statspack concept, generating all statistics found in Statspack, and more. In addition, the AWR is highly integrated with OEM, making it easy to analyze and fix a performance problem.

images
NOTE

Statspack is still available in Oracle Database 12c as a free option. To use AWR reports, you must be licensed for the Diagnostics pack.

Like Statspack, the AWR collects and maintains performance statistics for problem detection and self-tuning purposes. You can generate reports on the AWR data, and you can access it via views and through Cloud Control 12c. You can report on recent session activity as well as the overall system statistics and SQL usage.

The AWR captures the system statistics on an hourly basis (taking “snapshots” of the database) and stores the data in its repository tables. As with Statspack, the space requirements of the AWR will increase as the historical retention period is increased or the interval between snapshots is decreased. By default, seven days’ worth of data is maintained in the AWR. You can see the snapshots that are stored in the AWR via the DBA_HIST_SNAPSHOT view.

To enable the AWR, set the STATISTICS_LEVEL initialization parameter to TYPICAL or ALL. If you set STATISTICS_LEVEL to BASIC, you can take manual snapshots of the AWR data, but they will not be as comprehensive as those performed automatically by the AWR. Setting STATISTICS_LEVEL to ALL adds timed OS statistics and plan execution statistics to those gathered with the TYPICAL setting.

Managing Snapshots

To take a manual snapshot, use the CREATE_SNAPSHOT procedure of the DBMS_WORKLOAD_REPOSITORY package:

image

To alter the snapshot settings, use the MODIFY_SNAPSHOT_SETTINGS procedure. You can modify the retention (in minutes) and the interval (in minutes) for snapshots. The following example changes the interval to 30 minutes for the current database:

image

To drop a range of snapshots, use the DROP_SNAPSHOT_RANGE procedure, specifying the start and end of the snapshot IDs to drop:

image

Managing Baselines

You can designate a set of snapshots as a baseline for the performance of the system. The baseline data will be retained for later comparisons with snapshots. Use the CREATE_BASELINE procedure to specify the beginning and ending snapshots for the baseline:

image

When you create a baseline, Oracle will assign an ID to the baseline; you can view past baselines via the DBA_HIST_BASELINE view. The snapshots you specify for the beginning and ending of the baseline are maintained until you drop the baseline. To drop the baseline, use the DROP_BASELINE procedure:

image

If you set the CASCADE parameter of the DROP_BASELINE procedure to TRUE, the related snapshots will be dropped when the baseline is dropped.

You can see the AWR data via Cloud Control 12c or via the data dictionary views listed earlier in this section. Additional views supporting the AWR include V$ACTIVE_SESSION_HISTORY (sampled every second), DBA_HIST_SQL_PLAN (execution plans), and DBA_HIST_WR_CONTROL (for the AWR settings).

Generating AWR Reports

You can generate reports from the AWR either via Cloud Control 12c or via the reporting scripts provided. The awrrpt.sql script generates a report based on the differences in statistics between the beginning and ending snapshots. A second report, awrrpti.sql, displays a report based on the beginning and ending snapshots for a specified database and instance.

Both awrrpt.sql and awrrpti.sql are located in the $ORACLE_HOME/rdbms/admin directory. When you execute a report (from any DBA account), you will be prompted for the type of report (HTML or text), the number of days for which snapshots will be listed, the beginning and ending snapshot IDs, and the name for the output file. For RAC environments, you can use awrgrpt.sql to report most of the statistics across all instances.

Running the Automatic Database Diagnostic Monitor Reports

Rather than relying on manual reporting against the AWR table (much as you did with Statspack in previous versions of Oracle), you can use the Automatic Database Diagnostic Monitor (ADDM). Because it is based on AWR data, ADDM requires that the STATISTICS_LEVEL parameter be set (either to TYPICAL or ALL, as recommended earlier). You can access ADDM via the Performance Analysis section of Cloud Control 12c, or you can run an ADDM report manually.

To run ADDM against a set of snapshots, use the addmrpt.sql script located in the $ORACLE_HOME/rdbms/admin directory.

images
NOTE

You must have the ADVISOR system privilege in order to execute ADDM reports.

Within SQL*Plus, execute the addmrpt.sql script. You will be prompted for the beginning and ending snapshot IDs for the analysis and a name for the output file.

To view the ADDM data, you can use Cloud Control 12c or the advisor data dictionary views. The advisor views include DBA_ADVISOR_TASKS (existing tasks), DBA_ADVISOR_LOG (status and progress on tasks), DBA_ADVISOR_RECOMMENDATIONS (completed diagnostic tasks plus recommendations), and DBA_ADVISOR_FINDINGS. You can implement the recommendations to address the findings identified via ADDM. Figure 8-1 shows a typical AWR report, generated from the default baseline; in this example, the snapshot began on 14-Sep-2013 and ended on 22-Sep-2013. This database seems to be lightly loaded with plenty of CPU and memory resources; for example, latch contention is nonexistent, and there is enough memory to perform all sorting without using disk.

images

FIGURE 8-1. Sample AWR report via Cloud Control 12c

Using Automatic SQL Tuning Advisor

New as of Oracle Database 11g, Automatic SQL Tuning Advisor runs during the default maintenance window (using AutoTask) and targets the highest-load SQL statements collected in the AWR. Once the automatic SQL tuning begins during a maintenance window, the following steps are performed by the Automatic SQL Tuning Advisor:

1. Identify repeated high-load SQL from AWR statistics. Recently tuned SQL and recursive SQL are ignored.

2. Tune high-load SQL using calls to the SQL Tuning Advisor.

3. Create SQL Profiles for the high-load SQL; performance is tested both with and without the profile.

4. If the performance is better by at least a factor of three, automatically keep the profile; otherwise, note the improvement in the tuning report.

Figure 8-2 shows a summary of the Advisor tasks from Advisor Central; in this example, you can see a summary of the results for the Automatic Database Diagnostic Monitor (ADDM), Segment Advisor, and SQL Tuning Advisor.

images

FIGURE 8-2. Cloud Control 12c Advisor Central summary

Clicking the SQL Tuning Advisor result link, you can see the Automatic SQL Tuning Result Summary in Figure 8-3. On this database, the SQL Tuning Advisor found 124 potential SQL statements that could be improved by implementing a SQL Profile, adding one or more indexes, collecting statistics more often, or rewriting the SQL statement.

images

FIGURE 8-3. Automatic SQL Tuning Advisor results

Performance Tuning in a Multitenant Environment

In Chapter 11 I cover the basics of Oracle’s multitenant architecture. This includes the different types of containers available: the root container (container database, or CDB) that at a minimum comprises the root database, a seed database, and zero or more pluggable databases (PDBs). Oracle 12c databases can be standalone databases as well and converted to PDBs. I also distinguish between common and local users: common users have privileges across all PDBs within a container, whereas local users see the PDB as a standalone database (non-CDB). In a multitenant environment, the traditional USER_, ALL_, and DBA_ data dictionary views are supplemented with CDB_ views that are visible across the entire container to common users.

As you might expect, tuning a multitenant container database (the container itself or one of the PDBs) is much like tuning a non-CDB in that you’re tuning a single instance with many different applications (PDBs) sharing and competing for the same server resources. This is in line with the multitenant database architecture in that there is minimal or no difference between a CDB and a non-CDB from a usage, compatibility, and tuning perspective.

The keys to performance tuning a PDB are monitoring and resource allocation. Not only must you tune individual SQL statements within a PDB, but you also must decide what percentage of the server’s resources a PDB can have if all PDBs are active. Many of the same tools you use for a non-CDB database are also used for a CDB, such as the SQL Tuning Advisor and the familiar AWR and ADDM reports. The big difference in a CDB environment is that tuning SQL statements happens at the PDB level, whereas the AWR, ASH, and ADDM reports are at the instance (CDB) level.

As you will find out in Chapter 11, some initialization parameters can also be set at the PDB level if the default value at the CDB level is not appropriate. I’ll show you how to change some of these parameters at the PDB level in a tuning scenario.

Even though there is database activity at the CDB level, the bulk of the activity should be occurring in each PDB from a logical perspective. Remember that from an instance perspective, it’s still one database instance. Therefore, the standard Oracle tuning methodologies still apply to a CDB environment.

At the CDB level, you want to optimize the amount of memory you need to host one or more PDBs; that’s the reason you’re using a multitenant environment in the first place! In the following sections, I’ll review the standard tuning methodologies as well as how you can change initialization parameters at the PDB level. Using performance reports such as the ASH, ADDM, and AWR reports helps you identify performance issues for the CDB, and using the SQL Tuning Advisor helps you optimize your SQL activity within each PDB.

Tuning Methodology

The standard Oracle tuning methodology developed and refined over the last several releases of Oracle Database still apply to multitenant environments. The overall steps are as follows:

1. Identify the tuning goal:

a. Reduce elapsed time for individual queries?

b. Increase the number of users without buying new hardware?

c. Optimize memory usage?

d. Reduce disk space usage (compression, normalization)?

2. Determine the cause of the bottleneck (OS, network, I/O); usually it’s one cause.

3. Tune the application from the top down:

a. Review the application design.

b. Modify the database design (denormalization, materialized views, data warehouse).

c. Tune the SQL code.

d. Tune the instance.

4. Use database analysis tools once you finish step 3:

a. Collect statistics at the instance and OS levels.

b. Use the AWR report to identify wait events and poorly performing SQL.

c. Use the SQL Tuning Advisor, memory advisors, and other advisors to tune and reduce elapsed time, CPU, and I/O.

5. After tuning one or more components, start again at step 2 if the tuning goal has not yet been met.

The biggest focus of any tuning effort is identifying when to stop tuning. In other words, you need to identify the goal for a tuning effort after several users complain that “the database is slow.” Identifying the performance issue, reevaluating the service-level agreements (SLAs) in place, and monitoring database growth and the user base are important factors in deciding how much time you want to spend tuning a database (CDB or non-CDB) before you reach the decision that you need faster hardware, a faster network, or a new database design.

Sizing the CDB

Adjusting parameters at the CDB level is much like tuning a single instance in a non-CDB environment that has several applications with different resource and availability requirements. It’s worth mentioning again that a CDB is a single database instance, but with the added features of the multitenant environment, you have much more control over resource consumption among the several applications (each in their own PDB) in addition to the strong isolation between the applications from a security perspective.

Tuning CDB Memory and CPU Resources

Tuning the memory in a CDB means you’re changing the same memory areas as in a non-CDB:

Image Buffer cache (SGA)

Image Shared pool (SGA)

Image Program Global Area (PGA)

When you calculate the memory requirements for a CDB, your first estimate should be the sum of all corresponding memory requirements for each non-CDB that will become a PDB. Of course, you will eventually want to reduce the total memory footprint for the CDB based on a number of factors. For example, not all PDBs will be active at the same time; therefore, you will likely not need as much total memory allocated to the CDB.

Using Enterprise Manager Cloud Control 12c is a good way to see resource usage across the CDB. In Figure 8-4, the container CDB01 has three PDBs active and two inactive.

images

FIGURE 8-4. Viewing PDB resource usage within a CDB using Cloud Control 12c

The total memory allocated for the CDB is approximately 5GB. Three non-CDBs would likely use 5GB or more each; all five PDBs in CDB01 may perform just fine in a total of 5GB.

There are a few different approaches to resource allocation among PDBs within a CDB:

Image None Let each PDB use all resources of the CDB if no other PDB is active; when multiple PDBs need resources, they are divided equally.

Image Minimum Each PDB gets a minimum guaranteed resource allocation.

Image Minimum/maximum Each PDB gets both a minimum guaranteed resource allocation and a maximum.

Resource usage allocation in a CDB is measured in shares. By default, all PDBs can consume all resources allocated to the CDB. I cover more details on how shares are allocated and calculated later in the chapter.

Modifying Initialization Parameters

As you will find out in Chapter 11, there is only one SPFILE per CDB instance. All database parameters are stored in the CDB’s SPFILE, but 171 of those parameters (out of a total of 367 for Oracle Database 12c 12.1.0.1) can be changed at the PDB level. The column ISPDB_MODIFIABLE is an easy way to see which parameters you can change at the PDB level:

image

When you unplug a PDB, its customized parameters stay with the unplugged PDB and are set when that PDB is plugged back in regardless of which PDB it is plugged into. When a PDB is cloned, the custom parameters are cloned as well. At the container level, you can also look at the data dictionary view PDB_SPFILE$ to see which parameters are different across PDBs:

image

In the TOOL PDB, the SESSIONS parameter is different from the default (at the CDB level); the RPTQA12C PDB has three non-default parameters set.

Using Memory Advisors

The buffer cache in a CDB, shared across all PDBs, behaves much like the buffer cache in a non-CDB: The same LRU algorithms are used to determine when and if a block should stay in the buffer cache. Because the buffer cache is shared, the PDB’s container ID (CON_ID) is also stored in each block. The same container ID is stored in the other SGA and PGA memory areas, such as the shared pool in the SGA and the global PGA. The memory advisors from previous versions of Oracle Database work in much the same way in a multitenant environment; sizing recommendations are at the CDB (instance) level. Individual memory parameters that can be adjusted at the PDB level are limited to SORT_AREA_SIZE and SORT_AREA_RETAINED_SIZE, although in general Oracle best practices dictate that you only set PGA_AGGREGATE_TARGET and let Oracle manage the other memory areas.

Figure 8-5 shows the output from the SGA Memory Advisor launched from Cloud Control 12c.

image

FIGURE 8-5. CDB SGA Memory Advisor in Cloud Control 12c

Even with several PDBs in the CDB01 container, it appears that the total memory for the CDB can be reduced by at least 1GB and retain good performance for all PDBs.

To accommodate a potentially larger number of sessions in a CDB, the parameter PGA_AGGREGATE_LIMIT was added to place a hard limit on the amount of PGA memory used. The existing parameter PGA_AGGREGATE_TARGET was useful in previous releases as a soft limit but only for tunable memory. Several sessions using untunable memory (such as PL/SQL applications that allocate large memory arrays) could potentially use up all available PGA memory, causing swap activity at the OS level and affecting performance across all instances on the server. Thus, the parameter PGA_AGGREGATE_LIMIT was added to abort PGA memory requests by one or more non-SYSTEM connections to get under this limit.

Leveraging AWR Reports

As with all previously described Oracle tuning tools, AWR snapshots include a container ID number, and that container ID is reflected in any AWR report. Figure 8-6 shows an excerpt of the SQL statements executed during the three-hour window specified for the AWR report.

images

FIGURE 8-6. AWR report in a multitenant environment

The SQL statements run during this window were from two PDBs and the root container. As in a non-CDB environment, your tuning effort will focus first on the statements with the longest elapsed time along with statements whose total time across multiple executions is at the top of the list.

Using the SQL Tuning Advisor

When you run the SQL Tuning Advisor against one or more SQL statements, such as those in Figure 8-6, the advisor runs only in the context of a single PDB. In other words, the recommendations are based only on the performance and resource usage within the PDB. Even if the same SQL statement is run in multiple PDBs, the schema names, statistics, data volumes, and initialization parameters can and will likely be different between PDBs. Therefore, if any recommendations are implemented, they are applied in only a single PDB.

Other new and enhanced SQL Tuning features in Oracle Database 12c can be used for CDBs and non-CDBs:

Image Adaptive SQL plan management

Image Automatic SQL plan baseline evolution

Image SQL management base

Image SQL plan directives

Image Improved statistics gathering performance

The usage of these tools is beyond the scope of this book.

Managing Resource Allocation Within a PDB

In the previous section, I introduced the concept of resource sharing within a CDB by using shares. I’ll expand on that concept by showing how you can allocate shares among PDBs within a CDB. In addition, I’ll talk about resource management within a PDB, which is much like how Resource Manager operates in a non-CDB environment and previous versions of Oracle Database.

Once a portion of resources is allocated to a PDB, Resource Manager will prioritize resource requests by users. In both cases you’ll use the DBMS_RESOURCE_MANAGER package to create and deploy resource allocations.

Using Shares to Manage Inter-PDB Resources

Each PDB that’s plugged into a CDB competes for the resources of the CDB—primarily CPU, parallel servers, and, in the case of Oracle Exadata, I/O. How much of each resource a PDB gets depends on how many shares that PDB was assigned when it was created.

images
NOTE

Neither consumer groups (using Resource Manager) nor shares can be defined for the root container.

By default, each PDB gets one share unless otherwise specified. When a new PDB is added or an existing PDB is unplugged, the number of shares each PDB has remains the same. Table 8-1 shows a CDB with four PDBs: HR, BI, REPOS, and TOOL. The BI PDB has three shares, and the rest have one each, the default.

images

TABLE 8-1. PDBs and Share Allocation for Four PDBs

The TOOL database, for example, is guaranteed 16.67 percent of the server’s CPU resources if needed. If one or more of the other PDBs are not active, TOOL can use its default allocation if there is no activity in the other PDBs.

Suppose you create a PDB called NCAL and don’t specify the number of shares; it defaults to 1, with the results shown in Table 8-2.

images

TABLE 8-2. PDBs and Share Allocation for Five PDBs After Adding a New One

The minimum CPU guaranteed for each PDB is automatically recalculated based on the new total number of shares. Each PDB with one share now gets 14.29 percent of the CPU resources, and the amount of CPU resources available (at a minimum) for the BI PDB is now 42.86 percent.

Creating and Modifying Resource Manager Plans

To further refine the resource consumption, you can set limits within each PDB using Resource Manager. From the perspective of the PDB, all resources are controlled by directives created using DBMS_RESOURCE_MANAGER. The amount of CPU, Exadata I/O, and concurrent parallel servers used by the PDB default to 100 percent but can be adjusted down to 0 percent depending on the time of day or other circumstances.

The resource plan itself is created at the CDB level, and you create directives for each PDB within the CDB. You can also specify a set of default directives for those PDBs that do not have an explicit set of directives.

Identifying Parameters to Limit PDB Resource Usage

As part of the utilization plan for each PDB, there are two key limits you can control: the utilization limit for CPU, Exadata I/Os, and parallel servers and a parallel server limit. These plan directive limits are UTILIZATION_LIMIT and PARALLEL_SERVER_LIMIT, respectively.

The resource directive UTILIZATION_LIMIT defines the percentage of CPU, I/Os, and parallel servers available to a PDB. If UTILIZATION_LIMIT is set at 30, then the PDB can use no more than 30 percent of the resources available to the CDB.

To further refine the resource limits, you can use PARALLEL_SERVER_LIMIT to define the maximum percentage of the CDB’s PARALLEL_SERVERS_TARGET value; this value overrides the UTILIZATION_LIMIT directive but only for parallel resources. The default is 100 percent.

Creating the CDB Resource Plan

The steps for creating a CDB resource plan are similar to those for creating a resource plan in a non-CDB, but with additional steps for each PDB. You create and manage the resource plan from the root container only. Table 8-3 lists the steps and corresponding DBMS_RESOURCE_MANAGER calls needed to create and configure the CDB resource plan.

images

TABLE 8-3. Steps to Create a Resource Plan with DBMS_RESOURCE_MANAGER Calls

Other key procedures in DBMS_RESOURCE_MANAGER include UPDATE_CDB_PLAN to change the characteristics of the CDB resource plan and DELETE_CDB_PLAN to delete the resource plan and all of its directives. To update and delete individual CDB plan directives, use UPDATE_CDB_PLAN_DIRECTIVE and DELETE_CDB_PLAN_DIRECTIVE.

Here is an example of creating a CDB resource plan for the CDB01 container and defining the plan directives for two of the PDBs in the CDB.

1. Create a pending area for the CDB plan:

image

2. Create a resource plan that manages the TOOL and CCREPOS PDBs to minimize CPU and other resource usage:

image

3. Create a plan directive that gives both the TOOL and CCREPOS PDBs one share. The utilization limit for TOOL should be 50 percent, and for CCREPOS it will be 75 percent:

image

image

4. Validate and submit the pending area:

image

5. Finally, make this resource manager plan the current plan:

image

Viewing Resource Plan Directives

In Oracle Database 12c you have a data dictionary view called DBA_CDB_RSRC_PLAN_DIRECTIVES to see all of the current resource plans. Querying that view, you can see the resource plans you just created for TOOL and CCREPOS:

image

image

In previous releases of Oracle Database and for non-CDBs in Oracle Database 12c, the corresponding data dictionary view is DBA_RSRC_PLAN_DIRECTIVES.

Managing Resources Within a PDB

Resource plans can manage workloads within a PDB as well. These resource plans manage workloads just as they do in a non-CDB and, not surprisingly, are called PDB resource plans. There are a few restrictions and differences with PDB plans. Table 8-4 shows the parameter and feature differences between non-CDB and PDB resource plans.

images

TABLE 8-4. Differences Between Non-CDB and PDB Resource Plans

Regardless of the container type, you still view resource plans using the V$RSRC_PLAN dynamic performance view. To find the active CDB resource plan, select the row in V$RSRC_PLAN with CON_ID=1.

Migrating Non-CDB Resource Plans

You will likely convert and plug in many non-CDBs as well as creating new PDBs. This process is straightforward, and all of your applications should work as expected. If the non-CDB has a resource plan, it will be converted as well, as long as it meets these conditions:

Image There are no more than eight consumer groups.

Image There are no subplans.

Image All resource allocations are on level 1.

In other words, the migrated resource plan must be compatible with a new PDB resource plan that follows the rules in the previous section. If the plan violates any of these conditions, the plan is converted during the plug-in operation to a plan that is compatible with a PDB. This plan may be unsuitable; you can drop, modify, or create a new resource plan. The original plan is saved in DBA_RSRC_PLAN_DIRECTIVES with the STATUS column having a value of LEGACY.

Performing Database Replay

The Database Replay functionality from previous Oracle Database releases has also been enhanced in Oracle Database 12c to include simultaneous workload replays as a planning tool for estimating how multiple non-CDBs will perform in a CDB environment. You can take production workloads from multiple servers in a non-CDB environment and play them back in various configurations on a single new server to simulate how well they would coexist in a multitenant environment.

Analyze the Source Database Workloads

When you capture workloads for potential multitenant deployment, the workloads are typically in different business units and locations; the peak load for each application is likely at different times of the day, which makes these applications ideal candidates for consolidation. Figure 8-7 shows a typical set of workloads from applications currently on different servers.

images

FIGURE 8-7. Candidate workloads for multitenant consolidation

You can also analyze existing PDBs and capture workloads to see how they would perform as a PDB of another CDB on a different server. The general steps you’ll follow as part of this analysis phase are as follows:

1. Capture the workload of an existing non-CDB or PDB.

2. Optionally export the AWR snapshots for the database.

3. Restore the candidate database onto the target system.

4. Make changes to the imported candidate database as needed, such as upgrading to Oracle Database 12c.

5. Copy the generated workload files to the target system.

6. Process the workload as a one-time prerequisite step.

7. Repeat steps 1-6 for all other candidate databases.

8. Configure the target system for replay (such as the workload replay client processes).

9. Replay the workloads for all PDBs within the single CDB on the target system.

Capture Source Database Workloads

On the source database server, you’ll capture the workload for a typical 8-hour or 24-hour period. You’ll want all captured workloads to cover the same time period. To optimize the performance of the replay test, you can optionally export AWR snapshots, SQL profiles, and SQL tuning sets.

Process Workloads on Target System

After you import the candidate database into a PDB of the new CDB, you import the workload generated on the source server. You preprocess the workload files in preparation for the replay, which needs to happen only once for each imported workload. It’s recommended that you replay each imported workload individually to ensure that there are no extreme variations in performance compared to that database’s performance on the original server.

Replay Workloads on Target CDB

After all PDBs have been created and preprocessed, remap any connections that might refer to objects that don’t exist on the target system. Create a replay schedule that will replay each workload at the same time and rate that it does on the source system. You can create multiple schedules to see how workloads can be shifted to optimize the CDB’s overall performance.

Verify Replay Results

After the replay session is complete, review the reports generated by Consolidated Database Replay to see, for example, if the response time and overall SLA of the databases on their original servers can be met by this consolidation platform. If there are severe regressions, then you can use the tuning methodologies discussed earlier in this chapter and run the replay again. Even after tuning, you may find that the server needs more CPUs or memory. Ideally, you’ll find out that each database runs just as fast as or faster than it did on the original server!

Last, but not least, having accurate and easy-to-manage diagnostic information for your CDB and PDBs is more important than ever. The Automatic Diagnostic Repository (ADR) has the same structure as in previous releases, and the CDB and the PDBs within the CDB have their own subdirectories under the ADR Base directory.

Summary

This chapter does not cover every potential tuning solution. However, there is an underlying approach to the techniques and tools presented throughout this chapter. Before spending your time and resources on the implementation of a new feature, you should first stabilize your environment and architecture—the server, the database, and the application. If the environment is stable, you should be able to quickly accomplish two goals:

1. Successfully re-create the performance problem.

2. Successfully isolate the cause of the problem.

To achieve these goals, you may need to have a test environment available for your performance tests. Once the problem has been successfully isolated, you can apply the steps outlined in this chapter to the problem. In general, your tuning approach should mirror the order of the sections of this chapter:

1. Evaluate application design.

2. Tune SQL.

3. Tune memory usage.

4. Tune data access.

5. Tune data manipulation.

6. Tune network traffic.

7. Tune physical and logical storage.

8. Use the AWR to tune queries.

9. Manage PDB resources.

10. Leverage Database Replay for resource planning.

Depending on the nature of your application, you may choose a different order for the steps, or you may combine steps.

If the application design cannot be altered and the SQL cannot be altered, you can tune the memory and disk areas used by the application. As you alter the memory and disk area settings, you must be sure to revisit the application design and SQL implementation to be sure that your changes do not adversely impact the application. The need to revisit the application design process is particularly important if you choose to use a data replication method, because the timeliness of the replicated data may cause problems within the business process served by the application.

Finally, you’ll need to expand your tuning expertise to include tuning pluggable databases in a multitenant environment. This tuning expertise is easy to come by: The container database can be tuned as a traditional single-instance database and individual statements within a pluggable database can be tuned with the tools you’re already familiar with such as the SQL Tuning Advisor. Resource management in a multitenant environment adds the concept of a share to allocate resources among several pluggable tenants in a container database. Within each pluggable database you’ll once again be able to use the Resource Manager tools you’re familiar with from previous versions of Oracle Database.

images

Image





All materials on the site are licensed Creative Commons Attribution-Sharealike 3.0 Unported CC BY-SA 3.0 & GNU Free Documentation License (GFDL)

If you are the copyright holder of any material contained on our site and intend to remove it, please contact our site administrator for approval.

© 2016-2024 All site design rights belong to S.Y.A.