Managing Distributed Databases - Networked Oracle - Oracle Database 12c DBA Handbook (2015)

Oracle Database 12c DBA Handbook (2015)

PART
IV

Networked Oracle

CHAPTER
19

Managing Distributed Databases

In a distributed environment, databases on separate servers (hosts) may be accessed during a single transaction or query. Each server can be physically isolated without being logically isolated from the other servers.

A typical distributed database implementation involves corporate headquarters servers that replicate data to departmental servers in various locations. Each database supports local client applications and also has the ability to communicate with other databases in the network. This architecture is shown in Figure 19-1.

Image

FIGURE 19-1. Server/server architecture

Oracle Net allows this architecture to become reality. Run on all the servers involved, Oracle Net allows database requests made from one database (or application) to be passed to another database on a separate server. With this functionality, you can communicate with all the databases that are accessible via your network. You can then create synonyms that give applications true network transparency; the user who submits a query will not know the location of the data that is used to resolve it.

You can configure Oracle to support multimaster replication (in which all databases involved own the data and can serve as the source for data propagation) or single-master replication (in which only one database owns the data). When designing a replication configuration, you should try to restrict the ownership of data as much as possible. As the number of sources for propagation increases, the potential for errors to occur increases, as does the potential administration workload. In the following sections, you will see examples of the different replication capabilities available, followed by management techniques.

Remote Queries

To query a remote database, you must create a database link in the database in which the query will originate. The database link specifies the service name for the remote database and may also specify the username to connect to in the remote database. When a database link is referenced by a SQL statement, Oracle opens a session in the remote database, executes the SQL statement there, and returns the data. You can create database links as public links (created by DBAs, making the links available to all users in the local database) or as private links.

The following example creates a public database link called HR_LINK:

image

image
NOTE

As of Oracle Database 11g, passwords are case-sensitive unless you set the initialization parameter SEC_CASE_SENSITIVE_LOGON to FALSE (the default is TRUE).

The CREATE DATABASE LINK command shown in this example has several parameters:

Image The optional keyword PUBLIC, which allows DBAs to create links for all users in a database. An additional optional keyword, SHARED, is described later in this chapter.

Image The name of the link (HR_LINK, in this example).

Image The account to connect to. You can configure the database link to use the local username and password in the remote database. This link connects to a fixed username in the remote database.

Image The service name (HQ, in this example).

To use the newly created link, simply add it as a suffix to table names in commands. The following example queries a remote table by using the HR_LINK database link:

image

When you execute this query, Oracle will establish a session via the HR_LINK database link and query the EMPLOYEES table in that database. The WHERE clause will be applied to the EMPLOYEES rows, and the matching rows will be returned. The execution of the query is shown graphically in Figure 19-2.

Image

FIGURE 19-2. Sample remote query

The FROM clause in this example refers to EMPLOYEES@HR_LINK. Because the HR_LINK database link specifies the server name, instance name, and owner name, the full name of the table is known. If no account name had been specified in the database link, the user’s account name and password in the local database would have been used during the attempt to log into the remote database.

The management of database links is described in the section “Managing Distributed Data,” later in this chapter.

Remote Data Manipulation: Two-Phase Commit

To support data manipulation across multiple databases, Oracle relies on Two-Phase Commit (2PC). 2PC allows groups of transactions across several nodes to be treated as a unit; either the transactions all COMMIT or they all get rolled back. A set of distributed transactions is shown in Figure 19-3. In that figure, two UPDATE transactions are performed. The first UPDATE goes against a local table (EMPLOYEES); the second, against a remote table (EMPLOYEES@HR_LINK). After the two transactions are performed, a single COMMIT is then executed. If either transaction cannot COMMIT, both transactions will be rolled back.

Image

FIGURE 19-3. Sample distributed transaction

Distributed transactions yield two important benefits: databases on other servers can be updated, and those transactions can be grouped together with others in a logical unit. The second benefit occurs because of the database’s use of 2PC. Here are the two phases:

Image The prepare phase An initiating node called the global coordinator notifies all sites involved in the transaction to be ready either to COMMIT or to roll back the transaction.

Image The commit phase If there is no problem with the prepare phase, all sites COMMIT their transactions. If a network or node failure occurs, all sites roll back their transactions.

The use of 2PC is transparent to the users. If the node that initiates the transaction forgets about the transaction, a third phase, the forget phase, is performed. The detailed management of distributed transactions is discussed in the section “Managing Distributed Transactions,” later in this chapter.

Dynamic Data Replication

To improve the performance of queries that use data from remote databases, you may wish to replicate that data on the local server. There are several options for accomplishing this, depending on which Oracle features you are using.

You can use database triggers to replicate data from one table into another. For example, after every INSERT into a table, a trigger may fire to insert that same record into another table—and that table may be in a remote database. Thus, you can use triggers to enforce data replication in simple configurations. If the types of transactions against the base table cannot be controlled, the trigger code needed to perform the replication will be unacceptably complicated.

When using Oracle’s distributed features, you can use materialized views to replicate data between databases. You do not have to replicate an entire table or limit yourself to data from just one table. When replicating a single table, you may use a WHERE clause to restrict which records are replicated, and you may perform GROUP BY operations on the data. You can also join the table with other tables and replicate the result of the queries.

image
NOTE

You cannot use materialized views to replicate data using LONG, LONG RAW, or user-defined datatypes.

The data in the local materialized view of the remote table(s) will need to be refreshed. You can specify the refresh interval for the materialized view, and the database will automatically take care of the replication procedures. In many cases, the database can use a materialized view log to send over only transaction data (changes to the table); otherwise, the database will perform complete refreshes on the local materialized view. The dynamic replication of data via materialized views is shown in Figure 19-4.

Image

FIGURE 19-4. Replication with materialized views

You can use Data Guard to create and manage a standby database whose content is updated whenever the primary database’s data changes. The standby database can be used as a read-only database to support reporting requirements and then returned to its status as a standby database. See Chapter 15 for details on the use and management of standby databases.

Managing Distributed Data

Before you can worry about managing transactions against remote databases, you have to get the data there and make it globally accessible to other databases. The following sections describe the requisite management tasks: enforcing location transparency and managing the database links, triggers, and materialized views.

image
NOTE

The examples in this chapter assume that you are using tnsnames.ora files for your database service name resolution.

The Infrastructure: Enforcing Location Transparency

To properly design your distributed databases for long-term use, you must start by making the physical location of the data transparent to the application. The name of a table within a database is unique within the schema that owns it. However, a remote database may have an account with the same name, which may own a table with the same name.

Within distributed databases, two additional layers of object identification must be added. First, the name of the instance that accesses the database must be identified. Next, the name of the host on which that instance resides must be identified. Putting together these four parts of the object’s name—its host, its instance, its owner, and its name—results in a global object name. To access a remote table, you must know that table’s global object name.

The goal of location transparency is to make the first three parts of the global object name—the host, the instance, and the schema—transparent to the user. The first three parts of the global object name are all specified via database links, so any effort at achieving location transparency should start there. First, consider a typical database link:

image

image
NOTE

If the GLOBAL_NAMES initialization parameter is set to TRUE, the database link name must be the same as the global name of the remote database.

By using a service name (in this example, HQ), the host and instance name remain transparent to the user. These names are resolved via the local host’s tnsnames.ora file. A partial entry in this file for the service name HQ is shown in the following listing:

image

The two lines in bold in this listing fill in the two missing pieces of the global object name. When a user references the HQ service name, the host name is HQ_MW and the service name is LOC. The SERVICE_NAME can be the instance name of the remote database. It is specified by the initialization parameter SERVICE_NAMES, and it can include several services. The default value for SERVICE_NAME is DB_UNIQUE_NAME.DB_DOMAIN. In a RAC database environment, each node can have additional service names in addition to the service name. A service specified by SERVICE_NAMES can run on several (or all, or just one) of the RAC instances. You would specify INSTANCE_NAME in tnsnames.ora instead of SERVICE_NAME if you want a specific database instance.

This tnsnames.ora file uses parameters for the TCP/IP protocol; other protocols may use different keywords, but their usage is the same. The tnsnames.ora entries provide transparency for the server and instance names.

The HR_LINK database link created via the code given earlier in this section will provide transparency for the first two parts of the global object name. But what if the data moves from the HR schema, or the HR account’s password changes? The database link would have to be dropped and re-created. The same would be true if account-level security were required; you may need to create and maintain multiple database links.

To resolve the transparency of the schema portion of the global object name, you can modify the database link syntax. Consider the database link in the following listing:

image

This database link uses the CONNECT TO CURRENT_USER clause. It will use what is known as a connected user database link: the remote database authenticates the connection request using the user’s credentials on the server where the user executes the query. The previous examples were fixed user connections—the same credentials are used to authenticate the connection request regardless of the user making the request. Here is an example of using the connected user database link; not surprisingly, it looks identical to using a fixed user database link:

image

When the user references this link, the database will attempt to resolve the global object name in the following order:

1. It will search the local tnsnames.ora file to determine the proper host name, port, and instance name or service name.

2. It will check the database link for a CONNECT TO specification. If the CONNECT TO CURRENT_USER clause is found, it will attempt to connect to the specified database using the connected user’s username and password.

3 It will search the FROM clause of the query for the object name.

Connected user links are often used to access tables whose rows can be restricted according to the username that is accessing the tables. For example, if the remote database had a table named HR.EMPLOYEES, and every employee were allowed to see their own row in the table, then a database link with a specific connection, such as:

image

would log in as the HR account (the owner of the table). If this specific connection is used, you cannot restrict the user’s view of the records on the remote host. However, if a connected user link is used, and a view is created on the remote host using the USER pseudo-column, then only that user’s data would be returned from the remote host. A sample database link and view of this type is shown in the following listing:

image

Either way, the data being retrieved can be restricted. The difference is that when a connected user link is used, the data can be restricted based on the username in the remote database; if a fixed connection is used, the data can be restricted after it has been returned to the local database. The connected user link reduces the amount of network traffic needed to resolve the query and adds an additional level of location transparency to the data.

image
NOTE

If you are using the Virtual Private Database features of the Oracle Database, you can restrict access to rows and columns without maintaining views for this purpose. See Chapter 10 for details on Virtual Private Database options.

Connected user database links raise a different set of maintenance issues. The tnsnames.ora file must be synchronized across the servers (which in turn drive the adoption of an LDAP solution such as OID), and the username/password combinations in multiple databases must be synchronized. These issues are addressed in the next sections.

Database Domains

A domain name service allows hosts within a network to be hierarchically organized. Each node within the organization is called a domain, and each domain is labeled by its function. These functions may include COM for companies and EDU for schools. Each domain may have many subdomains. Therefore, each host will be given a unique name within the network; its name contains information about how it fits into the network hierarchy. Host names within a network typically have up to four parts; the leftmost portion of the name is the host’s name, and the rest of the name shows the domain to which the host belongs.

For example, a host may be named HQ.MYCORP.COM. In this example, the host is named HQ and identified as being part of the MYCORP subdomain of the COM domain.

The domain structure is significant for two reasons. First, the host name is part of the global object name. Second, Oracle allows you to specify the DNS version of the host name in database link names, simplifying the management of distributed database connections.

To use DNS names in database links, you first need to add two parameters to your initialization file for the database. The first of these, DB_NAME, should be set to the instance name. The second parameter, DB_DOMAIN, is set to the DNS name of the database’s host or is set to WORLD by default; the value cannot be NULL. DB_DOMAIN specifies the network domain in which the host resides. If a database named LOC is created on the HQ.MYCORP.COM server, its entries will be

image

image
NOTE

In a RAC environment, the INSTANCE_NAME cannot be the same as the DB_NAME. Typically, a sequential number is appended to the DB_NAME for each instance. See Chapter 12 for more information on configuring a RAC database.

To enable the usage of the database domain name, you must set the GLOBAL_NAMES parameter to TRUE in your SPFILE or initialization parameter file, as in this example:

image

image
NOTE

GLOBAL_NAMES is set to FALSE by default in Oracle Database 12c.

Once you have set these parameters, the database must be shut down and restarted for changes to DB_NAME or DB_DOMAIN.

image
NOTE

If you set GLOBAL_NAMES to TRUE, all your database link names must follow the rules described in this section; in other words, GLOBAL_NAMES ensures that database links have the same name as the database to which you connect using the link.

When you use this method of creating global database names, the names of the database links are the same as the databases to which they point. Therefore, a database link that points to the LOC database instance listed earlier would be named LOC.HQ.MYCORP.COM. Here is an example:

image

LOCSVC is the service name in tnsnames.ora. Oracle will append the local database’s DB_DOMAIN value to the name of the database link. For example, if the database is in the HQ.MYCORP.COM domain, and the database link is named LOC, the database link will resolve to LOC.HQ.MYCORP.COM whenever it is referenced.

Using global database names establishes a link between the database name, database domain, and database link names. This, in turn, may make it easier to identify and manage database links. For example, you can create a public database link (with no connect string, as shown in the preceding example) in each database that points to every other database. Users within a database no longer need to guess at the proper database link to use; if they know the global database name, they know the database link name. If a table is moved from one database to another, or if a database is moved from one host to another, it is easy to determine which of the old database links must be dropped and re-created. Using global database names is part of migrating from standalone databases to true networks of databases.

Using Shared Database Links

If you use a shared server configuration for your database connections and your application will employ many concurrent database link connections, you may benefit from using shared database links. A shared database link uses shared server connections to support the database link connections. If you have multiple concurrent database link accesses into a remote database, you can use shared database links to reduce the number of server connections required.

To create a shared database link, use the SHARED keyword of the CREATE DATABASE LINK command. As shown in the following listing, you will also need to specify a schema and password for the remote database:

image

The HR_LINK_SHARED database link uses the connected user’s username and password when accessing the HQ database, since this link specifies the CONNECT TO CURRENT_USER clause. In order to prevent unauthorized attempts to use shared links, the AUTHENTICATED BY clause is required for shared links. In this example, the account used for authentication is an application user’s account, but you can also use an empty schema (that no user will ever log into) for authentication. The authentication account must have the CREATE SESSION system privilege. When users use the HR_LINK_SHARED link, connections will use the HR account on the remote database.

If you change the password on the authentication account, you will need to drop and re-create each database link that references the account. To simplify maintenance, create an account that is only used for authentication of shared database link connections. The account should only have the CREATE SESSION privilege; it should not have any privileges on any of the application tables.

If your application uses database links infrequently, you should use traditional database links without the SHARED clause. Without the SHARED clause, each database link connection requires a separate connection to the remote database. In general, use shared database links when the number of users accessing a database link is expected to be much larger than the number of server processes in the local database.

Managing Database Links

You can retrieve information about public database links via the DBA_DB_LINKS data dictionary view. You can view private database links via the USER_DB_LINKS data dictionary view. Whenever possible, separate your users among databases by application so that they may all share the same public database links. As a side benefit, these users will usually also be able to share public grants and synonyms.

The columns of the DBA_DB_LINKS data dictionary view are listed in the following table. The password for the link to use is not viewable via DBA_DB_LINKS; it is encrypted in the SYS.LINK$ table since Oracle Database 10g Release 2.

Image

image
NOTE

The number of database links that can be used by a single query is limited by the OPEN_LINKS parameter in the database’s initialization file. Its default value is 4.

The managerial tasks involved for database links depend on the level to which you have implemented location transparency in your databases. In the best-case scenario, connected user links are used with service names or aliases; minimal link management in this scenario requires a consistent tnsnames.ora file among all hosts in the domain (or all hosts using the same LDAP server for name resolution) and that user account/password combinations are the same within the domain.

Synchronizing account/password combinations across databases may be difficult, but there are several alternatives. First, you may force all changes to user account passwords to go through a central authority. This central authority would have the responsibility for updating the password for the account in all databases in the network—a time-consuming but valuable task.

Second, you may audit user password changes made via the ALTER USER command by auditing the usage of that command (see Chapter 10). If a user’s password changes in one database, it must be changed on all databases available in the network that are accessed via connected user links.

If any part of the global object name—such as a username—is embedded in the database link, a change affecting that part of the global object name requires that the database link be dropped and re-created. For example, if the HR user’s password were changed, the HR_LINK database link with a specific connection defined earlier would be dropped with:

image

and the link would have to be re-created using the new account password:

image

You cannot create a database link in another user’s account. Suppose you attempt to create a database link in OE’s account, as shown here:

image

In this case, Oracle will not create the HR_LINK database link in OE’s account. Instead, Oracle will create a database link named OE.HR_LINK in the account that executed the CREATE DATABASE LINK command. To create private database links, you must be logged into the database in the account that will own the link.

image
NOTE

To see which links are currently in use in your session, query V$DBLINK.

Managing Database Triggers

If your data replication needs require synchronized changes in multiple databases, you can use database triggers to replicate data from one table into another. Database triggers are executed when specific actions happen. Triggers can be executed for each row of a transaction, for an entire transaction as a unit, or when system-wide events occur. When dealing with data replication, you will usually be concerned with triggers affecting each row of data.

Before creating a replication-related trigger, you must create a database link for the trigger to use. In this case, the link is created in the database that owns the data, accessible to the owner of the table being replicated:

image

This link, named TRIGGER_LINK, uses the service name RMT_DB_1 to specify the connection to a remote database. The link will attempt to connect to the database RMT_DB_1 using the same username and password as the account using the link.

The trigger shown in the following listing uses this link. The trigger is fired after every row is inserted into the EMPLOYEES table. Because the trigger executes after the row has been inserted, the row’s data has already been validated in the local database. The trigger inserts the same row into a remote table with the same structure, using the TRIGGER_LINK database link just defined. The remote table must already exist.

image

This trigger uses the NEW keyword to reference the values from the row that was just inserted into the local EMPLOYEES table.

image
NOTE

If you use trigger-based replication, your trigger code must account for potential error conditions at the remote site, such as duplicate key values, space-management problems, or a shut down database.

image

Sample output from this query is as follows:

image

You can query the text of the trigger from DBA_TRIGGERS, as shown in this example:

image

It is theoretically possible to create a trigger to replicate all possible permutations of data-manipulation actions on the local database, but this quickly becomes difficult to manage. For a complex environment, you should consider the use of materialized views. For the limited circumstances described earlier, triggers are a very easy solution to implement. The overhead of using triggers for replication purposes is significant, however, so if you use this method, be sure to perform enough tests on bigger tables to determine if the overhead is acceptable.

image
NOTE

If you use triggers for your data replication, the success of a transaction in the master database is dependent on the success of the remote transaction.

Managing Materialized Views

You can use materialized views to aggregate, pre-join, or replicate data. In an enterprise database environment, data generally flows from an online transaction-processing database into a data warehouse. Normally, the data is prestaged, cleansed, or otherwise processed and then moved into the data warehouse. From there, the data may be copied to other databases or data marts.

You can use materialized views to pre-compute and store aggregate information within a database, to dynamically replicate data between distributed databases, and synchronize data updates within replicated environments. In replication environments, materialized views enable local access to data that would normally have to be accessed remotely. A materialized view may be based on another materialized view.

In large databases, materialized views help to improve the performance of queries that involve aggregates (including sum, count, average, variance, standard deviation, minimum, and maximum) or table joins. Oracle’s query optimizer will automatically recognize that the materialized view could be used to satisfy the query—a feature known as query rewrite.

image
NOTE

For best results, make sure the statistics on the materialized view are kept current. Since Oracle Database 10g, statistics on all database objects are collected on a regular basis during predefined maintenance windows as part of the automated maintenance tasks infrastructure (AutoTask windows).

You can use initialization parameters to configure the optimizer to automatically rewrite queries to use the materialized views whenever possible. Because materialized views are transparent to SQL applications, they can be dropped or created without any impact on the execution of the SQL code. You can also create partitioned materialized views, and you can base materialized views on partitioned tables.

Unlike regular views, materialized views store data and take up physical space in your database. Materialized views are populated with data generated from their base queries, and they are refreshed on demand or on a scheduled basis. Therefore, whenever the data accessed by the base query changes, the materialized views should be refreshed to reflect the data changes. The data refresh frequency depends on how much data latency your business can tolerate in the processes supported by the materialized views. You’ll see how to establish your refresh rate later in this chapter.

The materialized view will create several objects in the database. The user creating the materialized view must have the CREATE MATERIALIZED VIEW, CREATE TABLE, and CREATE VIEW privileges as well as the SELECT privilege on any tables that are referenced but are owned by another schema. If the materialized view is going to be created in another schema, the user creating the materialized view must have the CREATE ANY MATERIALIZED VIEW privilege and the SELECT privilege to the tables that are referenced in the materialized view if the tables are owned by another schema. To enable query rewrite on a materialized view that references tables within another schema, the user enabling query rewrite must have the GLOBAL QUERY REWRITE privilege or be explicitly granted the QUERY REWRITE privilege on any referenced table within another schema. The user must also have the UNLIMITED TABLESPACE privilege. Materialized views can be created in the local database, and pull data from the remote master database, or materialized views can reside on the same database server on which the data is located.

If you plan to use the query rewrite feature, you must put the following entry in your initialization parameter file:

image

image
NOTE

If the OPTIMIZER_FEATURES_ENABLE parameter is set to 10.0.0 or higher, then QUERY_REWRITE_ENABLED defaults to TRUE.

A second parameter, QUERY_REWRITE_INTEGRITY, sets the degree to which Oracle must enforce query rewriting. At the safest level, Oracle does not use query rewrite transformations that rely on unenforced relationships. The valid values for QUERY_REWRITE_INTEGRITY are ENFORCED (Oracle enforces and guarantees consistency and integrity), TRUSTED (query rewrite is supported for declared relationships), and STALE_TOLERATED (query rewrite is supported even if the materialized views are inconsistent with their underlying data). By default, QUERY_REWRITE_INTEGRITY is set to ENFORCED.

Materialized View Planning

Before you can create a materialized view, you must make several decisions, including:

Image Whether the materialized view is to be populated with data during creation or after

Image How often the materialized view is to be refreshed

Image What type of refreshes to perform

Image Whether to maintain a materialized view log or not

Image Whether the refresh should be in-place or out-of-place

You can either have data loaded to the materialized view upon its creation using the BUILD IMMEDIATE option of the CREATE MATERIALIZED VIEW command, or add the BUILD DEFERRED clause to pre-create the materialized view but not populate it until the first time it is used. The advantage of populating the view on creation is that the data will be available immediately when you make the materialized view available. However, if the materialized view is not going to be used right away and the underlying data changes rapidly, the data in the materialized view will become stale rapidly. If you wait to have the materialized view populated, the view will not be populated with data until the package DBMS_MVIEW.REFRESH is automatically executed, and your users must wait for the view to populate before any data is returned, thus causing a one-time performance degradation. If a standard view already exists and you want to convert it to a materialized view, you can use the PREBUILT keyword option.

You must also decide how much stale data is tolerable in terms of your company’s needs. You can base your decision on how frequently the data changes in the table on which the materialized view is based. If your management does not have to have up-to-the-minute information on which to base decisions, you might only need to refresh your materialized view once an hour or once a day. If it is critical for the data to be absolutely accurate at all times, you may need to perform fast refreshes every five minutes throughout the day and night.

There are four forms of refresh when specifying a refresh method during materialized view creation: REFRESH COMPLETE, REFRESH FAST, REFRESH FORCE, and NEVER REFRESH. In a fast (incremental) refresh, materialized view logs are used to track the data changes that have occurred within the table since the last refresh. Only changed information is populated back to the materialized view, on a periodic basis, based on the refresh criteria you have established. The materialized view log is maintained in the same database and schema as the master table for the materialized view. Because the fast refresh just applies changes made since the last refresh, the time taken to perform the refresh should generally be very short.

A new type of incremental refresh introduced in Oracle Database 12c is called partition change tracking (PCT), which is somewhat of a hybrid between log-based incremental refreshes and full refreshes. If a base table is partitioned, only the corresponding materialized view partition needs to be refreshed.

In a complete refresh, the data within the materialized view is completely replaced each time the refresh is run. The time required to perform a complete refresh of the materialized view can be substantial. You can either have the refresh performed each time transactions are committed on the master table (REFRESH ON COMMIT) or have it performed only when the DBMS_MVIEW.REFRESH procedure is run (REFRESH ON DEMAND).

When you specify REFRESH FORCE, the refresh process first evaluates whether or not a fast refresh can be run. If it can’t, a complete refresh will be performed. If you specify NEVER REFRESH as the refresh option, the materialized view will not be refreshed. If you do not have a materialized view log created and populated, only complete refreshes can be executed. Oracle Database 12c introduces another type of refresh: out-of-place materialized view refresh. During any type of refresh (COMPLETE, FAST, FORCE, or PCT), the current copy of the materialized view is maintained while a new version is built. Once completed, the current version is dropped and the new copy is renamed. This dramatically improves the availability of the materialized view, with the cost being the additional storage required to build a new copy of the materialized view.

Creating a Materialized View

A sample command used to create the materialized view is shown in the following listing. In this example, the materialized view is given a name (STORE_DEPT_SAL_MV) and its storage parameters are specified as well as its refresh interval and the time at which it will be populated with data. In this case, the materialized view is told to use the complete refresh option and to not populate the data until the DBMS_MVIEW.REFRESH procedure is run. Query rewrite is enabled. This materialized view’s base query is as follows:

image

image
NOTE

A materialized view query cannot reference tables or views owned by the user SYS.

The following example shows another example of a materialized view creation, using the REFRESH FAST ON COMMIT clause. To support fast refreshes when commits occur, you will need to create a materialized view log on the base table. See “Managing Materialized View Logs” later in this chapter for details.

image

image

In this example, the same base query is used, but the materialized view is created with REFRESH FAST ON COMMIT so that a fast refresh occurs every time a transaction is committed in any of the materialized view’s base queries. This materialized view will be populated with data on creation, and the inserted rows will be loaded in parallel. Query rewrite is enabled as well.

image
NOTE

The fast refresh option will not be used unless a materialized view log is created on the base table for the materialized view. Oracle can perform fast refreshes of joined tables in materialized views.

For both of these examples, the materialized view uses the default storage parameters for its tablespace. You can alter the materialized view’s storage parameters via the ALTER MATERIALIZED VIEW command, as in this example:

image

The two most frequently used operations against a materialized view are query execution and fast refresh. Each of these actions requires different resources and has different performance requirements. You may index the base table of the materialized view; for example, adding an index to improve query performance. If you have a materialized view that only uses join conditions and fast refresh, indexes on the primary key columns may improve the fast refresh operations. If your materialized view uses both joins and aggregates and is fast refreshable, as shown in the last example, an index is automatically created for the materialized view unless you specify USING NO INDEX in the CREATE MATERIALIZED VIEW command.

To drop a materialized view, use the DROP MATERIALIZED VIEW command:

image

Using the out-of-place option for a materialized view (new as of Oracle Database 12c), the creation of a materialized view created out-of-place is the same as for a materialized view that will be refreshed in-place, with the difference being the parameters you specify in the call to DBMS_MVIEW.REFRESH.

Using DBMS_MVIEW and DBMS_ADVISOR

There are multiple supplied packages you can use to manage and evaluate your materialized views, including DBMS_MVIEW, DBMS_ADVISOR, and DBMS_DIMENSION.

The DBMS_MVIEW package subprograms are shown in Table 19-1; this package is used to perform management tasks such as evaluating, registering, or refreshing a materialized view.

Image

TABLE 19-1. DBMS_MVIEW Subprograms

To refresh a single materialized view, use DBMS_MVIEW.REFRESH. Its two main parameters are the name of the materialized view to be refreshed and the method to use. For the method, you can specify ‘c’ for a complete refresh, ‘f’ for fast refresh, and ‘?’ for force. Here’s an example:

image

If you are refreshing multiple materialized views via a single call to DBMS_MVIEW.REFRESH, list the names of all the materialized views in the first parameter and their matching refresh methods in the second parameter, as in this example:

image

In this example, the materialized view MV2 will be refreshed via a fast refresh, whereas the other will use a complete refresh.

Refreshing a materialized view using out-of-place refresh is very similar to refreshing a materialized view using in-place refresh, the only difference being one parameter to the procedure DBMS_MVIEW.REFRESH, as in this example:

image

Since the outside table (the future version of the materialized view to be refreshed) is loaded using direct path I/O, it can be significantly faster than an in-place complete refresh.

You can use a separate procedure in the DBMS_MVIEW package to refresh all the materialized views that are scheduled to be automatically refreshed. This procedure, named REFRESH_ALL, will refresh each materialized view separately. It does not accept any parameters. The following listing shows an example of its execution:

image

Because the materialized views will be refreshed via REFRESH_ALL consecutively, they are not all refreshed at the same time (in other words, not in parallel). Therefore, a database or server failure during this procedure may cause the local materialized views to be out of sync with each other. In this case, simply rerun this procedure after the database has been recovered. As an alternative, you can create refresh groups, as described in the next section.

Using the SQL Access Advisor

You can use the SQL Access Advisor to generate recommendations for the creation and indexing of materialized views. The SQL Access Advisor may recommend specific indexes (and types of indexes) to improve the performance of joins and other queries. The SQL Access Advisor may also generate recommendations for altering a materialized view so that it supports query rewrite or fast refreshes. You can execute the SQL Access Advisor from within Oracle Enterprise Manager or via executions of the DBMS_ADVISOR package.

image
NOTE

For best results from the DBMS_ADVISOR package, you should gather statistics about all tables, indexes, and join columns prior to generating recommendations.

To use the SQL Access Advisor, either from Oracle Cloud Control 12c or via DBMS_ADVISOR, perform the following steps:

1. Create a task.

2. Define the workload.

3. Generate recommendations.

4. View and implement recommendations.

You can create a task in one of two ways: by executing the DBMS_ADVISOR.CREATE_TASK procedure or by using the DBMS_ADVISOR.QUICK_TUNE procedure (as shown in the next section).

The workload consists of one or more SQL statements plus the statistics and attributes that relate to the statement. The workload may include all SQL statements for an application. The SQL Access Advisor ranks the entries in the workload according to statistics and business importance. The workload is created using the DBMS_ADVISOR.CREATE_SQLWKLD procedure. To associate a workload with a parent Advisor task, use the DBMS_ADVISOR.ADD_SQLWKLD_REF procedure. If a workload is not provided, the SQL Access Advisor can generate and use a hypothetical workload based on the dimensions defined in your schema.

Once a task exists and a workload is associated with it, you can generate recommendations via the DBMS_ADVISOR.EXECUTE_TASK procedure. The SQL Access Advisor will consider the workload and the system statistics and will attempt to generate recommendations for tuning the application. You can see the recommendations by executing the DBMS_ADVISOR.GET_TASK_SCRIPT function or via data dictionary views. Each recommendation can be viewed via USER_ADVISOR_RECOMMENDATIONS (there are “ALL” and “DBA” versions of this view available as well). To relate recommendations to a SQL statement, you will need to use the USER_ADVISOR_SQLA_WK_STMTS view and USER_ADVISOR_ACTIONS.

image
NOTE

See Chapter 6 for more examples of using the DBMS_ADVISOR package.

When you execute the GET_TASK_SCRIPT procedure, Oracle generates an executable SQL file that will contain the commands needed to create, alter, or drop the recommended objects. You should review the generated script prior to executing it, particularly noting the tablespace specifications. Later in this chapter, you will see how to use the QUICK_TUNE procedure to simplify the tuning advisor process for a single command.

To tune a single SQL statement, use the QUICK_TUNE procedure of the DBMS_ADVISOR package. QUICK_TUNE has two input parameters, a task name and a SQL statement. Using QUICK_TUNE shields the user from the steps involved in creating workloads and tasks via DBMS_ADVISOR.

For example, the following procedure call evaluates a query:

image

image
NOTE

The user executing this command needs the ADVISOR system privilege.

The recommendations generated by QUICK_TUNE can be viewed via the data dictionary view USER_ADVISOR_ACTIONS, but they are easier to read if you use the DBMS_ADVISOR procedures to generate a script file. The recommendation in this example is to create a materialized view to support the query. Because only one SQL statement was provided, this recommendation is given in isolation and does not consider any other aspects of the database or application.

You can use the CREATE_FILE procedure to automate the generation of a file containing the scripts needed to implement the recommendations. First, create a directory object to hold the file:

image

Next, execute the CREATE_FILE procedure. It has three input variables: the script (generated by GET_TASK_SCRIPT, to which you pass the name of the task), the output directory, and the name of the file to be created.

image

The mv_tune.sql file created by the CREATE_FILE procedure will contain commands similar to those shown in the following listing. Depending on the specific version of Oracle, the recommendations may differ.

image

image

The MARK_RECOMMENDATION procedure allows you to annotate the recommendation so that it can be skipped during subsequent script generations. Valid actions for MARK_RECOMMENDATION include ACCEPT, IGNORE, IMPLEMENTED, and REJECT.

You can use the TUNE_MVIEW procedure of the DBMS_ADVISOR package to generate recommendations for the reconfiguration of your materialized views. TUNE_MVIEW generates two sets of output results, one for the creation of new materialized views and the other for the removal of previously created materialized views. The end result should be a set of materialized views that can be fast refreshed, replacing materialized views that cannot be fast refreshed.

You can view the TUNE_MVIEW output via the USER_TUNE_MVIEW data dictionary view, or you can generate its scripts via the GET_TASK_SCRIPT and CREATE_FILE procedures shown in the previous listings.

The supplied programs for the DBMS_ADVISOR package are shown in Table 19-2.

Image

Image

TABLE 19-2. DBMS_ADVISOR Subprograms

An additional package, DBMS_DIMENSION, provides these two procedures:

Image

You can use the DBMS_DIMENSION package to validate and display the structure of your dimensions.

Enforcing Referential Integrity Among Materialized Views

The referential integrity between two related tables, both of which have simple materialized views based on them, may not be enforced in their materialized views. If the tables are refreshed at different times, or if transactions are occurring on the master tables during the refresh, it is possible for the materialized views of those tables to not reflect the referential integrity of the master tables.

If, for example, the EMPLOYEES and DEPARTMENTS tables are related to each other via a primary key/foreign key relationship, then simple materialized views of these tables may contain violations of this relationship, including foreign keys without matching primary keys. In this example, that could mean employees in the EMPLOYEES materialized view with DEPTNO values that do not exist in the DEPARTMENTS materialized view.

This problem has a number of potential solutions. First, time the refreshes to occur when the master tables are not in use. Second, perform the refreshes manually (see the following section for information on this) immediately after locking the master tables or quiescing the database. Third, you may join the tables in the materialized view, creating a complex materialized view that will be based on the master tables (which will be properly related to each other). Fourth, you can force the materialized view updates to occur when transactions are committed in the primary database.

Using refresh groups provides another solution to the referential integrity problem. You can collect related materialized views into refresh groups. The purpose of a refresh group is to coordinate the refresh schedules of its members. Materialized views whose master tables have relationships with other master tables are good candidates for membership in refresh groups. Coordinating the refresh schedules of the materialized views will maintain the master tables’ referential integrity in the materialized views as well. If refresh groups are not used, the data in the materialized views may be inconsistent with regard to the master tables’ referential integrity.

Manipulation of refresh groups is performed via the DBMS_REFRESH package. The procedures within that package are MAKE, ADD, SUBTRACT, CHANGE, DESTROY, and REFRESH, as shown in the following examples. Information about existing refresh groups can be queried from the USER_REFRESH and USER_REFRESH_CHILDREN data dictionary views.

image
NOTE

Materialized views that belong to a refresh group do not have to belong to the same schema, but they do have to be all stored within the same database.

You can create a refresh group by executing the MAKE procedure in the DBMS_REFRESH package, whose calling parameters are shown here:

image

All but the first four of the parameters for this procedure have default values that are usually acceptable. The LIST and TAB parameters are mutually exclusive. You can use the following command to create a refresh group for materialized views names LOCAL_EMP and LOCAL_DEPT:

image

image
NOTE

The LIST parameter, which is the second parameter in the listing, has a single quote at its beginning and at its end, with none between. In this example, two materialized views—LOCAL_EMP and LOCAL_DEPT—are passed to the procedure via a single parameter.

The preceding command will create a refresh group named EMP_GROUP, with two materialized views as its members. The refresh group name is enclosed in single quotes, as is the list of members—but not each member.

If the refresh group is going to contain a materialized view that is already a member of another refresh group (for example, during a move of a materialized view from an old refresh group to a newly created refresh group), you must set the LAX parameter to TRUE. A materialized view can only belong to one refresh group at a time.

To add materialized views to an existing refresh group, use the ADD procedure of the DBMS_REFRESH package, whose parameters are as follows:

image

As with the MAKE procedure, the ADD procedure’s LAX parameter does not have to be specified unless a materialized view is being moved between two refresh groups. When this procedure is executed with the LAX parameter set to TRUE, the materialized view is moved to the new refresh group and is automatically deleted from the old refresh group.

To remove materialized views from an existing refresh group, use the SUBTRACT procedure of the DBMS_REFRESH package, as in the following example:

image

As with the MAKE and ADD procedures, a single materialized view or a list of materialized views (separated by commas) may serve as input to the SUBTRACT procedure. You can alter the refresh schedule for a refresh group via the CHANGE procedure of the DBMS_REFRESH package; here are the parameters:

image

The NEXT_DATE parameter is analogous to the START WITH clause in the CREATE MATERIALIZED VIEW command. For example, to change the EMP_GROUP’s schedule so that it will be replicated every three days, you can execute the following command (which specifies a NULL value for the NEXT_DATE parameter, leaving that value unchanged):

image

After this command is executed, the refresh cycle for the EMP_GROUP refresh group will be changed to every three days.

image
NOTE

Refresh operations on refresh groups may take longer than comparable materialized view refreshes. Group refreshes may also require significant undo segment space to maintain data consistency during the refresh.

You can manually refresh a refresh group via the REFRESH procedure of the DBMS_REFRESH package. The REFRESH procedure accepts the name of the refresh group as its only parameter. The command shown here will refresh the refresh group named EMP_GROUP:

image

To delete a refresh group, use the DESTROY procedure of the DBMS_REFRESH package, as shown in the following example. Its only parameter is the name of the refresh group.

image

You may also implicitly destroy the refresh group. If you set the IMPLICIT_DESTROY parameter to TRUE when you create the group with the MAKE procedure, the refresh group will be deleted (destroyed) when its last member is removed from the group (usually via the SUBTRACT procedure).

image
NOTE

For performance statistics related to materialized view refreshes, query V$MVREFRESH.

Managing Materialized View Logs

A materialized view log is a table that maintains a record of modifications to the master table in a materialized view. It is stored in the same database as the master table and is only used by simple materialized views. The data in the materialized view log is used during fast refreshes. If you are going to use fast refreshes, create the materialized view log before creating the materialized view.

To create a materialized view log, you must be able to create an AFTER ROW trigger on the table, so you need CREATE TRIGGER and CREATE TABLE privileges. You cannot specify a name for the materialized view log.

Because the materialized view log is a table, it has the full set of table storage clauses available to it. The following example shows the creation of a materialized view log on a table named EMPLOYEES:

image

The PCTFREE value for the materialized view log can be set very low (even 0), since there will not be any updates to this table! The size of the materialized view log depends on the number of changes that will be processed during each refresh. The more frequently all the materialized views that reference the master table are refreshed, the less space is needed for the log.

You can modify the storage parameters for the materialized view log via the ALTER MATERIALIZED VIEW LOG command. When using this command, specify the name of the master table. An example of altering the EMPLOYEES table’s materialized view log is shown in the following listing:

image

To drop a materialized view log, use the DROP MATERIALIZED VIEW LOG command, as in this example:

image

Purging the Materialized View Log

The materialized view log contains transient data; records are inserted into the log, used during refreshes, and then deleted. If multiple materialized views use the same master table, they share the same materialized view log. If one of the materialized views is not refreshed for a long period, the materialized view log may never delete any of its records. As a result, the space requirements of the materialized view log will grow.

To reduce the space used by log entries, you can use the PURGE_LOG procedure of the DBMS_MVIEW package. PURGE_LOG takes three parameters: the name of the master table, a NUM variable, and a DELETE flag. The NUM variable specifies the number of least recently refreshed materialized views whose rows will be removed from the materialized view log. For example, if you have three materialized views that use the materialized view log and one of them has not been refreshed for a very long time, you would use a NUM value of 1.

The following listing shows an example of the PURGE_LOG procedure. In this example, the EMPLOYEES table’s materialized view log will be purged of the entries required by the least recently used materialized view:

image

To further support maintenance efforts, Oracle provides two materialized view-specific options for the TRUNCATE command; if you want to truncate the master table without losing its materialized view log entries, you can use the TRUNCATE command with options like the following:

image

If the EMPLOYEES table’s materialized views are based on primary key values (the default behavior), the materialized view log values will still be valid following an export/import of the EMPLOYEES table. However, if the EMPLOYEES table’s materialized views are based on ROWID values, the materialized view log would be invalid following an export/import of the base table (since different ROWIDs will most likely be assigned during the import). In that case, you should truncate the materialized view log when you truncate the base table, as in this example:

image

What Kind of Refreshes Can Be Performed?

To see what kind of refresh and rewrite capabilities are possible for your materialized views, you can query the MV_CAPABILITIES_TABLE table. The capabilities may change between versions, so you should reevaluate your refresh capabilities following Oracle software upgrades. To create this table, execute the utlxmv.sql script located in the directory $ORACLE_HOME/rdbms/admin.

The columns of MV_CAPABILITIES_TABLE are shown here:

image

image

To populate MV_CAPABILITIES_TABLE, execute the procedure DBMS_MVIEW.EXPLAIN_MVIEW, using the name of the materialized view as the input value, as in this example:

image

The script utlxmv.sql provides guidance on the interpretation of the column values, as in this listing:

image

image

image

image

Once the EXPLAIN_MVIEW procedure has been executed, you can query the MV_CAPABILITIES_TABLE to determine your options:

image

For the LOCAL_BOOKSHELF materialized view, the query returns the following:

image

image

Because the QUERY REWRITE clause was not specified during the creation of this materialized view, the query rewrite capabilities are disabled for the LOCAL_BOOKSHELF table. Fast refresh capabilities are not supported, because the base table does not have a materialized view log. If you change your materialized view or its base table, you should regenerate the data in MV_CAPABILITIES_TABLE to see the new options.

As shown in the preceding listing, the LOCAL_BOOKSHELF materialized view cannot use a fast refresh because its base table does not have a materialized view log. Here are some other constraints that will limit your ability to use fast refreshes:

Image The materialized view must not contain references to nonrepeating expressions such as SYSDATE and ROWNUM.

Image The materialized view must not contain references to RAW or LONG RAW datatypes.

Image For materialized views based on joins, ROWIDs from all tables in the FROM list must be part of the SELECT list.

Image If there are outer joins, all the joins must be connected by ANDs, the WHERE clause must have no selections, and unique constraints must exist on the join columns of the inner join table.

Image For materialized views based on aggregates, the materialized view logs must contain all columns from the referenced tables, must specify the ROWID and INCLUDING NEW VALUES clauses, and must specify the SEQUENCE clause.

See the Oracle Database Data Warehousing Guide 12c Release 1 (12.1) for additional restrictions related to fast refreshes of complex aggregates.

image
NOTE

You can specify an ORDER BY clause in the CREATE MATERIALIZED VIEW command. The ORDER BY clause will only affect the initial creation of the materialized view; it will not affect any refreshes.

Using Materialized Views to Alter Query Execution Paths

For a large database, a materialized view may offer several performance benefits. You can use materialized views to influence the optimizer to change the execution paths for queries. This feature, called query rewrite, enables the optimizer to use a materialized view in place of the table queried by the materialized view, even if the materialized view is not named in the query. For example, if you have a large SALES table, you may create a materialized view that sums the SALES data by region. If a user queries the SALES table for the sum of the SALES data for a region, Oracle can redirect that query to use your materialized view in place of the SALES table. As a result, you can reduce the number of accesses against your largest tables, thus improving the system performance. Further, because the data in the materialized view is already grouped by region, summarization does not have to be performed at the time the query is issued.

image
NOTE

You must specify ENABLE QUERY REWRITE in the materialized view definition for the view to be used as part of a query rewrite operation.

To use the query rewrite capability effectively, you should create a dimension that defines the hierarchies within the table’s data. To execute the CREATE DIMENSION command, you must have the CREATE DIMENSION system privilege. In this example, countries are part of continents, so you can create tables and dimensions to support this hierarchy:

image

To enable a materialized view for query rewrite, you must place all the master tables for the materialized view in the materialized view’s schema, and you must have the QUERY REWRITE system privilege. In general, you should create materialized views in the same schema as the tables on which they are based; otherwise, you will need to manage the permissions and grants required to create and maintain the materialized views.

image
NOTE

You can enable or disable query rewrite at the SQL statement level via the REWRITE and NOREWRITE hints. When using the REWRITE hint, you can specify materialized views for the optimizer to consider.

For query rewrite to be enabled, you must set the following initialization parameters:

Image OPTIMIZER_MODE = ALL_ROWS or FIRST_ROWS or FIRST_ROWS_n

Image QUERY_REWRITE_ENABLED = TRUE

Image QUERY_REWRITE_INTEGRITY = STALE_TOLERATED, TRUSTED, or ENFORCED

By default, QUERY_REWRITE_INTEGRITY is set to ENFORCED; in this mode, all constraints must be validated. The optimizer only uses fresh data from the materialized views and only uses those relationships that are based on ENABLED and VALIDATED primary, unique, or foreign key constraints. In TRUSTED mode, the optimizer trusts that the data in the materialized view is fresh and that the relationships declared in dimensions and constraints are correct. In STALE_TOLERATED mode, the optimizer uses materialized views that are valid but contain stale data as well as those that contain fresh data.

If you set QUERY_REWRITE_ENABLED to FORCE, the optimizer will rewrite queries to use materialized views even when the estimated query cost of the original query is lower.

If query rewrite occurs, the explain plan for the query will list the materialized view as one of the objects accessed, along with an operation listed as “MAT_VIEW REWRITE ACCESS.” You can use the DBMS_MVIEW.EXPLAIN_REWRITE procedure to see if rewrite is possible for a query and which materialized views would be involved. If the query cannot be rewritten, the procedure will document the reasons.

image
NOTE

Query rewrite decisions are based on the costs of the different execution paths, so your statistics should be kept up to date.

EXPLAIN_REWRITE takes three input parameters—the query, a materialized view name, and a statement identifier—and can store its output in a table. Oracle provides the CREATE TABLE command for the output table in a script named utlxrw.sql in the $ORACLE_HOME/rdbms/admin directory. The utlxrw.sql script creates a table named REWRITE_TABLE.

You can query REWRITE_TABLE for the original cost, rewritten cost, and the optimizer’s decision. The MESSAGE column will display the reasons for the optimizer’s decision.

If you have used the BUILD DEFERRED option of the CREATE MATERIALIZED VIEW or ALTER MATERIALIZED VIEW command, the query rewrite feature will not be enabled until after the first time the materialized view is refreshed.

image
NOTE

If bind variables have been used within the query, the optimizer will not rewrite it even though query rewrite has been enabled.

Managing Distributed Transactions

A single logical unit of work may include transactions against multiple databases. For example, a COMMIT may be executed after two tables in separate databases have been updated. Oracle will transparently maintain the integrity between the two databases by ensuring that all the transactions involved either COMMIT or roll back (using the ROLLBACK command or a session failure) as a group. This is accomplished automatically via Oracle’s Two-Phase Commit (2PC) mechanism.

The first phase of the 2PC is the prepare phase. In this phase, each database instance involved in a transaction prepares the data that it will need to either COMMIT or roll back the data. Once prepared, an instance is said to be “in doubt.” The destination instances notify the initiating instance for the transaction (known as the global coordinator) of their status.

Once all instances are prepared, the transaction enters the commit phase, and all nodes are instructed to COMMIT their portion of the logical transaction. The databases all COMMIT the data at the same logical time, preserving the integrity of the distributed data.

image
NOTE

All databases that perform a COMMIT in a distributed transaction use the same System Change Number (SCN), which is the highest SCN of all databases involved in the transaction.

Resolving In-Doubt Transactions

Transactions against standalone databases may fail due to problems with the database server; for example, there may be a media failure. Working with distributed databases increases the number of potential failure causes during a set of related transactions.

When a distributed transaction is pending, an entry for that transaction will appear in the DBA_2PC_PENDING data dictionary view. When the transaction completes, its DBA_2PC_PENDING record is removed. If the transaction is pending but is not able to complete, its record stays in DBA_2PC_PENDING.

The RECO (Recoverer) background process periodically checks the DBA_2PC_PENDING view for distributed transactions that failed to complete. Using the information there, the RECO process on a node will automatically attempt to recover the local portion of an in-doubt transaction. It then attempts to establish connections to any other databases involved in the transaction and resolves the distributed portions of the transaction. The related rows in the DBA_2PC_PENDING view in each database are then removed.

image
NOTE

You can enable and disable the RECO process via the ENABLE DISTRIBUTED RECOVERY and DISABLE DISTRIBUTED RECOVERY clauses of the ALTER SYSTEM command.

The recovery of distributed transactions is performed automatically by the RECO process. You can manually recover the local portions of a distributed transaction, but this will usually result in inconsistent data between the distributed databases. If a local recovery is performed, the remote data will be out of sync.

To minimize the number of distributed recoveries necessary, you can influence the way that the distributed transaction is processed. The transaction processing is influenced via the use of commit point strength to tell the database how to structure the transaction.

Commit Point Strength

Each set of distributed transactions may reference multiple hosts and databases. Of those, one host and database can normally be singled out as being the most reliable, or as owning the most critical data. This database is known as the commit point site. If data is committed there, it should be committed for all databases. If the transaction against the commit point site fails, the transactions against the other nodes are rolled back. The commit point site also stores information about the status of the distributed transaction.

The commit point site will be selected by Oracle based on each database’s commit point strength. This is set via the initialization file, as shown in the following example:

image

The values for the COMMIT_POINT_STRENGTH parameter are set on a scale relative to other nodes participating in distributed transactions. In the preceding example, the value is set to 100 (the default is 1). If another database has a value of 200 for this parameter, that database would be the commit point site for a distributed transaction involving those two databases. The COMMIT_POINT_STRENGTH cannot exceed 255.

Because the scale is relative, you should set up a site-specific scale. Set the commit point on your most reliable database to 200. Then, grade the other servers and databases relative to the most reliable database. If, for example, another database is only 80 percent as reliable as the most reliable database, assign it a commit point strength of 160 (80 percent of 200). Fixing a single database at a definite point (in this case, 200) allows the rest of the databases to be graded on an even scale. This scale should result in the proper commit point site being used for each transaction.

Monitoring Distributed Databases

Several key environmental performance measures must be taken into account for databases:

Image The performance of the host

Image The distribution of I/O across disks and controllers

Image The usage of available memory

For distributed databases, you must also consider the following:

Image The capacity of the network and its hardware

Image The load on the network segments

Image The usage of different physical access paths between hosts

None of these can be measured from within the database. The focus of monitoring efforts for distributed databases shifts from being database-centric to being network-centric. The database becomes one part of the monitored environment, rather than the only part that is monitored.

You still need to monitor those aspects of the database that are critical to its success, such as the free space in tablespaces. However, the performance of distributed databases cannot be measured except as part of the performance of the network that supports them. Therefore, all performance-related tests, such as stress tests, must be coordinated with the network management staff. That staff may also be able to verify the effectiveness of your attempts to reduce the database load on the network.

The performance of the individual hosts can usually be monitored via a network monitoring package. This monitoring is performed in a top-down fashion, from network to host to database. Use the monitoring system described in Chapter 6 as an extension to the network and host monitors.

Tuning Distributed Databases

When you’re tuning a standalone database, the goal is to reduce the amount of time it takes to find data. As described in Chapter 8, you can use a number of database structures and options to increase the likelihood that the data will be found in the buffer cache or via an index.

When working with distributed databases, you have an additional consideration: Because data is now not only being retrieved but also being shipped across the network, the performance of a query is made up of the performance of these two steps. You must therefore consider the ways in which data is being transferred across the network, with a goal of reducing the network traffic.

A simple way to reduce network traffic is to replicate data from one node to another. You can do this manually (via the SQL*Plus COPY command), or it can be done automatically by the database (via materialized views). Replicating data improves the performance of queries against remote databases by bringing the data across the network once, usually during a slow period on the local host. Local queries can use the local copy of the data, eliminating the network traffic that would otherwise be required.

Let’s consider a simple task: selecting a value from a sequence. A company has created a distributed application in which a new sequence value is generated for each row. However, the sequence is local, whereas the insert is being performed in a far distant database. Because the trigger that generates the sequence value is executed for each row, each insert generates a remote operation to generate the next sequence value.

The impact of this design is apparent when a session’s trace file is examined:

image

In this case, the query is very simple—it selects the next value of the sequence from the DUAL table. But the sequence is remote (as seen in the execution plan), so the time required to fetch the values is 6.29 seconds for 53 rows, out of a total of 6.5 seconds. To tune the application, you either need to reduce the number of trips (such as by performing batch operations instead of row-by-row operations) or eliminate the remote architecture component of the INSERT. If the remote object (the sequence) and the local object (the DUAL table) can reside on the same database, the wait times associated with the remote operations can be eliminated.

image
NOTE

Since Oracle Database 10g, the DUAL table is an internal table, not a physical table, and therefore does not generate consistent gets as long as you don’t use * as the column list in a query referencing DUAL.

Two problems commonly arise with replicated solutions: First, the local data may become out of sync with the remote data. This is a standard problem with derived data; it limits the usefulness of this option to tables whose data is fairly static. Even if a simple materialized view is used with a materialized view log, the data will not be refreshed continuously—only when scheduled.

The second problem with the replicated data solution is that the copy of the table may not be able to pass updates back to the master table. That is, if a read-only materialized view is used to make a local copy of a remote table, the snapshot cannot be updated. If you are using materialized views, you can use updatable materialized views to send changes back to the master site, or you can use writable materialized views to support local ownership of data.

Any updates that must be processed against replicas must also be performed against the master tables. If the table is frequently updated, then replicating the data will not improve your performance unless you are using Oracle’s multimaster replication options. When there is multisite ownership of data, users can make changes in any database designated as an owner of the data. The management of Oracle’s multimaster replication is very involved and requires creating a database environment (with database links and so on) specifically designed to support multidirectional replication of data. See the Oracle replication documentation for details on implementing a multimaster environment.

The performance of the refreshes generally won’t concern your users. What may concern them is the validity and timeliness of the data. If the remote tables are frequently modified and are of considerable size, you are almost forced to use simple materialized views with materialized view logs to keep the data current. Performing complete refreshes in the middle of a workday is generally unacceptable. Therefore, it is the frequency of the refreshes rather than the size of them that determines which type of materialized view will better serve the users. After all, users are most concerned about the performance of the system while they are using it; refreshes performed late at night do not directly affect them. If the tables need to be frequently synchronized, use simple materialized views with materialized view logs.

As was noted previously in this chapter, you may index the underlying tables that are created by the materialized view in the local database. Indexing should also help to improve query performance, at the expense of slowing down the refreshes.

Another means of reducing network traffic, via remote procedure calls, is described in Chapter 8. That chapter also includes information on tuning SQL and the application design. If the database was properly structured, tuning the way the application processes data will yield the most significant performance improvements.

Summary

Distributed databases spread out the workload in a database environment to improve both performance and availability. However, merely spreading out the database or replicating a database across multiple locations is only valuable if you can ensure transactional integrity at each site. Therefore, Oracle uses Two-Phase Commit to ensure that a distributed transaction is treated atomically with a single COMMIT point.

To further enhance transparency and usability, you can use database links between databases to give you the flexibility to change the actual location of a database object without any changes to the application or any awareness of the actual location of the database objects to end users.

Use of materialized views is another key element in a distributed environment or even in a standalone environment. Creating a materialized view pre-aggregates the results of a query to improve performance for users who may run the query several times a day and not be aware that they are accessing the aggregate and not the actual tables in the query. The materialized view can be kept up to date on a continuous basis whether the source tables are all in a single database or are in several databases in a distributed environment.

Image

Image





All materials on the site are licensed Creative Commons Attribution-Sharealike 3.0 Unported CC BY-SA 3.0 & GNU Free Documentation License (GFDL)

If you are the copyright holder of any material contained on our site and intend to remove it, please contact our site administrator for approval.

© 2016-2024 All site design rights belong to S.Y.A.