Oracle Processes - Expert Oracle Database Architecture, Third Edition (2014)

Expert Oracle Database Architecture, Third Edition (2014)

Chapter 5. Oracle Processes

We’ve reached the last piece of the architecture puzzle. We’ve investigated the database and the set of physical files that constitute a database. In covering the memory used by Oracle, we’ve looked at one half of an instance. The last remaining architectural issue to cover is the set of processesthat constitute the other half of the instance.

Each process in Oracle will perform a particular task or set of tasks, and each will have internal memory (PGA memory) allocated by it to perform its job. An Oracle instance has three broad classes of processes:

· Server processes: These perform work based on a client’s request. We have already looked at dedicated and shared servers to some degree. These are the server processes.

· Background processes: These are the processes that start up with the database and perform various maintenance tasks, such as writing blocks to disk, maintaining the online redo log, cleaning up aborted processes, maintaining Automatic Workload Repository (AWR) and so on.

· Slave processes: These are similar to background processes, but they are processes that perform extra work on behalf of either a background or a server process.

Some of these processes, such as the database block writer (DBWn) and the log writer (LGWR), have cropped up already, but here we’ll take a closer look at the function of each, and what each does and why.

Image Note When I use the term process in this chapter, consider it to be synonymous with the term thread on ­operating systems where Oracle is implemented with threads (such as Windows). In the context of this chapter, I use the term process to cover both processes and threads. If you are using an implementation of Oracle that is multiprocess, such as you see on UNIX/Linux, the term process is totally appropriate. If you are using a single-process implementation of Oracle, such as you see on Windows, the term process will actually mean thread within the Oracle process. So, for example, when I talk about the DBWn process, the equivalent on Windows is the DBWn thread within the Oracle process.

Server Processes

Server processes are those that perform work on behalf of a client session. They are the processes that ultimately receive and act on the SQL statements our applications send to the database.

In Chapter 2, we briefly touched on the two main connection types to Oracle, namely the following:

· Dedicated server: There is a dedicated process on the server for your connection. There is a one-to-one mapping between a connection to the database and a server process or thread.

· Shared server: Many sessions share a pool of server processes spawned and managed by the Oracle instance. Your connection is to a database dispatcher, not to a dedicated server process created just for your connection.

Image Note It is important to understand the difference between a connection and a session in Oracle terminology. A connection is just a physical path between a client process and an Oracle instance (e.g., a network connection between you and the instance). A session, on the other hand, is a logical entity in the database, where a client process can execute SQL and so on. Many independent sessions can be associated with a single connection, and these sessions can even exist independently of a connection. We will discuss this further shortly.

Both dedicated and shared server processes have the same job: they process all of the SQL you give to them. When you submit a SELECT * FROM EMP query to the database, an Oracle dedicated/shared server process parses the query and places it into the shared pool (or finds it in the shared pool already, hopefully). This process comes up with the query plan, if necessary, and executes the query plan, perhaps finding the necessary data in the buffer cache or reading the data from disk into the buffer cache.

These server processes are the workhorse processes. Often, you will find these processes to be the highest consumers of CPU time on your system, as they are the ones that do your sorting, your summing, your joining—pretty much everything.

Dedicated Server Connections

In dedicated server mode, there will be a one-to-one mapping between a client connection and a server process (or thread, as the case may be). If you have 100 dedicated server connections on a UNIX/Linux machine, there will be 100 processes executing on their behalf. Graphically, it looks as shown in Figure 5-1.


Figure 5-1. Typical dedicated server connection

Your client application will have Oracle libraries linked into it. These libraries provide the APIs you need in order to talk to the database. These APIs know how to submit a query to the database and process the cursor that is returned. They know how to bundle your requests into network calls that the dedicated server will know how to unbundle. This piece of software is called Oracle Net, although in prior releases you might have known it as SQL*Net or Net8. This is the networking software/protocol that Oracle employs to allow for client/server processing (even in an n-tier architecture, there is a client/server program lurking). Oracle employs this same architecture even if Oracle Net is not technically involved in the picture. That is, even when the client and server are on the same machine this two-process (also known as two-task) architecture is still employed. This architecture provides two benefits:

· Remote execution: It is very natural for the client application to be executing on a machine other than the database itself.

· Address space isolation: The server process has read-write access to the SGA. An errant pointer in a client process could easily corrupt data structures in the SGA if the client process and server process were physically linked together.

In Chapter 2, we saw how these dedicated servers are spawned or created by the Oracle listener process. I won’t cover that process again; rather, we’ll quickly look at what happens when the listener isn’t involved. The mechanism is much the same as it was with the listener, but instead of the listener creating the dedicated server via a fork()/exec() in UNIX/Linux or an interprocess communication (IPC) call in Windows, the client process itself creates it.

Image Note There are many variants of the fork() and exec() calls, such as vfork() and execve(). The call used by Oracle may vary by operating system and implementation, but the net effect is the same. fork() creates a new process that is a clone of the parent process; on UNIX/Linux, this is the only way to create a new process. exec() loads a new program image over the existing program image in memory, thus starting a new program. So, SQL*Plus can fork (copy itself) and then exec the Oracle binary, the dedicated server, overlaying the copy of itself with this new program.

We can see this parent/child process creation clearly on UNIX/Linux when we run the client and server on the same machine:

$ sqlplus eoda/foo
SQL*Plus: Release Production on Thu Mar 20 14:29:00 2014
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Last Successful login time: Thu Mar 20 2014 13:47:01 -07:00
Connected to:
Oracle Database 12c Enterprise Edition Release - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

EODA@ORA12CR1> select a.spid dedicated_server, b.process clientpid
2 from v$process a, v$session b
3 where a.addr = b.paddr
4 and b.sid = sys_context('userenv','sid');

------------------------ ------------------------
18571 18570

EODA@ORA12CR1> !/bin/ps -fp 18571 18570
oracle 18570 11782 0 15:17 pts/4 S+ 0:00 sqlplus
oracle 18571 18570 0 15:17 ? Ss 0:00 oracleORA12CR1(DESCRIPTION=(LOCAL=...

Here, I used a query to discover the process ID (PID) associated with my dedicated server (the SPID from V$PROCESS is the operating system PID of the process that was being used during the execution of that query). The output of /bin/ps –fp includes the parent process id (PPID) and shows the dedicated server process, 18571, is the child of my SQL*Plus process: process id 18570.

Shared Server Connections

Let’s now take a look at the shared server process in more detail. This type of connection mandates the use of Oracle Net even if the client and server are on the same machine—you cannot use a shared server without using the Oracle TNS listener. As described earlier, the client application will connect to the Oracle TNS listener and will be redirected or handed off to a dispatcher. The dispatcher acts as the conduit between the client application and the shared server process. Figure 5-2 is a diagram of the architecture of a shared server connection to the database.


Figure 5-2. Typical shared server connection

Here, we can see that the client applications, with the Oracle libraries linked in, will be physically connected to a dispatcher process. We may have many dispatchers configured for any given instance, but it is not uncommon to have just one dispatcher for many hundreds—even thousands—of users. The dispatcher is simply responsible for receiving inbound requests from the client applications and putting them into a request queue in the SGA. The first available shared server process from the pool of pre-created shared server processes will pick up the request from the queue and attach the UGA of the associated session (the boxes labeled “S” in Figure 5-2). The shared server will process that request and place any output from it into the response queue. The dispatcher constantly monitors the response queue for results and transmits them back to the client application. As far as the client is concerned, it can’t really tell if it is connected via a dedicated server or a shared connection—they appear to be the same. Only at the database level is the difference apparent.

Database Resident Connection Pooling (DRCP)

Database Resident Connection Pooling (DRCP) is an optional method of connecting to the database and establishing a session. It is designed as a more efficient method of connection pooling for application interfaces that do not support efficient connection pooling natively—such as PHP, a general purpose web scripting language. DRCP is a mixture of dedicated server and shared server concepts. It inherits from a shared server the concept of server process pooling, only the processes being pooled will be dedicated servers, not shared servers; it inherits from the dedicated server the concept of—well—being dedicated.

In a shared server connection, the shared server process is shared among many sessions and a single session will tend to use many shared servers. With DRCP, this is not true; the dedicated server process that is selected from the pool will become dedicated to the client process for the life of its session. In a shared server, if I execute three statements against the database in my session, there is a good chance that the three statements will be executed by three different shared server processes. Using DRCP, those same three statements would be executed by the dedicated server assigned to me from the pool—that dedicated server would be mine until my session releases it back to the pool. So DRCP has the pooling capabilities of a shared server and the performance characteristics of a dedicated server. We’ll explore performance of a dedicated versus a shared server more later in this chapter.

Connections vs. Sessions

It surprises many people to discover that a connection is not synonymous with a session. In most people’s eyes they are the same, but the reality is they do not have to be. A connection may have zero, one, or more sessions established on it. Each session is separate and independent, even though they all share the same physical connection to the database. A commit in one session does not affect any other session on that connection. In fact, each session using that connection could use different user identities!

In Oracle, a connection is simply a physical circuit between your client process and the database instance—a network connection, most commonly. The connection may be to a dedicated server process or to a dispatcher. As previously stated, a connection may have zero or more sessions, meaning that a connection may exist with no corresponding sessions. Additionally, a session may or may not have a connection. Using advanced Oracle Net features such as connection pooling, a physical connection may be dropped by a client, leaving the session intact (but idle). When the client wants to perform some operation in that session, it would reestablish the physical connection. Let’s define these terms in more detail:

· Connection: A connection is a physical path from a client to an Oracle instance. A connection is established either over a network or over an IPC mechanism. A connection is typically between a client process and either a dedicated server or a dispatcher. However, using Oracle’s Connection Manager (CMAN), a connection may be between a client and CMAN, and CMAN and the database. Coverage of CMAN is beyond the scope of this book, but Oracle Database Net Services Administrator’s Guide (freely available from covers it in some detail.

· Session: A session is a logical entity that exists in the instance. It is your session state, or a collection of data structures in memory that represents your unique session. It is what would come first to most people’s minds when thinking of a database connection. It is your session on the server, where you execute SQL, commit transactions, and run stored procedures.

We can use SQL*Plus to see connections and sessions in action, and also to recognize that it could be a very common thing indeed for a connection to have more than one session. We’ll simply use the AUTOTRACE command and discover that we have two sessions! Over a single connection, using a single process, we’ll establish two sessions. Here is the first:

EODA@ORA12CR1> select username, sid, serial#, server, paddr, status
2 from v$session
3 where username = USER
4 /

--------------- ---------- ---------- --------- ---------------- --------
EODA 10 15 DEDICATED 00000000727FE9B0 ACTIVE

Now, that shows right now that we have one session: a single dedicated server–connected session. The PADDR column is the address of our sole dedicated server process. Next, we turn on AUTOTRACE to see the statistics of statements we execute in SQL*Plus:

EODA@ORA12CR1> set autotrace on statistics
EODA@ORA12CR1> select username, sid, serial#, server, paddr, status
2 from v$session
3 where username = USER
4 /

--------------- ---------- ---------- --------- ---------------- --------
EODA 10 15 DEDICATED 00000000727FE9B0 ACTIVE

8 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
1004 bytes sent via SQL*Net to client
543 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed

EODA@ORA12CR1> set autotrace off

In doing so, we now have two sessions, but both are using the same single dedicated server process, as evidenced by them both having the same PADDR value. We can confirm in the operating system that no new processes were created and that we are using a single process—a single connection—for both sessions. Note that one of the sessions (the original session) has a status of ACTIVE. That makes sense: it is running the query to show this information, so of course it is active. But that INACTIVE session—what is that one for? That is the AUTOTRACE session. Its job is to watch our real session and report on what it does.

When we enable AUTOTRACE in SQL*Plus, SQL*Plus will perform the following actions when we execute DML operations (INSERT, UPDATE, DELETE, SELECT, and MERGE):

1. It will create a new session using the current connection, if the secondary session does not already exist.

2. It will ask this new session to query the V$SESSTAT view to remember the initial statistics values for the session in which we will run the DML. This is very similar to the function the watch_stat.sql script performed for us in Chapter 4.

3. It will run the DML operation in the original session.

4. Upon completion of that DML statement, SQL*Plus will request the other session to query V$SESSTAT again and produce the report displayed previously showing the difference in the statistics for the session that executed the DML.

If you turn off AUTOTRACE, SQL*Plus will terminate this additional session and you will no longer see it in V$SESSION. Why does SQL*Plus do this trick? The answer is fairly straightforward. SQL*Plus does it for the same reason that we used a second SQL*Plus session in Chapter 4 to monitor memory and temporary space usage: if we had used a single session to monitor memory usage, we would have been using memory to do the monitoring. By observing the statistics in a single session, we necessarily would change those statistics. If SQL*Plus used a single session to report on the number of I/Os performed, how many bytes were transferred over the network, and how many sorts happened, then the queries used to find these details would be adding to the statistics themselves. They could be sorting, performing I/O, transferring data over the network (one would assume they would), and so on. Hence, we need to use another session to measure correctly.

So far, we’ve seen a connection with one or two sessions. Now we’d like to use SQL*Plus to see a connection with no session. That one is pretty easy. In the same SQL*Plus window used in the previous example, simply type the misleadingly named command, DISCONNECT:

EODA@ORA12CR1> disconnect
Disconnected from Oracle Database 12c Enterprise Edition Release - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

Technically, that command should be called DESTROY_ALL_SESSIONS instead of DISCONNECT, since we haven’t really disconnected physically.

Image Note The true disconnect in SQL*Plus is “exit,” as you would have to exit to completely destroy the connection.

We have, however, closed all of our sessions. If we open another session using some other user account and query (replacing EODA with your account name, of course):

$ sqlplus / as sysdba
SQL*Plus: Release Production on Thu Mar 20 15:57:56 2014
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SYS@ORA12CR1> select * from v$session where username = 'EODA';
no rows selected

We can see that we have no sessions—but we still have a process, a physical connection (using the previous ADDR value):

SYS@ORA12CR1> select username, program
2 from v$process
3 where addr = hextoraw( '00000000727FE9B0' );

--------------- ------------------------------------------------
oracle oracle@cs-xvm2 (TNS V1-V3)

So, here we have a connection with no sessions associated with it. We can use the also misnamed SQL*Plus CONNECT command to create a new session in this existing process (the CONNECT command might be better named CREATE_SESSION). Using the SQL*Plus instance we disconnected in, we’ll execute the following:

EODA@ORA12CR1> connect eoda/foo
EODA@ORA12CR1> select username, sid, serial#, server, paddr, status
2 from v$session
3 where username = USER;

--------------- ---------- ---------- --------- ---------------- --------
EODA 10 25 DEDICATED 00000000727FE9B0 ACTIVE

Notice that we have the same PADDR as before, so we are using the same physical connection, but we have (potentially) a different SID. I say potentially because we could get assigned the same SID—it just depends on whether other people logged in while we were logged out and whether the original SID we had was available.

Image Note On Windows or other thread-based operating systems, you might see different results—the process address may change since you are connected to a threaded process, not just a single purpose process as you would on UNIX/Linux.

So far, these tests were performed using a dedicated server connection, so the PADDR was the process address of our dedicated server process. What happens if we use a shared server?

Image Note To connect via a shared server, your database instance would have to have been started with the necessary setup. Coverage of how to configure a shared server is beyond the scope of this book but is covered in detail in the Oracle Database Net Services Administrator’s Guide.

Let’s log in using a shared server and in that session query:

EODA@ORA12CR1> select a.username, a.sid, a.serial#, a.server,
2 a.paddr, a.status, b.program
3 from v$session a left join v$process b
4 on (a.paddr = b.addr)
5 where a.username = 'EODA'
6 /

--------- ------ -------- --------- ---------------- -------- ---------------------
EODA 485 1105 SHARED 0000000071C4BE68 ACTIVE oracle@cs-xvm2 (S004)

Our shared server connection is associated with a process—the PADDR is there and we can join to V$PROCESS to pick up the name of this process. In this case, we see it is a shared server, as identified by the text S004.

However, if we use another SQL*Plus window to query this same bit of information, while leaving our shared server session idle, we see something like this:

$ sqlplus / as sysdba

SYS@ORA12CR1> select a.username, a.sid, a.serial#, a.server,
2 a.paddr, a.status, b.program
3 from v$session a left join v$process b
4 on (a.paddr = b.addr)
5 where a.username = 'EODA'
6 /

--------- ------ -------- --------- ---------------- -------- ---------------------
EODA 485 1105 NONE 0000000071C46788 INACTIVE oracle@cs-xvm2 (D000)

Notice that our PADDR is different and the name of the process we are associated with has also changed. Our idle shared server connection is now associated with a dispatcher, D000. Hence we have yet another method for observing multiple sessions pointing to a single process. A dispatcher could have hundreds, or even thousands, of sessions pointing to it.

An interesting attribute of shared server connections is that the shared server process we use can change from call to call. If I were the only one using this system (as I am for these tests), running that query over and over as EODA would tend to produce the same PADDR of0000000071C4BE68 over and over. However, if I were to open more shared server connections and start to use those shared server connections in other sessions, then I might notice that the shared server I use varies.

Consider this example. I’ll query my current session information, showing the shared server I’m using. Then in another shared server session, I’ll perform a long-running operation (i.e., I’ll monopolize that shared server). When I ask the database what shared server I’m using again, I’ll (in my current session) most likely see a different one (if the original one is off servicing the other session). In the following example, the code in bold represents a second SQL*Plus session that was connected via a shared server:

EODA@ORA12CR1> select a.username, a.sid, a.serial#, a.server,
2 a.paddr, a.status, b.program
3 from v$session a left join v$process b
4 on (a.paddr = b.addr)
5 where a.username = 'EODA'
6 /

--------- ------ -------- --------- ---------------- -------- ---------------------
EODA 485 1105 SHARED 0000000071C4BE68 ACTIVE oracle@cs-xvm2 (S004)

From another terminal session, connect to the database as the user SCOTT (the shared text in the following connection string maps to an entry in the tnsnames.ora file that instructs SQL*Plus to connect via a shared server connection):

$ sqlplus scott/tiger@shared

SCOTT@ORA12CR1> exec dbms_lock.sleep(20);
PL/SQL procedure successfully completed.

From the original connection as EODA, run the query to show the PROGRAM information again:

EODA@ORA12CR1> select a.username, a.sid, a.serial#, a.server,
2 a.paddr, a.status, b.program
3 from v$session a left join v$process b
4 on (a.paddr = b.addr)
5 where a.username = 'EODA'
6 /

--------- ------ -------- --------- ---------------- -------- ---------------------
EODA 485 1105 SHARED 0000000071C478E8 ACTIVE oracle@cs-xvm2 (S000)

Image Note You need to use an account that has execute privileges on the DBMS_LOCK package. I granted my demo account SCOTT execute privileges on the DBMS_LOCK package to accomplish this: SYS@ORA12CR1> grant execute on dbms_lock to scott;

Notice that the first time I queried, I was using S004 as the shared server. Then, in another session (as SCOTT), I executed a long-running statement that monopolized the shared server, which just happened to be S004 this time. The first nonbusy shared server is the one that gets assigned to do the work, and in this case no one else was asking to use the S004 shared server, so the DBMS_LOCK command took it. When I queried again in the first SQL*Plus session, I got assigned to another shared server process, S000, since the S004 shared server was busy.

It is interesting to note that the parse of a query (returns no rows yet) could be processed by shared server S000, the fetch of the first row by S001, the fetch of the second row by S002, and the closing of the cursor by S003. That is, an individual statement might be processed bit by bit by many shared servers.

So, what we have seen in this section is that a connection—a physical pathway from a client to a database instance—may have zero, one, or more sessions established on it. We have seen one use case of that when using SQL*Plus’s AUTOTRACE facility. Many other tools employ this ability as well. For example, Oracle Forms uses multiple sessions on a single connection to implement its debugging facilities. The n-tier proxy authentication feature of Oracle, used to provide end-to-end identification of users from the browser to the database, makes heavy use of the concept of a single connection with multiple sessions, but each session would use a potentially different user account. We have seen that sessions can use many processes over time, especially in a shared server environment. Also, if we are using connection pooling with Oracle Net, then our session might not be associated with any process at all; the client would drop the connection after an idle time and reestablish it transparently upon detecting activity.

In short, there is a many-to-many relationship between connections and sessions. However, the most common case, the one most of us see day to day, is a one-to-one relationship between a dedicated server and a single session.

Dedicated Server vs. Shared Server vs. DRCP

Before we examine the rest of the processes, let’s discuss why there are three main connection modes and when one might be more appropriate than the other.

When to Use a Dedicated Server

As noted previously, in dedicated server mode there is a one-to-one mapping between the client connection and server process. This is by far the most common method of connection to the Oracle database for all SQL-based applications. It is the simplest to set up and provides the easiest way to establish connections. It requires little to no configuration.

Since there is a one-to-one mapping, you do not have to be concerned that a long-running transaction will block other transactions. Those other transactions will simply proceed via their own dedicated processes. Therefore, it is the only mode you should consider using in a non-OLTP environment where you may have long-running transactions. A dedicated server is the recommended configuration for Oracle, and it scales rather nicely. As long as your server has sufficient hardware (CPU and RAM) to service the number of dedicated server processes your system needs, a dedicated server may be used for thousands of concurrent connections.

Certain operations must be done in a dedicated server mode, such as database startup and shutdown, so every database will have either both or just a dedicated server set up.

When to Use a Shared Server

A shared server setup and configuration, while not difficult, involves an extra step beyond a dedicated server setup. The main difference between the two is not, however, in their setup; it is in their mode of operation. With a dedicated server, there is a one-to-one mapping between client connections and server processes. With a shared server, there is a many-to-one relationship: many clients to a shared server.

As its name implies, a shared server is a shared resource, whereas a dedicated server is not. When using a shared resource, you must be careful to not monopolize it for long periods of time. As you saw previously, use of a simple DBMS_LOCK.SLEEP(20) in one session would monopolize a shared server process for 20 seconds. Monopolization of these shared server resources can lead to a system that appears to hang.

Figure 5-2 depicts two shared servers. If I have three clients and all of them attempt to run a 45-second process more or less at the same time, two of them will get their response in 45 seconds and the third will get its response in 90 seconds. This is rule number one for shared server: make sure your transactions are short in duration. They can be frequent, but they should be short (as characterized by OLTP systems). If they are not short, you will get what appears to be a total system slowdown due to shared resources being monopolized by a few processes. In extreme cases, if all of the shared servers are busy, the system will appear to hang for all users except the lucky few who are monopolizing the shared servers.

Another interesting situation that you may observe when using a shared server is that of an artificial deadlock. With a shared server, a number of server processes are being shared by a potentially large community of users. Consider a situation where you have five shared servers and one hundred user sessions established. At most, five of those user sessions can be active at any point in time. Suppose one of these user sessions updates a row and does not commit. While that user sits there and ponders his or her modification, five other user sessions try to lock that same row. They will, of course, become blocked and will patiently wait for that row to become available. Now the user session that holds the lock on this row attempts to commit its transaction (hence releasing the lock on the row). That user session will find that all of the shared servers are being monopolized by the five waiting sessions. We have an artificial deadlock situation here: the holder of the lock will never get a shared server to permit the commit, unless one of the waiting sessions gives up its shared server. But, unless the waiting sessions are waiting for the lock with a timeout, they will never give up their shared server (you could, of course, have an administrator kill their session via a dedicated server to release this logjam).

For these reasons, a shared server is only appropriate for an OLTP system characterized by short, frequent transactions. In an OLTP system, transactions are executed in milliseconds; nothing ever takes more than a fraction of a second. A shared server is highly inappropriate for a data warehouse. Here, you might execute a query that takes one, two, five, or more minutes. Under a shared server, this would be deadly. If you have a system that is 90 percent OLTP and 10 percent “not quite OLTP,” then you can mix and match dedicated servers and a shared server on the same instance. In this fashion, you can reduce the number of server processes on the machine dramatically for the OLTP users, and make it so that the “not quite OLTP” users do not monopolize their shared servers. In addition, the DBA can use the built-in Resource Manager to further control resource utilization.

Of course, a big reason to use a shared server is when you have no choice. Many advanced connection features require the use of a shared server. If you want to use Oracle Net connection pooling, you must use a shared server. If you want to use database link concentration between databases, then you must use a shared server for those connections.

Image Note If you are already using a connection pooling feature in your application (e.g., you are using the J2EE ­connection pool), and you have sized your connection pool appropriately, using a shared server will only be a performance inhibitor. You already sized your connection pool to cater for the number of concurrent connections that you will get at any point in time; you want each of those connections to be a direct dedicated server connection. Otherwise, you just have a connection pooling feature connecting to yet another connection pooling feature.

Potential Benefits of a Shared Server

What are the benefits of a shared server, bearing in mind that you have to be somewhat careful about the transaction types you let use it? A shared server does three things: it reduces the number of operating system processes/threads, it artificially limits the degree of concurrency, and it reduces the memory needed on the system. Let’s discuss these points in more detail.

Reduces the Number of Operating System Processes/Threads

On a system with thousands of users, the operating system may quickly become overwhelmed in trying to manage thousands of processes. In a typical system, only a fraction of the thousands of users are concurrently active at any point in time. For example, I’ve worked on systems with 5,000 concurrent users. At any one point in time, at most 50 were active. This system would work effectively with 50 shared server processes, reducing the number of processes the operating system has to manage by two orders of magnitude (100 times). The operating system can now, to a large degree, avoid context switching.

Artificially Limits the Degree of Concurrency

Speaking as a person who has been involved in many benchmarks, the benefits of this seem obvious. When running benchmarks, people frequently ask to run as many users as possible until the system breaks. One of the outputs of these benchmarks is always a chart that shows the number of concurrent users versus the number of transactions (see Figure 5-3).


Figure 5-3. Concurrent users vs. transactions per second

Initially, as you add concurrent users, the number of transactions increases. At some point, however, adding additional users does not increase the number of transactions you can perform per second; the graph tends to flatten off. The throughput has peaked and now response time starts to increase. In other words, you are doing the same number of transactions per second, but the end users are observing slower response times. As you continue adding users, you will find that the throughput will actually start to decline. The concurrent user count before this drop-off is the maximum degree of concurrency you want to allow on the system. Beyond this point, the system becomes flooded and queues begin forming to perform work. Much like a backup at a tollbooth, the system can no longer keep up. Not only does response time rise dramatically at this point, but throughput from the system may fall, too, as the overhead of simply context switching and sharing resources between too many consumers takes additional resources itself. If we limit the maximum concurrency to the point right before this drop, we can sustain maximum throughput and minimize the increase in response time for most users. A shared server allows us to limit the maximum degree of concurrency on our system to this number.

An analogy for this process could be a simple door. The width of the door and the width of people limit the maximum people per minute throughput. At low load, there is no problem; however, as more people approach, some forced waiting occurs (CPU time slice). If a lot of people want to get through the door, we get the fallback effect—there are so many people saying “after you” and so many false starts that the throughput falls. Everybody gets delayed getting through. Using a queue means the throughput increases, some people get through the door almost as fast as if there was no queue, while others (the ones put at the end of the queue) experience the greatest delay and might fret that “this was a bad idea.” But when you measure how fast everybody (including the last person) gets through the door, the queued model (shared server) performs better than a free-for-all approach (even with polite people; but conjure up the image of the doors opening when a store has a large sale, with everybody pushing very hard to get through).

Reduces the Memory Needed on the System

This is one of the most highly touted reasons for using a shared server: it reduces the amount of required memory. It does, but not as significantly as you might think, especially given the automatic PGA memory management discussed in Chapter 4, where work areas are allocated to a process, used, and released—and their size varies based on the concurrent workload. So, this was a fact that was truer in older releases of Oracle but is not as meaningful today. Also, remember that when you use a shared server, the UGA is located in the SGA. This means that when switching over to a shared server, you must be able to accurately determine your expected UGA memory needs and allocate appropriately in the SGA via the LARGE_POOL_SIZE parameter. The SGA requirements for the shared server configuration are typically very large. This memory must typically be preallocated and thus can only be used by the database instance.

Image Note It is true that with a resizable SGA, you may grow and shrink this memory over time, but for the most part, it will be owned by the database instance and will not be usable by other processes.

Contrast this with a dedicated server, where anyone can use any memory not allocated to the SGA. If the SGA is much larger due to the UGA being located in it, where does the memory savings come from? It comes from having that many fewer PGAs allocated. Each dedicated/shared server has a PGA. This is process information. It is sort areas, hash areas, and other process-related structures. It is this memory need that you are removing from the system by using a shared server. If you go from using 5,000 dedicated servers to 100 shared servers, it is the cumulative sizes of the 4,900 PGAs (excluding their UGAs) you no longer need that you are saving with a shared server.


So, what about DRCP, the feature (available with Oracle 11g and above)? It has many of the benefits of a shared server such as reduced processes (we are pooling), possible memory savings without the drawbacks. There is no chance of artificial deadlock; for example, the session that holds the lock on the resource in the earlier example would have its own dedicated server dedicated to it from the pool, and that session would be able to release the lock eventually. It doesn’t have the multithreading capability of a shared server; when a client process gets a dedicated server from the pool, it owns that process until that client process releases it. Therefore, it is best suited for client applications that frequently connect, do some relatively short process, and disconnect—over and over and over again; in short, for client processes that have an API that do not have an efficient connection pool of their own.

Dedicated/Shared Server Wrap-up

Unless your system is overloaded, or you need to use a shared server for a specific feature, a dedicated server will probably serve you best. A dedicated server is simple to set up (in fact, there is no setup) and makes tuning easier.

Image Note With shared server connections, a session’s trace information (SQL_TRACE=TRUE output) may be spread across many individual trace files; thus, reconstructing what that session has done is more difficult. With the advent of the DBMS_MONITOR package in Oracle 10g and above, much of the difficulty has been removed, but it still complicates matters. Also, if you have multiple related trace files generated by a session, you can use the TRCSESS utility to combine the trace files.

If you have a very large user community and know that you will be deploying with a shared server, I would urge you to develop and test with a shared server. It will increase your likelihood of failure if you develop under just a dedicated server and never test on a shared server. Stress the system, benchmark it, and make sure that your application is well behaved under a shared server. That is, make sure it does not monopolize shared servers for too long. If you find that it does so during development, it is much easier to fix at that stage than during deployment. You can use features such as the Advanced Queuing (AQ) to turn a long-running process into an apparently short one, but you have to design that into your application. These sorts of things are best done when you are developing. Also, historically, there have been differences between the feature set available to shared server connections versus dedicated server connections. We already discussed the lack of automatic PGA memory management in Oracle9i, for example, but also in the past, things as basic as a hash join between two tables were not available in shared server connections. (Hash joins are available in the current 9i and above releases with shared server!)

Background Processes

The Oracle instance is made up of two things: the SGA and a set of background processes. The background processes perform the mundane maintenance tasks needed to keep the database running. For example, there is a process that maintains the block buffer cache for us, writing blocks out to the data files as needed. Another process is responsible for copying an online redo log file to an archive destination as it fills up. Yet another process is responsible for cleaning up after aborted processes, and so on. Each of these processes is pretty focused on its job, but works in concert with all of the others. For example, when the process responsible for writing to the log files fills one log and goes to the next, it will notify the process responsible for archiving that full log file that there is work to be done.

There is a V$ view you can use to see all of the possible Oracle background processes and determine which ones are in use in your system currently:

EODA@ORA12CR1> select paddr, name, description
2 from v$bgprocess
3 order by paddr desc
4 /

---------------- ---- ----------------------------------------
0000000072FD44C8 MMON Manageability Monitor Process
00000000727FD850 MMNL Manageability Monitor Process 2
00000000723FE138 LREG Listener Registration
00000000723FCFD8 SMON System Monitor Process
00000000723F9BB8 CKPT checkpoint
00000000723F8A58 LGWR Redo etc.
00000000723F78F8 DBW0 db writer process 0
00 VMB0 Volume Membership 0
00 SCRB ASM Scrubbing Master
00 XDMG cell automation manager
00 XDWK cell automation worker actions

401 rows selected.

Rows in this view with a PADDR other than 00 are processes (threads) configured and running on your system.

Image Tip Another way to view currently running background processes is to query V$PROCESS where PNAME is not null.

There are two classes of background processes: those that have a focused job to do (as just described) and those that do a variety of other jobs (i.e., utility processes). For example, there is a utility background process for the internal job queues accessible via theDBMS_JOB/DBMS_SCHEDULER packages. This process monitors the job queues and runs whatever is inside them. In many respects, it resembles a dedicated server process, but without a client connection. Let’s examine each of these background processes, starting with the ones that have a focused job, and then look into the utility processes.

Focused Background Processes

The number, names, and types of focused background processes varies by release. Figure 5-4 depicts a typical set of Oracle background processes that have a focused purpose:


Figure 5-4. Focused background processes

Image Tip For a complete diagram showing all Oracle 12c background processes and memory structures, see the file at

For example, in Oracle 12c Release 1, here’s a database started using a minimum number of init.ora parameters.

SYS@ORA12CR1> create pfile='/tmp/pfile' from spfile;
File created.

SYS@ORA12CR1> !cat /tmp/pfile

In Oracle 12c Release 1, you would have about 22 background processes started up:

EODA@ORA12CR1> select paddr, name, description
2 from v$bgprocess
3 where paddr <> '00'
4 order by paddr desc
5 /

---------------- ----- ----------------------------------------------------------------
000000007F188558 VKRM Virtual sKeduler for Resource Manager
000000007F17FA58 CJQ0 Job Queue Coordinator
000000007F17E8F8 AQPC AQ Process Coord
000000007F17D798 FBDA Flashback Data Archiver Process
000000007F17C638 SMCO Space Manager Process
000000007F17A378 TMON Transport Monitor
000000007F175DF8 MMNL Manageability Monitor Process 2
000000007F174C98 LREG Listener Registration
000000007F173B38 RECO distributed recovery
000000007F1729D8 SMON System Monitor Process
000000007F16F5B8 CKPT checkpoint
000000007F16E458 LGWR Redo etc.
000000007F16D2F8 DBW0 db writer process 0
000000007F16C198 DIA0 diagnosibility process 0
000000007F16B038 DBRM DataBase Resource Manager
000000007F169ED8 DIAG diagnosibility process
000000007F168D78 MMON Manageability Monitor Process
000000007F167C18 MMAN Memory Manager
000000007F166AB8 GEN0 generic0
000000007F165958 VKTM Virtual Keeper of TiMe process
000000007F1647F8 PSP0 process spawner 0
000000007F163698 PMON process cleanup

22 rows selected.

In Oracle 11g Release 2, with that same init.ora file, you would have about 17 background processes started up:

ops$tkyte@ORA11GR2> select paddr, name, description
2 from v$bgprocess
3 where paddr <> '00'
4 order by paddr desc
5 /

-------- ---- ----------------------------------------------------------------
32AF0E64 CJQ0 Job Queue Coordinator
32AEF8B4 QMNC AQ Coordinator
32AEE304 MMNL Manageability Monitor Process 2
32AED82C MMON Manageability Monitor Process
32AECD54 RECO distributed recovery
32AEC27C SMON System Monitor Process
32AEB7A4 CKPT checkpoint
32AEACCC LGWR Redo etc.
32AEA1F4 DBW0 db writer process 0
32AE971C MMAN Memory Manager
32AE8C44 DIA0 diagnosibility process 0
32AE816C PSP0 process spawner 0
32AE7694 DBRM DataBase Resource Manager
32AE6BBC DIAG diagnosibility process
32AE60E4 GEN0 generic0
32AE560C VKTM Virtual Keeper of TiMe process
32AE4B34 PMON process cleanup

17 rows selected.

Using the same init.ora, only replacing MEMORY_TARGET with SGA_TARGET and PGA_AGGREGATE_TARGET, in Oracle 10g Release 2, you might only see 12:

ops$tkyte@ORA10GR2> select paddr, name, description
2 from v$bgprocess
3 where paddr <> '00'
4 order by paddr desc
5 /

-------- ---- ----------------------------------------------------------------
23D27AC4 CJQ0 Job Queue Coordinator
23D27508 QMNC AQ Coordinator
23D26990 MMNL Manageability Monitor Process 2
23D263D4 MMON Manageability Monitor Process
23D25E18 RECO distributed recovery
23D2585C SMON System Monitor Process
23D252A0 CKPT checkpoint
23D24CE4 LGWR Redo etc.
23D24728 DBW0 db writer process 0
23D2416C MMAN Memory Manager
23D23BB0 PSP0 process spawner 0
23D235F4 PMON process cleanup

12 rows selected.

Note that you may not see all of these processes when you start your instance, but the majority of them will be present. You will only see ARCn (the archiver) if you are in ARCHIVELOG mode and have enabled automatic archiving. You will only see the LMD0, LCKn, LMON, and LMSn(more details on those processes shortly) processes if you are running Oracle RAC, a configuration of Oracle that allows many instances on different machines in a cluster to mount and open the same physical database.

Image Note Starting with Oracle 12c, on some UNIX/Linux platforms, you can use a mixture of processes and threads. This feature is enabled by setting the initialization THREADED_EXECUTION parameter to TRUE (the default is FALSE). Nothing has fundamentally changed—all of the “processes” are still there, you just might not see them via the operating system ps command because they are now running as a thread in a larger process.

So, Figure 5-4 depicts roughly what you might see if you started an Oracle instance, and mounted and opened a database. On an operating system where Oracle implements a multiprocess architecture, such as on a UNIX/Linux system, you can physically see these processes. After starting the instance, I observed the following:

$ ps -aef | grep ora_...._$ORACLE_SID | grep -v grep
oracle 2276 1 0 10:33 ? 00:00:00 ora_pmon_ORA12CR1
oracle 2278 1 0 10:33 ? 00:00:00 ora_psp0_ORA12CR1
oracle 2280 1 0 10:33 ? 00:00:04 ora_vktm_ORA12CR1
oracle 2284 1 0 10:33 ? 00:00:00 ora_gen0_ORA12CR1
oracle 2286 1 0 10:33 ? 00:00:00 ora_mman_ORA12CR1
oracle 2290 1 0 10:33 ? 00:00:00 ora_diag_ORA12CR1
oracle 2292 1 0 10:33 ? 00:00:00 ora_dbrm_ORA12CR1
oracle 2294 1 0 10:33 ? 00:00:00 ora_dia0_ORA12CR1
oracle 2297 1 0 10:33 ? 00:00:00 ora_dbw0_ORA12CR1
oracle 2299 1 0 10:33 ? 00:00:00 ora_lgwr_ORA12CR1
oracle 2301 1 0 10:33 ? 00:00:00 ora_ckpt_ORA12CR1
oracle 2303 1 0 10:33 ? 00:00:00 ora_lg00_ORA12CR1
oracle 2305 1 0 10:33 ? 00:00:00 ora_lg01_ORA12CR1
oracle 2307 1 0 10:33 ? 00:00:00 ora_smon_ORA12CR1
oracle 2309 1 0 10:33 ? 00:00:00 ora_reco_ORA12CR1
oracle 2311 1 0 10:33 ? 00:00:00 ora_lreg_ORA12CR1
oracle 2313 1 0 10:33 ? 00:00:01 ora_mmon_ORA12CR1
oracle 2315 1 0 10:33 ? 00:00:00 ora_mmnl_ORA12CR1
oracle 2327 1 0 10:33 ? 00:00:00 ora_p000_ORA12CR1
oracle 2329 1 0 10:33 ? 00:00:00 ora_p001_ORA12CR1
oracle 2331 1 0 10:33 ? 00:00:00 ora_tmon_ORA12CR1
oracle 2333 1 0 10:33 ? 00:00:00 ora_tt00_ORA12CR1
oracle 2335 1 0 10:33 ? 00:00:00 ora_smco_ORA12CR1
oracle 2337 1 0 10:33 ? 00:00:00 ora_fbda_ORA12CR1
oracle 2339 1 0 10:33 ? 00:00:00 ora_aqpc_ORA12CR1
oracle 2343 1 0 10:33 ? 00:00:00 ora_cjq0_ORA12CR1
oracle 2345 1 0 10:33 ? 00:00:00 ora_p002_ORA12CR1
oracle 2347 1 0 10:33 ? 00:00:00 ora_p003_ORA12CR1
oracle 2349 1 0 10:33 ? 00:00:00 ora_p004_ORA12CR1
oracle 2351 1 0 10:33 ? 00:00:00 ora_p005_ORA12CR1
oracle 2353 1 0 10:33 ? 00:00:00 ora_p006_ORA12CR1
oracle 2355 1 0 10:33 ? 00:00:00 ora_p007_ORA12CR1
oracle 2383 1 0 10:33 ? 00:00:00 ora_w000_ORA12CR1
oracle 2385 1 0 10:33 ? 00:00:00 ora_qm02_ORA12CR1
oracle 2389 1 0 10:33 ? 00:00:00 ora_q002_ORA12CR1
oracle 2391 1 0 10:33 ? 00:00:00 ora_q003_ORA12CR1
oracle 2465 1 0 10:38 ? 00:00:00 ora_w001_ORA12CR1

It is interesting to note the naming convention used by these processes. The process name starts with ora_. It is followed by four characters representing the actual name of the process, which are followed by _ORA12CR1. As it happens, my ORACLE_SID (site identifier) is ORA12CR1. On UNIX/Linux, this makes it very easy to identify the Oracle background processes and associate them with a particular instance (on Windows, there is no easy way to do this, as the backgrounds are threads in a larger, single process). What is perhaps most interesting, but not readily apparent from the preceding code, is that they are all really the same exact binary executable program—there is not a separate executable for each “program.” Search as hard as you like, but you will not find the ora_pmon_ORA12CR1 binary executable on disk anywhere. You will not find ora_lgwr_ORA12CR1 or ora_reco_ORA12CR1. These processes are all really oracle (that’s the name of the binary executable that is run). They just alias themselves upon startup to make it easier to identify which process is which. This enables a great deal of object code to be efficiently shared on the UNIX/Linux platform. On Windows, this is not nearly as interesting, as they are just threads within the process, so of course they are one big binary.

Let’s now take a look at the function performed by each major process of interest, starting with the primary Oracle background processes. For a complete listing of the possible background processes and a short synopsis of the function they perform, I will direct you to the appendix of theOracle Database Reference manual available freely on

PMON: The Process Monitor

This process is responsible for cleaning up after abnormally terminated connections. For example, if your dedicated server “fails” or is killed for some reason, PMON is the process responsible for fixing (recovering or undoing work) and releasing your resources. PMON will initiate the rollback of uncommitted work, release locks, and free SGA resources allocated to the failed process.

In addition to cleaning up after aborted connections, PMON is responsible for monitoring the other Oracle background processes and restarting them if necessary (and if possible). If a shared server or a dispatcher fails (crashes), PMON will step in and restart another one (after cleaning up for the failed process). PMON will watch all of the Oracle processes and either restart them or terminate the instance as appropriate. For example, it is appropriate to fail the instance in the event the database log writer process, LGWR, fails. This is a serious error, and the safest path of action is to terminate the instance immediately and let normal recovery fix the data. (Note that this is a rare occurrence and should be reported to Oracle Support immediately.)

Image Note Prior to Oracle 12c, PMON handled the listener registration tasks. Starting with Oracle 12c, the listener ­registration (LREG) background process registers instances and services with a listener.

LREG: Listener Registration Process

The LREG process (available starting with Oracle 12c) is responsible for registering instances and services with the Oracle TNS listener. When an instance starts up, the LREG process polls the well-known port address, unless directed otherwise, to see whether or not a listener is up and running. The well-known/default port used by Oracle is 1521. Now, what happens if the listener is started on some different port? In this case, the mechanism is the same, except that the listener address needs to be explicitly specified by the LOCAL_LISTENER parameter setting. You can also use the REMOTE_LISTENER parameter to instruct LREG to register a service with a remote listener (which is common in Oracle RAC environments).

If the listener is running when the database instance is started, LREG communicates with the listener and passes to it relevant parameters, such as the service name and load metrics of the instance. If the listener was not started, LREG will periodically attempt (typically every 60 seconds) to contact it to register itself. You can also instruct LREG to immediately attempt to register a service with the listener via the ALTER SYSTEM REGISTER command (which is useful in high availability environments).

SMON: The System Monitor

SMON is the process that gets to do all of the system-level jobs. Whereas PMON was interested in individual processes, SMON takes a system-level perspective of things and is a sort of garbage collector for the database. Some of the jobs it does include the following:

· Cleans up temporary space: With the advent of true temporary tablespaces, the chore of cleaning up temporary space has lessened, but it has not gone away. For example, when building an index, the extents allocated for the index during the creation are marked asTEMPORARY. If the CREATE INDEX session is aborted for some reason, SMON is responsible for cleaning them up. Other operations create temporary extents that SMON would be responsible for as well.

· Coalesces free space: If you are using dictionary-managed tablespaces, SMON is responsible for taking extents that are free in a tablespace and contiguous with respect to each other and coalescing them into one larger free extent. This occurs only on dictionary-managed tablespaces with a default storage clause that has PCTINCREASE set to a nonzero value.

· Recovers transactions active against unavailable files: This is similar to its role during database startup. Here, SMON recovers failed transactions that were skipped during instance/crash recovery due to a file(s) not being available to recover. For example, the file may have been on a disk that was unavailable or not mounted. When the file does become available, SMON will recover it.

· Performs instance recovery of a failed node in RAC: In an Oracle RAC configuration, when a database instance in the cluster fails (e.g., the machine the instance was executing on fails), some other node in the cluster will open that failed instance’s redo log files and perform a recovery of all data for that failed instance.

· Cleans up OBJ$: OBJ$ is a low-level data dictionary table that contains an entry for almost every object (table, index, trigger, view, and so on) in the database. Many times, there are entries in here that represent deleted objects, or objects that represent “not there” objects, used in Oracle’s dependency mechanism. SMON is the process that removes these rows that are no longer needed.

· Manage undo segments: SMON will perform the automatic onlining, offlining, and shrinking of undo segments.

· Offlines rollback segments: When using manual rollback segment management (not recommended, you should be using automatic undo management), it is possible for the DBA to offline, or make unavailable, a rollback segment that has active transactions. It may be possible that active transactions are using this offlined rollback segment. In this case, the rollback is not really offlined; it is marked as “pending offline.” In the background, SMON will periodically try to truly take it offline, until it succeeds.

That should give you a flavor of what SMON does. It does many other things, such as flush the monitoring statistics that show up in the DBA_TAB_MODIFICATIONS view, flush the SCN to timestamp mapping information found in the SMON_SCN_TIME table, and so on. The SMONprocess can accumulate quite a lot of CPU over time, and this should be considered normal. SMON periodically wakes up (or is woken up by the other background processes) to perform these housekeeping chores.

RECO: Distributed Database Recovery

RECO has a very focused job: it recovers transactions that are left in a prepared state because of a crash or loss of connection during a two-phase commit (2PC). A 2PC is a distributed protocol that allows for a modification that affects many disparate databases to be committed atomically. It attempts to close the window for distributed failure as much as possible before committing. In a 2PC between N databases, one of the databases—typically (but not always) the one the client logged into initially—will be the coordinator. This one site will ask the other N-1 sites if they are ready to commit. In effect, this one site will go to the N-1 sites and ask them to be prepared to commit. Each of the N-1 sites reports back its prepared state as YES or NO. If any one of the sites votes NO, the entire transaction is rolled back. If all sites vote YES, then the site coordinator broadcasts a message to make the commit permanent on each of the N-1 sites.

Say a site votes YES and is prepared to commit, but before it gets the directive from the coordinator to actually commit, the network fails or some other error occurs, then the transaction becomes an in-doubt distributed transaction. The 2PC tries to limit the window of time in which this can occur, but cannot remove it. If there is a failure right then and there, the transaction will become the responsibility of RECO. RECO will try to contact the coordinator of the transaction to discover its outcome. Until it does that, the transaction will remain in its uncommitted state. When the transaction coordinator can be reached again, RECO will either commit the transaction or roll it back.

It should be noted that if the outage is to persist for an extended period of time, and you have some outstanding transactions, you can commit/roll them back manually. You might want to do this since an in-doubt distributed transaction can cause writers to block readers—this is the one time this can happen in Oracle. Your DBA could call the DBA of the other database and ask her to query the status of those in-doubt transactions. Your DBA can then commit or roll them back, relieving RECO of this task.

CKPT: Checkpoint Process

The checkpoint process doesn’t, as its name implies, do a checkpoint (checkpoints were discussed in Chapter 3 in the section on redo logs)—that’s mostly the job of DBWn. It simply assists with the checkpointing process by updating the file headers of the data files. It used to be that CKPTwas an optional process, but starting with version 8.0 of the database, it is always started, so if you do a ps on UNIX/Linux, you’ll normally see it there (I say “normally” because as of Oracle 12c, it’s possible for the checkpoint process to run within an operating system thread, and therefore won’t be displayed as a process).

The job of updating data files’ headers with checkpoint information used to belong to the LGWR; however, as the number of files increased along with the size of a database over time, this additional task for LGWR became too much of a burden. If LGWR had to update dozens, or hundreds, or even thousands, of files, there would be a good chance sessions waiting to commit these transactions would have to wait far too long. CKPT removes this responsibility from LGWR.

DBWn: Database Block Writer

The database block writer (DBWn) is the background process responsible for writing dirty blocks to disk. DBWn will write dirty blocks from the buffer cache, usually to make more room in the cache (to free buffers for reads of other data) or to advance a checkpoint (to move forward the position in an online redo log file from which Oracle would have to start reading, to recover the instance in the event of failure). As we discussed in Chapter 3 when Oracle switches log files, a checkpoint is signaled. Oracle needs to advance the checkpoint so that it no longer needs the online redo log file it just filled up. If it hasn’t been able to do that by the time we need to reuse that redo log file, we get the “checkpoint not complete” message and we must wait.

Image Note Advancing log files is only one of many ways for checkpoint activity to occur. There are incremental ­checkpoints controlled by parameters such as FAST_START_MTTR_TARGET and other triggers that cause dirty blocks to be flushed to disk.

As you can see, the performance of DBWn can be crucial. If it does not write out blocks fast enough to free buffers (buffers that can be reused to cache some other blocks) for us, we will see both the number and duration of waits on Free Buffer Waits and Write Complete Waits start to grow.

We can configure more than one DBWn; in fact, in Oracle 11g, we can configure up to 36, and in Oracle 12c, up to 100, as evidenced by the following query:

EODA@ORA12CR1> select name, description from v$bgprocess
2 where description like 'db writer process%';

----- ----------------------------------------------------------------
DBW0 db writer process 0
DBW1 db writer process 1
DBW2 db writer process 2
BW97 db writer process 97
BW98 db writer process 98
BW99 db writer process 99

100 rows selected.

Most systems run with one database block writer, but larger, multi-CPU systems can make use of more than one. This is generally done to distribute the workload of keeping a large block buffer cache in the SGA clean, flushing the dirtied (modified) blocks to disk.

Optimally, the DBWn uses asynchronous I/O to write blocks to disk. With asynchronous I/O, DBWn gathers up a batch of blocks to be written and gives them to the operating system. DBWn does not wait for the operating system to actually write the blocks out; rather, it goes back and collects the next batch to be written. As the operating system completes the writes, it asynchronously notifies DBWn that it completed the writes. This allows DBWn to work much faster than if it had to do everything serially. We’ll see later in the “Slave Processes” section how we can use I/O slaves to simulate asynchronous I/O on platforms or configurations that do not support it.

I would like to make one final point about DBWn. It will, almost by definition, write out blocks scattered all over disk—DBWn does lots of scattered writes. When you do an update, you’ll be modifying index blocks that are stored here and there, and data blocks that are also randomly distributed on disk. LGWR, on the other hand, does lots of sequential writes to the redo log. This is an important distinction and one of the reasons that Oracle has a redo log and the LGWR process as well as the DBWn process. Scattered writes are significantly slower than sequential writes. By having the SGA buffer dirty blocks and the LGWR process do large sequential writes that can re-create these dirty buffers, we achieve an increase in performance. The fact that DBWn does its slow job in the background while LGWR does its faster job while the user waits gives us better overall performance. This is true even though Oracle may technically be doing more I/O than it needs to (writes to the log and to the data file); the writes to the online redo log could in theory be skipped if, during a commit, Oracle physically wrote the modified blocks out to disk instead. In practice, it does not happen this way. LGWR writes the redo information to the online redo logs for every transaction, and DBWn flushes the database blocks to disk in the background.

LGWR: Log Writer

The LGWR process is responsible for flushing to disk the contents of the redo log buffer located in the SGA. It does this when one of the following is true:

· Every three seconds

· Whenever a COMMIT or ROLLBACK is issued

· When LGWR is asked to switch log files

· When the redo buffer gets one-third full or contains 1MB of cached redo log data

For these reasons, having an enormous (hundreds/thousands of megabytes) redo log buffer is not practical; Oracle will never be able to use it all since it pretty much continuously flushes it. The logs are written to with sequential writes as compared to the scattered I/O DBWn must perform. Doing large batch writes like this is much more efficient than doing many scattered writes to various parts of a file. This is one of the main reasons for having a LGWR and redo logs in the first place. The efficiency in just writing out the changed bytes using sequential I/O outweighs the additional I/O incurred. Oracle could just write database blocks directly to disk when you commit, but that would entail a lot of scattered I/O of full blocks, and this would be significantly slower than letting LGWR write the changes out sequentially.

Image Note Starting with Oracle 12c, Oracle will start additional Log Writer Worker (LG0) processes on multiprocessor machines to increase the performance of writing to the online redo log files.

ARCn: Archive Process

The job of the ARCn process is to copy an online redo log file to another location when LGWR fills it up. These archived redo log files can then be used to perform media recovery. Whereas online redo log is used to fix the data files in the event of a power failure (when the instance is terminated), archived redo logs are used to fix data files in the event of a hard disk failure. If we lose the disk drive containing the data file, /u01/dbfile/ORA12CR1/system01.dbf, we can go to our backups from last week, restore that old copy of the file, and ask the database to apply all of the archived and online redo logs generated since that backup took place. This will catch up that file with the rest of the data files in our database, and we can continue processing with no loss of data.

ARCn typically copies online redo log files to at least two other locations (redundancy being a key to not losing data). These other locations may be disks on the local machine or, more appropriately, at least one will be located on another machine altogether, in the event of a catastrophic failure. In many cases, these archived redo log files are copied by some other process to some tertiary storage device, such as tape. They may also be sent to another machine to be applied to a standby database, a failover option offered by Oracle. We’ll discuss the processes involved in that shortly.

DIAG: Diagnosability Process

In past releases, the DIAG process was used exclusively in a RAC environment. As of Oracle 11g, with the ADR (Advanced Diagnostic Repository), it is responsible for monitoring the overall health of the instance, and it captures information needed in the processing of instance failures. This applies to both single instance configurations as well as multi-instance RAC configurations.

FBDA: Flashback Data Archiver Process

This process available with Oracle 11g Release 1 and above. It is the key component of the flashback data archive capability—the ability to query data “as of” long periods of time ago (for example, to query data in a table as it appeared one year ago, five years ago, and so on). This long term historical query capability is achieved by maintaining a history of the row changes made to every row in a table over time. This history, in turn, is maintained by the FBDA process in the background. This process functions by working soon after a transaction commits. The FBDA process will read the UNDO generated by that transaction and roll back the changes made by the transaction. It will then record these rolled back (the original values) rows in the flashback data archive for us.

DBRM: Database ResourceManager Process

This process implements the resource plans that may be configured for a database instance. It sets the resource plans in place and performs various operations related to enforcing/implementing those resource plans. The resource manager allows the administrators of a database to have fine grained control over the resources used by the database instance, by applications accessing the database, or by individual users accessing the database.

GEN0: General Task Execution Process

This process provides, as expected by its name, a general task execution thread for the database. The main goal of this process is to offload potentially blocking processing (processing that would cause a process to stop while it occurs) from some other process and perform it in the background. For example, if the main ASM process needs to perform some blocking file operation, but that operation could safely be done in the background (ASM can safely continue processing before the operation completes), then the ASM process may request the GEN0 process to perform this operation and let GEN0 notify it upon completion. It is similar in nature to the slave processes described further later in this chapter.

Remaining Common Focused Processes

Depending on the features of Oracle you are using, other focused processes may be visible. Some are listed here with a brief description of their function.

Image Note Appendix F (Background Processes) of the Oracle Database Reference manual, available on, has a complete listing of the background processes and their functions.

Most of the processes described previously are nonnegotiable—you will have them if you have an Oracle instance running. (ARCn is technically optional but is, in my opinion, mandatory for all production databases!) The following processes are optional and will appear only if you make use of the specific feature. The following processes are unique to a database instance using ASM, as discussed in Chapter 3:

· Automatic Storage Management Background (ASMB) process: The ASMB process runs in a database instance that is making use of ASM. It is responsible for communicating to the ASM instance that is managing the storage, providing updated statistics to the ASM instance, and providing a heartbeat to the ASM instance, letting it know that it is still alive and functioning.

· ReBALance (RBAL) process: The RBAL process also runs in a database instance that is making use of ASM. It is responsible for processing a rebalance request (a redistribution request) as disks are added/removed to and from an ASM disk group.

The following processes are found in an Oracle RAC instance. RAC is a configuration of Oracle whereby multiple instances, each running on a separate node (typically a separate physical computer) in a cluster, may mount and open a single database. It gives you the ability to have more than one instance accessing, in a full read-write fashion, a single set of database files. The primary goals of RAC are twofold:

· High availability: With Oracle RAC, if one node/computer in the cluster fails due to a software, hardware, or human error, the other nodes may continue to function. The database will be accessible via the other nodes. You might lose some computing power, but you won’t lose access to the database.

· Scalability: Instead of buying larger and larger machines to handle an increasing workload (known as vertical scaling), RAC allows you to add resources in the form of more machines in the cluster (known as horizontal scaling). Instead of trading your 4 CPU machine in for one that can grow to 8 or 16 CPUs, RAC gives you the option of adding another relatively inexpensive 4 CPU machine (or more than one).

The following processes are unique to a RAC environment. You will not see them otherwise.

· Lock monitor (LMON) process: The LMON process monitors all instances in a cluster to detect the failure of an instance. It then facilitates the recovery of the global locks held by the failed instance. It is also responsible for reconfiguring locks and other resources when instances leave or are added to the cluster (as they fail and come back online, or as new instances are added to the cluster in real time).

· Lock manager daemon (LMD0) process: The LMD process handles lock manager service requests for the global cache service (keeping the block buffers consistent between instances). It works primarily as a broker sending requests for resources to a queue that is handled by the LMSn processes. The LMD handles global deadlock detection/resolution and monitors for lock timeouts in the global environment. Also, starting with Oracle 12c, there may be LDDn slave processes spawned by the LMD0 process to assist with the workload.

· Lock manager server (LMSn) process: As noted earlier, in a RAC environment, each instance of Oracle is running on a different machine in a cluster, and they all access, in a read-write fashion, the same exact set of database files. To achieve this, the SGA block buffer caches must be kept consistent with respect to each other. This is one of the main goals of the LMSn process. In earlier releases of Oracle Parallel Server (OPS) this was accomplished via a ping. That is, if a node in the cluster needed a read-consistent view of a block that was locked in exclusive mode by another node, the exchange of data was done via a disk flush (the block was pinged). This was a very expensive operation just to read data. Now, with the LMSn, this exchange is done via very fast cache-to-cache exchange over the clusters’ high-speed connection. You may have up to ten LMSn processes per instance.

· Lock (LCK0) process: This process is very similar in functionality to the LMD process described earlier, but it handles requests for all global resources other than database block buffers.

The following are common background processes seen with most single instance or RAC instances:

· Process Spawner (PSP0) Process: This process is responsible for spawning (starting/creating) the various background processes. It is the process that creates new processes/threads for the Oracle Instance. It does most of its work during instance startup.

· Virtual Keeper of Time (VKTM) Process: Implements a consistent, fine-grained clock for the Oracle instance. It is responsible for providing both wall clock time (human readable) as well as an extremely high resolution timer (not necessarily built using wall clock time, more of a ticker that increments for very small units of time) used to measure durations and intervals.

· Virtual Sch(K)eduler for Resource Manager (VKRM) Process: Scheduler for the resource manager. Manages CPU scheduling and managed processes with active resource plans.

· Space Management Coordinator (SMCO) Process: This process is part of the manageability infrastructure. It coordinates the proactive space management features of the database such as the processes that discover space that could be reclaimed and the processes that perform the reclamation.

Utility Background Processes

These background processes are totally optional, based on your need for them. They provide facilities not necessary to run the database day to day, unless you are using them yourself, such as the job queues, or are making use of a feature that uses them, such as the diagnostic capabilities (starting with Oracle 10g and above).

These processes will be visible in UNIX/Linux as any other background process would be. If you do a ps, you will see them. In my ps listing from the beginning of the focused background processes section (reproduced in part here), you can see that I have

· Job queues configured. The CJQ0 process is the job queue coordinator.

· Oracle AQ configured, as evidenced by the Qnnn (AQ queue process) and QMNC (AQ monitor process).

· Automatic memory management enabled, as evidenced by the Memory Manager (MMAN) process.

· Oracle manageability/diagnostic features enabled, as evidenced by the Manageability Monitor (MMON) and Manageability Monitor Light (MMNL) processes.

Let’s take a look at the various processes you might see depending on the features you are using.

CJQ0 and Jnnn Processes: Job Queues

In the first 7.0 release, Oracle provided replication in the form of a database object known as a snapshot. Job queues were the internal mechanism by which these snapshots were refreshed, or made current.

A job queue process monitored a job table that told it when it needed to refresh various snapshots in the system. In Oracle 7.1, Oracle Corporation exposed this facility for all to use via a database package called DBMS_JOB. So a process that was solely the domain of the snapshot in 7.0 became the “job queue” in 7.1 and later versions. Over time, the parameters for controlling the behavior of the job queue (how frequently it should be checked and how many queue processes there should be) changed in name from SNAPSHOT_REFRESH_INTERVAL andSNAPSHOT_REFRESH_PROCESSES to JOB_QUEUE_INTERVAL and JOB_QUEUE_PROCESSES. In current releases only the JOB_QUEUE_PROCESSES parameter is exposed as a user-tunable setting.

You may have up to 1,000 job queue processes. Their names will be J000 . . . J999. These processes are used heavily in replication as part of the materialized view refresh process. Streams-based replication (starting with Oracle9i Release 2 and above) uses AQ for replication and therefore does not use the job queue processes. Developers also frequently use the job queues in order to schedule one-off (background) jobs or recurring jobs such as sending an e-mail in the background or processing a long-running batch process in the background. By doing some work in the background, you can make a long task seem to take much less time to an impatient end user (he feels like it went faster, even though it might not be done yet). This is similar to what Oracle does with LGWR and DBWn processes; they do much of their work in the background, so you don’t have to wait for them to complete all tasks in real time.

The Jnnn, where nnn represents a number, processes are very much like a shared server, but with aspects of a dedicated server. They are shared in the sense that they process one job after the other, but they manage memory more like a dedicated server would (their UGA memory is in the PGA, not the SGA). Each job queue process will run exactly one job at a time, one after the other, to completion. That is why we may need multiple processes if we wish to run jobs at the same time. There is no threading or preempting of a job. Once a job is running, it will run to completion (or failure).

You will notice that the Jnnn processes come and go over time. That is, if you configure up to 1,000 of them, you will not see 1,000 of them start up with the database. Rather, a sole process, the Job Queue Coordinator (CJQ0) will start up, and as it sees jobs that need to be run in the job queue table, it will start the Jnnn processes. As the Jnnn processes complete their work and discover no new jobs to process, they will start to exit, to go away. So, if you schedule most of your jobs to run at 2:00 AM when no one is around, you might well never actually see these Jnnnprocesses.

QMNC and Qnnn: Advanced Queues

The QMNC process is to the AQ tables what the CJQ0 process is to the job table. It monitors the advanced queues and alerts waiting message dequeuers that a message has become available. QMNC and Qnnn are also responsible for queue propagation—that is, the ability of a message that was enqueued (added) in one database to be moved to a queue in another database for dequeueing.

The Qnnn process are to the QMNC process what the Jnnn processes are to the CJQ0 process. They are notified by the QMNC process of work that needs to be performed, and they process the work.

The QMNC and Qnnn processes are optional background processes. The parameter AQ_TM_PROCESSES specifies creation of up to 40 of these processes named Qnnn, (where nn is a number 0..15 or a letter a..z) and a single QMNC process.

Unlike the Jnnn processes used by the job queues, the Qnnn processes are persistent. If you set AQ_TM_PROCESSES to 10, you will see ten Q0nn processes and the QMNC process at database startup and for the entire life of the instance.

Oracle automatically adjusts the number of queue processes and therefore rarely do you need to set AQ_TM_PROCESSES manually. If you do set this parameter, Oracle still automatically adjusts the number of processes spawned, and uses the value of AQ_TM_PROCESSES as a minimum number of processes to create.

Image Note Starting with Oracle 12c, there’s an Advanced Queue Process Coordinator (AQPC) process. Its purpose is to create and manage master advanced queuing processes (starting, stopping, and so on). Statistics related to this process can be queried from theGV$AQ_BACKGROUND_COORDINATOR view.

EMNC: Event Monitor Processes

The EMNC process is part of the AQ architecture. It is used to notify queue subscribers of messages they would be interested in. This notification is performed asynchronously. There are Oracle Call Interface (OCI) functions available to register a callback for message notification. The callback is a function in the OCI program that will be invoked automatically whenever a message of interest is available in the queue. The EMNn background process is used to notify the subscriber. The EMNC process is started automatically when the first notification is issued for the instance. The application may then issue an explicit message_receive(dequeue) to retrieve the message.

MMAN: Memory Manager

This process is available starting with Oracle 10g and above and is used by the automatic SGA sizing feature. The MMAN process coordinates the sizing and resizing of the shared memory components (the default buffer pool, the shared pool, the Java pool, and the large pool).

MMON, MMNL, and Mnnn: Manageability Monitors

These processes are used to populate the Automatic Workload Repository (AWR), a feature available starting with Oracle 10g. The MMNL process flushes statistics from the SGA to database tables on a scheduled basis. The MMON process is used to auto-detect database performance issues and implement the self-tuning features. The Mnnn processes are similar to the Jnnn or Qnnn processes for the job queues; the MMON process will request these slave processes to perform work on its behalf. The Mnnn processes are transient in nature—they will come and go as needed.

CTWR: Change Tracking Processes

This is an optional process starting with Oracle 10g database and above. The CTWR process is responsible for maintaining the change tracking file, as described in Chapter 3.

RVWR: Recovery Writer

This process (available with Oracle 10g database and above), is responsible for maintaining the before images of blocks in the Fast Recovery Area (described in Chapter 3) used with the FLASHBACK DATABASE command.

DMnn/DWnn: Data Pump Master/Worker Processes

Data Pump was a feature added in Oracle 10g Release 1 of the database. It was designed as a complete rewrite of the legacy export/import processes. Data Pump runs entirely in the server and the API to it is via PL/SQL. Since Data Pump runs in the server, support for performing the various Data Pump operations were added. The Data Pump master (DMnn) collects all inputs from client processes (it is the process that receives the API inputs) and then coordinates the worker processes (the DWnn) which perform the real work—the DMnn processes do the actual processing of the metadata and data.

TMON/TT00: Transport Monitor and Redo Transport Slave

Starting with Oracle 12c, two Data Guard related processes are automatically started when your instance starts: a transport monitor process (TMON) and a redo transport slave (TT00). TMON will start and monitor a number of TT00 processes. The TT00 processes are used to inform the LGWRprocess if and when it needs to generate heartbeat redo. Even if you don’t implement Data Guard, you may see the processes started. You don’t need to worry about these processes, just be aware that they’re there and will be used if you implement Data Guard.

If you do implement Data Guard, there will be a number of other processes started to facilitate the shipping of redo information from one database to another, and apply it. See the Data Guard Concepts and Administration manual from Oracle for full details.

Remaining Utility Background Processes

So, is that the complete list? No, not by a long shot, there are many others depending on what features you implemented. For example, there are Streams apply and capture processes present when you implement products such as Oracle GoldenGate, Oracle XStream, Oracle Streams, and so on. However, the preceding list covers most of the common background processes you will encounter.

Slave Processes

Now we are ready to look at the last class of Oracle processes: the slave processes. There are two types of slave processes with Oracle, I/O slaves and parallel query slaves.

I/O Slaves

I/O slaves are used to emulate asynchronous I/O for systems or devices that do not support it. For example, tape devices (which are notoriously slow) do not support asynchronous I/O. By using I/O slaves, we can mimic for tape drives what the operating system normally provides for disk drives. Just as with true asynchronous I/O, the process writing to the device batches a large amount of data and hands it off to be written. When the data is successfully written, the writer (our I/O slave this time, not the operating system) signals the original invoker, who removes this batch of data from its list of data that needs to be written. In this fashion, we can achieve a much higher throughput, since the I/O slaves are the ones waiting for the slow device, while their caller is off doing other important work getting the data together for the next write.

I/O slaves are used in a couple of places in Oracle. DBWn and LGWR can make use of them to simulate asynchronous I/O, and RMAN will make use of them when writing to tape.

Two parameters control the use of I/O slaves:

· BACKUP_TAPE_IO_SLAVES: This parameter specifies whether I/O slaves are used by RMAN to back up, copy, or restore data to tape. Since this parameter is designed around tape devices, and tape devices may be accessed by only one process at any time, this parameter is a Boolean, and not the number of slaves to use, as you might expect. RMAN will start up as many slaves as necessary for the number of physical devices being used. When BACKUP_TAPE_IO_SLAVES = TRUE, an I/O slave process is used to write to or read from a tape device. If this parameter is FALSE (the default), then I/O slaves are not used for backups. Instead, the dedicated server process engaged in the backup will access the tape device.

· DBWR_IO_SLAVES: This parameter specifies the number of I/O slaves used by the DBW0 process. The DBW0 process and its slaves always perform the writing to disk of dirty blocks in the buffer cache. By default, the value is 0 and I/O slaves are not used. Note that if you set this parameter to a nonzero value, LGWR and ARCn will use their own I/O slaves as well, up to four I/O slaves for LGWR and ARCn will be permitted.

The DBWn I/O slaves appear with the name I1nn, and the LGWR I/O slaves appear with the name I2nn.

Pnnn: Parallel Query Execution Servers

Oracle 7.1.6 introduced the parallel query capability into the database. This is the capability to take a SQL statement such as a SELECT, CREATE TABLE, CREATE INDEX, UPDATE, and so on and create an execution plan that consists of many execution plans that can be done simultaneously. The outputs of each of these plans are merged together into one larger result. The goal is to do an operation in a fraction of the time it would take if you did it serially. For example, say you have a really large table spread across ten different files. You have 16 CPUs at your disposal, and you need to execute an ad hoc query on this table. It might be advantageous to break the query plan into 32 little pieces and really make use of that machine, as opposed to just using one process to read and process all of that data serially.

When using parallel query, you will see processes named Pnnn—these are the parallel query execution servers themselves. During the processing of a parallel statement, your server process will be known as the Parallel Query Coordinator. Its name won’t change at the operating system level, but as you read documentation on parallel query, when you see references to the coordinator process, know that it is simply your original server process.

Prior to Oracle 12c, the default number of parallel execution servers started was zero. You could modify that behavior by specifying a non-zero value for the PARALLEL_MIN_SERVERS parameter. Starting with Oracle 12c, your instance will automatically create several parallel server processes. This occurs because the PARALLEL_MIN_SERVERS parameter is set to a non-zero value (derived from CPU_COUNT * PARALLEL_THREADS_PER_CPU * 2). For example, on my two CPU box (CPU_COUNT parameter is 2 and PARALLEL_THREADS_PER_CPUparameter is 2), we see the following eight parallel execution servers running:

$ ps -ef | grep ora_p00 | grep -v grep

oracle 31086 1 0 Apr06 ? 00:00:03 ora_p000_ORA12CR1
oracle 31088 1 0 Apr06 ? 00:00:05 ora_p001_ORA12CR1
oracle 31104 1 0 Apr06 ? 00:00:02 ora_p002_ORA12CR1
oracle 31106 1 0 Apr06 ? 00:00:02 ora_p003_ORA12CR1
oracle 31108 1 0 Apr06 ? 00:00:02 ora_p004_ORA12CR1
oracle 31110 1 0 Apr06 ? 00:00:02 ora_p005_ORA12CR1
oracle 31112 1 0 Apr06 ? 00:00:02 ora_p006_ORA12CR1
oracle 31114 1 0 Apr06 ? 00:00:02 ora_p007_ORA12CR1

Image Tip See Chapter 14 for full details on parallel processing.


We’ve covered the files used by Oracle, from the lowly but important parameter file to data files, redo log files, and so on. We’ve taken a look inside the memory structures used by Oracle, both in the server processes and the SGA. We’ve seen how different server configurations, such as shared server versus dedicated server mode for connections, will have a dramatic impact on how memory is used by the system. Lastly, we looked at the processes (or threads, depending on the operating system) that enable Oracle to do what it does. Now we are ready to look at the implementation of some other features of Oracle, such as locking, concurrency controls, and transactions.