Expert Oracle Database Architecture, Third Edition (2014)

Introduction

The inspiration for the material contained in this book comes from my experiences developing Oracle software, and from working with fellow Oracle developers to help them build reliable and robust applications based on the Oracle database. The book is basically a reflection of what I do every day and of the issues I see people encountering each and every day.

I covered what I felt was most relevant, namely the Oracle database and its architecture. I could have written a similarly titled book explaining how to develop an application using a specific language and architecture—for example, one using JavaServer Pages that speaks to Enterprise JavaBeans, which in turn uses JDBC to communicate with Oracle. However, at the end of the day, you really do need to understand the topics covered in this book in order to build such an application successfully. This book deals with what I believe needs to be universally known to develop successfully with Oracle, whether you are a Visual Basic programmer using ODBC, a Java programmer using EJBs and JDBC, or a Perl programmer using DBI Perl. This book does not promote any specific application architecture; it does not compare three tier to client/server. Rather, it covers what the database can do and what you must understand about the way it works. Since the database is at the heart of any application architecture, the book should have a broad audience.

As the title suggests, Expert Oracle Database Architecture concentrates on the database architecture and how the database itself works. I cover the Oracle database architecture in depth: the files, memory structures, and processes that comprise an Oracle database and instance. I then move on to discuss important database topics such as locking, concurrency controls, how transactions work, and redo and undo, and why it is important for you to know about these things. Lastly, I examine the physical structures in the database such as tables, indexes, and datatypes, covering techniques for making optimal use of them.

What This Book Is About

One of the problems with having plenty of development options is that it’s sometimes hard to figure out which one might be the best choice for your particular needs. Everyone wants as much flexibility as possible (as many choices as they can possibly have), but they also want things to be very cut and dried—in other words, easy. Oracle presents developers with almost unlimited choice. No one ever says, “You can’t do that in Oracle.” Rather, they say, “How many different ways would you like to do that in Oracle?” I hope that this book will help you make the correct choice.

This book is aimed at those people who appreciate the choice but would also like some guidelines and practical implementation details on Oracle features and functions. For example, Oracle has a really neat feature called parallel execution. The Oracle documentation tells you how to use this feature and what it does. Oracle documentation does not, however, tell you when you should use this feature and, perhaps even more important, when you should not use this feature. It doesn’t always tell you the implementation details of this feature, and if you’re not aware of them, this can come back to haunt you (I’m not referring to bugs, but the way the feature is supposed to work and what it was really designed to do).

In this book I strove to not only describe how things work, but also explain when and why you would consider using a particular feature or implementation. I feel it is important to understand not only the “how” behind things, but also the “when” and “why” as well as the “when not” and “why not!”

Who Should Read This Book

The target audience for this book is anyone who develops applications with Oracle as the database back end. It is a book for professional Oracle developers who need to know how to get things done in the database. The practical nature of the book means that many sections should also be very interesting to the DBA. Most of the examples in the book use SQL*Plus to demonstrate the key features, so you won’t find out how to develop a really cool GUI—but you will find out how the Oracle database works, what its key features can do, and when they should (and should not) be used.

This book is for anyone who wants to get more out of Oracle with less work. It is for anyone who wants to see new ways to use existing features. It is for anyone who wants to see how these features can be applied in the real world (not just examples of how to use the feature, but why the feature is relevant in the first place). Another category of people who would find this book of interest is technical managers in charge of the developers who work on Oracle projects. In some respects, it is just as important that they understand why knowing the database is crucial to success. This book can provide ammunition for managers who would like to get their personnel trained in the correct technologies or ensure that personnel already know what they need to know.

To get the most out of this book, the reader should have

· Knowledge of SQL. You don’t have to be the best SQL coder ever, but a good working knowledge will help.

· An understanding of PL/SQL. This isn’t a prerequisite, but it will help you to absorb the examples. This book will not, for example, teach you how to program a FOR loop or declare a record type; the Oracle documentation and numerous books cover this well. However, that’s not to say that you won’t learn a lot about PL/SQL by reading this book. You will. You’ll become very intimate with many features of PL/SQL, you’ll see new ways to do things, and you’ll become aware of packages/features that perhaps you didn’t know existed.

· Exposure to some third-generation language (3GL), such as C or Java. I believe that anyone who can read and write code in a 3GL language will be able to successfully read and understand the examples in this book.

· Familiarity with the Oracle Database Concepts manual.

A few words on that last point: due to the Oracle documentation set’s vast size, many people find it to be somewhat intimidating. If you’re just starting out or haven’t read any of it as yet, I can tell you that the Oracle Database Concepts manual is exactly the right place to start. It’s about 450 pages long (I know that because I wrote some of the pages and edited every one) and touches on many of the major Oracle concepts that you need to know about. It may not give you each and every technical detail (that’s what the other 10,000 to 20,000 pages of documentation are for), but it will educate you on all the important concepts. This manual touches the following topics (to name a few):

· The structures in the database, and how data is organized and stored

· Distributed processing

· Oracle’s memory architecture

· Oracle’s process architecture

· Schema objects you will be using (tables, indexes, clusters, and so on)

· Built-in datatypes and user-defined datatypes

· SQL stored procedures

· How transactions work

· The optimizer

· Data integrity

· Concurrency control

I will come back to these topics myself time and time again. These are the fundamentals. Without knowledge of them, you will create Oracle applications that are prone to failure. I encourage you to read through the manual and get an understanding of some of these topics.

How This Book Is Structured

To help you use this book, most chapters are organized into four general sections (described in the list that follows). These aren’t rigid divisions, but they will help you navigate quickly to the area you need more information on. This book has 15 chapters, and each is like a “minibook”—a virtually stand-alone component. Occasionally, I refer to examples or features in other chapters, but you could pretty much pick a chapter out of the book and read it on its own. For example, you don’t have to read Chapter 10 on database tables to understand or make use of Chapter 14 on parallelism.

The format and style of many of the chapters is virtually identical:

· An introduction to the feature or capability.

· Why you might want to use the feature or capability (or not). I outline when you would consider using this feature and when you would not want to use it.

· How to use this feature. The information here isn’t just a copy of the material in the SQL reference; rather, it’s presented in step-by-step manner: here is what you need, here is what you have to do, and these are the switches you need to go through to get started. Topics covered in this section will include:

· How to implement the feature

· Examples, examples, examples

· How to debug this feature

· Caveats of using this feature

· How to handle errors (proactively)

· A summary to bring it all together

There will be lots of examples and lots of code, all of which is available for download from the Source Code area of www.apress.com. The following sections present a detailed breakdown of the content of each chapter.

Chapter 1: Developing Successful Oracle Applications

This chapter sets out my essential approach to database programming. All databases are not created equal, and in order to develop database-driven applications successfully and on time, you need to understand exactly what your particular database can do and how it does it. If you do not know what your database can do, you run the risk of continually reinventing the wheel—developing functionality that the database already provides. If you do not know how your database works, you are likely to develop applications that perform poorly and do not behave in a predictable manner.

The chapter takes an empirical look at some applications where a lack of basic understanding of the database has led to project failure. With this example-driven approach, the chapter discusses the basic features and functions of the database that you, the developer, need to understand. The bottom line is that you cannot afford to treat the database as a black box that will simply churn out the answers and take care of scalability and performance by itself.

Chapter 2: Architecture Overview

This chapter covers the basics of Oracle architecture. We start with some clear definitions of two terms that are very misunderstood by many in the Oracle world, namely instance and database. We then cover two new types of databases introduced in Oracle 12c, namely container databaseand pluggable database. We also take a quick look at the System Global Area (SGA) and the processes behind the Oracle instance, and examine how the simple act of “connecting to Oracle” takes place.

Chapter 3: Files

This chapter covers in depth the eight types of files that make up an Oracle database and instance. From the simple parameter file to the data and redo log files, we explore what they are, why they are there, and how we use them.

Chapter 4: Memory Structures

This chapter covers how Oracle uses memory, both in the individual processes (Process Global Area, or PGA, memory) and shared memory (SGA). We explore the differences between manual and automatic PGA and, in Oracle 10g, automatic shared memory management, and in Oracle 11g, automatic memory management, and see when each is appropriate. After reading this chapter, you will have an understanding of exactly how Oracle uses and manages memory.

Chapter 5: Oracle Processes

This chapter offers an overview of the types of Oracle processes (server processes versus background processes). It also goes into much more depth on the differences in connecting to the database via a shared server or dedicated server process. We’ll also take a look, process by process, at most of the background processes (such as LGWR, DBWR, PMON, SMON, and LREG) that we’ll see when starting an Oracle instance and discuss the functions of each.

Chapter 6: Locking and Latching

Different databases have different ways of doing things (what works well in SQL Server may not work as well in Oracle), and understanding how Oracle implements locking and concurrency control is absolutely vital to the success of your application. This chapter discusses Oracle’s basic approach to these issues, the types of locks that can be applied (DML, DDL, and latches), and the problems that can arise if locking is not implemented carefully (deadlocking, blocking, and escalation).

Chapter 7: Concurrency and Multiversioning

In this chapter, we’ll explore my favorite Oracle feature, multiversioning, and how it affects concurrency controls and the very design of an application. Here we will see that all databases are not created equal and that their very implementation can have an impact on the design of our applications. We’ll start by reviewing the various transaction isolation levels as defined by the ANSI SQL standard and see how they map to the Oracle implementation (as well as how the other databases map to this standard). Then we’ll take a look at what implications multiversioning, the feature that allows Oracle to provide nonblocking reads in the database, might have for us.

Chapter 8: Transactions

Transactions are a fundamental feature of all databases—they are part of what distinguishes a database from a file system. And yet, they are often misunderstood and many developers do not even know that they are accidentally not using them. This chapter examines how transactions should be used in Oracle and also exposes some bad habits that may have been picked up when developing with other databases. In particular, we look at the implications of atomicity and how it affects statements in Oracle. We also discuss transaction control statements (COMMIT, SAVEPOINT, andROLLBACK), integrity constraints, distributed transactions (the two-phase commit, or 2PC), and finally autonomous transactions.

Chapter 9: Redo and Undo

It can be said that developers do not need to understand the detail of redo and undo as much as DBAs, but developers do need to know the role they play in the database. After first defining redo, we examine what exactly a COMMIT does. We discuss how to find out how much redo is being generated and how to significantly reduce the amount of redo generated by certain operations using the NOLOGGING clause. We also investigate redo generation in relation to issues such as block cleanout and log contention.

In the undo section of the chapter, we examine the role of undo data and the operations that generate the most/least undo. Finally, we investigate the infamous ORA-01555: snapshot too old error, its possible causes, and how to avoid it.

Chapter 10: Database Tables

Oracle now supports numerous table types. This chapter looks at each different type—heap organized (i.e., the default, “normal” table), index organized, index clustered, hash clustered, nested, temporary, and object—and discusses when, how, and why you should use them. Most of time, the heap organized table is sufficient, but this chapter will help you recognize when one of the other types might be more appropriate.

Chapter 11: Indexes

Indexes are a crucial aspect of your application design. Correct implementation requires an in-depth knowledge of the data, how it is distributed, and how it will be used. Too often, indexes are treated as an afterthought in application development, and performance suffers as a consequence.

This chapter examines in detail the different types of indexes, including B*Tree, bitmap, function-based, and application domain indexes, and discusses where they should and should not be used. I’ll also answer some common queries in the “Frequently Asked Questions and Myths About Indexes” section, such as “Do indexes work on views?” and “Why isn’t my index getting used?”

Chapter 12: Datatypes

There are a lot of datatypes to choose from. This chapter explores each of the 22 built-in datatypes, explaining how they are implemented, and how and when to use each one. First up is a brief overview of National Language Support (NLS), a basic knowledge of which is necessary to fully understand the simple string types in Oracle. We then move on to the ubiquitous NUMBER type. Next the LONG and LONG RAW types are covered, mostly from a historical perspective. The main objective here is to show how to deal with legacy LONG columns in applications and migrate them to the LOB type. Next, we delve into the various datatypes for storing dates and time, and investigating how to manipulate the various datatypes to get what we need from them. The ins and outs of time zone support are also covered.

Next up are the LOB datatypes. We’ll cover how they are stored and what each of the many settings such as IN ROW, CHUNK, RETENTION, CACHE, and so on mean to us. When dealing with LOBs, it is important to understand how they are implemented and how they are stored by default—especially when it comes to tuning their retrieval and storage. We close the chapter by looking at the ROWID and UROWID types. These are special types, proprietary to Oracle, that represent the address of a row. We’ll cover when to use them as a column datatype in a table (which is almost never).

Chapter 13: Partitioning

Partitioning is designed to facilitate the management of very large tables and indexes by implementing a divide and conquer logic—basically breaking up a table or index into many smaller and more manageable pieces. It is an area where the DBA and developer must work together to maximize application availability and performance. Features introduced in Oracle 11g and Oracle 12c are also covered in detail.

This chapter covers both table and index partitioning. We look at partitioning using local indexes (common in data warehouses) and global indexes (common in OLTP systems).

Chapter 14: Parallel Execution

This chapter introduces the concept of and uses for parallel execution in Oracle. We’ll start by looking at when parallel processing is useful and should be considered, as well as when it should not be considered. After gaining that understanding, we move on to the mechanics of parallel query, the feature most people associate with parallel execution. Next, we cover parallel DML (PDML), which allows us to perform modifications using parallel execution. We’ll see how PDML is physically implemented and why that implementation leads to a series of restrictions regarding PDML.

We then move on to parallel DDL. This, in my opinion, is where parallel execution really shines. Typically, DBAs have small maintenance windows in which to perform large operations. Parallel DDL gives DBAs the ability to fully exploit the machine resources they have available, permitting them to finish large, complex operations in a fraction of the time it would take to do them serially.

The chapter closes on procedural parallelism, the means by which we can execute application code in parallel. We cover two techniques here. The first is parallel pipelined functions, or the ability of Oracle to execute stored functions in parallel dynamically. The second is “do it yourself” (DIY) parallelism, whereby we design the application to run concurrently.

Chapter 15: Data Loading and Unloading

The first half of the chapter focuses on external tables, a highly efficient means by which to bulk load and unload data. If you perform a lot of data loading, you should strongly consider using external tables. Also discussed in detail is the external table preprocessing feature that allows for operating system commands to be executed automatically as part of selecting from an external table.

The second half of this chapter focuses on SQL*Loader (SQLLDR) and covers the various ways in which we can use this tool to load and modify data in the database. Issues discussed include loading delimited data, updating existing rows and inserting new ones, unloading data, and calling SQLLDR from a stored procedure. Again, SQLLDR is a well-established and crucial tool, but it is the source of many questions with regard to its practical use.

Source Code and Updates

The best way to digest the material in this book is to thoroughly work through and understand the hands-on examples. As you work through the examples in this book, you may decide that you prefer to type in all the code by hand. Many readers choose to do this because it is a good way to get familiar with the coding techniques that are being used.

Whether you want to type the code in or not, all the source code for this book is available in the Source Code section of the Apress web site (www.apress.com). If you like to type in the code, you can use the source code files to check the results you should be getting—they should be your first stop if you think you might have typed an error. If you don’t like typing, then downloading the source code from the Apress web site is a must! Either way, the code files will help you with updates and debugging.

Errata

Apress makes every effort to make sure that there are no errors in the text or the code. However, to err is human, and as such we recognize the need to keep you informed of any mistakes as they’re discovered and corrected. Errata sheets are available for all our books at www.apress.com. If you find an error that hasn’t already been reported, please let us know. The Apress web site acts as a focus for other information and support, including the code from all Apress books, sample chapters, previews of forthcoming titles, and articles on related topics.