Expert Oracle SQL: Optimization, Deployment, and Statistics (2014)

PART 4. Optimization

CHAPTER 14. Why Do Things Go Wrong?

As I mentioned at the beginning of Chapter 1, SQL is a declarative programming language. In theory, you just have to say what you want to have done and then something works out how to do it. In the case of an Oracle database, that “something” is the CBO, and it should devise an appropriate execution plan for you. In theory you shouldn’t have to worry about getting incorrect results and you shouldn’t have to worry about performance. Since you are reading this book you have almost certainly realized that life isn’t as simple as that and that you do, in fact, have to worry quite a bit. With the exception of bugs, the results of a correctly written SQL statement will be correct, but optimal performance, or even acceptable performance, is far from guaranteed, even from a bug-free database.

So why is performance often so poor? We have touched on many of the issues already in this book. Without the correct set of extended statistics cardinality errors will occur and cardinality errors will totally confuse the CBO. But cardinality errors aren’t the only things that confuse the CBO. We discussed scalar subquery caching in Chapter 4. The CBO has no way to assess the benefits of scalar subquery caching nor has it any way of knowing which blocks will be in the buffer cache when the query runs.

The subject of caching is just one example of missing information. Sometimes an index can only be used when the indexed column or columns are known to be not null. The CBO might know that an indexed column is not null because there is a NOT NULL constraint on the column or because an IS NOT NULL predicate appears in the WHERE clause. In the absence of such information the CBO will have to assume that the column may, in fact, be NULL and thus avoid the use of an index.

What about contention? Do you think that if an index is being updated by another session while you are querying it that your performance may suffer? I can tell you that it might. I have seen real-life cases where I have had to use an index that appeared to the CBO to be a poor choice, but the query performed well because my chosen index wasn’t being updated by other processes at the time of the query. On the other hand, the CBO’s chosen index was being updated by other processes, causing poor performance.

The concept of declarative programming languages means that the CBO is supposed to be omniscient, and so we can easily place most of the blame for poor execution plans on the inadequacies of the CBO. However, it is not within the remit of the CBO to design your physical database; if an index is missing the CBO can’t create it for you.

You can see that many of the concepts covered in this chapter have already been addressed earlier in the book, but I want to bring them together in one place and expand on them. Let us begin by taking another look at the single biggest problem for the CBO: cardinality errors.

Cardinality Errors

The term cardinality error refers to a significant discrepancy between the CBO’s estimate of the number of rows returned by a row source operation in a candidate execution plan and the actual number of rows that would be returned were that execution plan to be selected and run. Cardinality estimates are the most important pieces of information that the CBO generates while performing final state optimization. There are several reasons for cardinality errors. Let me begin with the biggest source of cardinality errors: correlated columns.

Correlation of Columns

I introduced the concept of column correlation in Chapter 6, and in Chapter 9 Listing 9-11 showed a correlation between two columns named TRANSACTION_DATE and POSTING_DATE. The correlation between these two date columns is easy to see, but in other cases it may not be. Consider the columns in the SALES table in the SH schema:

· PROD_ID

· CUST_ID

· TIME_ID

· PROMO_ID

· QUANTITY_SOLD

· AMOUNT_SOLD

These columns are reasonably realistic for a data warehouse table, although the example data is quite artificial. I am going to ask you to imagine that this table is for a wine merchant that has the following:

· 100 types of wine

· 10,000 customers

· 10 years of trading history (assume 10 values of TIME_ID)

· A wine promotion every month

· Sold 10,000,000 bottles of wine

· An average of 10 bottles of wine sold per order

This means that there are 1,000,000 rows in the table.

Let us see which columns in the table are correlated with each other. Let us start with PROD_ID and CUST_ID. At first glance these columns seem to have nothing to do with each other. But hang on; let us assume that we want to estimate the number of sales of a specific bottle of wine. We know that there are 100 types of wine so the selectivity will be 1% and the cardinality estimate (the average number of sales transactions for that wine) will be 1% of 1,000,000, or 10,000. Using the same logic we can see that there are 10,000 customers, so each customer will place 0.01% of the orders and thus the average customer places 100 orders over the ten-year period.

So now let us ask the question: How many bottles of a specific wine did a specific customer buy? Well, the CBO will assume that each customer placed one order for each of the 100 types of wine. This is unlikely to be accurate because typically customers will not sample every wine—they will buy lots of bottles of the wine they like. So CUST_ID and PROD_ID are correlated.

Tip If you realize that you will not find a row for each possible combination of two columns, the two columns are correlated and the CBO will underestimate cardinalities when predicates on both columns are present and there are no extended statistics.

Does each customer buy wine every year? Probably not. Customers come and go, so CUST_ID and TIME_ID are correlated. Will each customer avail themselves of every promotion? Unlikely. Most customers are likely to place orders of the same or similar amounts each time and it is virtually certain that not every combination of CUST_ID and AMOUNT_SOLD will be found in the table. So CUST_ID is correlated to every other column in the table! What about PROD_ID? Will every wine have been sold in every year? No: a 2012 wine wouldn’t have been sold in 2011 or earlier, and after it sells out you can’t get any more. Do promotions apply to every product? No. If you keep going with these thoughts you can see that almost every column is correlated to every other.

I have worked with some applications where almost every cardinality estimate in every critical execution plan is significantly lower than it should be because of the impact of column correlation.

Statistics Feedback and DBMS_STATS.SEED_COL_USAGE Features

The fact that determining the correct set of extended statistics to create is nigh on impossible for most complex applications is not lost on the optimizer team, who have provided two approaches for addressing this problem.

Let us start with a look at statistics feedback, an extended and renamed variant of the cardinality feedback feature that I mentioned in Chapter 6. If you get a significant cardinality error in 12cR1 or later, and you haven’t disabled statistics feedback by settingOPTIMIZER_ADAPTIVE_FEATURES to FALSE, then a SQL Plan Directive will most likely be created. The existence of a SQL Plan Directive may or may not precipitate one or both of the following consequences:

· Dynamic sampling for that statement or similar statements executed in the future, whether issued by the same session or a different session.

· The automatic creation of extended statistics for certain column groups the next time statistics are gathered.

In my opinion these sort of dynamic adaptive features are undesirable in a production environment. As I explained in Chapter 6, we don’t normally want untested execution plans to be run for the first time in a production environment. But despite the brevity of this overview, I feel that SQL Plan Directives are a potentially useful feature for identifying the need for extended statistics on a test system during the early stages of a project lifecycle. The use of dynamic sampling during these early stages is also useful for identifying issues that extended statistics can’t currently address, such as correlated columns in multiple tables. You can read more about SQL Plan Directives in the SQL tuning guide.

An entirely separate feature that may be easier to manage was introduced and supported in 11gR2 but only documented for the first time in 12cR1. The DBMS_STATS.SEED_COL_USAGE procedure allows you to collect information on the need for column group statistics while you run a particular workload. Listing 14-1 shows the basic approach.

Listing 14-1. Obtaining extended statistics through DBMS_STATS.SEED_COL_USAGE

BEGIN
DBMS_STATS.reset_col_usage (ownname => 'SH', tabname => NULL); -- Optional

DBMS_STATS.seed_col_usage (NULL, NULL, 3600);
END;
/

--- Wait a little over an hour

DECLARE
dummy CLOB;
BEGIN
FOR r
IN (SELECT DISTINCT object_name
FROM dba_objects, sys.col_group_usage$
WHERE obj# = object_id AND owner = 'SH' AND object_type = 'TABLE')
LOOP
SELECT DBMS_STATS.create_extended_stats ('SH', r.object_name)
INTO dummy
FROM DUAL;
END LOOP;
END;
/

Listing 14-1 begins by calling the undocumented DBMS_STATS.RESET_COL_USAGE procedure to remove any unrepresentative data about the SH schema from previous test runs, but this is optional. The next call is to DBMS_STATS.SEED_COL_USAGE, which starts the monitoring of SQL statements for 3,600 seconds. After the hour has elapsed you can create extended statistics based on the column usage statistics obtained. Notice that the call to DBMS_STATS.CREATE_EXTENDED_STATS omits the specification of any extension as these are automatically obtained from the data dictionary.

You can work out the tables that need extended statistics by looking at the data dictionary table SYS.COL_GROUP_USAGE$ as shown in Listing 14-1. As an alternative, if you pass NULL as the table name (as opposed to omitting the table name) you can create the necessary extensions for the SH schema as a whole in one call.

Let me leave correlated columns now and move on to another major source of cardinality errors: functions.

Functions

When you include a function call in a predicate the CBO will normally have no idea what the selectivity of the predicate will be. Consider Listing 14-2:

Listing 14-2. Function call in predicate

SELECT *
FROM some_table t1, another_table t2
WHERE some_function (t1.some_column) = 1 AND t1.c1 = t2.c2;

Suppose that the object statistics for SOME_TABLE say that the table has 1,000,000 rows. How many rows from SOME_TABLE will remain after applying the filter predicate some_function (t1.some_column) = 1? You can’t tell, can you? And neither can the CBO. If only one or two rows are selected from SOME_TABLE then perhaps a NESTED LOOPS join with ANOTHER_TABLE is warranted. But perhaps all 1,000,000 rows will be selected, in which case a HASH JOIN will be the only way to avoid disastrous performance. I deliberately chose meaningless table and function names in the above example. Suppose the function call in Listing 14-2 was actually MOD (t1.some_column, 2)? What then? Since MOD is a function supplied by the SQL engine, surely the CBO should realize that the function call is likely to filter out only about half the rows? Well, not really. The SQL language provides dozens of built-in functions, and it would be impractical for the CBO to hold rules for identifying cardinality estimates for each one. The only practical thing for the CBO to do is to treat them in the same way as it treats user-written functions.

When the CBO has no clue as to how to determine cardinality it makes an arbitrary guess using one of a number of fixed built-in values. In the case of a function call a selectivity estimate of 1% is used.

Stale Statistics

I will be brief here. I have already discussed the topic of stale statistics at length in Chapter 6 and will come back to it again in Chapter 20. Just add stale statistics (in a non-TSTATS environment) to the growing list of reasons why cardinality errors occur.

Daft Data Types

Daft data types is the title of a section in Chapter 6 of the book Cost Based Oracle Fundamentals. That chapter is dedicated entirely to the topic of cardinality errors, and I have unashamedly copied the section title. Listing 14-3 is also almost directly copied from the same section in Jonathan Lewis’ book.

Listing 14-3. Daft data types

CREATE TABLE t2
AS
SELECT d1
,TO_NUMBER (TO_CHAR (d1, 'yyyymmdd')) n1
,TO_CHAR (d1, 'yyyymmdd') c1
FROM (SELECT TO_DATE ('31-Dec-1999') + ROWNUM d1
FROM all_objects
WHERE ROWNUM <= 1827);

SELECT *
FROM t2
WHERE d1 BETWEEN TO_DATE ('30-Dec-2002', 'dd-mon-yyyy')
AND TO_DATE ('05-Jan-2003', 'dd-mon-yyyy');

------------------------------------------
| Id | Operation | Name | Rows |
------------------------------------------
| 0 | SELECT STATEMENT | | 8 |
|* 1 | TABLE ACCESS FULL| T2 | 8 |
------------------------------------------

SELECT *
FROM t2
WHERE n1 BETWEEN 20021230 AND 20030105;

------------------------------------------
| Id | Operation | Name | Rows |
------------------------------------------
| 0 | SELECT STATEMENT | | 396 |
|* 1 | TABLE ACCESS FULL| T2 | 396 |
------------------------------------------

Listing 14-3 generates one row for each date between January 1, 2000 and December 31, 2004. The first query uses a date column to pick the seven rows between December 30, 2002 and January 5, 2003 inclusive, and the cardinality estimate is almost right. The second query uses a predicate on a numeric column that stores dates in YYYYMMDD format. The cardinality estimate is now way off because when viewed as a number the range 20,021,230 to 20,030,105 seems like quite a big chunk of the total range 20,000,101 to 20,041,231.

I could copy and paste much more of Jonathan’s chapter but I don’t need to. The key thing is not to try and memorize a bunch of reasons why the CBO gets things wrong but rather to develop a mindset that allows you to work out what goes wrong when you need to.

I think I have said all I want to about cardinality errors. But cardinality errors are not the only reasons why the CBO gets things wrong. Let us continue this story of CBO issues by turning our attention to caching effects.

Caching Effects

Many Oracle specialists develop a habit of using the term “cache” as an abbreviation for “buffer cache.” I do that myself all the time. However, there are many different sorts of cache in an Oracle database. I covered the OCI, result, and function caches in Chapter 4 and went into quite some detail about the scalar subquery cache in Chapter 13. All of these caches have an effect on the elapsed time and resource consumption of a SQL statement. The CBO has no way of quantifying the benefits of any of these caches and therefore doesn’t consider caching effects in its calculations in any way.

I’d like to discuss one particular case where the CBO might one day be able to think things through a little bit more than it currently does. It involves the index clustering factor statistic that I explained in Chapter 9. Listing 14-4 shows an interesting test case.

Listing 14-4. Caching effects of index clustering with a single-column index

ALTER SESSION SET statistics_level=all;

SET PAGES 900 LINES 200 SERVEROUTPUT OFF
COMMIT;
ALTER SESSION ENABLE PARALLEL DML;

CREATE TABLE t1
(
n1 INT
,n2 INT
,filler CHAR (10)
)
NOLOGGING;

INSERT /*+ parallel(t1 10) */
INTO t1
WITH generator
AS ( SELECT ROWNUM rn
FROM DUAL
CONNECT BY LEVEL <= 4500)
SELECT TRUNC (ROWNUM / 80000)
,ROWNUM + 5000 * (MOD (ROWNUM, 2))
,RPAD ('X', 10)
FROM generator, generator;

COMMIT;

CREATE INDEX t1_n1
ON t1 (n1)
NOLOGGING
PARALLEL 10;

COLUMN index_name FORMAT a10

SELECT index_name, clustering_factor
FROM all_indexes
WHERE index_name = 'T1_N1';
INDEX_NAME CLUSTERING_FACTOR
---------- -----------------
T1_N1 74102

ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT MAX (filler)
FROM t1
WHERE n1 = 2;

SELECT *
FROM TABLE (
DBMS_XPLAN.display_cursor (NULL
,NULL
,'basic cost iostats -predicate'));

----------------------------------------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)| A-Rows |A-Time | Buffers | Reads |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 458 (100)| 1 | 00:00.58| 440 | 440 |
| 1 | SORT AGGREGATE | | | 1 |00:00.58 | 440 | 440 |
| 2 | TABLE ACCESS BY INDEX RO| T1 | 458 (1)| 80000 |00:00.56 | 440 | 440 |
| 3 | INDEX RANGE SCAN | T1_N1 | 165 (0)| 80000 |00:00.21 | 159 | 159 |
----------------------------------------------------------------------------------------------

ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+ full(t1) */
MAX (filler)
FROM t1
WHERE n1 = 2;

SELECT *
FROM TABLE (
DBMS_XPLAN.display_cursor (NULL
,NULL
,'basic cost iostats -predicate'));
--------------------------------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)| A-Rows |A-Time | Buffers | Reads |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 19507 (100)| 1 | 00:08.67| 71596 | 71592 |
| 1 | SORT AGGREGATE | | | 1 |00:08.67 | 71596 | 71592 |
| 2 | TABLE ACCESS FULL| T1 | 19507 (1)| 80000 |00:08.67 | 71596 | 71592 |
--------------------------------------------------------------------------------------

Listing 14-4 creates a table T1 with 20,025,000 rows. The rows are added ordered by column N1. A single-column index, T1_N1, is then created on column N1 and statistics are gathered. The clustering factor of the index is quite low at 74,102. To make it a level playing field I flush the buffer cache before each query.

When we query the table looking for rows with a specific value of N1 the CBO elects to use the index T1_N1. When we force a full table scan with a hint we see a much higher cost and a much higher actual elapsed time. The CBO has, correctly, concluded that a full table scan would take much longer. Listing 14-5 shows us what happens when we add a second column to the index.

Listing 14-5. Caching effects of index clustering with a multi-column index

DROP INDEX t1_n1;

CREATE INDEX t1_n1_n2 ON t1(n1,n2);

select index_name,clustering_factor from all_indexes where index_name='T1_N1_N2';

INDEX_NAME CLUSTERING_FACTOR
---------- -----------------
T1_N1_N2 18991249

ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT MAX (filler)
FROM t1
WHERE n1 =2;

SELECT *
FROM TABLE (DBMS_XPLAN.display_cursor (NULL,
NULL,
'basic cost iostats -predicate'));
--------------------------------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)| A-Rows |A-Time | Buffers | Reads |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 19507 (100)| 1 | 00:09.78| 71596 | 71592 |
| 1 | SORT AGGREGATE | | | 1 |00:09.78 | 71596 | 71592 |
| 2 | TABLE ACCESS FULL| T1 | 19507 (1)| 80000 |00:09.77 | 71596 | 71592 |
--------------------------------------------------------------------------------------

ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+ index(t1 t1_n1_n2) */ MAX (filler)
FROM t1
WHERE n1 =2;

SELECT *
FROM TABLE (DBMS_XPLAN.display_cursor (NULL,
NULL,
'basic cost iostats -predicate'));
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)| A-Rows |A-Time | Buffers | Reads |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 75014 (100)| 1 | 00:00.67| 74497 | 494 |
| 1 | SORT AGGREGATE | | | 1 |00:00.67 | 74497 | 494 |
| 2 | TABLE ACCESS BY INDEX R| T1 | 75014 (1)| 80000 |00:00.65 | 74497 | 494 |
| 3 | INDEX RANGE SCAN | T1_N1_N2 | 231 (0)| 80000 |00:00.23 | 214 | 214 |
------------------------------------------------------------------------------------------------

Listing 14-5 drops index T1_N1 and creates the index T1_N1_N2 on the columns N1 and N2. When we run the query now a full table scan is selected by the CBO. When we force the use of the new index with a hint we see that the CBO has given a much higher cost to the multi-column index operation than it did when considering access with the single-column index T1_N1. Why? T1_N1_N2 is slightly larger than T1_N1, but that doesn’t explain the huge increase in cost. Has the CBO got it wrong?

Well, yes and no. If you look at the runtime statistics for the indexed operation you will see that, in fact, the number of consistent gets has increased massively (from 440 in Listing 14-4 to 74,497 in Listing 14-5) with the addition of N2 to the index, just as the CBO suspected. The reason for the huge increase in logical I/Os is that the index entries for T1_N1_N2 are ordered differently than those in T1_N1. I have constructed the data in such a way that consecutive index entries in T1_N1_N2 are almost guaranteed to reference different table blocks, whereas most consecutive index entries in T1_N1 reference the same table block and can be processed as part of the same logical I/O operation. This difference is reflected in the index clustering factor statistic for T1_N1_N2, which at 18,991,249 (almost equal to the number of rows in the table as expected), is much higher than for T1_N1. This explains how the CBO arrived at the correspondingly higher cost estimate for the query using T1_N1_N2.

Although the CBO has correctly realized that index access through T1_N1_N2 involves a large number of logical I/O operations, it hasn’t realized that the operations are to the same small set of table blocks. As a consequence there are just 494 physical I/Os for indexed access and the elapsed time is similarly low. The number of physical I/Os for the full table scan is a massive 71,592 because the table isn’t held in its entirety in the buffer cache.

Transitive Closure

If you search the Internet for a definition of transitive closure then you are likely to find a lot of confusing mathematical gobbledygook. Let me give you the layman’s explanation: if A=B and B=C then A=C. Pretty simple, really, but you might be surprised at how little Oracle understands transitive closure. Listing 14-6 shows three interesting queries and their associated execution plans.

Listing 14-6. Transitive closure examples

CREATE TABLE t3 AS SELECT ROWNUM c1 FROM DUAL CONNECT BY LEVEL <=10;
CREATE TABLE t4 AS SELECT MOD(ROWNUM,10)+100 c1 FROM DUAL CONNECT BY LEVEL <= 100;
CREATE TABLE t5 AS SELECT MOD(ROWNUM,10) c1,RPAD('X',30) filler FROM DUAL
CONNECT BY LEVEL <= 10000;

CREATE INDEX t5_i1 ON t5(c1);

SELECT COUNT (*)
FROM t3, t5
WHERE t3.c1 = t5.c1 AND t3.c1 = 1;

-----------------------------------------------
| Id | Operation | Name | Rows |
-----------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
| 1 | SORT AGGREGATE | | 1 |
| 2 | MERGE JOIN CARTESIAN| | 1000 |
|* 3 | TABLE ACCESS FULL | T3 | 1 |
| 4 | BUFFER SORT | | 1000 |
|* 5 | INDEX RANGE SCAN | T5_I1 | 1000 |
-----------------------------------------------

Predicate Information(identified by operation id):
---------------------------------------------------

3 - filter("T3"."C1"=1)
5 - access("T5"."C1"=1)

SELECT *
FROM t3, t4, t5
WHERE t3.c1 = t4.c1 AND t4.c1 = t5.c1;

--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 39 | 8 (13)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 1 | 39 | 8 (13)| 00:00:01 |
|* 3 | HASH JOIN | | 1 | 6 | 7 (15)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T3 | 10 | 30 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | T4 | 100 | 300 | 3 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | T5_I1 | 1000 | | 1 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| T5 | 1 | 33 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("T3"."C1"="T4"."C1")
6 - access("T4"."C1"="T5"."C1")

SELECT *
FROM t3, t4, t5
WHERE t3.c1 = t5.c1 AND t4.c1 = t5.c1;

----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 39 | 25 (4)| 00:00:01 |
|* 1 | HASH JOIN | | 1 | 39 | 25 (4)| 00:00:01 |
|* 2 | HASH JOIN | | 1 | 36 | 22 (5)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T4 | 100 | 300 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| T5 | 10000 | 322K| 18 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | T3 | 10 | 30 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information(identified by operation id):
---------------------------------------------------

1 - access("T3"."C1"="T5"."C1")
2 - access("T4"."C1"="T5"."C1")

The predicates in our first query are T3.C1=1 and T3.C1=T5.C1. If you look at the filter predicates shown in the execution plan for this first query you will see that the T3.C1=T5.C1 predicate has been replaced by T5.C1=1. Transitive closure is generally recognized in simple situations when a literal value or bind variable is present, but anything more complex will confuse the CBO. The second and third queries are semantically equivalent. The two predicates T3.C1=T4.c1 AND T4.C1=T5.C1 are clearly equivalent to the two predicates T3.C1=T5.c1 AND T4.C1=T5.C1. Both pairs of predicates imply that all three columns are identical. And yet we can see that the selected execution plans for the last two statements are different: the CBO has no “transitive closure” transformation. Unless performance of the two execution plans is identical they can’t both be optimal! However, “transitive closure” is just one of many potential transformations that the CBO might one day implement that don’t currently exist. Let us consider the implications of this a bit more.

Unsupported Transformations

Consider the two SQL statements in Listing 14-7. Take a look and see which statement you think is better written.

Listing 14-7. Unsupported transformations

CREATE TABLE t6
AS
SELECT ROWNUM c1,MOD(ROWNUM, 2) c2, RPAD ('X', 10) filler
FROM DUAL
CONNECT BY LEVEL <= 1000;
CREATE TABLE t7 AS
SELECT ROWNUM c1,ROWNUM c2 FROM DUAL CONNECT BY LEVEL <= 10;

CREATE INDEX t6_i1 ON t6(c2,c1);

WITH subq AS
(SELECT t6.c2, MAX (t6.c1) max_c1
FROM t6
GROUP BY t6.c2)
SELECT c2, subq.max_c1
FROM t7 LEFT JOIN subq USING (c2);

-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 290 | 4 (23)| 00:00:01 |
|* 1 | HASH JOIN OUTER | | 10 | 290 | 4 (23)| 00:00:01 |
| 2 | TABLE ACCESS FULL | T7 | 10 | 30 | 2 (0)| 00:00:01 |
| 3 | VIEW | | 2 | 52 | 2 (20)| 00:00:01 |
| 4 | HASH GROUP BY | | 2 | 14 | 2 (20)| 00:00:01 |
| 5 | TABLE ACCESS FULL| T6 | 1000 | 7000 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("T7"."C2"="SUBQ"."C2"(+))

SELECT c2, (SELECT /*+ no_unnest */ MAX (c1)
FROM t6
WHERE t6.c2 = t7.c2) max_c1
FROM t7;

--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 30 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 8 | | |
| 2 | FIRST ROW | | 1 | 8 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN (MIN/MAX)| T6_I1 | 1 | 8 | 2 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T7 | 10 | 30 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("T6"."C2"=:B1)

If you look closely at the two statements you will see that they are semantically equivalent. Both queries list all the rows from T7 together with the corresponding maximum value (if any) from T6 for the matching value of C2. So which is the superior construct?

The upper query in Listing 14-7 will obtain the maximum value of C1 in T6 for each value of C2. This calculation is guaranteed to be performed exactly once for each value of C2 in T6.

The lower query is constructed in such a way that the maximum value of C1 is only calculated for values of C2 present in T7. However, the maximum value of C1 for a value of C2 that is present in T7 is potentially calculated more than once. Subquery caching may mitigate this. Furthermore, we have the MAX/MIN optimization introduced in Listing 10-8 to consider.

Seem familiar? Should we do everything one time (including stuff that we may never need to do) or should we avoid doing stuff that we never need to do at the expense of risking doing stuff that we do need to do more than once? This is precisely the same sort of argument surrounding indexed access to a table with nested loops versus hash joins and full table scans. Get your mind thinking this way. This is what it is all about.

So it seems that the nature of the data and the availability of a suitable index should dictate which construct is superior. If T7 is a tiny table with just a few rows and T6 has millions of rows with thousands of values of C2, then it seems that the latter construct would be superior. On the other hand, if T7 has millions of rows and T6 has just a few thousand (small but sufficiently large to risk blowing the scalar subquery cache) then it is probably better to use the former construct.

But there is one final complication: Oracle 12cR1 will normally unnest the subquery in the lower query to transform it into the former. To prevent this heuristic transformation from happening you need to include a NO_UNNEST hint.

Missing Information

Another reason why the CBO may not identify a suitable execution plan is because it doesn’t understand business rules. If you don’t somehow embed business rules in your SQL the CBO may end up doing the wrong thing. Take look at Listing 14-8, which shows two queries against theORDERS table in the example OE schema.

Listing 14-8. Missing information

select sales_rep_id,sum(order_total) from oe.orders where order_status=7 group by sales_rep_id order by sales_rep_id;

-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 144 | 4 (25)| 00:00:01 |
| 1 | SORT GROUP BY | | 9 | 144 | 4 (25)| 00:00:01 |
|* 2 | TABLE ACCESS FULL| ORDERS | 73 | 1168 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------

select sales_rep_id,sum(order_total) from oe.orders
where order_status=7 and sales_rep_id is not null group by sales_rep_id order by sales_rep_id;

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 2 (0)| 00:00:01 |
| 1 | SORT GROUP BY NOSORT | | 9 | 2 (0)| 00:00:01 |
|* 2 | TABLE ACCESS BY INDEX ROWID| ORDERS | 49 | 2 (0)| 00:00:01 |
|* 3 | INDEX FULL SCAN | ORD_SALES_REP_IX | 70 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

The first query lists the total order value for each SALES_REP_ID where ORDER_STATUS is 7. The execution plan uses a full table scan and uses a sort to group the data, as the data is required to be sorted on output.

The second query does precisely the same thing except that we have added an extra predicate and SALES_REP_ID is not null. As you can see, that extra predicate has resulted in a completely different execution plan that, amongst other things, requires no sort. The point is that there is noNOT NULL constraint on SALES_REP_ID because some transactions, including all online sales, have no salesman. You may know, however, that ORDER_STATUS=7 implies the use of a salesman.¹ Unless you supply the extra predicate the CBO has no way of knowing that it is safe to use the index and thus produces a potentially inferior execution plan.

Bad Physical Design

I am one of those individuals who likes to blame other people or other things when things go wrong. My wife will tell you that, in my mind, nothing is ever my fault. That is one of the things I love about the CBO. It is a good fall guy when it comes to poor performance! When my SQL performs poorly (and yes, I sometimes write poorly performing SQL) I can usually claim that the CBO should have figured out what I did wrong and should have fixed it. However, the CBO can’t be blamed for a missing index! I would have to find somebody else to blame for that!

There is, of course, much more to physical database design than just working out which indexes to create. Top amongst those considerations is working out which indexes to drop. The next chapter, Chapter 15, will go into the topic of physical database design in much more detail, so I will stop now. I just want you to add bad physical design to the ever-increasing list of reasons why things go wrong.

Contention

When the CBO considers what execution plan to use for a statement it assumes nothing about what other database activity may be going on at the same time. Consideration of concurrent activity might affect the optimal execution plan.

Consider the following scenario. We run ten jobs one after the other. The first job inserts data into a table T1 with JOB_ID=1. The second job inserts data with JOB_ID=2 and so on. Suppose that during the execution of the last of the ten jobs we decide to query some of the data forJOB_ID=1. Our query specifies WHERE JOB_ID=1 AND PRODUCT_CODE=27. The CBO, blissfully unaware of any contention issues, elects to use an index on PRODUCT_CODE because it is more selective and has a low clustering factor. But the last of the ten jobs is bashing away, inserting lots of rows with PRODUCT_CODE=27. This means that our query might have to do a lot of work to roll back these inserts and reconstruct the blocks as they were at the time the query started.

Now suppose we hint our query to use an index on JOB_ID. The use of the less selective index on JOB_ID might be more efficient, as the section of the index with JOB_ID=1 won’t be being updated while the query runs. Furthermore, the table blocks being accessed will have filled up long ago and won’t be being updated. The fact that neither the index nor the table blocks being accessed through the JOB_ID index are updated during the course of our query may limit or indeed eliminate the rollback activity. I once worked with a query that was running for several hours and by changing the query to use an ostensibly poor index the execution time dropped to a couple of minutes!

Summary

This chapter may have depressed you. All this negative talk about what seems to be a never-ending list of reasons why things can, and more often than not, do prevent the CBO from coming up with the optimal execution plan may seem daunting. And my list only scratches the surface. There are many more reasons why things go wrong that I haven’t even covered here. The more complex a SQL statement is, and in particular the more complex the predicates are, the less likely the CBO is to guess right. Don’t get too depressed. The main point of this chapter is to remove any rose-tinted spectacles that you may have been wearing. Once you understand that things do go wrong and the main reasons why they go wrong you are in a better position to do something about it! Let us get started in doing something about it right now. On to Chapter 15 and a look at how good physical database design can help SQL performance.

__________________

¹It would have made more sense for more to suggest that direct sales have a sales representative and online sales do not. Unfortunately, the sample data doesn’t follow that rule!