Expert Oracle SQL: Optimization, Deployment, and Statistics (2014)
What is this book about?
How many of us have been woken at some antisocial hour because a SQL statement that was performing well suddenly started behaving badly because of a changed execution plan? In most circumstances, Oracle would probably recommend that customers suffering repeatedly from such issues investigate the use of SQL Plan Baselines, Oracle’s strategic feature for stabilizing execution plans in a production environment. This book introduces TSTATS, the name given by one of my clients to a controversial technology that bypasses altogether the need to gather object statistics on a production system and can be thought of as an alternative to SQL Plan Baselines.
Although Chapter 6 and Chapter 20 are dedicated to the issue of deploying and managing statistics in a production environment, the main theme of the book is tuning SQL for Oracle databases. There are other excellent books covering Oracle database performance in general, but this book is focused specifically on SQL tuning.
In my opinion, the key to finding the best solution to a SQL tuning problem usually lies in fully understanding the problem that you are addressing as well as in understanding the technologies at your disposal. A large portion of this book is dedicated to a study of how the cost-based optimizer (CBO) and the runtime engine work and how to obtain and interpret diagnostic data, such as the execution plans displayed by functions from the DBMS_XPLAN package.
Some readers may find it surprising is that I make very little reference to the famous 10046 and 10053 traces that form the foundation of many books on SQL performance. In practice, I use a 10046 trace about once a year and a 10053 trace about once every three years. In my opinion, there are easier ways to diagnose the vast majority of SQL tuning problems, and I will explain the techniques that I use in detail.
You will notice that I have used the term “in my opinion” twice in the last few paragraphs, and I will use it many more times throughout this book. The fear of alienating audiences dissuades many authors from expressing controversial opinions, particularly if they are not perfectly aligned with Oracle-recommended practice. But there is often more than one way to look at a topic, and I hope this book provides you with something new to think about.
But don’t be too worried that this book is all about philosophy and grandstanding. There is a lot of technical content in this book that you won’t find in other books or in blogs and plenty of examples to help you through.
Why did I write this book?
The process that led me to write this book began with a talk by Kyle Hailey (http://www.oraclerealworld.com/) at the 2010 UK Oracle User Group (UKOUG) national conference. The topic of Kyle’s talk is immaterial, but he mentioned en passant a book called SQL Tuning written in 2003 by Dan Tow and published by O’Reilly. I was sitting next to Jonathan Lewis (http://jonathanlewis.wordpress.com/) in the audience, and Jonathan agreed with Kyle that this was an excellent book, one of only a handful that he recommends on his blog. I felt obliged to buy the book and can confirm that it is an outstanding publication.
The small difficulty I have in 2014 with Dan’s book is that it focuses almost entirely on a scientific and foolproof way to determine the correct join order for a set of tables. Although join order is still a performance concern, it is less so in 2014 than it was in 2003 for several reasons:
· Since the CBO was introduced into the Oracle database product in version 7, it has become increasingly more capable of identifying the correct join order.
· New options, such as right-deep join trees and star transformations, which I will cover in chapters 11 and 13 respectively, mean that there is even less possibility that the CBO will pick a disastrous join order.
· We now have at our disposal Wolfgang Breitling’s Tuning by Cardinality Feedback (http://www.centrexcc.com/Tuning%20by%20Cardinality%20Feedback.pdf) tuning technique, which I will discuss briefly in Chapter 6 and which provides a simpler approach to solving simple problems like join order. Wolfgang’s approach is nowhere near as formal and foolproof as Dan’s, but it works 90% of the time and is easier to master.
Although join order is less of a problem in 2014 than it was in 2003, there are new challenges. The base database product (the part that requires no extra licensing) now includes analytic functions, parallel processing, and the MODEL clause, all of which open up ever more sophisticated options for business logic in the database layer and in SQL statements in particular. Licensed options, such as partitioning, can also help solve some performance problems that otherwise might be very cumbersome, at the very least, to solve. All these nice new features generate complexity, and with that complexity comes the need to understand more aspects of how a SQL statement behaves. Chapter 17, for example, is dedicated entirely to the topic of sorting.
And so the idea of this book was born. In December 2011, at the next UKOUG conference, I was still mulling over the idea of writing this book and looked for some encouragement from other authors that I knew. I received a mixed response. Yes, a new book on SQL tuning would be nice. But given the amount of work involved, I would be crazy to undertake it.
But I was already emotionally committed and a short while later I asked Christian Antognini, author of Troubleshooting Oracle Performance, to introduce me to his publisher at Apress.
Running the examples
Scripts to run the SQL statements in the listings in this book can be downloaded from http://www.apress.com/9781430259770. If you want to run the scripts yourself, I would recommend using version 12cR1 of the database product, although most of the scripts do run on 11gR2. The following are additional requirements:
· The database should have an 8k block size and you should set the initialization parameter db_file_multiblock_read to 128.
· The sample schemas (SCOTT, OE, HR, PM and SH) need to be installed. See the Sample Schemas manual and the $ORACLE_HOME/rdbms/admin/scott.sql script for more details.
As the book has been reviewed it has become clear that the different ways of installing the example schemas can lead to inconsistencies in execution plans. The downloadable materials include a set of object statistics for the example schemas that can be installed with datapump import. These statistics should help you reproduce the results shown in this book. Full instructions are included in the README file included in the materials.
At the time of publication, the only point release of Oracle database 12cR1 available is 188.8.131.52, and unfortunately there is a bug related to join cardinality estimates that renders one of the key elements of the TSTATS technology described in chapters 6 and 20 unworkable as described. Hopefully this bug will be fixed in a later point release, but in the meantime the downloadable code includes a workaround: rather than removing the high- and low-value column statistics altogether, the high value of the column statistic is set very high and the low value set very low.
The structure of this book
This book is composed of five parts:
Part 1 introduces some basic concepts. I cover the SQL language itself and the basics of execution plans. I introduce the cost-based optimizer (CBO) and the runtime engine and give an overview of my approaches to optimization and managing object statistics in a production environment. Even if you are very experienced with SQL I would recommend that you at least skim this first part as a command of the concepts covered is crucial to following the rest of the book.
Part 2 covers more advanced aspects of SQL and execution plans and explains how object statistics are used by the CBO to help it select an execution plan.
Part 3 provides a study of the CBO. I don’t get bogged down with lots of formulas for costing; I cover the essential details of access method, join order, and join method that you will need during your SQL tuning life. I also take a detailed look at the all of the optimizer transformations that you are likely to encounter.
Part 4 covers optimizing SQL. Now that we have a firm grounding in the tools of the trade, it is finally time to look at how we can apply all this knowledge to solving real SQL performance issues. I cover physical database design and rewriting SQL, and then take a detailed look at that most controversial of topics: hints. There is also a chapter dedicated to sorts and another that covers a range of advanced techniques for solving unusual problems.
Part 5 is a single chapter dedicated to TSTATS, a technique for managing object statistics in a production environment. TSTATS virtually eliminates unwanted execution changes without the need to manage repositories of SQL Plan Baselines. A controversial chapter to be sure, but the techniques described in this chapter have proven themselves in a number of mission-critical applications over a number of years.
The key messages of the book
As with most problems in life, solving a SQL tuning problem or a production instability problem can be made much easier, and sometimes trivial, by fully understanding it. An obvious statement, perhaps, but I have lost count of the number of times I have seen people trying to solve a SQL performance problem without understanding it. For example, it may be that the best solution to a performance problem is to gather statistics. Perhaps you just need to stop and restart the SQL. Perhaps you need to run the SQL Tuning Advisor and create a SQL profile. But don’t just pick one of these options at random and work through the list when it doesn’t work. For example, if your SQL statement includes a temporary table then the SQL Tuning Advisor is unlikely to be of much use because the temporary table will be empty when the SQL Tuning Advisor runs. You need to begin by reviewing the SQL statement!
Why are so many problems with poorly performing SQL approached in a haphazard way? One reason is the pressure that technicians are put under, often at antisocial hours, to do something quickly. The other reason is that very few people have enough knowledge to approach a performance problem in a systematic way. I can’t help you with the first of these two problems, but hopefully after reading this book you will at least have the knowledge and the skill, if not always the time, to approach your performance problems in a systematic way, starting with the problem and working towards a solution, rather than the other way around.
I want to end this introduction with a second message. Enjoy yourself! Enjoy reading this book and take pride and pleasure in your work. It will take time to master all the principles in this book, but the journey will hopefully be a rewarding one for your clients, your employers, and, above all, yourself