Preface - High Performance MySQL (2012)

High Performance MySQL (2012)

Preface

We wrote this book to serve the needs of not just the MySQL application developer but also the MySQL database administrator. We assume that you are already relatively experienced with MySQL. We also assume some experience with general system administration, networking, and Unix-like operating systems.

The second edition of this book presented a lot of information to readers, but no book can provide complete coverage of a topic. Between the second and third editions, we took notes on literally thousands of interesting problems we’d solved or seen others solve. When we started to outline the third edition, it became clear that not only would full coverage of these topics require three to five thousand pages, but the book still wouldn’t be complete. After reflecting on this problem, we realized that the second edition’s emphasis on deep coverage was actually self-limiting, in the sense that it often didn’t teach readers how to think about MySQL.

As a result, this third edition has a different focus from the second edition. We still convey a lot of information, and we still emphasize the same goals, such as reliability and correctness. But we’ve also tried to imbue the book with a deeper purpose: we want to teach the principles of why MySQL works as it does, not just the facts about how it works. We’ve included more illustrative stories and case studies, which demonstrate the principles in action. We build on these to try to answer questions such as “Given MySQL’s internal architecture and operation, what practical effects arise in real usage? Why do those effects matter? How do they make MySQL well suited (or not well suited) for particular needs?”

Ultimately, we hope that your knowledge of MySQL’s internals will help you in situations beyond the scope of this book. And we hope that your newfound insight will help you to learn and practice a methodical approach to designing, maintaining, and troubleshooting systems that are built on MySQL.

How This Book Is Organized

We fit a lot of complicated topics into this book. Here, we explain how we put them together in an order that makes them easier to learn.

A Broad Overview

Chapter 1, MySQL Architecture and History is dedicated to the basics—things you’ll need to be familiar with before you dig in deeply. You need to understand how MySQL is organized before you’ll be able to use it effectively. This chapter explains MySQL’s architecture and key facts about its storage engines. It helps you get up to speed if you aren’t familiar with some of the fundamentals of a relational database, including transactions. This chapter will also be useful if this book is your introduction to MySQL but you’re already familiar with another database, such as Oracle. We also include a bit of historical context: the changes to MySQL over time, recent ownership changes, and where we think it’s headed.

Building a Solid Foundation

The early chapters cover material we hope you’ll reference over and over as you use MySQL.

Chapter 2, Benchmarking MySQL discusses the basics of benchmarking—that is, determining what sort of workload your server can handle, how fast it can perform certain tasks, and so on. Benchmarking is an essential skill for evaluating how the server behaves under load, but it’s also important to know when it’s not useful.

Chapter 3, Profiling Server Performance introduces you to the response time–oriented approach we take to troubleshooting and diagnosing server performance problems. This framework has proven essential to solving some of the most puzzling cases we’ve seen. Although you might choose to modify our approach (we developed it by modifying Cary Millsap’s approach, after all), we hope you’ll avoid the pitfalls of not having any method at all.

In Chapters 4 through 6, we introduce three topics that together form the foundation for a good logical and physical database design. In Chapter 4, Optimizing Schema and Data Types, we cover the various nuances of data types and table design. Chapter 5, Indexing for High Performanceextends the discussion to indexes—that is, physical database design. A firm understanding of indexes and how to use them well is essential for using MySQL effectively, so you’ll probably find yourself returning to this chapter repeatedly. And Chapter 6, Query Performance Optimizationwraps the topics together by explaining how MySQL executes queries and how you can take advantage of its query optimizer’s strengths. This chapter also presents specific examples of many common classes of queries, illustrating where MySQL does a good job and how to transform queries into forms that use its strengths.

Up to this point, we’ve covered the basic topics that apply to any database: tables, indexes, data, and queries. Chapter 7, Advanced MySQL Features goes beyond the basics and shows you how MySQL’s advanced features work. We examine topics such as partitioning, stored procedures, triggers, and character sets. MySQL’s implementation of these features is different from other databases, and a good understanding of them can open up new opportunities for performance gains that you might not have thought about otherwise.

Configuring Your Application

The next two chapters discuss how to make MySQL, your application, and your hardware work well together. In Chapter 8, Optimizing Server Settings, we discuss how you can configure MySQL to make the most of your hardware and to be reliable and robust. Chapter 9, Operating System and Hardware Optimization explains how to get the most out of your operating system and hardware. We discuss solid-state storage in depth, and we suggest hardware configurations that might provide better performance for larger-scale applications.

Both chapters explore MySQL internals to some degree. This is a recurring theme that continues all the way through the appendixes: learn how it works internally, and you’ll be empowered to understand and reason about the consequences.

MySQL as an Infrastructure Component

MySQL doesn’t exist in a vacuum. It’s part of an overall application stack, and you’ll need to build a robust overall architecture for your application. The next set of chapters is about how to do that.

In Chapter 10, Replication, we discuss MySQL’s killer feature: the ability to set up multiple servers that all stay in sync with a master server’s changes. Unfortunately, replication is perhaps MySQL’s most troublesome feature for some people. This doesn’t have to be the case, and we show you how to ensure that it keeps running well.

Chapter 11, Scaling MySQL discusses what scalability is (it’s not the same thing as performance), why applications and systems don’t scale, and what to do about it. If you do it right, you can scale MySQL to suit nearly any purpose. Chapter 12, High Availability delves into a related-but-distinct topic: how to ensure that MySQL stays up and functions smoothly. In Chapter 13, MySQL in the Cloud, you’ll learn about what’s different when you run MySQL in cloud computing environments.

In Chapter 14, Application-Level Optimization, we explain what we call full-stack optimization—optimization from the frontend to the backend, all the way from the user’s experience to the database.

The best-designed, most scalable architecture in the world is no good if it can’t survive power outages, malicious attacks, application bugs or programmer mistakes, and other disasters. That’s why Chapter 15, Backup and Recovery discusses various backup and recovery strategies for your MySQL databases. These strategies will help minimize your downtime in the event of inevitable hardware failure and ensure that your data survives such catastrophes.

Miscellaneous Useful Topics

In the last chapter and the book’s appendixes, we delve into several topics that either don’t fit well into any of the earlier chapters, or are referenced often enough in multiple chapters that they deserve a bit of special attention.

Chapter 16, Tools for MySQL Users explores some of the open source and commercial tools that can help you manage and monitor your MySQL servers more efficiently.

Appendix A introduces the three major unofficial versions of MySQL that have arisen over the last few years, including the one that our company maintains. It’s worth knowing what else is available; many problems that are difficult or intractable with MySQL are solved elegantly by one of the variants. Two of the three (Percona Server and MariaDB) are drop-in replacements, so the effort involved in trying them out is not large. However, we hasten to add that we think most users are well served by sticking with the official MySQL distribution from Oracle.

Appendix B shows you how to inspect your MySQL server. Knowing how to get status information from the server is important; knowing what that information means is even more important. We cover SHOW INNODB STATUS in particular detail, because it provides deep insight into the operations of the InnoDB transactional storage engine. There is a lot of discussion of InnoDB’s internals in this appendix.

Appendix C shows you how to copy very large files from place to place efficiently—a must if you are going to manage large volumes of data. Appendix D shows you how to really use and understand the all-important EXPLAIN command. Appendix E shows you how to decipher what’s going on when queries are requesting locks that interfere with each other. And finally, Appendix F is an introduction to Sphinx, a high-performance, full-text indexing system that can complement MySQL’s own abilities.

Software Versions and Availability

MySQL is a moving target. In the years since Jeremy wrote the outline for the first edition of this book, numerous releases of MySQL have appeared. MySQL 4.1 and 5.0 were available only as alpha versions when the first edition went to press, but today MySQL 5.1 and 5.5 are the backbone of many large online applications. As we completed this third edition, MySQL 5.6 was the unreleased bleeding edge.

We didn’t rely on a single version of MySQL for this book. Instead, we drew on our extensive collective knowledge of MySQL in the real world. The core of the book is focused on MySQL 5.1 and MySQL 5.5, because those are what we consider the “current” versions. Most of our examples assume you’re running some reasonably mature version of MySQL 5.1, such as MySQL 5.1.50 or newer or newer. We have made an effort to note features or functionalities that might not exist in older releases or that might exist only in the upcoming 5.6 series. However, the definitive reference for mapping features to specific versions is the MySQL documentation itself. We expect that you’ll find yourself visiting the annotated online documentation (http://dev.mysql.com/doc/) from time to time as you read this book.

Another great aspect of MySQL is that it runs on all of today’s popular platforms: Mac OS X, Windows, GNU/Linux, Solaris, FreeBSD, you name it! However, we are biased toward GNU/Linux[1] and other Unix-like operating systems. Windows users are likely to encounter some differences. For example, file paths are completely different on Windows. We also refer to standard Unix command-line utilities; we assume you know the corresponding commands in Windows.[2]

Perl is the other rough spot when dealing with MySQL on Windows. MySQL comes with several useful utilities that are written in Perl, and certain chapters in this book present example Perl scripts that form the basis of more complex tools you’ll build. Percona Toolkit—which is indispensable for administering MySQL—is also written in Perl. However, Perl isn’t included with Windows. In order to use these scripts, you’ll need to download a Windows version of Perl from ActiveState and install the necessary add-on modules (DBI and DBD::mysql) for MySQL access.


[1] To avoid confusion, we refer to Linux when we are writing about the kernel, and GNU/Linux when we are writing about the whole operating system infrastructure that supports applications.

[2] You can get Windows-compatible versions of Unix utilities at http://unxutils.sourceforge.net or http://gnuwin32.sourceforge.net.

Conventions Used in This Book

The following typographical conventions are used in this book:

Italic

Used for new terms, URLs, email addresses, usernames, hostnames, filenames, file extensions, pathnames, directories, and Unix commands and utilities.

Constant width

Indicates elements of code, configuration options, database and table names, variables and their values, functions, modules, the contents of files, or the output from commands.

Constant width bold

Shows commands or other text that should be typed literally by the user. Also used for emphasis in command output.

Constant width italic

Shows text that should be replaced with user-supplied values.

NOTE

This icon signifies a tip, suggestion, or general note.

WARNING

This icon indicates a warning or caution.