Basics - MySQL Troubleshooting (2012)

MySQL Troubleshooting (2012)

Chapter 1. Basics

When troubleshooting, you can generally save time by starting with the simplest possible causes and working your way to more complicated ones. I work dozens of trouble tickets at MySQL Support every month. For most of them, we start from trivial requests for information, and the final resolution may—as we’ll see in some examples—be trivial as well, but sometimes we have quite an adventure in between. So it always pays to start with the basics.

The typical symptoms of a basic problem are running a query and getting unexpected results. The problem could manifest itself as results that are clearly wrong, getting no results back when you know there are matching rows, or odd behavior in the application. In short, this section depends on you having a good idea of what your application should be doing and what the query results should look like. Cases in which the source of wrong behavior is not so clear will be discussed later in this book.

We will always return to these basics, even with the trickiest errors or in situations when you would not know what caused the wrong behavior in your application. This process, which we’ll discuss in depth in Localizing the Problem (Minimizing the Test Case), can also be called creating a minimal test case.

Incorrect Syntax

This sounds absolutely trivial, but still can be tricky to find. I recommend you approach the possibility of incorrect SQL syntax very rigorously, like any other possible problem.

An error such as the following is easy to see:

SELECT * FRO t1 WHERE f1 IN (1,2,1);

In this case, it is clear that the user just forgot to type an “m”, and the error message clearly reports this (I have broken the output lines to fit the page):

mysql> SELECT * FRO t1 WHERE f1 IN (1,2,1);

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that

corresponds to your MySQL server version for the right syntax to use near 'FRO

t1 WHERE f1 IN (1,2,1)' at line 1

Unfortunately, not all syntax errors are so trivial. I once worked on a trouble ticket concerning a query like this:

SELECT id FROM t1 WHERE accessible=1;

The problem was a migration issue; the query worked fine in version 5.0 but stopped working in version 5.1. The problem was that, in version 5.1, “accessible” is a reserved word. We added quotes (these can be backticks or double quotes, depending on your SQL mode), and the query started working again:

SELECT `id` FROM `t1` WHERE `accessible`=1;

The actual query looked a lot more complicated, with a large JOIN and a complex WHERE condition. So the simple error was hard to pick out among all the distractions. Our first task was to reduce the complex query to the simple one-line SELECT as just shown, which is an example of a minimal test case. Once we realized that the one-liner had the same bug as the big, original query, we quickly realized that the programmer had simply stumbled over a reserved word.

§ The first lesson is to check your query for syntax errors as the first troubleshooting step.

But what do you do if you don’t know the query? For example, suppose the query was built by an application. Even more fun is in store when it’s a third-party library that dynamically builds queries.

Let’s consider this PHP code:

$query = 'SELECT * FROM t4 WHERE f1 IN(';

for ($i = 1; $i < 101; $i ++)

$query .= "'row$i,";

$query = rtrim($query, ',');

$query .= ')';

$result = mysql_query($query);

Looking at the script, it is not easy to see where the error is. Fortunately, we can alter the code to print the query using an output function. In the case of PHP, this can be the echo operator. So we modify the code as follows:

echo $query;

//$result = mysql_query($query);

Once the program shows us the actual query it’s trying to submit, the problem jumps right out:

$ php ex1.php

SELECT * FROM t4 WHERE f1 IN('row1,'row2,'row3,'row4,'row5,'row6,'row7,'row8,

'row9,'row10,'row11, 'row12,'row13,'row14,'row15,'row16,'row17,'row18,'row19,'row20)

If you still can’t find the error, try running this query in the MySQL command-line client:

mysql> SELECT * FROM t4 WHERE f1 IN('row1,'row2,'row3,'row4,'row5,'row6,'row7,'row8,

'row9,'row10,'row11,'row12,'row13,'row14,'row15,'row16,'row17,'row18,'row19,'row20);

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that

corresponds to your MySQL server version for the right syntax to use near 'row2,

'row3,'row4,'row5,'row6,'row7,'row8,'row9,'row10,'row11, 'row12,'row13,'row' at

line 1

The problem is that the closing apostrophe is missing from each row. Going back to the PHP code, I have to change:

$query .= "'row$i,";

to the following:

$query .= "'row$i',";

§ An important debugging technique, therefore, consists of this: always try to view the query exactly as the MySQL server receives it. Don’t debug only application code; get the query!

Unfortunately, you can’t always use output functions. One possible reason, which I mentioned before, is that you’re using a third-party library written in a compiled language to generate the SQL. Your application might also be using high-level abstractions, such as libraries that offer a CRUD (create, read, update, delete) interface. Or you might be in a production environment where you don’t want users to be able to see the query while you are testing particular queries with specific parameters. In such cases, check the MySQL general query log. Let’s see how it works using a new example.

This is the PHP application where the problem exists:

private function create_query($columns, $table)

{

$query = "insert into $table set ";

foreach ($columns as $column) {

$query .= $column['column_name'] . '=';

$query .= $this->generate_for($column);

$query .= ', ';

}

return rtrim($query, ',') . ';';

}

private function generate_for($column)

{

switch ($column['data_type']) {

case 'int':

return rand();

case 'varchar':

case 'text':

return "'" . str_pad(md5(rand()), rand(1,$column['character_maximum_length']),

md5(rand()), STR_PAD_BOTH) . "'";

default:

return "''";

}

}

This code updates a table defined in Example 1-1.

Example 1-1. Sample table of common troubleshooting situations

CREATE TABLE items(

id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,

short_description VARCHAR(255),

description TEXT,

example TEXT,

explanation TEXT,

additional TEXT

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Now is time to start using the general query log. This log contains every single query the MySQL server receives. Many production applications don’t want to use it on a day-to-day basis, because it grows extremely fast during a high load, and writing to it can take up MySQL server resources that are needed for more important purposes. Starting with version 5.1, you can turn the general query log on temporarily to record just the query you need:

mysql> SET GLOBAL general_log='on';

Query OK, 0 rows affected (0.00 sec)

You can also log into a table, which lets you easily sort logfile entries because you can query a log table like any other MySQL table:

mysql> SET GLOBAL log_output='table';

Query OK, 0 rows affected (0.00 sec)

Now let’s run the application. After an iteration that executes the problem code, query the table containing the general log to find the problem query:

mysql> SELECT * FROM mysql.general_log\G

*************************** 1. row ***************************

event_time: 2011-07-13 02:54:37

user_host: root[root] @ localhost []

thread_id: 27515

server_id: 60

command_type: Connect

argument: root@localhost on collaborate2011

*************************** 2. row ***************************

event_time: 2011-07-13 02:54:37

user_host: root[root] @ localhost []

thread_id: 27515

server_id: 60

command_type: Query

argument: INSERT INTO items SET id=1908908263,

short_description='8786db20e5ada6cece1306d44436104c',

description='fc84e1dc075bca3fce13a95c41409764',

example='e4e385c3952c1b5d880078277c711c41',

explanation='ba0afe3fb0e7f5df1f2ed3f2303072fb',

additional='2208b81f320e0d704c11f167b597be85',

*************************** 3. row ***************************

event_time: 2011-07-13 02:54:37

user_host: root[root] @ localhost []

thread_id: 27515

server_id: 60

command_type: Quit

argument:

We are interested in the second row and query:

INSERT INTO items SET id=1908908263,

short_description='8786db20e5ada6cece1306d44436104c',

description='fc84e1dc075bca3fce13a95c41409764',

example='e4e385c3952c1b5d880078277c711c41',

explanation='ba0afe3fb0e7f5df1f2ed3f2303072fb',

additional='2208b81f320e0d704c11f167b597be85',

The error again is trivial: a superfluous comma at the end of the query. The problem was generated in this part of the PHP code:

$query .= ', ';

}

return rtrim($query, ',') . ';';

The rtrim function would work if the string actually ended with a comma because it could remove the trailing comma. But the line actually ends with a space character. So rtrim does not remove anything.

Now that we have the query that caused the error in our application, we can turn off the general query log:

mysql> SET GLOBAL general_log='off';

Query OK, 0 rows affected (0.08 sec)

In this section, we learned a few important things:

§ Incorrect syntax can be the source of real-life problems.

§ You should test exactly the same query that the MySQL server gets.

§ Programming language output functions and the general query log can help you quickly find the query that the application sends to the MySQL server.

Wrong Results from a SELECT

This is another frequent problem reported by users of an application who don’t see the updates they made, see them in the wrong order, or see something they don’t expect.

There are two main reasons for getting wrong results: something is wrong with your SELECT query, or the data in database differs from what you expect. I’ll start with the first case.

When I went over examples for this section, I had to either show some real-life examples or write my own toy cases. The real-life examples can be overwhelmingly large, but the toy cases wouldn’t be much help to you, because nobody writes such code. So I’ve chosen to use some typical real-life examples, but simplified them dramatically.

The first example involves a common user mistake when using huge joins. We will use Example 1-1, described in the previous section. This table contains my collection of MySQL features that cause common usage mistakes I deal with in MySQL Support. Each mistake has a row in theitems table. I have another table of links to resources for information. Because there’s a many-to-many relationship between items and links, I also maintain an items_links table to tie them together. Here are the definitions of the items and items_links table (we don’t need linksin this example):

mysql> DESC items;

+-------------------+--------------+------+-----+---------+----------------+

| Field | Type | Null | Key | Default | Extra |

+-------------------+--------------+------+-----+---------+----------------+

| id | int(11) | NO | PRI | NULL | auto_increment |

| short_description | varchar(255) | YES | | NULL | |

| description | text | YES | | NULL | |

| example | text | YES | | NULL | |

| explanation | text | YES | | NULL | |

| additional | text | YES | | NULL | |

+-------------------+--------------+------+-----+---------+----------------+

6 rows in set (0.30 sec)

mysql> DESC items_links;

+--------+---------+------+-----+---------+-------+

| Field | Type | Null | Key | Default | Extra |

+--------+---------+------+-----+---------+-------+

| iid | int(11) | YES | MUL | NULL | |

| linkid | int(11) | YES | MUL | NULL | |

+--------+---------+------+-----+---------+-------+

2 rows in set (0.11 sec)

The first query I wrote worked fine and returned a reasonable result:

mysql> SELECT count(*) FROM items WHERE id IN (SELECT id FROM items_links);

+----------+

| count(*) |

+----------+

| 10 |

+----------+

1 row in set (0.12 sec)

...until I compared the number returned with the total number of links:

mysql> SELECT count(*) FROM items_links;

+----------+

| count(*) |

+----------+

| 6 |

+----------+

1 row in set (0.09 sec)

How could it be possible to have more links than associations?

Let’s examine the query, which I made specially for this book, once more. It is simple and contains only two parts, a subquery:

SELECT id FROM items_links

and an outer query:

SELECT count(*) FROM items WHERE id IN ...

The subquery can be a good place to start troubleshooting because one should be able to execute it independently. Therefore, we can expect a compete result set:

mysql> SELECT id FROM items_links;

ERROR 1054 (42S22): Unknown column 'id' in 'field list'

Surprise! We have a typo, and actually there is no field named id in the items_links table; it says iid (for “items ID”) instead. If we rewrite our query so that it uses the correct identifiers, it will work properly:

mysql> SELECT count(*) FROM items WHERE id IN (SELECT iid FROM items_links);

+----------+

| count(*) |

+----------+

| 4 |

+----------+

1 row in set (0.08 sec)

§ We just learned a new debugging technique. If a SELECT query does not work as expected, split it into smaller chunks, and then analyze each part until you find the cause of the incorrect behavior.

NOTE

If you specify the full column name by using the format table_name.column_name, you can prevent the problems described here in the first place because you will get an error immediately:

mysql> SELECT count(*) FROM items WHERE items.id IN

(SELECT items_links.id FROM items_links);

ERROR 1054 (42S22): Unknown column 'items_links.id' in 'field list'

A good tool for testing is the simple MySQL command-line client that comes with a MySQL installation. We will discuss the importance of this tool in Chapter 6.

But why didn’t MySQL return the same error for the first query? We have a field named id in the items table, so MySQL thought we wanted to run a dependent subquery that actually selects items.id from items_links. A “dependent subquery” is one that refers to fields from the outer query.

We can also use EXPLAIN EXTENDED followed by SHOW WARNINGS to find the mistake. If we run these commands on the original query, we get:

mysql> EXPLAIN EXTENDED SELECT count(*) FROM items WHERE id IN

(SELECT id FROM items_links)\G

2 rows in set, 2 warnings (0.12 sec)

*************************** 1. row ***************************

id: 1

select_type: PRIMARY

table: items

type: index

possible_keys: NULL

key: PRIMARY

key_len: 4

ref: NULL

rows: 10

filtered: 100.00

Extra: Using where; Using index

*************************** 2. row ***************************

id: 2

select_type: DEPENDENT SUBQUERY

table: items_links

type: index

possible_keys: NULL

key: iid_2

key_len: 5

ref: NULL

rows: 6

filtered: 100.00

Extra: Using where; Using index

2 rows in set, 2 warnings (0.54 sec)

mysql> show warnings\G

*************************** 1. row ***************************

Level: Note

Code: 1276

Message: Field or reference 'collaborate2011.items.id' of SELECT #2 was resolved

in SELECT #1

*************************** 2. row ***************************

Level: Note

Code: 1003

Message: select count(0) AS `count(*)` from `collaborate2011`.`items` where

<in_optimizer7gt;(`collaborate2011`.`items`.`id`,<exists>(select 1 from

`collaborate2011`.`items_links` where

(<cache>(`collaborate2011`.`items`.`id`) =

`collaborate2011`.`items`.`id`)))

2 rows in set (0.00 sec)

Row 2 of the EXPLAIN EXTENDED output shows that the subquery is actually dependent: select_type is DEPENDENT SUBQUERY.

Before moving on from this example, I want to show one more technique that will help you avoid getting lost when your query contains lots of table references. It is easy to get lost if you join 10 or more tables in a single query, even when you know how they should be joined.

The interesting part of the previous example was the output of SHOW WARNINGS. The MySQL server does not always execute a query as it was typed, but invokes the optimizer to create a better execution plan so that the user usually gets the results faster. Following EXPLAIN EXTENDED, theSHOW WARNINGS command shows the optimized query.

In our example, the SHOW WARNINGS output contains two notes. The first is:

Field or reference 'collaborate2011.items.id' of SELECT #2 was resolved in SELECT #1

This note clearly shows that the server resolved the value of id in the subquery from the items table rather than from items_links.

The second note contains the optimized query:

select count(0) AS `count(*)` from `collaborate2011`.`items` where <in_optimizer>

(`collaborate2011`.`items`.`id`,<exists>

(select 1 from `collaborate2011`.`items_links` where

(<cache>(`collaborate2011`.`items`.`id`) = `collaborate2011`.`items`.`id`)))

This output also shows that the server takes the value of id from the items table.

Now let’s compare the previous listing with the result of EXPLAIN EXTENDED on the correct query:

mysql> EXPLAIN EXTENDED SELECt count(*) FROM items WHERE id IN

(SELECT iid FROM items_links)\G

*************************** 1. row ***************************

id: 1

select_type: PRIMARY

table: items

type: index

possible_keys: NULL

key: PRIMARY

key_len: 4

ref: NULL

rows: 10

filtered: 100.00

Extra: Using where; Using index

*************************** 2. row ***************************

id: 2

select_type: DEPENDENT SUBQUERY

table: items_links

type: index_subquery

possible_keys: iid,iid_2

key: iid

key_len: 5

ref: func

rows: 1

filtered: 100.00

Extra: Using index; Using where

2 rows in set, 1 warning (0.03 sec)

mysql> show warnings\G

*************************** 1. row ***************************

Level: Note

Code: 1003

Message: select count(0) AS `count(*)` from `collaborate2011`.`items` where

<in_optimizer>(`collaborate2011`.`items`.`id`,<exists>

(<index_lookup>(<cache>(`collaborate2011`.`items`.`id`) in

items_links on iid where (<cache>(`collaborate2011`.`items`.`id`) =

`collaborate2011`.`items_links`.`iid`))))

1 row in set (0.00 sec)

The optimized query this time looks completely different, and really compares items.id with items_links.iid as we intended.

§ We just learned another lesson: use EXPLAIN EXTENDED followed by SHOW WARNINGS to find how a query was optimized (and executed).

The value of select_type in the correct query is still DEPENDENT SUBQUERY. How can that be if we resolve the field name from the items_links table? The explanation starts with the part of the SHOW WARNINGS output that reads as follows:

where (<cache>(`collaborate2011`.`items`.`id`) =

`collaborate2011`.`items_links`.`iid`)

The subquery is still dependent because the id in clause of the outer query requires the subquery to check its rows against the value of iid in the inner query. This issue came up in the discussion of report #12106 in the MySQL Community Bugs Database.

§ I added a link to the bug report because it provides another important lesson: if you doubt the behavior of your query, use good sources to find information. The community bug database is one such source.

There can be many different reasons why a SELECT query behaves incorrectly, but the general method of investigation is always the same:

§ Split the query into small chunks, and then execute them one by one until you see the cause of the problem.

§ Use EXPLAIN EXTENDED followed by SHOW WARNINGS to get the query execution plan and information on how it was actually executed.

§ If you don’t understand the MySQL server behavior, use the Internet and good sources for information. The Appendix A includes a list of useful resources.

When the Problem May Have Been a Previous Update

If a SELECT returns a result set you don’t expect, this does not always mean something is wrong with the query itself. Perhaps you didn’t insert, update, or delete data that you thought you had.

Before you investigate this possibility, you should faithfully carry out the investigation in the previous section, where we discussed a badly written SELECT statement. Here I examine the possibility that you have a good SELECT that is returning the values you asked for, and that the problem is your data itself. To make sure the problem is in the data and not the SELECT, try to reduce it to a simple query on a single table. If the table is small, go ahead and remove all the WHERE clauses, as well as any GROUP BY clauses, and examine the full data set with a brute-force SELECT * FROM table_name. For a larger table, judiciously use WHERE to cull the values you want to examine, and consider COUNT(*) if you just want to make sure the number of rows matching the query is what you expect.

Once you are sure the SELECT works fine, this means the data is inconsistent and you need to find where it is broken. There can be a lot of reasons: a badly applied backup, an incorrect UPDATE statement, or a slave getting out of sync with the master, just to name the most common. In this section, we’ll look at some examples where a bad DELETE or UPDATE isn’t revealed until a later SELECT. In a later section, we’ll address those puzzling cases where the problem turns up long after it was triggered, and show you how to work backward to find the error. This section does not deal directly with problems in transactions, which are discussed in Chapter 2. Here I show cases where data in the database is stable, i.e., all transactions, if used, were completed. I will continue using examples reduced down from real cases.

Let’s start from the best possible case, when data inconsistency was noticed right after the error was made. We will use the following initial data set:

mysql> CREATE TEMPORARY TABLE t1(f1 INT);

Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TEMPORARY TABLE t2(f2 INT);

Query OK, 0 rows affected (0.08 sec)

mysql> INSERT INTO t1 VALUES(1);

Query OK, 1 row affected (0.01 sec)

mysql> SELECT * FROM t1;

+------+

| f1 |

+------+

| 1 |

+------+

1 row in set (0.00 sec)

In the application, the tables shown were temporary tables containing partial result sets selected from the main log table. This is a common technique frequently used in stored routines when only a small set of data from the main table is needed but the user doesn’t want to change the data in the main table or lock the table.

So in this example, after finishing with a result set, the user wanted to delete rows from both tables. It always looks amazing when you can do things in a single query. But real life can work out differently from your plans, and you can get unexpected results or side effects:

mysql> DELETE FROM t1, t2 USING t1, t2;

Query OK, 0 rows affected (0.00 sec)

If the user paid attention to the string printed in response to the DELETE, he would realize right away that something had gone wrong. No rows were affected by the DELETE, meaning that it did nothing. The output from a statement is often not so obvious, however, and sometimes it is never seen, because the SQL statement is run within a program or script with no human being to watch over the results. In general, you should always check information returned by a statement execution to learn how many rows were affected and whether this value is same as you expect. In an application, you must explicitly check information functions.

Let’s continue. If you run SELECT immediately, you could be surprised, thinking that the query was incorrect or even that the query cache had not been cleared:

mysql> SELECT * FROM t1;

+------+

| f1 |

+------+

| 1 |

+------+

1 row in set (0.00 sec)

You can be sure this is not a cache or some other problem if you convert the SELECT to ask for the number of rows. This easy example shows how we can use different ways to query a table to be sure the data is consistent:

mysql> SELECT count(*) FROM t1;

+----------+

| count(*) |

+----------+

| 1 |

+----------+

1 row in set (0.00 sec)

COUNT(*) still returns a positive number here, which shows the table is not empty as desired. As an attentive user would have seen, the DELETE didn’t actually remove any rows. To find out why, we can convert the DELETE to the corresponding SELECT. This will show us which rows satisfy the condition for the delete.

Although our simple example had no WHERE clause, the technique is certainly useful to check the impact of a WHERE clause in a delete or update. The rows returned by a SELECT are the ones that DELETE would delete or that UPDATE would change:

mysql> SELECT * FROM t1, t2;

Empty set (0.00 sec)

Consistent with previous results, this returns an empty set. That’s why no rows were removed! It still might not be clear why this happens, but now that we have a SELECT, we can use familiar techniques from the previous section. For this case, our best choice is to run SELECT followed byEXPLAIN and analyze its output:

mysql> \W

Show warnings enabled.

mysql> EXPLAIN EXTENDED SELECT * FROM t1, t2\G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: t1

type: system

possible_keys: NULL

key: NULL

key_len: NULL

ref: NULL

rows: 1

filtered: 100.00

Extra:

*************************** 2. row ***************************

id: 2

select_type: SIMPLE SUBQUERY

table: t2

type: system

possible_keys: NULL

key: NULL

key_len: NULL

ref: NULL

rows: 0

filtered: 0.00

Extra: const row not found

2 rows in set, 1 warning (0.03 sec)

Note (Code 1003): select '1' AS `f1`,'0' AS `f2` from `test`.`t1` join `test`.`t2`

The final note in the output shows that the query was modified to an (inner) JOIN, which can return rows from each table only if there are matching rows in the other table. For each row in table t1, there should be at least one row in table t2 with a matching value in a matching row. In this case, because table t2 is empty, naturally the join returns an empty set.

§ We just learned another important technique that helps us find out what is wrong with an UPDATE or DELETE: convert it to a SELECT with the same JOIN and WHERE conditions. With a SELECT, you can use EXPLAIN EXTENDED[3] to get the actual execution plan, as well as to manipulate the result set without the risk of modifying the wrong rows.

Here’s a more complex example using UPDATE. We will use the items table again:

mysql> SELECT SUBSTR(description, 1, 20), additional IS NULL FROM items;

+----------------------------+--------------------+

| substr(description, 1, 20) | additional IS NULL |

+----------------------------+--------------------+

| NULL | 1 |

| NULL | 1 |

| One who has TRIGGER | 1 |

| mysql> revoke insert | 1 |

| NULL | 0 |

+----------------------------+--------------------+

5 rows in set (0.00 sec)

mysql> SELECT description IS NULL, additional IS NULL FROM items;

+---------------------+--------------------+

| description IS NULL | additional IS NULL |

+---------------------+--------------------+

| 1 | 1 |

| 1 | 1 |

| 0 | 1 |

| 0 | 1 |

| 1 | 0 |

+---------------------+--------------------+

5 rows in set (0.00 sec)

The description and additional fields are of type TEXT. In this example, we will use an erroneous query that is supposed to replace NULL values with more descriptive text (“no description” for one table and “no additional comments” for the other):

mysql> UPDATE items SET description = 'no description' AND

additional = 'no additional comments' WHERE description IS NULL;

Query OK, 3 rows affected, 3 warnings (0.13 sec)

Rows matched: 3 Changed: 3 Warnings: 3

This query updates some data (“3 rows affected”), but let’s check whether we have the proper values in the table now:

mysql> SELECT SUBSTR(description, 1, 20), additional IS NULL FROM items;

+----------------------------+--------------------+

| substr(description, 1, 20) | additional IS NULL |

+----------------------------+--------------------+

| 0 | 1 |

| 0 | 1 |

| One who has TRIGGER | 1 |

| mysql> revoke insert | 1 |

| 0 | 0 |

+----------------------------+--------------------+

5 rows in set (0.09 sec)

As we see, three rows changed their values in the description field, but 0 is different from the “no description” string we thought we were setting. Furthermore, the values in the additional field have not changed at all. To find out why this happened, we should check warnings. Notice in these statements returned by the server that we see a warnings count of three:

Query OK, 3 rows affected, 3 warnings (0.13 sec)

Rows matched: 3 Changed: 3 Warnings: 3

mysql> SHOW WARNINGS;

+---------+------+----------------------------------------------------+

| Level | Code | Message |

+---------+------+----------------------------------------------------+

| Warning | 1292 | Truncated incorrect DOUBLE value: 'no description' |

| Warning | 1292 | Truncated incorrect DOUBLE value: 'no description' |

| Warning | 1292 | Truncated incorrect DOUBLE value: 'no description' |

+---------+------+----------------------------------------------------+

3 rows in set (0.00 sec)

The message looks strange. Why does it complain about a DOUBLE when description and additional are TEXT fields, as the following queries prove?

mysql> SHOW FIELDS FROM items LIKE 'description';

+-------------+------+------+-----+---------+-------+

| Field | Type | Null | Key | Default | Extra |

+-------------+------+------+-----+---------+-------+

| description | text | YES | | NULL | |

+-------------+------+------+-----+---------+-------+

1 row in set (0.13 sec)

mysql> SHOW FIELDS FROM items LIKE 'additional';

+------------+------+------+-----+---------+-------+

| Field | Type | Null | Key | Default | Extra |

+------------+------+------+-----+---------+-------+

| additional | text | YES | | NULL | |

+------------+------+------+-----+---------+-------+

1 row in set (0.13 sec)

We also want to know why we did not get any warning about the additional field, when it was not changed at all.

Let’s split the query in chunks and examine what it going in each of them:

UPDATE items

This is a common start for an UPDATE, and nothing’s wrong with it:

SET description = 'no description' AND additional = 'no additional comments'

That used a SET statement. Let’s examine what it is actually doing. What does the keyword AND mean in this case? Let me add parentheses to the query to underline operator precedence:

SET description = ('no description' AND additional = 'no additional comments')

So actually, the statement calculates the expression:

'no description' and additional = 'no additional comments'

and then assigns it to description. Evaluating the equal sign produces a Boolean result as a LONGLONG value. To prove this, start the MySQL command line client with the --column-type-info option and run the SELECT again:

$ mysql --column-type-info

mysql> SELECT 'no description' AND additional = 'no additional comments' FROM items;

Field 1: `'no description' AND additional = 'no additional comments'`

Catalog: `def`

Database: ``

Table: ``

Org_table: ``

Type: LONGLONG

Collation: binary (63)

Length: 1

Max_length: 1

Decimals: 0

Flags: BINARY NUM

+------------------------------------------------------------+

| 'no description' AND additional = 'no additional comments' |

+------------------------------------------------------------+

| 0 |

| 0 |

| 0 |

| 0 |

| 0 |

+------------------------------------------------------------+

5 rows in set, 5 warnings (0.09 sec)

We clearly see that the result of the expression is 0, which was inserted into the description field. And because our update to the additional field got buried inside the absurd expression, nothing was inserted into the field, nor did the server see any reason to comment about the field.

Now you can rewrite the query without logic errors:

UPDATE items SET description = 'no description',

additional = 'no additional comments' WHERE description IS NULL;

You can examine the WHERE clause if you want, but in this case it has no error.

This example shows the importance of return values and information about query execution. Let’s discuss them a bit more.


[3] Since version 5.6.3, you can use EXPLAIN with UPDATE and DELETE as well, but converting such a query into SELECT still makes sense because it is easier to examine an actual result set and manipulate it than to just use EXPLAIN. This is especially true for complicated JOINs when EXPLAIN shows it examined more rows than were actually updated.

Getting Information About a Query

As we saw in the previous section, the server returns some important information about each query, displaying some of it directly in the MySQL client and making some of it easy to obtain through commands such as SHOW WARNINGS. When SQL is called from an application, it’s just as important to retrieve this information and check to make sure nothing suspicious is going on. All programming APIs for MySQL support functions that retrieve the query information returned by the server. In this section, we will discuss these functions. I refer just to the C API because I had to choose one language, and most of the other APIs are based on the C API.[4]

Number of rows affected

Let’s start with the simple output we saw earlier, which is displayed after each insert, update, or delete and shows how many rows were inserted, updated, or deleted:

Query OK, N rows affected

This means the query executed fine and changed N rows.

To get the same information in an application, use the call:

mysql_affected_rows()

This returns a positive number of rows if there were changes, 0 if nothing changed, or –1 in case of error.

For UPDATE statements, if the client flag CLIENT_FOUND_ROWS was set, this function returns the number of rows that matched the WHERE condition, which is not always the same as those that were actually changed.

NOTE

Using affected rows is turned off by default in Connector/J because this feature is not JDBC-compliant and will break most applications that rely on found (matched) rows instead of affected rows for DML statements. But it does cause correct update counts from INSERT ... ON DUPLICATE KEY UPDATE statements to be returned by the server. The useAffectedRowsconnection string property tells Connector/J whether to set the CLIENT_FOUND_ROWS flag when connecting to the server.

Number of matched rows

The string in the output that indicates this is:

Rows matched: M

which shows how many rows satisfy the WHERE conditions.

The following C function:

mysql_info()

returns additional information about the most recent query in the string format. For an UPDATE, it returns a string like:

Rows matched: # Changed: # Warnings: #

where each # represents the number of matched rows, changed rows, and warnings, correspondingly. You should parse the line for “matched: #” to find out how many corresponding rows were found.

Number of changed rows

The string in the output that indicates this is:

Changed: P

which shows how many rows were actually changed. Note that M (rows matched) and P (rows changed) can be different. Perhaps the columns you wanted to change already contained the values you specified; in that case, the columns appear in the “Matched” value but not the “Changed” value.

In an application, retrieve this information using mysql_info() as before, but in this case, parse for “Changed: #.”

Warnings: number and message

The string in the output that indicates this is:

Warnings: R

You get warnings if the server detected something unusual and worth reporting during the query, but the query could still be executed and the rows could be modified. Be sure to check for warnings anyway, because they will let you know about potential problems.

In your application, you can retrieve warnings in a few different ways. You can use mysql_info() again and parse for “Warnings: #”. You can also issue:

mysql_warning_count()

If there are warnings, run a SHOW WARNINGS query to get the text message that describes what happened. Another option is:

mysql_sqlstate()

This retrieves the most recent SQLSTATE. For example, “42000” means a syntax error. “00000” means 0 errors and 0 warnings.

NOTE

A value called SQLSTATE is defined by the ANSI SQL standard to indicate the status of a statement. The states are set to status codes, defined in the standard, that indicate whether a call completed successfully or returned an exception. The SQLSTATE is returned as a character string. To find out which values the MySQL server can return, refer to “Server Error Codes and Messages” in the MySQL Reference Manual.

Errors

It is also always useful to check for errors. The following functions report the error value for the most recent SQL statement:

mysql_errno()

This returns the MySQL number of the latest error. For instance, a syntax error will generate the number 1064, whereas 0 means no error.

mysql_error()

This returns a text representation of the latest error. For a syntax error, it would be something like:

You have an error in your SQL syntax; check the manual that

corresponds to your MySQL server version for the right syntax

to use near 'FRO t1 WHERE f1 IN (1,2,1)' at line 1

This can be convenient for storing messages in a separate logfile that you can examine at any time.

NOTE

The official MySQL documentation contains a list of errors that the MySQL server can return and a list of client errors.

RETRIEVING ERROR STRINGS THROUGH PERROR

The perror utility comes with the MySQL distribution and is a very useful tool for troubleshooting. perror provides the error string corresponding to an error code for both MySQL and the server host’s operating system. You can find these numbers in brackets after the error message received in either the MySQL command-line client, the information function, or the error logfile. Here are a few examples:

mysql> CREATE TABLE t2(f1 INT NOT NULL PRIMARY

-> KEY, f2 INT, FOREIGN KEY(f2) REFERENCES t1(f2)) ENGINE=InnoDB;

ERROR 1005 (HY000): Can't create table 'test.t2' (errno: 150)

mysql> \q

$perror 150

MySQL error code 150: Foreign key constraint is incorrectly formed

$perror 2

OS error code 2: No such file or directory

$perror 136

MySQL error code 136: No more room in index file

These codes are printed by the MySQL client when a command produces an error and can be retrieved by programs through mysql_error(). But in situations where you come across an error number and don’t know what it means, use this utility.


[4] You can find a detailed description of the C API syntax at http://dev.mysql.com/doc/refman/5.5/en/c.html.

Tracing Back Errors in Data

If you rigorously check the results of your queries and updates, you’ll catch many of the problems that could otherwise go undetected for weeks and cause a lot of grief when the problems finally grow too large to miss. But problems do creep up on you. Sometimes a SELECT suddenly starts returning wrong results, but your experiments with the query just confirm there is nothing wrong with it.

In this case, you need to imitate user actions, but in reverse order, until you find the source of the error. If you are lucky, you will catch the problem in a single step. Usually it will take multiple steps, and sometimes a very long time.

A lot of these issues happen because the data is different on the master and slave in a replication environment. One common problem is duplicate values where they are supposed to be unique (e.g., if a user relies on an INSERT ON DUPLICATE KEY UPDATE statement but a table has a different structure on the master and slave). For such setups, the user usually notices the problem later when SELECT statements query the slave, instead of noticing them when the INSERT takes place. Things become even worse when this happens during circular replication.

To illustrate this problem, we’ll work with a stored procedure that inserts into a table from a temporary table that was created to hold the results of other selects. This is another example of a common technique when a user wants to handle data from large tables without the risk of modifying data inadvertently or blocking other applications that are using the large tables.

Let’s create our table and populate it with temporary values. In a real application, the temporary table would hold a result set from some calculation that is waiting to be stored in the main table:

CREATE TABLE t1(f1 INT) ENGINE=InnoDB;

CREATE TEMPORARY TABLE t2(f1 INT) ENGINE=InnoDB;

Now create values in the temporary table:

INSERT INTO t2 VALUES(1),(2),(3);

The stored routine moves data from the temporary table into the main table. It checks first to make sure something is in the temporary table before doing the move. Our version looks like this:

CREATE PROCEDURE p1()

BEGIN

DECLARE m INT UNSIGNED DEFAULT NULL;

CREATE TEMPORARY TABLE IF NOT EXISTS t2(f1 INT) ENGINE=InnoDB;

SELECT MAX(f1) INTO m FROM t2;

IF m IS NOT NULL

THEN

INSERT INTO t1(f1) SELECT f1 FROM t2;

END IF;

END

|

This routine creates the temporary table if it does not exist when the routine is called. This prevents errors that would be caused if the temporary table does not exist, but at the same time leads to new issues, as we will see.

NOTE

The example uses the MAX function just to check whether there is at least one row in the table. I prefer MAX to COUNT because InnoDB tables do not store the number of rows they contain, but calculate this value every time the COUNT function is called. Therefore, MAX(indexed_field) is faster than COUNT.

If a slave restarted after the first insert but before the stored procedure call, the temporary table on the slave would be empty and the main table on the slave would have no data. In that case, we will get the following on the master:

mysql> SELECT * FROM t1;

+------+

| f1 |

+------+

| 1 |

| 2 |

| 3 |

+------+

3 rows in set (0.03 sec)

Whereas on the slave we get:

mysql> SELECT * FROM t1;

Empty set (0.00 sec)

Even worse, if we insert into t1 after the stored procedure call runs, we will have a total mess in the slave’s data.

Suppose we notice the error in an application that reads data from the main table. We now need to find out how data has been inserted into the slave table: was it a direct update on the slave, or was data replicated from the master?

WARNING

MySQL replication does not check data consistency for you, so updates of the same objects using both the SQL replication thread and the user thread on the slave leave the data different from the master, which in turn can lead to failure during later replicated events.

Because we imitated this situation in our example, we know at this point why data corruption happened: the slave was restarted after the first insert but before the stored procedure call. In a real-life situation, issues tend to be noticed later when a user issues a select:

mysql> SELECT * FROM t1;

Empty set (0.00 sec)

When you see unexpected results from SELECT, you need to find out whether this is caused by the query itself or by something that went wrong earlier. The query just shown is so simple that it couldn’t produce errors unless the table was corrupted, so we must try to go back to see how the table was modified.

Our generic example is in a replicated environment with a read-only slave, so we can be sure that the wrong data arose in one of two ways: either the master inserted the wrong data, or the data was corrupted during replication.

So check first whether the master has the wrong data:

master> SELECT * FROM t1;

+------+

| f1 |

+------+

| 1 |

| 2 |

| 3 |

+------+

3 rows in set (0.03 sec)

The master’s data is correct, so the source of the problem lies in the replication layer.

But why did it happen? Replication seems to be running fine,[5] so we suspect a logic error on the master. Having discovered a possible source of the problem, you need to analyze the stored procedure and the calls on the master to find a fix.

As I said before, the slave server was restarted after events that insert data into the temporary table were replicated and emptied the temporary table, but before the stored procedure call that selects and inserts data into the main table. So the slave just re-created an empty temporary table and inserted no data.

In this case, you can either switch to row-based replication or rewrite the procedure so it does not rely on the existence of the temporary table. Another approach is to truncate and then refill the table so that a sudden restart will not leave the slave without data.

One might think that this example is very artificial and that you can’t predict when a server will suddenly restart. This is correct, but restarts are sure to happen from time to time. Therefore, you need to worry about such errors.

Actually, a slave replicates binary log events one by one, and when data is created within an atomic event (e.g., a transaction or stored procedure call), the slave would not be affected by this issue. But again, this example was just a simple one to show the concept behind events that do happen in real life.

§ When you experience a problem on a statement that you know is correct, check what your application did before you saw the problem.

More details on replication errors are in Chapter 5.

SINGLE SERVER EXAMPLE

I once worked on a web application that stored measurements made in different cutting systems. The user could add a system, then edit the rules about how it saved its measurements.

The first time I met the error, I tested a web page with a list of systems:

Existing systems

* System 1

* Test

* test2

* test2

* test2

Enter name of new system:

<>

Description:

<>

<Go!>

The list should not contain duplicate systems, because there is no sense in describing the same rule twice. So I was very surprised to see several items with the same name.

The code that printed the data used objects, and I could not see what statements were sent to the MySQL server just by reading the code:

return $this->addParameters(array(Field::ITEMS => DAO::system()->getPlainList()));

I used logging to get a real query. It happened to be correct:

SELECT `system`.`id`, `system`.`name`, `system`.`description` FROM `system`

Next I examined the contents of the table:

mysql> SELECT * FROM system;

+----+----------+-------------------------------------------------+

| id | name | description |

+----+----------+-------------------------------------------------+

| 1 | System 1 | Man and woman clothing construction |

| 2 | Test | Testing Geometric set |

| 3 | test2 | New test |

| 4 | test2 | foobar |

| 8 | test2 | |

+----+----------+-------------------------------------------------+

5 rows in set (0.00 sec)

So the SELECT was accurately returning the data set that existed in the table. I switched to checking the code that updated the table:

$system = System::factory()

->setName($this->form->get(Field::NAME))

->setDescription(

$this->form->get(Field::DESCRIPTION)

);

DAO::system()->take($system);

Again I used logging to get the query:

INSERT INTO `system` (`id`, `name`, `description`) VALUES ('', 'test2', '')

The query was right too! id seemed to be an auto-incremented field that got set automatically.

But at the same time, the query hints at the potential problem: it must be running repeatedly with no check for uniqueness. Making this guess, I decided to check the table definition:

mysql> SHOW CREATE TABLE system\G

*************************** 1. row ***************************

Table: system

Create Table: CREATE TABLE `system` (

`id` int(11) NOT NULL AUTO_INCREMENT,

`name` varchar(255) NOT NULL,

`description` tinytext NOT NULL,

PRIMARY KEY (`id`)

) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8

1 row in set (0.09 sec)

And this clearly showed the source of the problem: the name field is not defined as UNIQUE. When creating the table, I probably was planning to use id as a unique identifier, but because I also use MySQL’s ability to generate an auto-incremented value for id in the INSERT, nothing prevented me from using the same name repeatedly.

To solve the problem, I manually removed the superfluous rows and added a UNIQUE index:

ALTER TABLE system ADD UNIQUE(name)

We’re done for now with problems related to wrong results. Next we’ll turn to some other problems that occur frequently.


[5] We will discuss in great detail how to troubleshoot replication failures in Chapter 5, so I’m skipping the explanation here.

Slow Queries

One common issue with SQL applications is performance regression. In this section, I’ll describe some basic actions to take when you hit performance problems. Don’t worry about the details; just pick up the essential ideas. As your knowledge deepens, you’ll find yourself using them in a more educated and therefore more effective manner.

When I considered which problems to include in this chapter, I questioned whether I should mention performance problems at all. There are a lot of sources describing performance issues in detail, starting with the wonderful “Optimization” chapter in the MySQL Reference Manual and extending to books published by O’Reilly. I will add a short overview of useful sources at the end of this book. One can easily spend one’s career on this subject or drown in the flood of available information.

I will be speaking here mostly about SELECT queries. At the end of this section, I briefly address how to deal with a slow query that modifies data.

Three main techniques will help you work on slow queries: tuning the query itself, tuning your tables (including adding indexes), and tuning the server. Let’s look at them in detail.

Tuning a Query with Information from EXPLAIN

The most powerful tool for query tuning is our old acquaintance EXPLAIN. This statement provides information about how the server actually executes a query. Details of MySQL EXPLAIN are covered quite well in the MySQL Reference Manual, and I will not repeat that information here. Rather, I’ll pick out what I’ve found to be the most important and useful elements of the output.

The first lines you should look at are type, which actually shows the type of join performed, and rows, which shows an estimate of how many rows were examined during the query. (For instance, if the query had to scan the whole table, the number of rows would equal the number of rows in the table.) Multitable joins have to examine a Cartesian product of the number of rows examined in each table. Thus, if the query examines 20 rows in one table and 30 in another, the join performs a total of 600 examinations. EXPLAIN will contain a row for each table in the JOIN. We will see this in the following examples.

EXPLAIN reports a join even when you operate on a single table. This may sound a bit strange, but the MySQL optimizer internally treats any query as a join, even if it’s a join on one table.

Let’s look into the EXPLAIN output from the previous section again:

mysql> EXPLAIN EXTENDED SELECT count(*) FROM items WHERE id IN

(SELECT iid FROM items_links)\G

*************************** 1. row ***************************

id: 1

select_type: PRIMARY

table: items

type: index

possible_keys: NULL

key: PRIMARY

key_len: 4

ref: NULL

rows: 10

filtered: 100.00

Extra: Using where; Using index

*************************** 2. row ***************************

id: 2

select_type: DEPENDENT SUBQUERY

table: items_links

type: index_subquery

possible_keys: iid,iid_2

key: iid

key_len: 5

ref: func

rows: 1

filtered: 100.00

Extra: Using index; Using where

2 rows in set, 1 warning (0.48 sec)

The number of examined rows is 10 times 1 because the subquery executes once for every row in the outer query. The type of the first query is index, which means that the whole index will be read. The type of the second query is index_subquery. This is an index lookup function that works similar to the ref type. So, in this example, the optimizer will read all index entries from the items table and one row from the items_links table for each of the ten matching rows found from items.

How can we find out if this is a good plan for the query? First, let’s repeat the query results and check how long the query actually took:

mysql> SELECT count(*) FROM items WHERE id IN (SELECT iid FROM items_links);

+----------+

| count(*) |

+----------+

| 4 |

+----------+

1 row in set (0.08 sec)

The MySQL server examined 10 rows and returned 4. How fast is it? To answer this question, let’s count the number of rows in both tables:

mysql> SELECT count(*) FROM items;

+----------+

| count(*) |

+----------+

| 10 |

+----------+

1 row in set (0.11 sec)

mysql> SELECT count(*) FROM items_links;

+----------+

| count(*) |

+----------+

| 6 |

+----------+

1 row in set (0.00 sec)

We have 10 rows in the items table, each with a unique ID. The items_links table has 6 rows with nonunique IDs (iid). For the current amount of rows, the plan looks good, but at the same time, it shows a potential problem. Right now we have fewer links than items. Currently the difference in these numbers is not high, but it will be more noticeable if their numbers diverge.

To test this guess and to show you an example of query tuning, I will insert a few rows into the items table. The id is defined as INT NOT NULL AUTO_INCREMENT PRIMARY KEY, so we are guaranteed that no existing link will refer to a newly inserted row. This allows us to imitate a realistic situation that arises when a user needs to get a small number of links (six in our case) from a large table. The following statements are just a quick hack to create a lot of rows by repeatedly selecting all the rows in the table and inserting more:

mysql> INSERT INTO items( short_description , description,

example, explanation, additional) SELECT short_description , description,

example, explanation, additional FROM items;

Query OK, 10 rows affected (0.17 sec)

Records: 10 Duplicates: 0 Warnings: 0

<Repeat this query few times>

mysql> INSERT INTO items( short_description , description,

example, explanation, additional) SELECT short_description , description,

example, explanation, additional FROM items;

Query OK, 2560 rows affected (3.77 sec)

Records: 2560 Duplicates: 0 Warnings: 0

Now let’s see whether our query plan changed:

mysql> EXPLAIN EXTENDED SELECT count(*) FROM items WHERE id IN

-> (SELECT iid FROM items_links)\G

*************************** 1. row ***************************

id: 1

select_type: PRIMARY

table: items

type: index

possible_keys: NULL

key: PRIMARY

key_len: 4

ref: NULL

rows: 5136

filtered: 100.00

Extra: Using where; Using index

*************************** 2. row ***************************

id: 2

select_type: DEPENDENT SUBQUERY

table: items_links

type: index_subquery

possible_keys: iid,iid_2

key: iid

key_len: 5

ref: func

rows: 1

filtered: 100.00

Extra: Using index; Using where

2 rows in set, 1 warning (0.09 sec)

The query execution plan is the same—and it is going to examine 5,136 rows for just six links! Is there any way to rewrite the query so it will execute faster?

The type of subquery is index_subquery. This means that the optimizer uses an index lookup function that replaces the subquery completely. The output from SHOW WARNINGS shows how the query has been rewritten:

mysql> SHOW WARNINGS\G

*************************** 1. row ***************************

Level: Note

Code: 1003

Message: select count(0) AS `count(*)` from `collaborate2011`.`items` where

<in_optimizer>(`collaborate2011`.`items`.`id`,<exists>

(<index_lookup>(<cache>(`collaborate2011`.`items`.`id`) in

items_links on iid where (<cache>(`collaborate2011`.`items`.`id`) =

`collaborate2011`.`items_links`.`iid`))))

1 row in set (0.00 sec)

The output is intimidating, but at least we can see some kind of join here. What if we rewrite the query to be more explicit about the columns on which the join is performed? We will also rewrite the subquery into an explicit JOIN; with current versions of MySQL, this method can dramatically improve performance:

mysql> \W

Show warnings enabled.

mysql> EXPLAIN EXTENDED SELECT count(*) FROM items JOIN

items_links ON (items.id=items_links.iid)\G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: items_links

type: index

possible_keys: iid,iid_2

key: iid_2

key_len: 5

ref: NULL

rows: 6

filtered: 100.00

Extra: Using index

*************************** 2. row ***************************

id: 1

select_type: SIMPLE

table: items

type: eq_ref

possible_keys: PRIMARY

key: PRIMARY

key_len: 4

ref: collaborate2011.items_links.iid

rows: 1

filtered: 100.00

Extra: Using index

2 rows in set, 1 warning (0.05 sec)

Note (Code 1003): select count(0) AS `count(*)` from `collaborate2011`.`items`

join `collaborate2011`.`items_links` where (`collaborate2011`.`items`.`id` =

`collaborate2011`.`items_links`.`iid`)

The result looks encouraging because it does not scan all the rows from the items table. But does the query work correctly?

mysql> SELECT count(*) FROM items JOIN items_links ON

(items.id=items_links.iid);

+----------+

| count(*) |

+----------+

| 6 |

+----------+

1 row in set (0.10 sec)

We get six rows instead of four. This is because we asked the query to return all matches, even when the same link was in two matches. We can fix this by adding the DISTINCT keyword:

mysql> SELECT count(distinct items.id) FROM items JOIN items_links ON

(items.id=items_links.iid);

+--------------------------+

| count(distinct items.id) |

+--------------------------+

| 4 |

+--------------------------+

1 row in set (0.12 sec)

NOTE

You can use a query rewriting technique to confirm that DISTINCT is needed. Just replace count(*) with items.id to see the duplicate values.

With DISTINCT, is the query still fast? Let’s try EXPLAIN once again:

mysql> EXPLAIN EXTENDED SELECT count(distinct items.id) FROM items

JOIN items_links ON (items.id=items_links.iid)\G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: items_links

type: index

possible_keys: iid,iid_2

key: iid_2

key_len: 5

ref: NULL

rows: 6

filtered: 100.00

Extra: Using index

*************************** 2. row ***************************

id: 1

select_type: SIMPLE

table: items

type: eq_ref

possible_keys: PRIMARY

key: PRIMARY

key_len: 4

ref: collaborate2011.items_links.iid

rows: 1

filtered: 100.00

Extra: Using index

2 rows in set, 1 warning (0.00 sec)

Note (Code 1003): select count(distinct `collaborate2011`.`items`.`id`) AS

`count(distinct items.id)` from `collaborate2011`.`items` join

`collaborate2011`.`items_links` where (`collaborate2011`.`items`.`id` =

`collaborate2011`.`items_links`.`iid`)

It still examines six rows. So we can consider the query to be optimized for this particular data set. I will explain later in this chapter why the structure of the data and its size matter.

In our example, the data set was small, so I could not make it run really slowly, even on my laptop. Still, the execution times of the original and optimized queries differ substantially. This was the original query:

mysql> SELECT count(*) FROM items WHERE id IN (SELECT iid FROM

items_links );

+----------+

| count(*) |

+----------+

| 4 |

+----------+

1 row in set (0.21 sec)

And here is the optimized query:

mysql> SELECT count(distinct items.id) FROM items JOIN items_links

ON (items.id=items_links.iid);

+--------------------------+

| count(distinct items.id) |

+--------------------------+

| 4 |

+--------------------------+

1 row in set (0.10 sec)

We achieved a two-fold improvement, even for such a small data set! For our test, this is just 0.11 sec, but for millions of rows, the improvement can be way better.

§ You just learned a basic query tuning technique with the help of EXPLAIN: reading information about what currently happens and comparing it to what you wish to reach. A similar process can tune any query, from the simplest to the most complex.

Table Tuning and Indexes

In the previous section, we introduced the process of tuning queries. In all the examples, EXPLAIN output contained information about indexes. But what if a table has no index at all? Or if the indexes are not being used? How should you choose when, where, and which index to add?

The MySQL server uses indexes when results can be limited. Therefore, having indexes on columns listed in WHERE, JOIN, and GROUP BY can speed up queries. Having an index on the column in an ORDER BY clause can make sense as well because it will allow the server to do more effective sorting.

With those principles in mind, adding indexes becomes a simple task. Consider tables from the previous example, but without any indexes:

mysql> CREATE TEMPORARY TABLE items SELECT * FROM items;

Query OK, 5120 rows affected (6.97 sec)

Records: 5120 Duplicates: 0 Warnings: 0

mysql> CREATE TEMPORARY TABLE items_links SELECT * FROM items_links;

Query OK, 6 rows affected (0.36 sec)

Records: 6 Duplicates: 0 Warnings: 0

mysql> SHOW CREATE TABLE items;

+-------+----------------------------------------------------------------------+

| Table | Create Table |

+-------+----------------------------------------------------------------------+

| items | CREATE TEMPORARY TABLE `items` (

`id` int(11) NOT NULL DEFAULT '0',

`short_description` varchar(255) DEFAULT NULL,

`description` text,

`example` text,

`explanation` text,

`additional` text

) ENGINE=InnoDB DEFAULT CHARSET=utf8 |

+-------+----------------------------------------------------------------------+

1 row in set (0.10 sec)

mysql> SHOW CREATE TABLE items_links;

+-------------+----------------------------------------------------------------+

| Table | Create Table |

+-------------+----------------------------------------------------------------+

| items_links | CREATE TEMPORARY TABLE `items_links` (

`iid` int(11) DEFAULT NULL,

`linkid` int(11) DEFAULT NULL

) ENGINE=InnoDB DEFAULT CHARSET=utf8 |

+-------------+----------------------------------------------------------------+

1 row in set (0.00 sec)

As you can see, no index is specified. Let’s try an unoptimized query on these tables, and then improve it:

mysql> EXPLAIN EXTENDED SELECT count(distinct items.id) FROM items JOIN

items_links ON (items.id=items_links.iid)\G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: items_links

type: ALL

possible_keys: NULL

key: NULL

key_len: NULL

ref: NULL

rows: 6

filtered: 100.00

Extra:

*************************** 2. row ***************************

id: 1

select_type: SIMPLE

table: items

type: ALL

possible_keys: NULL

key: NULL

key_len: NULL

ref: NULL

rows: 5137

filtered: 100.00

Extra: Using where; Using join buffer

2 rows in set, 1 warning (0.00 sec)

Note (Code 1003): select count(distinct `collaborate2011`.`items`.`id`) AS

`count(distinct items.id)` from `collaborate2011`.`items` join

`collaborate2011`.`items_links` where (`collaborate2011`.`items`.`id` =

`collaborate2011`.`items_links`.`iid`)

The type became ALL, the most expensive type, because it means all rows will be read. The query is examining 6*5,137 = 30,822 rows now. This is even worse than the query we considered to be slow in our earlier example.

Let’s examine the query in detail:

SELECT count(distinct items.id)...

This query returns the number of unique not-null values in the result set. It would make sense to add an index on items.id, so that this search could use it.

Another part of the same query:

...FROM items JOIN items_links ON (items.id=items_links.iid)

The join refers to the id column from items and the iid column from items_links. So it makes sense to add indexes to both of those columns.

mysql> ALTER TABLE items ADD INDEX(id);

Query OK, 5120 rows affected (4.78 sec)

Records: 5120 Duplicates: 0 Warnings: 0

mysql> ALTER TABLE items_links ADD INDEX(iid);

Query OK, 6 rows affected (0.04 sec)

Records: 6 Duplicates: 0 Warnings: 0

Now we can see how this affects the query plan:

mysql> EXPLAIN EXTENDED SELECT count(distinct items.id) FROM items

JOIN items_links ON (items.id=items_links.iid)\G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: items_links

type: index

possible_keys: iid

key: iid

key_len: 5

ref: NULL

rows: 6

filtered: 100.00

Extra: Using index

*************************** 2. row ***************************

id: 1

select_type: SIMPLE

table: items

type: ref

possible_keys: id

key: id

key_len: 4

ref: collaborate2011.items_links.iid

rows: 1

filtered: 100.00

Extra: Using index

2 rows in set, 1 warning (0.00 sec)

Note (Code 1003): select count(distinct `collaborate2011`.`items`.`id`) AS

`count(distinct items.id)` from `collaborate2011`.`items` join

`collaborate2011`.`items_links` where (`collaborate2011`.`items`.`id` =

`collaborate2011`.`items_links`.`iid`)

This looks much better than before, with a single exception: the ref type for table items is worse than the eq_ref we got in the previous section. This type is used because we added a simple index, whereas the original table had a unique index on the same column. We can easily change the temporary table too, because IDs are unique and are supposed to be so:

mysql> EXPLAIN EXTENDED SELECT count(distinct items.id) FROM items

JOIN items_links ON (items.id=items_links.iid)\G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: items_links

type: index

possible_keys: iid

key: iid

key_len: 5

ref: NULL

rows: 6

filtered: 100.00

Extra: Using index

*************************** 2. row ***************************

id: 1

select_type: SIMPLE

table: items

type: eq_ref

possible_keys: id_2,id

key: id_2

key_len: 4

ref: collaborate2011.items_links.iid

rows: 1

filtered: 100.00

Extra: Using index

2 rows in set, 1 warning (0.00 sec)

Note (Code 1003): select count(distinct `collaborate2011`.`items`.`id`) AS

`count(distinct items.id)` from `collaborate2011`.`items` join

`collaborate2011`.`items_links` where (`collaborate2011`.`items`.`id` =

`collaborate2011`.`items_links`.`iid`)

Now, when the faster-executing type eq_ref is used, we can drop the redundant index on items.id. This is especially important if you care about the speed of queries that modify data because updating each index takes time. We will discuss when query tuning should be stopped in following section.

You just learned how indexes affect query execution and when it makes sense to add them.

CHOOSING YOUR OWN EXECUTION PLAN

There are cases when an index actually slows down a query. In this case, you can drop the index or use an IGNORE INDEX clause if you need the index for other queries. You can also use FORCE INDEX to let the optimizer know which index you want to use. These clauses can also be very useful for query tuning when you want to see how a particular index can affect performance. Just tryEXPLAIN using the clauses, and analyze the output.

Although use of IGNORE INDEX and FORCE INDEX may sound appealing, you should avoid using them in production when possible, unless you are ready to examine every query with such a clause during each following upgrade.

This is required because the optimizer always tries to choose the best plan for execution. Improving from version to version, it can create different plans for the same JOINs. When you don’t force or ignore an index, it will just create a plan as best as it can. But if you explicitly specify how it should use indexes in one of the tables from a multitable JOIN, such a rule can affect others, and the final execution plan could change in the new version to something worse than it was before.

Using IGNORE INDEX and FORCE INDEX in queries that access only one table is relatively safe. In all other cases, you must check to make sure the query execution plan was not changed during the upgrade.

Another problem with using IGNORE INDEX and FORCE INDEX in production is that the best execution plan for particular tables depends on the data they store. In normal operation, the optimizer checks table statistics and adjusts the plan as necessary, which it can’t do when you use IGNORE INDEX and FORCE INDEX. If you use these clauses, you must regularly check whether they are still effective throughout the application’s life.

When to Stop Optimizing

The previous sections discussed simple queries. Even there we found ways to improve the queries, sometimes with increasingly better results over several iterations of tuning. When you work with complicated queries with a lot of JOIN conditions, a lot of columns in WHERE clauses, andGROUP BY clauses, you have even more choices. It’s possible to imagine that you could always find a way to make performance even faster and that these improvements can continue forever. So the question is when to consider a query properly optimized and put a stop to the research.

An in-depth knowledge of performance techniques could help you choose the proper solution. But there are still basic considerations that can help you to stop, even if you don’t consider yourself an expert.

First, you need to find out what the query does. For example, the following query:

SELECT * FROM contacts

always returns all columns and rows from the table, and no optimization can be applied to it.

But even if you extract all columns, adding a JOIN can change the situation:

SELECT * FROM child JOIN parent ON (child.pid=parent.id)

This can be optimized because the ON condition limits the result set. The same analysis can be applied to queries with WHERE and GROUP BY conditions.

The second thing you need to look at is the join type from the EXPLAIN output. Although you will be trying to get the best possible JOIN types, keep in mind the limitations of your data. For example, a condition on a nonunique row can never lead to types eq_ref or better.

Your data is very important when you optimize queries. Differences in data can lead to completely different results for the same execution plan. The most trivial example is to compare results for a single row in a table to results for a table where more than 50% of the rows have the same value. In these cases, using indexes can decrease performance rather than increase it.

§ Here is another rule: do not rely only on the EXPLAIN output; make sure to measure the actual query execution time.

Another thing you should keep in mind is the effect of indexes on changes to the table. Although indexes usually improve the speed of SELECT queries, they slightly decrease the speed of queries that modify data, especially INSERT statements. Therefore, it can sometimes be sensible to live with slow SELECT queries if it speeds up the execution of inserts. Always keep in mind the overall performance of your application, not just a single query.

Effects of Options

Suppose you have completely optimized your query and can’t find any ways to tune it better, but it’s still slow. Can you do anything to improve its performance? Yes. There are server options that allow you to tune factors affecting the query, such as the size of temporary tables in memory, buffers for sorting, and so on. Some options specific to a particular storage engine, such as InnoDB, can also be useful for query optimizing.

I will describe these configuration options in more detail in Chapter 3. Here I’ll give an overview of how to use them to improve performance.

Tuning server options is a somewhat global activity because a change can potentially affect every query on the server (or in the case of engine-specific options, every query that refers to a table using that storage engine). But some options are used for particular kinds of optimization, and if your query does not meet the right conditions, it remains unaffected.

The first options to check are buffer sizes. Each memory buffer is allocated for specific reasons. The general rule is that large buffers mean higher performance—but only if the query can use the larger size for the particular role played by that buffer.

And of course there are trade-offs when you increase buffer sizes. Here are some of the issues large buffers can cause. I don’t want to dissuade you from setting large buffers, because under the right circumstances it’s a great way to improve performance significantly. You just need to keep the following issues in mind and adjust sizes reasonably.

Swapping

A large buffer may lead to swapping at the operating-system level and therefore slow performance, depending on the size of RAM on your system. In general, the MySQL server works fast if all the memory it needs sits in physical RAM. When it starts swapping, performance degrades dramatically.

Swapping can happen when you allocate more memory to buffers than your server has physically in RAM. Please note that some buffers are allocated for each user thread. To determine how much memory the server allocates for such buffers, use the formula max_connections *buffer_size. Calculate the sum of this product for all buffers, and make sure it is less than the amount of memory the mysqld server can use. This calculation is not decisive, because mysqld can actually allocate more memory than you explicitly specify.

Startup time

The more memory mysqld needs to allocate, the longer it takes to start.

Stale data

There are also scaling issues, mostly for caches shared between threads. Scaling the buffers that do the caching in these cases can lead to memory fragmentation. You will generally notice the fragmentation problem after hours of running the server, when old data needs to be removed from the buffer in order to make room for new data. This can cause a fast server to suddenly slow down. I show this in an example in Chapter 3.

After finishing with buffers, consider other options discussed in Chapter 3. Pay attention not only to options explicitly designated for performance tuning, such as optimizer options, but also to options that control high availability. The safer you make a transaction, the more checks are needed and the more slowly the query executes. But be careful with such options; tune them when and only when you can sacrifice safety for performance.

When you tune options, it is especially important to consider performance as a whole, because every option affects the whole server. For example, there is no sense in tuning engine-specific options if you don’t use that engine. This might appear too obvious, but I have seen way too many installations where MyISAM options were huge when only the InnoDB storage engine was used, or vice versa. This is especially important to remember if you take some generic configuration as a template.

The MySQL server allows you to change most of its configuration dynamically. This is valuable for performance testing. Change options and rerun queries to make sure you get good results before applying the changes in the configuration file. It is also always a good idea to apply changes step by step, so that you can attribute bad effects to the right option and easily go back. We will discuss this technique in detail in Chapter 6.

Queries That Modify Data

We discussed effects on the performance of SELECT queries, and in this section, we turn to tuning queries that modify data. UPDATE and DELETE queries can use the same conditions as SELECT to limit the number of rows affected. Therefore, the same rules for tuning can be applied to these queries.

We saw in When the Problem May Have Been a Previous Update how to convert UPDATE and DELETE queries to SELECT queries and run EXPLAIN on them. You can use this technique to troubleshoot performance issues on versions prior to 5.6.3, which introduced EXPLAIN for INSERT,UPDATE, and DELETE queries, but remember that UPDATE and DELETE are sometimes executed slightly differently from the corresponding SELECT.

Check whether indexes were used by querying the Handler_% status variables before and after query execution:

mysql> SHOW STATUS LIKE 'Handler_%'; 1

+----------------------------+-------+

| Variable_name | Value |

+----------------------------+-------+

| Handler_commit | 0 |

| Handler_delete | 0 |

| Handler_discover | 0 |

| Handler_prepare | 0 |

| Handler_read_first | 0 |

| Handler_read_key | 0 |

| Handler_read_last | 0 |

| Handler_read_next | 0 |

| Handler_read_prev | 0 |

| Handler_read_rnd | 0 |

| Handler_read_rnd_next | 19 |

| Handler_rollback | 0 |

| Handler_savepoint | 0 |

| Handler_savepoint_rollback | 0 |

| Handler_update | 0 |

| Handler_write | 17 |

+----------------------------+-------+

16 rows in set (0.00 sec)

1

Here I used SHOW STATUS, which is a synonym of SHOW SESSION STATUS and shows the status variables for the current session.

NOTE

It is convenient to reset these variables before testing using a FLUSH STATUS query.

We’ll talk about particular variables in the preceding list as we continue. You should be aware, though, that these are cumulative values, so they increase as you issue each query. Now let’s tune our example query from When the Problem May Have Been a Previous Update so that it will update null-able columns:

mysql> UPDATE items SET description = 'no description', additional

= 'no additional comments' WHERE description IS NULL;

Query OK, 0 rows affected (6.95 sec)

Rows matched: 0 Changed: 0 Warnings: 0

This changed no rows because we corrupted the data in an earlier step: we have 0 in each field instead of NULL now. But the query runs very slowly. Let’s look at the handler variables:

mysql> show status like 'Handler_%';

+----------------------------+-------+

| Variable_name | Value |

+----------------------------+-------+

| Handler_commit | 1 |

| Handler_delete | 0 |

| Handler_discover | 0 |

| Handler_prepare | 0 |

| Handler_read_first | 1 |

| Handler_read_key | 2 |

| Handler_read_last | 0 |

| Handler_read_next | 0 |

| Handler_read_prev | 0 |

| Handler_read_rnd | 0 |

| Handler_read_rnd_next | 5140 |

| Handler_rollback | 0 |

| Handler_savepoint | 0 |

| Handler_savepoint_rollback | 0 |

| Handler_update | 0 |

| Handler_write | 17 |

+----------------------------+-------+

16 rows in set (0.01 sec)

What jumps out is the high value of Handler_read_rnd_next, which shows how often the next row in a datafile was read. A high value typically means that a table scan was used, which is not good for performance. Handler_read_key is a related variable showing the number of index read requests. It shouldn’t be so low in relation to Handler_read_rnd_next, because that means a lot of rows were read instead of using an index. In addition, the values of Handler_commit and Handler_read_first have increased slightly. These refer respectively to the number of commits and the number of times the first entry in an index was read. Finally, the unobtrusive 1 in Handler_read_first shows that we asked the server to read a first entry in the index, which can be a symptom of a full index scan.

Hopefully, this run-through of a few Handler_% status variables has shown you how they can be used to check how queries are running. I’ll leave the question of whether the speed of this query can be improved as homework for you.

I’ll just spend a little space on INSERT queries. They have no conditions that limit the set of rows affected, so the presence of indexes in a table merely slows them down because each insert has to update the indexes. The performance of inserts should be tuned using server options. Here especially, the options offered by the InnoDB storage engine can be helpful.

One way to speed up inserts is to combine many of them in one statement, also called “bulk insert”:

insert into t1 (f1, f2, f3, ...) values (v1, v2, v3, ...), (v1, v2, v3, ...), ...

But please note that inserts block table rows or even whole tables, so other queries are denied access while the insert runs. So I’ll end this section by repeating the general rule:

§ Keep the performance of the whole application in mind while tuning any single query.

No Silver Bullet

We just learned how tuning server options can dramatically improve performance. We’ve also seen in this chapter how to tune a particular query so it runs really fast. Tuning a query and tuning a server are often alternative solutions to performance problems. Is there any general rule about which kind of tuning you should start with?

I am afraid not. Tuning server options looks so promising that many people think finding and changing the right option will make mysqld run like a rocket. If you believe that too, I have to disappoint you: badly written queries will still be a drain on server resources. And you may enjoy good performance for a few hours after restarting the server only to see it decrease again, because every query will need a lot of resources and your caches will fill up. At times, the server will be inundated with millions of queries that want more and more resources.

However, tuning every single query might not be an option. Some of them are called rarely, so there is no need to spend human resources on them. Others may query all the rows in a table, which defeats attempts to optimize them.

I usually advocate some kind of “mixed” mode. Tune server options first, paying particular attention to options specific to your storage engine, then tune queries. After the important queries are tuned, go back to the server options and consider what you can tune more, then back to the rest of the queries, and so on, until you are happy with performance.

You can also start with the slowest queries in your application or find ones that can obviously benefit from trivial optimizations, then turn to server options. Consult the status variables, as shown earlier. I will describe them in more detail in Chapter 6.

Last but not least: use numerous information sources on performance tuning to create your own strategy.

When the Server Does Not Answer

Occasionally a MySQL client gets the catastrophic message "Lost connection to server during query" or "Server has gone away." Although I hope you will never face this problem, it’s good to be prepared in case it happens. There are two main reasons for this problem caused by the MySQL installation itself: server issues (most likely a crash) or the misuse of connection options (usually timeout options or max_allowed_packet).

We will discuss the configuration of connections in Chapter 3. Problems caused by hardware and third-party software will be touched on in Chapter 4. Here I want to briefly describe what to do if the server crashes.

First, determine whether you really had a server crash. You can do this with the help of process status monitors. If you run mysqld_safe or another daemon that restarts the server after a failure, the error log will contain a message indicating the server has been restarted. When mysqld starts, it always prints something like this to the error logfile:

110716 14:01:44 [Note] /apps/mysql-5.1/libexec/mysqld: ready for connections.

Version: '5.1.59-debug' socket: '/tmp/mysql51.sock' port: 3351 Source distribution

So if you find such a message, the server has been restarted. If there is no message and the server is up and running, a lost connection is most likely caused by the misuse of connection options, which will be discussed in Chapter 3.

NOTE

If you remember when your MySQL server was originally started, you can use the status variable uptime, which shows the number of seconds since the server started:

mysql> SHOW GLOBAL STATUS LIKE 'uptime';

+---------------+-------+

| Variable_name | Value |

+---------------+-------+

| Uptime | 10447 |

+---------------+-------+

1 row in set (0.00 sec)

This information will also help when you want to check whether mysqld failure was not caused by an operating system restart. Just compare the value of this variable with the operating system uptime.

The reason why I rely on error logfiles comes from my job experience with cases when customers notice a problem hours after a server crash and even some time after a planned mysqld restart.

If you confirm that the server has restarted, you should examine the error log again and look for information about the crash itself. Usually you can derive enough information about the crash from the error log to avoid the same situation in the future. We’ll discuss how to investigate the small number of difficult cases you may encounter in Chapter 6. Now let’s go back to the error logfile and see examples of its typical contents in case of a server crash. I’ll list a large extract here:

Version: '5.1.39' socket: '/tmp/mysql_sandbox5139.sock' port: 5139

MySQL Community Server (GPL)

091002 14:56:54 - mysqld got signal 11 ;

This could be because you hit a bug. It is also possible that this binary

or one of the libraries it was linked against is corrupt, improperly built,

or misconfigured. This error can also be caused by malfunctioning hardware.

We will try our best to scrape up some info that will hopefully help diagnose

the problem, but since we have already crashed, something is definitely wrong

and this may fail.

key_buffer_size=8384512

read_buffer_size=131072

max_used_connections=1

max_threads=151

threads_connected=1

It is possible that mysqld could use up to

key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 338301 K

bytes of memory

Hope that's ok; if not, decrease some variables in the equation.

thd: 0x69e1b00

Attempting backtrace. You can use the following information to find out

where mysqld died. If you see no messages after this, something went

terribly wrong...

stack_bottom = 0x450890f0 thread_stack 0x40000

/users/ssmirnova/blade12/5.1.39/bin/mysqld(my_print_stacktrace+0x2e)[0x8ac81e]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(handle_segfault+0x322)[0x5df502]

/lib64/libpthread.so.0[0x3429e0dd40]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN6String4copyERKS_+0x16)[0x5d9876]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN14Item_cache_str5storeEP4Item+0xc9)

[0x52ddd9]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN26select_singlerow_subselect9send_

dataER4ListI4ItemE+0x45)

[0x5ca145]

/users/ssmirnova/blade12/5.1.39/bin/mysqld[0x6386d1]

/users/ssmirnova/blade12/5.1.39/bin/mysqld[0x64236a]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN4JOIN4execEv+0x949)[0x658869]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN30subselect_single_select_

engine4execEv+0x36c)[0x596f3c]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN14Item_subselect4execEv+0x26)[0x595d96]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN24Item_singlerow_subselect8val_

realEv+0xd)[0x595fbd]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN14Arg_comparator18compare_real_

fixedEv+0x39)[0x561b89]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN12Item_func_ne7val_intEv+0x23)[0x568fb3]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN4JOIN8optimizeEv+0x12ef)[0x65208f]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_

LISTjR4ListIS1_ES2_jP8

st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xa0)

[0x654850]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z13handle_selectP3THDP6st_lexP13select_

resultm+0x16c)

[0x65a1cc]

/users/ssmirnova/blade12/5.1.39/bin/mysqld[0x5ecbda]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z21mysql_execute_commandP3THD+0x602)

[0x5efdd2]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z11mysql_parseP3THDPKcjPS2_+0x357)

[0x5f52f7]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z16dispatch_command19enum_server_

commandP3THDPcj+0xe93)

[0x5f6193]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z10do_commandP3THD+0xe6)[0x5f6a56]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(handle_one_connection+0x246)[0x5e93f6]

/lib64/libpthread.so.0[0x3429e061b5]

/lib64/libc.so.6(clone+0x6d)[0x34292cd39d]`)

Trying to get some variables.

Some pointers may be invalid and cause the dump to abort...

thd->query at 0x6a39e60 = select 1 from `t1` where `c0` <>

(select geometrycollectionfromwkb(`c3`) from `t1`)

thd->thread_id=2

thd->killed=NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains

information that should help you find out what

The key line indicating the reason for the crash is:

091002 14:56:54 - mysqld got signal 11 ;

This means the MySQL server was killed after it asked the operating system for a resource (e.g., access to a file or RAM), getting an error with code 11. In most operating systems, this signal refers to a segmentation fault. You can find more detailed information in the user manual for your operating system. Run man signal for Unix and Linux. In Windows, similar cases will usually generate a log message like “mysqld got exception 0xc0000005.” Search the Windows user manual for the meanings of these exception codes.

The following is the excerpt from the log about a query that was running in the thread that crashed the server:

Trying to get some variables.

Some pointers may be invalid and cause the dump to abort...

thd->query at 0x6a39e60 = SELECT 1 FROM `t1` WHERE `c0` <>

(SELECT geometrycollectionfromwkb(`c3`) FROM `t1`)

thd->thread_id=2

thd->killed=NOT_KILLED

To diagnose, rerun the query to see whether it was the cause of the crash:

mysql> SELECT 1 FROM `t1` WHERE `c0` <> (SELECT

geometrycollectionfromwkb(`c3`) FROM `t1`);

ERROR 2013 (HY000): Lost connection to MySQL server during query

NOTE

When I recommend repeating problems, I assume you will use the development server and not your production server. We will discuss how to safely troubleshoot in an environment dedicated to this purpose in Sandboxes. Please don’t try to repeat this example; it is based on a known bug #47780 that’s fixed in current versions. The fix exists since versions 5.0.88, 5.1.41, 5.5.0, and 6.0.14.

At this point, you have identified the source of the crash and confirmed this was really the source, but you have to rewrite the query so it does not cause a crash next time. Now we can get help from the backtrace that was printed in the log:

Attempting backtrace. You can use the following information to find out

where mysqld died. If you see no messages after this, something went

terribly wrong...

stack_bottom = 0x450890f0 thread_stack 0x40000

/users/ssmirnova/blade12/5.1.39/bin/mysqld(my_print_stacktrace+0x2e)[0x8ac81e]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(handle_segfault+0x322)[0x5df502]

/lib64/libpthread.so.0[0x3429e0dd40]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN6String4copyERKS_+0x16)[0x5d9876]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN14Item_cache_str5storeEP4Item+0xc9)

[0x52ddd9]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN26select_singlerow_subselect9

send_dataER4ListI4ItemE+0x45)

[0x5ca145]

/users/ssmirnova/blade12/5.1.39/bin/mysqld[0x6386d1]

/users/ssmirnova/blade12/5.1.39/bin/mysqld[0x64236a]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN4JOIN4execEv+0x949)[0x658869]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN30subselect_single_select_engine4execEv

+0x36c)[0x596f3c]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN14Item_subselect4execEv+0x26)[0x595d96]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN24Item_singlerow_subselect8val_realEv

+0xd)[0x595fbd]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN14Arg_comparator18compare_real_fixedEv

+0x39)[0x561b89]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN12Item_func_ne7val_intEv+0x23)[0x568fb3]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN4JOIN8optimizeEv+0x12ef)[0x65208f]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_

LISTjR4ListIS1_ES2_jP8

st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xa0)

[0x654850]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z13handle_selectP3THDP6st_lexP13select_

resultm+0x16c)

[0x65a1cc]

/users/ssmirnova/blade12/5.1.39/bin/mysqld[0x5ecbda]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z21mysql_execute_commandP3THD+0x602)

[0x5efdd2]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z11mysql_parseP3THDPKcjPS2_+0x357)

[0x5f52f7]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z16dispatch_command19enum_server_

commandP3THDPcj+0xe93)

[0x5f6193]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z10do_commandP3THD+0xe6)[0x5f6a56]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(handle_one_connection+0x246)[0x5e93f6]

/lib64/libpthread.so.0[0x3429e061b5]

/lib64/libc.so.6(clone+0x6d)[0x34292cd39d]`)

The relevant lines are the calls to Item_subselect and Item_singlerow_subselect:

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN14Item_subselect4execEv+0x26)[0x595d96]

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN24Item_singlerow_subselect8val_realEv

+0xd)[0x595fbd]

How did I decide these were the culprits? In this case, I recognized the calls from my previous troubleshooting. But a good rule of thumb is to start from the top. These first functions are usually operating system calls, which can be relevant, but are of no help in these circumstances, because you cannot do anything with them, and then follow calls to the MySQL library. Examine these from top to bottom to find which ones you can affect. You can’t do anything with String4copy or Item_cache_str5store, for instance, but you can rewrite a subselect, so we’ll start from there.

Here, even without looking into the source code for mysqld, we can play around to find the cause of the crash. It’s a good guess that the use of a subquery is the problem because subqueries can be converted easily to JOIN. Let’s try rewriting the query and testing it:

mysql> SELECT 1 FROM `t1` WHERE `c0` <> geometrycollectionfromwkb(`c3`);

Empty set (0.00 sec)

The new query does not crash, so all you need to do is change the query in the application to match.

§ You just learned something important about MySQL troubleshooting: the first thing to check in case of an unknown error is the error logfile. Always have it turned on.

Here I want to add a note about bugs. When you are faced with a crash and identify the reason, check the MySQL bug database for similar problems. When you find a bug that looks like the one you hit, check whether it has been fixed, and if so, upgrade your server to the version where the bug was fixed (or a newer one). This can save you time because you won’t have to fix the problematic query.

If you can’t find a bug similar to one you hit, try downloading the latest version of the MySQL server and running your query. If the bug is reproducible there, report it. It’s important to use the latest versions of stable general availability (GA) releases because they contain all current bug fixes and many old issues won’t reappear. Chapter 6 discusses how to safely test crashes in sandbox environments.

Crashes can be caused not only by particular queries, but also by the environment in which the server runs. The most common reason is the lack of free RAM. This happens particularly when the user allocates huge buffers. As I noted before, mysqld always needs slightly more RAM than the sum of all buffers. Usually the error logfile contains a rough estimation of the RAM that can be used. It looks like this:

key_buffer_size=235929600

read_buffer_size=4190208

max_used_connections=17

max_connections=2048

threads_connected=13

It is possible that mysqld could use up to

key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 21193712 K

-----

21193712K ~= 20G

Such estimations are not precise, but still worth checking. The one just shown claims that mysqld can use up to 20G RAM! You can get powerful boxes nowadays, but it is worth checking whether you really have 20G RAM.

Another issue in the environment is other applications that run along with the MySQL server. It is always a good idea to dedicate a server for MySQL in production because other applications can use resources that you expect MySQL to use. We will describe how to debug the effects of other applications on mysqld in Chapter 4.

Issues with Solutions Specific to Storage Engines

Actually, any problem this book discusses could have specific nuances based on the storage engine you use. We will touch upon these aspects throughout the book. In this section, I want to show a few features of storage engines that are independent of other problems. We’ll cover a few basic problems that use tools specific to MyISAM or InnoDB because these are the most popular and frequently used storage engines. If you use a third-party storage engine, consult its user manual for useful tools.

Errors related to a storage engine are either reported back to the client or recorded in the error logfile. Usually the name of the storage engine appears in the error message. In rare cases, you will get an error number not known by the perror utility. This is usually a symptom of an issue with a storage engine.

One common storage engine issue is corruption. This is not always the fault of the storage engine, but can have an external cause such as disk damage, a system crash, or a MySQL server crash. For example, if somebody runs kill -9 on the server’s process, she is almost asking for data corruption. We will discuss here what to do in case of MyISAM and InnoDB corruption. We will not discuss how to fix corruption of a third-party storage engine; consult its documentation for guidance. As a general recommendation, you can try CHECK TABLE, which many storage engines support. (CHECK TABLE for the MyISAM storage engine is explained in Repairing a MyISAM table from SQL.)

Corruption is a difficult problem to diagnose because the user might not notice it until the MySQL server accesses a corrupted table. The symptoms can also be misleading. In the best case, you will get an error message. However, the problem might be manifested by incorrect execution of queries or even a server shutdown. If problems crop up suddenly on a particular table, always check for corruption.

NOTE

Once you suspect corruption, you need to repair the corrupted table. It’s always a good practice to back up table files before doing a repair so you can go back if something goes wrong.

MyISAM Corruption

MyISAM stores every table as a set of three files: table_name.frm contains the table structure (schema), table_name.MYD contains the data, and table_name.MYI contains the index. Corruption can damage the datafile, the index file, or both. In such cases, you will get an error like"ERROR 126 (HY000): Incorrect key file for table './test/t1.MYI'; try to repair it" or "Table './test/t2' is marked as crashed and last (automatic?) repair failed" when you access the table. The error message can vary, but check for the words “repair” or “crashed” as a clue that the table is corrupted.

The SQL statements CHECK TABLE and REPAIR TABLE troubleshoot corruption. From the operating system shell, you can also used the myisamchk utility for the same purpose. One advantage of myisamchk is that you can use it without access to a running MySQL server. For instance, you can try to repair a table after a crash before bringing up the server again.

Repairing a MyISAM table from SQL

CHECK TABLE without parameters shows the current table status:

mysql> CHECK TABLE t2;

+---------+-------+----------+------------------------------------------------------+

| Table | Op | Msg_type | Msg_text |

+---------+-------+----------+------------------------------------------------------+

| test.t2 | check | warning | Table is marked as crashed and last repair failed |

| test.t2 | check | warning | Size of indexfile is: 1806336 Should be: 495616 |

| test.t2 | check | error | Record-count is not ok; is 780 Should be: 208 |

| test.t2 | check | warning | Found 780 key parts. Should be: 208 |

| test.t2 | check | error | Corrupt |

+---------+-------+----------+------------------------------------------------------+

5 rows in set (0.09 sec)

This is an example of output for a corrupted table. Your first resort is to run REPAIR TABLE without parameters:

mysql> REPAIR TABLE t2;

+---------+--------+----------+----------------------------------------+

| Table | Op | Msg_type | Msg_text |

+---------+--------+----------+----------------------------------------+

| test.t2 | repair | warning | Number of rows changed from 208 to 780 |

| test.t2 | repair | status | OK |

+---------+--------+----------+----------------------------------------+

2 rows in set (0.05 sec)

This time we were lucky and the table was repaired successfully. We can run CHECK TABLE again to confirm this:

mysql> CHECK TABLE t2;

+---------+-------+----------+----------+

| Table | Op | Msg_type | Msg_text |

+---------+-------+----------+----------+

| test.t2 | check | status | OK |

+---------+-------+----------+----------+

1 row in set (0.02 sec)

If a simple REPAIR TABLE run does not help, there are two more options. REPAIR TABLE EXTENDED works more slowly than the bare REPAIR TABLE, but can fix 99% of errors. As a last resort, run REPAIR TABLE USE_FRM, which does not trust the information in the index file. Instead, it drops and then recreates the index using the description from the table_name.frm file and fills the key with values from the table_name.MYD file.

NOTE

For the very same purpose, you can use a utility named mysqlcheck. This program works by sending CHECK and REPAIR statements to the server. It also has very nice options, such as --all-databases, which can help you perform table maintenance effectively.

mysqlcheck connects to the MySQL server as any other client does, and thus can be used remotely.

Repairing a MyISAM table using myisamchk

All of these steps can also be performed using myisamchk, which has a lot of additional table maintenance options. I won’t describe all the features of the utility here, but instead concentrate on those specific to table repair.

myisamchk directly accesses table files and does not require the MySQL server to be started. This can be very useful in some situations. At the same time, myisamchk requires exclusive access to table files, and you should avoid using it when the MySQL server is running.

WARNING

If you have to use myisamchk while the server is running, issue the queries FLUSH TABLES and LOCK TABLE table_name WRITE, then wait until the latest query returns a command prompt, and then run myisamchk in a parallel session. If other processes besides myisamchk access the table while myisamchk is running, even worse corruption can occur.

A basic recovery command is:

$myisamchk --backup --recover t2

- recovering (with sort) MyISAM-table 't2'

Data records: 208

- Fixing index 1

- Fixing index 2

Data records: 780

The --backup option tells myisamchk to back up the datafile before trying to fix the table, and --recover does the actual repair. If this command is insufficient, you can use the --safe-recover option. The latter option uses a recovery method that has existed since very early versions of MySQL and can find issues that the simple --recover option cannot. An even more drastic option is --extend-check.

You can also use the option --sort-recover, which uses sorting to resolve the keys even when the temporary file is very large.

Among other options, which I recommend you study carefully, is the very useful --description option, which prints a description of the table. Taken together with -v or its synonym, --verbose, it will print additional information. You can specify the -v option twice or even three times to get more information.

InnoDB Corruption

InnoDB stores its data and indexes in shared tablespaces. If the server was started with the option --innodb_file_per_table at the moment of table creation, it also has its own datafile, but the table definition still exists in a shared tablespace. Understanding how table files are stored can help to effectively maintain the data directory and backups.

InnoDB is a transactional storage engine and has internal mechanisms that automatically fix most kinds of corruption. It does this recovery at server startup. The following excerpt from the error log, taken after a backup by MySQL Enterprise Backup (MEB) using the mysqlbackup --copy-back command, shows a typical recovery[6]:

InnoDB: The log file was created by ibbackup --apply-log at

InnoDB: ibbackup 110720 21:33:50

InnoDB: NOTE: the following crash recovery is part of a normal restore.

InnoDB: The log sequence number in ibdata files does not match

InnoDB: the log sequence number in the ib_logfiles!

110720 21:37:15 InnoDB: Database was not shut down normally!

InnoDB: Starting crash recovery.

InnoDB: Reading tablespace information from the .ibd files...

InnoDB: Restoring possible half-written data pages from the doublewrite

InnoDB: buffer...

InnoDB: Last MySQL binlog file position 0 98587529, file name ./blade12-bin.000002

110720 21:37:15 InnoDB Plugin 1.0.17 started; log sequence number 1940779532

110720 21:37:15 [Note] Event Scheduler: Loaded 0 events

110720 21:37:15 [Note] ./libexec/mysqld: ready for connections.

Version: '5.1.59-debug' socket: '/tmp/mysql_ssmirnova.sock' port: 33051

Source distribution

But sometimes corruption is extremely bad and InnoDB cannot repair it without user interaction. For such situations, the startup option --innodb_force_recovery exists. It can be set to any value from 0 to 6 (0 means no forced recovery, 1 is the lowest level, and 6 is the highest level). When recovery is successful, you can run certain types of queries against the table that was repaired, but you’re prevented from issuing certain commands. You can’t issue operations that modify data, but the option still allows certain SELECT select statements, as well as DROP statements. At level 6, for instance, you can run only queries of the form SELECT * FROM table_name with no qualifying condition—no WHERE, ORDER BY, or other clauses.

In case of corruption, try each level of --innodb_force_recovery, starting from 1 and increasing, until you are able to start the server and query the problem table. Your prior investigation should have uncovered which table is corrupted. Dump it to a file using SELECT INTO OUTFILE, then recreate it using DROP and CREATE. Finally, restart the server with --innodb_force_recovery=0 and load the dump. If the problem persists, try to find other tables that are corrupted and go through the process until the server is fine again.

If you need to begin your repair of a database by using a positive value for --innodb_force_recovery, the error log often mentions it explicitly through messages such as this:

InnoDB: We intentionally generate a memory trap.

InnoDB: Submit a detailed bug report to http://bugs.mysql.com.

InnoDB: If you get repeated assertion failures or crashes, even

InnoDB: immediately after the mysqld startup, there may be

InnoDB: corruption in the InnoDB tablespace. Please refer to

InnoDB: http://dev.mysql.com/doc/refman/5.1/en/forcing-recovery.html

InnoDB: about forcing recovery.

You will also find information there about unsuccessful automatic recovery and startup failure.

NOTE

InnoDB writes checksums for data, index, and log pages immediately before writing actual data, and confirms the checksums immediately after reading from the disk. This allows it to prevent a majority of problems. Usually when you encounter InnoDB corruption, this means you have issues with either the disk or RAM.


[6] MySQL Enterprise Backup (MEB), formerly known as InnoDB HotBackup, is a tool that creates hot online backups of InnoDB tables and warm online backups of tables that use other storage engines. We will discuss backup methods in Chapter 7.

Permission Issues

MySQL has a complex privilege scheme, allowing you to tune precisely which users and hosts are allowed to perform one or another operation. Since version 5.5, MySQL also has pluggable authentication.

Although it has advantages, this scheme is complicated. For example, having user1@hostA different from user2@hostA and user1@hostB makes it easy to mix up their privileges. It is even easier to do this when the username is the same and the host changes.

MySQL allows you to tune access at the object and connection level. You can restrict a user’s access to a particular table, column, and so on.

Users usually experience two kinds of permission issues:

§ Users who should be able to connect to the server find they cannot, or users who should not be able to connect find that they can.

§ Users can connect to the server, but can’t use objects to which they are supposed to have access, or can use objects to which they are not supposed to have access.

Before you start troubleshooting these problems, you need to find out whether you can connect to the server.

After you succeed in connecting as the user you’re troubleshooting (we will discuss the case when connection is not possible a bit later in this chapter), run the query:

SELECT USER(), CURRENT_USER()

The USER() function returns the connection parameters used when the user connects. These are usually the username that was specified and the hostname of a box where the client is running. CURRENT_USER() returns the username and hostname pair of those privileges chosen from privilege tables. These are the username and hostname pairs used by mysqld to check access to database objects. By comparing the results of these functions, you can find out why mysqld uses privileges that are different from what you expected. A typical problem is trying to use a % wildcard for the hostname:

root> GRANT ALL ON book.* TO sveta@'%';

Query OK, 0 rows affected (0.00 sec)

root> GRANT SELECT ON book.* TO sveta@'localhost';

Query OK, 0 rows affected (0.00 sec)

If I now connect as sveta and try to create a table, I get an error:

$mysql -usveta book

Welcome to the MySQL monitor. Commands end with ; or \g.

Your MySQL connection id is 30

Server version: 5.1.52 MySQL Community Server (GPL)

Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its

affiliates. Other names may be trademarks of their respective

owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE TABLE t1(f1 INT);

ERROR 1142 (42000): CREATE command denied to user 'sveta'@'localhost' for table 't1'

The problem is that the MySQL server expands sveta to sveta@localhost, not to the wild card:

mysql> SELECT user(), current_user();

+-----------------+-----------------+

| user() | current_user() |

+-----------------+-----------------+

| sveta@localhost | sveta@localhost |

+-----------------+-----------------+

1 row in set (0.00 sec)

If you don’t understand why one or another host was chosen, run a query like this:

mysql> SELECT user, host FROM mysql.user WHERE user='sveta' ORDER

BY host DESC;

+-------+-----------+

| user | host |

+-------+-----------+

| sveta | localhost |

| sveta | % |

+-------+-----------+

2 rows in set (0.00 sec)

MySQL sorts rows in this table from the most specific to the least specific host value and uses the first value found. Therefore, it connected me as the user account sveta@localhost, which does not have CREATE privileges.

§ USER() and CURRENT_USER(), together with the query SELECT user, host FROM mysql.user ORDER BY host DESC, are the first resort in case of a permission issue.

Another issue with privileges arises when it is not possible to connect as the specified user. In this case, you can usually learn the reason from the error message, which looks similar to the following:

$mysql -usveta books

ERROR 1044 (42000): Access denied for user 'sveta'@'localhost' to database 'books'

After seeing this, you know the user credentials. Connect as the root superuser, and check whether such a user exists and has the required privileges:

mysql> SELECT user, host FROM mysql.user WHERE user='sveta' ORDER

BY host DESC;

+-------+-----------+

| user | host |

+-------+-----------+

| sveta | localhost |

| sveta | % |

+-------+-----------+

2 rows in set (0.00 sec)

mysql> SHOW GRANTS FOR 'sveta'@'localhost';

+-------------------------------------------------+

| Grants for sveta@localhost |

+-------------------------------------------------+

| GRANT USAGE ON *.* TO 'sveta'@'localhost' |

| GRANT SELECT ON `book`.* TO 'sveta'@'localhost' |

+-------------------------------------------------+

2 rows in set (0.00 sec)

mysql> SHOW GRANTS FOR 'sveta'@'%';

+-------------------------------------------------+

| Grants for sveta@% |

+-------------------------------------------------+

| GRANT USAGE ON *.* TO 'sveta'@'%' |

| GRANT ALL PRIVILEGES ON `book`.* TO 'sveta'@'%' |

+-------------------------------------------------+

2 rows in set (0.00 sec)

In this output, you can see that user 'sveta'@'localhost' has privileges only on the database named book, but no privileges on the books database. Now you can fix the problem: give user sveta@localhost the necessary privileges.

The previous examples discussed users who lacked necessary privileges. Users who are granted superfluous privileges can be handled in the same way; you just need to revoke the unnecessary privileges.

WARNING

MySQL privileges are detached from objects they control: this means mysqld does not check for the existence of an object when you grant a privilege on it and does not remove a privilege when all objects it grants access to are removed. This is both a great advantage, because it allows us to grant necessary privileges in advance, and a potential cause of an issue if used without care.

As a best practice, I recommend careful study of how MySQL privileges work. This is especially important if you grant privileges on the object level because you should understand how a grant on one level affects grants of others. The same considerations apply to revoking privileges, which can be even more critical because if you think you revoked a privilege and it is still present, this allows unwanted access.