Sorting Query Results - MySQL Cookbook (2007)

MySQL Cookbook (2007)

Chapter 7. Sorting Query Results

Introduction

This chapter covers sorting, an operation that is extremely important for controlling how MySQL displays results from SELECT statements. Sorting is performed by adding an ORDER BY clause to a query. Without such a clause, MySQL is free to return rows in any order, so sorting helps bring order to disorder and makes query results easier to examine and understand. (Sorting is also performed implicitly when you use a GROUP BY clause, as discussed in Controlling Summary Display Order.)

You can sort rows of a query result several ways:

§ Using a single column, a combination of columns, or even parts of columns

§ Using ascending or descending order

§ Using the result of an expression

§ Using case-sensitive or case-insensitive string comparisons

§ Using temporal ordering

The driver_log table is used for several examples in this chapter; it contains columns for recording daily mileage logs for a set of truck drivers:

mysql>SELECT * FROM driver_log;

+--------+-------+------------+-------+

| rec_id | name | trav_date | miles |

+--------+-------+------------+-------+

| 1 | Ben | 2006-08-30 | 152 |

| 2 | Suzi | 2006-08-29 | 391 |

| 3 | Henry | 2006-08-29 | 300 |

| 4 | Henry | 2006-08-27 | 96 |

| 5 | Ben | 2006-08-29 | 131 |

| 6 | Henry | 2006-08-26 | 115 |

| 7 | Suzi | 2006-09-02 | 502 |

| 8 | Henry | 2006-09-01 | 197 |

| 9 | Ben | 2006-09-02 | 79 |

| 10 | Henry | 2006-08-30 | 203 |

+--------+-------+------------+-------+

Many other examples use the mail table (used in earlier chapters):

mysql>SELECT * FROM mail;

+---------------------+---------+---------+---------+---------+---------+

| t | srcuser | srchost | dstuser | dsthost | size |

+---------------------+---------+---------+---------+---------+---------+

| 2006-05-11 10:15:08 | barb | saturn | tricia | mars | 58274 |

| 2006-05-12 12:48:13 | tricia | mars | gene | venus | 194925 |

| 2006-05-12 15:02:49 | phil | mars | phil | saturn | 1048 |

| 2006-05-13 13:59:18 | barb | saturn | tricia | venus | 271 |

| 2006-05-14 09:31:37 | gene | venus | barb | mars | 2291 |

| 2006-05-14 11:52:17 | phil | mars | tricia | saturn | 5781 |

| 2006-05-14 14:42:21 | barb | venus | barb | venus | 98151 |

| 2006-05-14 17:03:01 | tricia | saturn | phil | venus | 2394482 |

| 2006-05-15 07:17:48 | gene | mars | gene | saturn | 3824 |

| 2006-05-15 08:50:57 | phil | venus | phil | venus | 978 |

| 2006-05-15 10:25:52 | gene | mars | tricia | saturn | 998532 |

| 2006-05-15 17:35:31 | gene | saturn | gene | mars | 3856 |

| 2006-05-16 09:00:28 | gene | venus | barb | mars | 613 |

| 2006-05-16 23:04:19 | phil | venus | barb | venus | 10294 |

| 2006-05-17 12:49:23 | phil | mars | tricia | saturn | 873 |

| 2006-05-19 22:21:51 | gene | saturn | gene | venus | 23992 |

+---------------------+---------+---------+---------+---------+---------+

Other tables are used occasionally as well. You can create most of them with scripts found in the tables directory of the recipes distribution.

Using ORDER BY to Sort Query Results

Problem

Output rows from a query don’t come out in the order you want.

Solution

Add an ORDER BY clause to the query to sort the result rows.

Discussion

The contents of the driver_log and mail tables shown in the chapter introduction are disorganized and difficult to make any sense of. The exception is that the values in the id and t columns are in order, but that’s just coincidental. Rows do tend to be returned from a table in the order they were originally inserted, but only until the table is subjected to delete and update operations. Rows inserted after that are likely to be returned in the middle of the result set somewhere. Many MySQL users notice this disturbance in row retrieval order, which leads them to ask, “How can I store rows in my table so they come out in a particular order when I retrieve them?” The answer to this question is, “That’s the wrong question.” Storing rows is the server’s job, and you should let the server do it. Besides, even if you can specify storage order, how would that help you if you want to see results sorted in different orders at different times?

When you select rows, they’re pulled out of the database and returned in whatever order the server happens to use. This order might change, even for statements that don’t sort rows, depending on which index the server happens to use when it executes a statement, because the index can affect the retrieval order. Even if your rows appear to come out in the proper order naturally, a relational database makes no guarantee about the order in which it returns rows—unless you tell it how. To arrange the rows from a query result into a specific order, sort them by adding an ORDER BYclause to your SELECT statement. Without ORDER BY, you may find that the retrieval order changes when you modify the contents of your table. With an ORDER BY clause, MySQL will always sort rows the way you indicate.

ORDERBY has the following general characteristics:

§ You can sort using a single column of values or multiple columns.

§ You can sort any column in either ascending order (the default) or descending order.

§ You can refer to sort columns by name or by using an alias.

This section shows some basic sorting techniques, such as how to name the sort columns and specify the sort direction. The following sections illustrate how to perform more complex sorts. Paradoxically, you can even use ORDER BY to disorder a result set, which is useful for randomizing the rows or (in conjunction with LIMIT) for picking a row at random from a result set. Those uses for ORDER BY are described in Chapter 13.

The following set of examples demonstrates how to sort on a single column or multiple columns and how to sort in ascending or descending order. The examples select the rows in the driver_log table but sort them in different orders so that you can compare the effect of the differentORDER BY clauses.

This query produces a single-column sort using the driver name:

mysql>SELECT * FROM driver_log ORDER BY name;

+--------+-------+------------+-------+

| rec_id | name | trav_date | miles |

+--------+-------+------------+-------+

| 1 | Ben | 2006-08-30 | 152 |

| 9 | Ben | 2006-09-02 | 79 |

| 5 | Ben | 2006-08-29 | 131 |

| 8 | Henry | 2006-09-01 | 197 |

| 6 | Henry | 2006-08-26 | 115 |

| 4 | Henry | 2006-08-27 | 96 |

| 3 | Henry | 2006-08-29 | 300 |

| 10 | Henry | 2006-08-30 | 203 |

| 7 | Suzi | 2006-09-02 | 502 |

| 2 | Suzi | 2006-08-29 | 391 |

+--------+-------+------------+-------+

The default sort direction is ascending. You can make the direction for an ascending sort explicit by adding ASC after the sorted column’s name:

SELECT * FROM driver_log ORDER BY name ASC;

The opposite (or reverse) of ascending order is descending order, specified by adding DESC after the sorted column’s name:

mysql>SELECT * FROM driver_log ORDER BY name DESC;

+--------+-------+------------+-------+

| rec_id | name | trav_date | miles |

+--------+-------+------------+-------+

| 2 | Suzi | 2006-08-29 | 391 |

| 7 | Suzi | 2006-09-02 | 502 |

| 10 | Henry | 2006-08-30 | 203 |

| 8 | Henry | 2006-09-01 | 197 |

| 6 | Henry | 2006-08-26 | 115 |

| 4 | Henry | 2006-08-27 | 96 |

| 3 | Henry | 2006-08-29 | 300 |

| 5 | Ben | 2006-08-29 | 131 |

| 9 | Ben | 2006-09-02 | 79 |

| 1 | Ben | 2006-08-30 | 152 |

+--------+-------+------------+-------+

If you closely examine the output from the queries just shown, you’ll notice that although the rows are sorted by name, the rows for any given name aren’t in any special order. (The trav_date values aren’t in date order for Henry or Ben, for example.) That’s because MySQL doesn’t sort something unless you tell it to:

§ The overall order of rows returned by a query is indeterminate unless you specify an ORDER BY clause.

§ Within a group of rows that sort together based on the values in a given column, the order of values in other columns also is indeterminate unless you name them in the ORDER BY clause.

To more fully control output order, specify a multiple-column sort by listing each column to use for sorting, separated by commas. The following query sorts in ascending order by name and by trav_date within the rows for each name:

mysql>SELECT * FROM driver_log ORDER BY name, trav_date;

+--------+-------+------------+-------+

| rec_id | name | trav_date | miles |

+--------+-------+------------+-------+

| 5 | Ben | 2006-08-29 | 131 |

| 1 | Ben | 2006-08-30 | 152 |

| 9 | Ben | 2006-09-02 | 79 |

| 6 | Henry | 2006-08-26 | 115 |

| 4 | Henry | 2006-08-27 | 96 |

| 3 | Henry | 2006-08-29 | 300 |

| 10 | Henry | 2006-08-30 | 203 |

| 8 | Henry | 2006-09-01 | 197 |

| 2 | Suzi | 2006-08-29 | 391 |

| 7 | Suzi | 2006-09-02 | 502 |

+--------+-------+------------+-------+

Multiple-column sorts can be descending as well, but DESC must be specified after each column name to perform a fully descending sort:

mysql>SELECT * FROM driver_log ORDER BY name DESC, trav_date DESC;

+--------+-------+------------+-------+

| rec_id | name | trav_date | miles |

+--------+-------+------------+-------+

| 7 | Suzi | 2006-09-02 | 502 |

| 2 | Suzi | 2006-08-29 | 391 |

| 8 | Henry | 2006-09-01 | 197 |

| 10 | Henry | 2006-08-30 | 203 |

| 3 | Henry | 2006-08-29 | 300 |

| 4 | Henry | 2006-08-27 | 96 |

| 6 | Henry | 2006-08-26 | 115 |

| 9 | Ben | 2006-09-02 | 79 |

| 1 | Ben | 2006-08-30 | 152 |

| 5 | Ben | 2006-08-29 | 131 |

+--------+-------+------------+-------+

Multiple-column ORDER BY clauses can perform mixed-order sorting where some columns are sorted in ascending order and others in descending order. The following query sorts by name in descending order and then by trav_date in ascending order for each name:

mysql>SELECT * FROM driver_log ORDER BY name DESC, trav_date;

+--------+-------+------------+-------+

| rec_id | name | trav_date | miles |

+--------+-------+------------+-------+

| 2 | Suzi | 2006-08-29 | 391 |

| 7 | Suzi | 2006-09-02 | 502 |

| 6 | Henry | 2006-08-26 | 115 |

| 4 | Henry | 2006-08-27 | 96 |

| 3 | Henry | 2006-08-29 | 300 |

| 10 | Henry | 2006-08-30 | 203 |

| 8 | Henry | 2006-09-01 | 197 |

| 5 | Ben | 2006-08-29 | 131 |

| 1 | Ben | 2006-08-30 | 152 |

| 9 | Ben | 2006-09-02 | 79 |

+--------+-------+------------+-------+

The ORDER BY clauses in the queries shown thus far refer to the sorted columns by name. You can also name the columns by using aliases. That is, if an output column has an alias, you can refer to the alias in the ORDER BY clause:

mysql>SELECT name, trav_date, miles AS distance FROM driver_log

-> ORDER BY distance;

+-------+------------+----------+

| name | trav_date | distance |

+-------+------------+----------+

| Ben | 2006-09-02 | 79 |

| Henry | 2006-08-27 | 96 |

| Henry | 2006-08-26 | 115 |

| Ben | 2006-08-29 | 131 |

| Ben | 2006-08-30 | 152 |

| Henry | 2006-09-01 | 197 |

| Henry | 2006-08-30 | 203 |

| Henry | 2006-08-29 | 300 |

| Suzi | 2006-08-29 | 391 |

| Suzi | 2006-09-02 | 502 |

+-------+------------+----------+

Columns specified by aliases can be sorted in either ascending or descending order, just like named columns:

mysql>SELECT name, trav_date, miles AS distance FROM driver_log

-> ORDER BY distance DESC;

+-------+------------+----------+

| name | trav_date | distance |

+-------+------------+----------+

| Suzi | 2006-09-02 | 502 |

| Suzi | 2006-08-29 | 391 |

| Henry | 2006-08-29 | 300 |

| Henry | 2006-08-30 | 203 |

| Henry | 2006-09-01 | 197 |

| Ben | 2006-08-30 | 152 |

| Ben | 2006-08-29 | 131 |

| Henry | 2006-08-26 | 115 |

| Henry | 2006-08-27 | 96 |

| Ben | 2006-09-02 | 79 |

+-------+------------+----------+

SHOULD YOU SORT QUERY RESULTS YOURSELF?

If you’re issuing a SELECT statement from within one of your own programs, you can retrieve an unsorted result set into a data structure, and then sort the data structure using your programming language. But why reinvent the wheel? The MySQL server is built to sort efficiently, and you may as well let it do its job.

A possible exception to this principle occurs when you need to sort a set of rows several different ways. In this case, rather than issuing several queries that differ only in the ORDER BY clause, it might be faster to retrieve the rows once, and re-sort them as necessary within your program. The attractiveness of this strategy generally diminishes if your result sets are very large and sorting them will use lots of memory and processing time.

Using Expressions for Sorting

Problem

You want to sort a query result based on values calculated from a column, rather than using the values actually stored in the column.

Solution

Put the expression that calculates the values in the ORDER BY clause.

Discussion

One of the columns in the mail table shows how large each mail message is, in bytes:

mysql>SELECT * FROM mail;

+---------------------+---------+---------+---------+---------+---------+

| t | srcuser | srchost | dstuser | dsthost | size |

+---------------------+---------+---------+---------+---------+---------+

| 2006-05-11 10:15:08 | barb | saturn | tricia | mars | 58274 |

| 2006-05-12 12:48:13 | tricia | mars | gene | venus | 194925 |

| 2006-05-12 15:02:49 | phil | mars | phil | saturn | 1048 |

| 2006-05-13 13:59:18 | barb | saturn | tricia | venus | 271 |

...

Suppose that you want to retrieve rows for “big” mail messages (defined as those larger than 50,000 bytes), but you want them to be displayed and sorted by sizes in terms of kilobytes, not bytes. In this case, the values to sort are calculated by an expression:

FLOOR((size+1023)/1024)

Wondering about the +1023 in the FLOOR() expression? That’s there so that size values group to the nearest upper boundary of the 1024-byte categories. Without it, the values group by lower boundaries (for example, a 2047-byte message would be reported as having a size of 1 kilobyte rather than 2). This technique is discussed in more detail in Categorizing Noncategorical Data.

There are two ways to use an expression for sorting query results. First, you can put the expression directly in the ORDER BY clause:

mysql>SELECT t, srcuser, FLOOR((size+1023)/1024)

-> FROM mail WHERE size > 50000

-> ORDER BY FLOOR((size+1023)/1024);

+---------------------+---------+-------------------------+

| t | srcuser | FLOOR((size+1023)/1024) |

+---------------------+---------+-------------------------+

| 2006-05-11 10:15:08 | barb | 57 |

| 2006-05-14 14:42:21 | barb | 96 |

| 2006-05-12 12:48:13 | tricia | 191 |

| 2006-05-15 10:25:52 | gene | 976 |

| 2006-05-14 17:03:01 | tricia | 2339 |

+---------------------+---------+-------------------------+

Second, if you are sorting by an expression named in the output column list, you can give it an alias and refer to the alias in the ORDER BY clause:

mysql>SELECT t, srcuser, FLOOR((size+1023)/1024) AS kilobytes

-> FROM mail WHERE size > 50000

-> ORDER BY kilobytes;

+---------------------+---------+-----------+

| t | srcuser | kilobytes |

+---------------------+---------+-----------+

| 2006-05-11 10:15:08 | barb | 57 |

| 2006-05-14 14:42:21 | barb | 96 |

| 2006-05-12 12:48:13 | tricia | 191 |

| 2006-05-15 10:25:52 | gene | 976 |

| 2006-05-14 17:03:01 | tricia | 2339 |

+---------------------+---------+-----------+

Although you can write the ORDER BY clause either way, there are at least two reasons you might prefer to use the alias method:

§ It’s easier to write the alias in the ORDER BY clause than to repeat the (rather cumbersome) expression—and if you change one, you’ll need to change the other.

§ The alias may be useful for display purposes, to provide a more meaningful column label. Note how the third column heading for the second of the two preceding queries is more meaningful.

Displaying One Set of Values While Sorting by Another

Problem

You want to sort a result set using values that you’re not selecting.

Solution

That’s not a problem. You can use columns in the ORDER BY clause that don’t appear in the output column list.

Discussion

ORDERBY is not limited to sorting only those columns named in the output column list. It can sort using values that are“hidden” (that is, not displayed in the query output). This technique is commonly used when you have values that can be represented different ways and you want to display one type of value but sort by another. For example, you may want to display mail message sizes not in terms of bytes, but as strings such as 103K for 103 kilobytes. You can convert a byte count to that kind of value using this expression:

CONCAT(FLOOR((size+1023)/1024),'K')

However, such values are strings, so they sort lexically, not numerically. If you use them for sorting, a value such as 96K sorts after 2339K, even though it represents a smaller number:

mysql>SELECT t, srcuser,

-> CONCAT(FLOOR((size+1023)/1024),'K') AS size_in_K

-> FROM mail WHERE size > 50000

-> ORDER BY size_in_K;

+---------------------+---------+-----------+

| t | srcuser | size_in_K |

+---------------------+---------+-----------+

| 2006-05-12 12:48:13 | tricia | 191K |

| 2006-05-14 17:03:01 | tricia | 2339K |

| 2006-05-11 10:15:08 | barb | 57K |

| 2006-05-14 14:42:21 | barb | 96K |

| 2006-05-15 10:25:52 | gene | 976K |

+---------------------+---------+-----------+

To achieve the desired output order, display the string, but use the actual numeric size for sorting:

mysql>SELECT t, srcuser,

-> CONCAT(FLOOR((size+1023)/1024),'K') AS size_in_K

-> FROM mail WHERE size > 50000

-> ORDER BY size;

+---------------------+---------+-----------+

| t | srcuser | size_in_K |

+---------------------+---------+-----------+

| 2006-05-11 10:15:08 | barb | 57K |

| 2006-05-14 14:42:21 | barb | 96K |

| 2006-05-12 12:48:13 | tricia | 191K |

| 2006-05-15 10:25:52 | gene | 976K |

| 2006-05-14 17:03:01 | tricia | 2339K |

+---------------------+---------+-----------+

Displaying values as strings but sorting them as numbers also can bail you out of some otherwise difficult situations. Members of sports teams typically are assigned a jersey number, which normally you might think should be stored using a numeric column. Not so fast! Some players like to have a jersey number of zero (0), and some like double-zero (00). If a team happens to have players with both numbers, you cannot represent them using a numeric column, because both values will be treated as the same number. The way out of the problem is to store jersey numbers as strings:

CREATE TABLE roster

(

name CHAR(30), # player name

jersey_num CHAR(3) # jersey number

);

Then the jersey numbers will display the same way you enter them, and 0 and 00 will be treated as distinct values. Unfortunately, although representing numbers as strings solves the problem of distinguishing 0 and 00, it introduces a different problem. Suppose that a team has the following players:

mysql>SELECT name, jersey_num FROM roster;

+-----------+------------+

| name | jersey_num |

+-----------+------------+

| Lynne | 29 |

| Ella | 0 |

| Elizabeth | 100 |

| Nancy | 00 |

| Jean | 8 |

| Sherry | 47 |

+-----------+------------+

The problem occurs when you try to sort the team members by jersey number. If those numbers are stored as strings, they’ll sort lexically, and lexical order often differs from numeric order. That’s certainly true for the team in question:

mysql>SELECT name, jersey_num FROM roster ORDER BY jersey_num;

+-----------+------------+

| name | jersey_num |

+-----------+------------+

| Ella | 0 |

| Nancy | 00 |

| Elizabeth | 100 |

| Lynne | 29 |

| Sherry | 47 |

| Jean | 8 |

+-----------+------------+

The values 100 and 8 are out of place. But that’s easily solved. Display the string values, but use the numeric values for sorting. To accomplish this, add zero to the jersey_num values to force a string-to-number conversion:

mysql>SELECT name, jersey_num FROM roster ORDER BY jersey_num+0;

+-----------+------------+

| name | jersey_num |

+-----------+------------+

| Ella | 0 |

| Nancy | 00 |

| Jean | 8 |

| Lynne | 29 |

| Sherry | 47 |

| Elizabeth | 100 |

+-----------+------------+

The technique of displaying one value but sorting by another is also useful when you want to display composite values that are formed from multiple columns but that don’t sort the way you want. For example, the mail table lists message senders using separate srcuser and srchostvalues. If you want to display message senders from the mail table as email addresses in srcuser@srchost format with the username first, you can construct those values using the following expression:

CONCAT(srcuser,'@',srchost)

However, those values are no good for sorting if you want to treat the hostname as more significant than the username. Instead, sort the results using the underlying column values rather than the displayed composite values:

mysql>SELECT t, CONCAT(srcuser,'@',srchost) AS sender, size

-> FROM mail WHERE size > 50000

-> ORDER BY srchost, srcuser;

+---------------------+---------------+---------+

| t | sender | size |

+---------------------+---------------+---------+

| 2006-05-15 10:25:52 | gene@mars | 998532 |

| 2006-05-12 12:48:13 | tricia@mars | 194925 |

| 2006-05-11 10:15:08 | barb@saturn | 58274 |

| 2006-05-14 17:03:01 | tricia@saturn | 2394482 |

| 2006-05-14 14:42:21 | barb@venus | 98151 |

+---------------------+---------------+---------+

The same idea commonly is applied to sorting people’s names. Suppose that you have a table names that contains last and first names. To display rows sorted by last name first, the query is straightforward when the columns are displayed separately:

mysql>SELECT last_name, first_name FROM name

-> ORDER BY last_name, first_name;

+-----------+------------+

| last_name | first_name |

+-----------+------------+

| Blue | Vida |

| Brown | Kevin |

| Gray | Pete |

| White | Devon |

| White | Rondell |

+-----------+------------+

If instead you want to display each name as a single string composed of the first name, a space, and the last name, you can begin the query like this:

SELECT CONCAT(first_name,' ',last_name) AS full_name FROM name ...

But then how do you sort the names so they come out in the last name order? The answer is to display the composite names, but refer to the constituent values in the ORDER BY clause:

mysql>SELECT CONCAT(first_name,' ',last_name) AS full_name

-> FROM name

-> ORDER BY last_name, first_name;

+---------------+

| full_name |

+---------------+

| Vida Blue |

| Kevin Brown |

| Pete Gray |

| Devon White |

| Rondell White |

+---------------+

Controlling Case Sensitivity of String Sorts

Problem

String sorting operations are case-sensitive when you don’t want them to be, or vice versa.

Solution

Alter the comparison characteristics of the sorted values.

Discussion

Chapter 5 discusses how string comparison properties depend on whether the strings are binary or nonbinary:

§ Binary strings are sequences of bytes. They are compared byte by byte using numeric byte values. Character set and lettercase have no meaning for comparisons.

§ Nonbinary strings are sequences of characters. They have a character set and collation and are compared character by character using the order defined by the collation.

These properties apply to string sorting as well, because sorting is based on comparison. To alter the sorting properties of a string column, you must alter its comparison properties. (For a summary of which string data types are binary and nonbinary, see Choosing a String Data Type.)

The examples in this section use a table that has case-insensitive and case-sensitive nonbinary columns, and a binary column:

CREATE TABLE str_val

(

ci_str CHAR(3) CHARACTER SET latin1 COLLATE latin1_swedish_ci,

cs_str CHAR(3) CHARACTER SET latin1 COLLATE latin1_general_cs,

bin_str BINARY(3)

);

Suppose that the table has the following contents:

+--------+--------+---------+

| ci_str | cs_str | bin_str |

+--------+--------+---------+

| AAA | AAA | AAA |

| aaa | aaa | aaa |

| bbb | bbb | bbb |

| BBB | BBB | BBB |

+--------+--------+---------+

Each column contains the same values, but the natural sort orders for the column data types produce three different results:

§ The case-insensitive collation sorts a and A together, placing them before b and B. However, for a given letter, it does not necessarily order one lettercase before another, as shown by the following result:

§ mysql>SELECT ci_str FROM str_val ORDER BY ci_str;

§ +--------+

§ | ci_str |

§ +--------+

§ | AAA |

§ | aaa |

§ | bbb |

§ | BBB |

+--------+

§ The case-sensitive collation puts A and a before B and b, and sorts uppercase before lowercase:

§ mysql>SELECT cs_str FROM str_val ORDER BY cs_str;

§ +--------+

§ | cs_str |

§ +--------+

§ | AAA |

§ | aaa |

§ | BBB |

§ | bbb |

+--------+

§ The binary strings sort numerically. Assuming that uppercase letters have numeric values less than those of lowercase letters, a binary sort results in the following ordering:

§ mysql>SELECT bin_str FROM str_val ORDER BY bin_str;

§ +---------+

§ | bin_str |

§ +---------+

§ | AAA |

§ | BBB |

§ | aaa |

§ | bbb |

+---------+

You get the same result for a nonbinary string column that has a binary collation, as long as the column contains single-byte characters (for example, CHAR(3) CHARACTER SET latin1 COLLATE latin1_bin). For multibyte characters, a binary collation still produces a numeric sort, but the character values use multibyte numbers.

To alter the sorting properties of each column, use the techniques described in Controlling Case Sensitivity in String Comparisons for controlling how string comparisons work:

§ To sort case-insensitive strings in case-sensitive fashion, order the sorted values using a case-sensitive collation:

§ mysql>SELECT ci_str FROM str_val

§ -> ORDER BY ci_str COLLATE latin1_general_cs;

§ +--------+

§ | ci_str |

§ +--------+

§ | AAA |

§ | aaa |

§ | BBB |

§ | bbb |

+--------+

§ To sort case-sensitive strings in case-insensitive fashion, order the sorted values using a case-insensitive collation:

§ mysql>SELECT cs_str FROM str_val

§ -> ORDER BY cs_str COLLATE latin1_swedish_ci;

§ +--------+

§ | cs_str |

§ +--------+

§ | AAA |

§ | aaa |

§ | bbb |

§ | BBB |

+--------+

Another possibility is to sort using values that have all been converted to the same lettercase, which makes lettercase irrelevant:

mysql>SELECT cs_str FROM str_val

-> ORDER BY UPPER(cs_str);

+--------+

| cs_str |

+--------+

| AAA |

| aaa |

| bbb |

| BBB |

+--------+

§ Binary strings sort using numeric byte values, so there is no concept of lettercase involved. However, because letters in different cases have different byte values, comparisons of binary strings effectively are case-sensitive. (That is, a and A are unequal.) To sort binary strings using a case-insensitive ordering, convert them to nonbinary strings and apply an appropriate collation. For example, to perform a case-insensitive sort, use a statement like this:

§ mysql>SELECT bin_str FROM str_val

§ -> ORDER BY CONVERT(bin_str USING latin1) COLLATE latin1_swedish_ci;

§ +---------+

§ | bin_str |

§ +---------+

§ | AAA |

§ | aaa |

§ | bbb |

§ | BBB |

+---------+

If the default collation is case-insensitive (as is true for latin1), you can omit the COLLATE clause.

Date-Based Sorting

Problem

You want to sort rows in temporal order.

Solution

Sort using a date or time data type. If some parts of the values are irrelevant for the sort that you want to accomplish, ignore them.

Discussion

Many database tables include date or time information and it’s very often necessary to sort results in temporal order. MySQL knows how to sort temporal data types, so there’s no special trick to ordering values in DATE, DATETIME, TIME, or TIMESTAMP columns. Let’s begin with a table that contains values for each of those types:

mysql>SELECT * FROM temporal_val;

+------------+---------------------+----------+---------------------+

| d | dt | t | ts |

+------------+---------------------+----------+---------------------+

| 1970-01-01 | 1884-01-01 12:00:00 | 13:00:00 | 1980-01-01 02:00:00 |

| 1999-01-01 | 1860-01-01 12:00:00 | 19:00:00 | 2021-01-01 03:00:00 |

| 1981-01-01 | 1871-01-01 12:00:00 | 03:00:00 | 1975-01-01 04:00:00 |

| 1964-01-01 | 1899-01-01 12:00:00 | 01:00:00 | 1985-01-01 05:00:00 |

+------------+---------------------+----------+---------------------+

Using an ORDER BY clause with any of these columns sorts the values into the appropriate order:

mysql>SELECT * FROM temporal_val ORDER BY d;

+------------+---------------------+----------+---------------------+

| d | dt | t | ts |

+------------+---------------------+----------+---------------------+

| 1964-01-01 | 1899-01-01 12:00:00 | 01:00:00 | 1985-01-01 05:00:00 |

| 1970-01-01 | 1884-01-01 12:00:00 | 13:00:00 | 1980-01-01 02:00:00 |

| 1981-01-01 | 1871-01-01 12:00:00 | 03:00:00 | 1975-01-01 04:00:00 |

| 1999-01-01 | 1860-01-01 12:00:00 | 19:00:00 | 2021-01-01 03:00:00 |

+------------+---------------------+----------+---------------------+

mysql> SELECT * FROM temporal_val ORDER BY dt;

+------------+---------------------+----------+---------------------+

| d | dt | t | ts |

+------------+---------------------+----------+---------------------+

| 1999-01-01 | 1860-01-01 12:00:00 | 19:00:00 | 2021-01-01 03:00:00 |

| 1981-01-01 | 1871-01-01 12:00:00 | 03:00:00 | 1975-01-01 04:00:00 |

| 1970-01-01 | 1884-01-01 12:00:00 | 13:00:00 | 1980-01-01 02:00:00 |

| 1964-01-01 | 1899-01-01 12:00:00 | 01:00:00 | 1985-01-01 05:00:00 |

+------------+---------------------+----------+---------------------+

mysql> SELECT * FROM temporal_val ORDER BY t;

+------------+---------------------+----------+---------------------+

| d | dt | t | ts |

+------------+---------------------+----------+---------------------+

| 1964-01-01 | 1899-01-01 12:00:00 | 01:00:00 | 1985-01-01 05:00:00 |

| 1981-01-01 | 1871-01-01 12:00:00 | 03:00:00 | 1975-01-01 04:00:00 |

| 1970-01-01 | 1884-01-01 12:00:00 | 13:00:00 | 1980-01-01 02:00:00 |

| 1999-01-01 | 1860-01-01 12:00:00 | 19:00:00 | 2021-01-01 03:00:00 |

+------------+---------------------+----------+---------------------+

mysql> SELECT * FROM temporal_val ORDER BY ts;

+------------+---------------------+----------+---------------------+

| d | dt | t | ts |

+------------+---------------------+----------+---------------------+

| 1981-01-01 | 1871-01-01 12:00:00 | 03:00:00 | 1975-01-01 04:00:00 |

| 1970-01-01 | 1884-01-01 12:00:00 | 13:00:00 | 1980-01-01 02:00:00 |

| 1964-01-01 | 1899-01-01 12:00:00 | 01:00:00 | 1985-01-01 05:00:00 |

| 1999-01-01 | 1860-01-01 12:00:00 | 19:00:00 | 2021-01-01 03:00:00 |

+------------+---------------------+----------+---------------------+

Sometimes a temporal sort uses only part of a date or time column. In that case, use an expression that extracts the part or parts you need and sort the result using the expression. Some examples of this are given in the next few recipes.

Sorting by Calendar Day

Problem

You want to sort by day of the calendar year.

Solution

Sort using the month and day of date values, ignoring the year.

Discussion

Sorting in calendar order differs from sorting by date. You need to ignore the year part of the dates and sort using only the month and day to order rows in terms of where they fall during the calendar year. Suppose that you have an event table that looks like this when values are ordered by actual date of occurrence:

mysql>SELECT date, description FROM event ORDER BY date;

+------------+-------------------------------------+

| date | description |

+------------+-------------------------------------+

| 1215-06-15 | Signing of the Magna Carta |

| 1732-02-22 | George Washington's birthday |

| 1776-07-14 | Bastille Day |

| 1789-07-04 | US Independence Day |

| 1809-02-12 | Abraham Lincoln's birthday |

| 1919-06-28 | Signing of the Treaty of Versailles |

| 1944-06-06 | D-Day at Normandy Beaches |

| 1957-10-04 | Sputnik launch date |

| 1958-01-31 | Explorer 1 launch date |

| 1989-11-09 | Opening of the Berlin Wall |

+------------+-------------------------------------+

To put these items in calendar order, sort them by month, and then by day within month:

mysql>SELECT date, description FROM event

-> ORDER BY MONTH(date), DAYOFMONTH(date);

+------------+-------------------------------------+

| date | description |

+------------+-------------------------------------+

| 1958-01-31 | Explorer 1 launch date |

| 1809-02-12 | Abraham Lincoln's birthday |

| 1732-02-22 | George Washington's birthday |

| 1944-06-06 | D-Day at Normandy Beaches |

| 1215-06-15 | Signing of the Magna Carta |

| 1919-06-28 | Signing of the Treaty of Versailles |

| 1789-07-04 | US Independence Day |

| 1776-07-14 | Bastille Day |

| 1957-10-04 | Sputnik launch date |

| 1989-11-09 | Opening of the Berlin Wall |

+------------+-------------------------------------+

MySQL also has a DAYOFYEAR() function that you might suspect would be useful for calendar day sorting:

mysql>SELECT date, description FROM event ORDER BY DAYOFYEAR(date);

+------------+-------------------------------------+

| date | description |

+------------+-------------------------------------+

| 1958-01-31 | Explorer 1 launch date |

| 1809-02-12 | Abraham Lincoln's birthday |

| 1732-02-22 | George Washington's birthday |

| 1944-06-06 | D-Day at Normandy Beaches |

| 1215-06-15 | Signing of the Magna Carta |

| 1919-06-28 | Signing of the Treaty of Versailles |

| 1789-07-04 | US Independence Day |

| 1776-07-14 | Bastille Day |

| 1957-10-04 | Sputnik launch date |

| 1989-11-09 | Opening of the Berlin Wall |

+------------+-------------------------------------+

That appears to work, but only because the table doesn’t have rows in it that expose a problem with using DAYOFYEAR() for sorting: it can generate the same value for different calendar days. For example, February 29 of leap years and March 1 of nonleap years have the same day-of-year value:

mysql>SELECT DAYOFYEAR('1996-02-29'), DAYOFYEAR('1997-03-01');

+-------------------------+-------------------------+

| DAYOFYEAR('1996-02-29') | DAYOFYEAR('1997-03-01') |

+-------------------------+-------------------------+

| 60 | 60 |

+-------------------------+-------------------------+

This property means that DAYOFYEAR() won’t necessarily produce correct results for calendar sorting. It can group dates that actually occur on different calendar days.

If a table represents dates using separate year, month, and day columns, calendar sorting requires no date-part extraction. Just sort the relevant columns directly. For large datasets, sorting using separate date-part columns can be much faster than sorts based on extracting pieces of DATEvalues. There’s no overhead for part extraction, but more important, you can index the date-part columns separately—something not possible with a DATE column. The principle here is that you should design the table to make it easy to extract or sort by the values that you expect to use a lot.

Sorting by Day of Week

Problem

You want to sort rows in day-of-week order.

Solution

Use DAYOFWEEK() to convert a date column to its numeric day-of-week value.

Discussion

Day-of-week sorting is similar to calendar-day sorting, except that you use different functions to get at the relevant ordering values.

You can get the day of the week using DAYNAME(), but that produces strings that sort lexically rather than in day-of-week order (Sunday, Monday, Tuesday, and so forth). Here the technique of displaying one value but sorting by another is useful (see Displaying One Set of Values While Sorting by Another). Display day names using DAYNAME(), but sort in day-of-week order using DAYOFWEEK(), which returns numeric values from 1 to 7 for Sunday through Saturday:

mysql>SELECT DAYNAME(date) AS day, date, description

-> FROM event

-> ORDER BY DAYOFWEEK(date);

+----------+------------+-------------------------------------+

| day | date | description |

+----------+------------+-------------------------------------+

| Sunday | 1809-02-12 | Abraham Lincoln's birthday |

| Sunday | 1776-07-14 | Bastille Day |

| Monday | 1215-06-15 | Signing of the Magna Carta |

| Tuesday | 1944-06-06 | D-Day at Normandy Beaches |

| Thursday | 1989-11-09 | Opening of the Berlin Wall |

| Friday | 1732-02-22 | George Washington's birthday |

| Friday | 1958-01-31 | Explorer 1 launch date |

| Friday | 1957-10-04 | Sputnik launch date |

| Saturday | 1919-06-28 | Signing of the Treaty of Versailles |

| Saturday | 1789-07-04 | US Independence Day |

+----------+------------+-------------------------------------+

If you want to sort rows in day-of-week order but treat Monday as the first day of the week and Sunday as the last, you can use a the MOD() function to map Monday to 0, Tuesday to 1, ..., Sunday to 6:

mysql>SELECT DAYNAME(date), date, description

-> FROM event

-> ORDER BY MOD(DAYOFWEEK(date)+5, 7);

+---------------+------------+-------------------------------------+

| DAYNAME(date) | date | description |

+---------------+------------+-------------------------------------+

| Monday | 1215-06-15 | Signing of the Magna Carta |

| Tuesday | 1944-06-06 | D-Day at Normandy Beaches |

| Thursday | 1989-11-09 | Opening of the Berlin Wall |

| Friday | 1732-02-22 | George Washington's birthday |

| Friday | 1957-10-04 | Sputnik launch date |

| Friday | 1958-01-31 | Explorer 1 launch date |

| Saturday | 1789-07-04 | US Independence Day |

| Saturday | 1919-06-28 | Signing of the Treaty of Versailles |

| Sunday | 1776-07-14 | Bastille Day |

| Sunday | 1809-02-12 | Abraham Lincoln's birthday |

+---------------+------------+-------------------------------------+

The following table shows the DAYOFWEEK() expressions to use for putting any day of the week first in the sort order:

Day to list first

DAYOFWEEK() expression

Sunday

DAYOFWEEK(date)

Monday

MOD(DAYOFWEEK(date)+5, 7)

Tuesday

MOD(DAYOFWEEK(date)+4, 7)

Wednesday

MOD(DAYOFWEEK(date)+3, 7)

Thursday

MOD(DAYOFWEEK(date)+2, 7)

Friday

MOD(DAYOFWEEK(date)+1, 7)

Saturday

MOD(DAYOFWEEK(date)+0, 7)

Another function that you can use for day-of-week sorting is WEEKDAY(), although it returns a different set of values (0 for Monday through 6 for Sunday).

Sorting by Time of Day

Problem

You want to sort rows in time-of-day order.

Solution

Pull out the hour, minute, and second from the column that contains the time, and use them for sorting.

Discussion

Time-of-day sorting can be done different ways, depending on your column type. If the values are stored in a TIME column named timecol, just sort them directly using ORDER BY timecol. To put DATETIME or TIMESTAMP values in time-of-day order, extract the time parts and sort them. For example, the mail table contains DATETIME values, which can be sorted by time of day like this:

mysql>SELECT * FROM mail ORDER BY HOUR(t), MINUTE(t), SECOND(t);

+---------------------+---------+---------+---------+---------+---------+

| t | srcuser | srchost | dstuser | dsthost | size |

+---------------------+---------+---------+---------+---------+---------+

| 2006-05-15 07:17:48 | gene | mars | gene | saturn | 3824 |

| 2006-05-15 08:50:57 | phil | venus | phil | venus | 978 |

| 2006-05-16 09:00:28 | gene | venus | barb | mars | 613 |

| 2006-05-14 09:31:37 | gene | venus | barb | mars | 2291 |

| 2006-05-11 10:15:08 | barb | saturn | tricia | mars | 58274 |

| 2006-05-15 10:25:52 | gene | mars | tricia | saturn | 998532 |

| 2006-05-14 11:52:17 | phil | mars | tricia | saturn | 5781 |

| 2006-05-12 12:48:13 | tricia | mars | gene | venus | 194925 |

...

You can also use TIME_TO_SEC(), which strips off the date part and returns the time part as the corresponding number of seconds:

mysql>SELECT * FROM mail ORDER BY TIME_TO_SEC(t);

+---------------------+---------+---------+---------+---------+---------+

| t | srcuser | srchost | dstuser | dsthost | size |

+---------------------+---------+---------+---------+---------+---------+

| 2006-05-15 07:17:48 | gene | mars | gene | saturn | 3824 |

| 2006-05-15 08:50:57 | phil | venus | phil | venus | 978 |

| 2006-05-16 09:00:28 | gene | venus | barb | mars | 613 |

| 2006-05-14 09:31:37 | gene | venus | barb | mars | 2291 |

| 2006-05-11 10:15:08 | barb | saturn | tricia | mars | 58274 |

| 2006-05-15 10:25:52 | gene | mars | tricia | saturn | 998532 |

| 2006-05-14 11:52:17 | phil | mars | tricia | saturn | 5781 |

| 2006-05-12 12:48:13 | tricia | mars | gene | venus | 194925 |

...

Sorting Using Substrings of Column Values

Problem

You want to sort a set of values using one or more substrings of each value.

Solution

Extract the hunks you want and sort them separately.

Discussion

This is a specific application of sorting by expression value (see Using Expressions for Sorting). If you want to sort rows using just a particular portion of a column’s values, extract the substring you need and use it in the ORDER BY clause. This is easiest if the substrings are at a fixed position and length within the column. For substrings of variable position or length, you may still be able to use them for sorting if there is some reliable way to identify them. The next several recipes show how to use substring extraction to produce specialized sort orders.

Sorting by Fixed-Length Substrings

Problem

You want to sort using parts of a column that occur at a given position within the column.

Solution

Pull out the parts you need with LEFT(), MID(), or RIGHT(), and sort them.

Discussion

Suppose that you have a housewares table that acts as a catalog for houseware furnishings, and that items are identified by 10-character ID values consisting of three subparts: a three-character category abbreviation (such as DIN for “dining room” or KIT for “kitchen”), a five-digit serial number, and a two-character country code indicating where the part is manufactured:

mysql>SELECT * FROM housewares;

+------------+------------------+

| id | description |

+------------+------------------+

| DIN40672US | dining table |

| KIT00372UK | garbage disposal |

| KIT01729JP | microwave oven |

| BED00038SG | bedside lamp |

| BTH00485US | shower stall |

| BTH00415JP | lavatory |

+------------+------------------+

This is not necessarily a good way to store complex ID values, and later we’ll consider how to represent them using separate columns (see Using an AUTO_INCREMENT Column to Create Multiple Sequences). But for now, assume that the values must be stored as just shown.

If you want to sort rows from this table based on the id values, just use the entire column value:

mysql>SELECT * FROM housewares ORDER BY id;

+------------+------------------+

| id | description |

+------------+------------------+

| BED00038SG | bedside lamp |

| BTH00415JP | lavatory |

| BTH00485US | shower stall |

| DIN40672US | dining table |

| KIT00372UK | garbage disposal |

| KIT01729JP | microwave oven |

+------------+------------------+

But you might also have a need to sort on any of the three subparts (for example, to sort by country of manufacture). For that kind of operation, it’s helpful to use functions that pull out pieces of a column, such as LEFT(), MID(), and RIGHT(). These functions can be used to break apart theid values into their three components:

mysql>SELECT id,

-> LEFT(id,3) AS category,

-> MID(id,4,5) AS serial,

-> RIGHT(id,2) AS country

-> FROM housewares;

+------------+----------+--------+---------+

| id | category | serial | country |

+------------+----------+--------+---------+

| DIN40672US | DIN | 40672 | US |

| KIT00372UK | KIT | 00372 | UK |

| KIT01729JP | KIT | 01729 | JP |

| BED00038SG | BED | 00038 | SG |

| BTH00485US | BTH | 00485 | US |

| BTH00415JP | BTH | 00415 | JP |

+------------+----------+--------+---------+

Any of those fixed-length substrings of the id values can be used for sorting, either alone or in combination. To sort by product category, extract the category value and use it in the ORDER BY clause:

mysql>SELECT * FROM housewares ORDER BY LEFT(id,3);

+------------+------------------+

| id | description |

+------------+------------------+

| BED00038SG | bedside lamp |

| BTH00485US | shower stall |

| BTH00415JP | lavatory |

| DIN40672US | dining table |

| KIT00372UK | garbage disposal |

| KIT01729JP | microwave oven |

+------------+------------------+

To sort rows by product serial number, use MID() to extract the middle five characters from the id values, beginning with the fourth:

mysql>SELECT * FROM housewares ORDER BY MID(id,4,5);

+------------+------------------+

| id | description |

+------------+------------------+

| BED00038SG | bedside lamp |

| KIT00372UK | garbage disposal |

| BTH00415JP | lavatory |

| BTH00485US | shower stall |

| KIT01729JP | microwave oven |

| DIN40672US | dining table |

+------------+------------------+

This appears to be a numeric sort, but it’s actually a string sort, because MID() returns strings. It just so happens that the lexical and numeric sort order are the same in this case because the “numbers” have leading zeros to make them all the same length.

To sort by country code, use the rightmost two characters of the id values:

mysql>SELECT * FROM housewares ORDER BY RIGHT(id,2);

+------------+------------------+

| id | description |

+------------+------------------+

| KIT01729JP | microwave oven |

| BTH00415JP | lavatory |

| BED00038SG | bedside lamp |

| KIT00372UK | garbage disposal |

| DIN40672US | dining table |

| BTH00485US | shower stall |

+------------+------------------+

You can also sort using combinations of substrings. For example, to sort by country code and serial number, the query looks like this:

mysql>SELECT * FROM housewares ORDER BY RIGHT(id,2), MID(id,4,5);

+------------+------------------+

| id | description |

+------------+------------------+

| BTH00415JP | lavatory |

| KIT01729JP | microwave oven |

| BED00038SG | bedside lamp |

| KIT00372UK | garbage disposal |

| BTH00485US | shower stall |

| DIN40672US | dining table |

+------------+------------------+

Sorting by Variable-Length Substrings

Problem

You want to sort using parts of a column that do not occur at a given position within the column.

Solution

Figure out some way to identify the parts you need so that you can extract them. Otherwise, you’re out of luck.

Discussion

If the substrings that you want to use for sorting vary in length, you need a reliable means of extracting just the part of the column values that you want. To see how this works, create a housewares2 table that is like the housewares table used in Recipe 7.14, except that it has no leading zeros in the serial number part of the id values:

mysql>SELECT * FROM housewares2;

+------------+------------------+

| id | description |

+------------+------------------+

| DIN40672US | dining table |

| KIT372UK | garbage disposal |

| KIT1729JP | microwave oven |

| BED38SG | bedside lamp |

| BTH485US | shower stall |

| BTH415JP | lavatory |

+------------+------------------+

The category and country parts of the id values can be extracted and sorted using LEFT() and RIGHT(), just as for the housewares table. But now the numeric segments of the values have different lengths and cannot be extracted and sorted using a simple MID() call. Instead, useSUBSTRING() to skip over the first three characters. Then, of the remainder beginning with the fourth character (the first digit), take everything but the rightmost two columns. One way to do this is as follows:

mysql>SELECT id, LEFT(SUBSTRING(id,4),CHAR_LENGTH(SUBSTRING(id,4)-2))

-> FROM housewares2;

+------------+------------------------------------------------------+

| id | LEFT(SUBSTRING(id,4),CHAR_LENGTH(SUBSTRING(id,4)-2)) |

+------------+------------------------------------------------------+

| DIN40672US | 40672 |

| KIT372UK | 372 |

| KIT1729JP | 1729 |

| BED38SG | 38 |

| BTH485US | 485 |

| BTH415JP | 415 |

+------------+------------------------------------------------------+

But that’s more complex than necessary. The SUBSTRING() function takes an optional third argument specifying a desired result length, and we know that the length of the middle part is equal to the length of the string minus five (three for the characters at the beginning and two for the characters at the end). The following query demonstrates how to get the numeric middle part by beginning with the ID, and then stripping off the rightmost suffix:

mysql>SELECT id, SUBSTRING(id,4), SUBSTRING(id,4,CHAR_LENGTH(id)-5)

-> FROM housewares2;

+------------+-----------------+-----------------------------------+

| id | SUBSTRING(id,4) | SUBSTRING(id,4,CHAR_LENGTH(id)-5) |

+------------+-----------------+-----------------------------------+

| DIN40672US | 40672US | 40672 |

| KIT372UK | 372UK | 372 |

| KIT1729JP | 1729JP | 1729 |

| BED38SG | 38SG | 38 |

| BTH485US | 485US | 485 |

| BTH415JP | 415JP | 415 |

+------------+-----------------+-----------------------------------+

Unfortunately, although the final expression correctly extracts the numeric part from the IDs, the resulting values are strings. Consequently, they sort lexically rather than numerically:

mysql>SELECT * FROM housewares2

-> ORDER BY SUBSTRING(id,4,CHAR_LENGTH(id)-5);

+------------+------------------+

| id | description |

+------------+------------------+

| KIT1729JP | microwave oven |

| KIT372UK | garbage disposal |

| BED38SG | bedside lamp |

| DIN40672US | dining table |

| BTH415JP | lavatory |

| BTH485US | shower stall |

+------------+------------------+

How to deal with that? One way is to add zero, which tells MySQL to perform a string-to-number conversion that results in a numeric sort of the serial number values:

mysql>SELECT * FROM housewares2

-> ORDER BY SUBSTRING(id,4,CHAR_LENGTH(id)-5)+0;

+------------+------------------+

| id | description |

+------------+------------------+

| BED38SG | bedside lamp |

| KIT372UK | garbage disposal |

| BTH415JP | lavatory |

| BTH485US | shower stall |

| KIT1729JP | microwave oven |

| DIN40672US | dining table |

+------------+------------------+

But in this particular case, a simpler solution is possible. It’s not necessary to calculate the length of the numeric part of the string, because the string-to-number conversion operation strips off trailing nonnumeric suffixes and provides the values needed to sort on the variable-length serial number portion of the id values. That means the third argument to SUBSTRING() actually isn’t needed:

mysql>SELECT * FROM housewares2

-> ORDER BY SUBSTRING(id,4)+0;

+------------+------------------+

| id | description |

+------------+------------------+

| BED38SG | bedside lamp |

| KIT372UK | garbage disposal |

| BTH415JP | lavatory |

| BTH485US | shower stall |

| KIT1729JP | microwave oven |

| DIN40672US | dining table |

+------------+------------------+

In the preceding example, the ability to extract variable-length substrings is based on the different kinds of characters in the middle of the ID values, compared to the characters on the ends (that is, digits versus nondigits). In other cases, you may be able to use delimiter characters to pull apart column values. For the next examples, assume a housewares3 table with id values that look like this:

mysql>SELECT * FROM housewares3;

+---------------+------------------+

| id | description |

+---------------+------------------+

| 13-478-92-2 | dining table |

| 873-48-649-63 | garbage disposal |

| 8-4-2-1 | microwave oven |

| 97-681-37-66 | bedside lamp |

| 27-48-534-2 | shower stall |

| 5764-56-89-72 | lavatory |

+---------------+------------------+

To extract segments from these values, use SUBSTRING_INDEX( str , c , n ). It searches a string str for the n-th occurrence of a given character c and returns everything to the left of that character. For example, the following call returns 13-478:

SUBSTRING_INDEX('13-478-92-2','-',2)

If n is negative, the search for c proceeds from the right and returns the rightmost string. This call returns 478-92-2:

SUBSTRING_INDEX('13-478-92-2','-',-3)

By combining SUBSTRING_INDEX() calls with positive and negative indexes, it’s possible to extract successive pieces from each id value. One way is to extract the first n segments of the value, and then pull off the rightmost one. By varying n from 1 to 4, we get the successive segments from left to right:

SUBSTRING_INDEX(SUBSTRING_INDEX(id,'-',1),'-',-1)

SUBSTRING_INDEX(SUBSTRING_INDEX(id,'-',2),'-',-1)

SUBSTRING_INDEX(SUBSTRING_INDEX(id,'-',3),'-',-1)

SUBSTRING_INDEX(SUBSTRING_INDEX(id,'-',4),'-',-1)

The first of those expressions can be optimized, because the inner SUBSTRING_INDEX() call returns a single-segment string and is sufficient by itself to return the leftmost id segment:

SUBSTRING_INDEX(id,'-',1)

Another way to obtain substrings is to extract the rightmost n segments of the value, and then pull off the first one. Here we vary n from −4 to −1:

SUBSTRING_INDEX(SUBSTRING_INDEX(id,'-',-4),'-',1)

SUBSTRING_INDEX(SUBSTRING_INDEX(id,'-',-3),'-',1)

SUBSTRING_INDEX(SUBSTRING_INDEX(id,'-',-2),'-',1)

SUBSTRING_INDEX(SUBSTRING_INDEX(id,'-',-1),'-',1)

Again, an optimization is possible. For the fourth expression, the inner SUBSTRING_INDEX() call is sufficient to return the final substring:

SUBSTRING_INDEX(id,'-',-1)

These expressions can be difficult to read and understand, and you probably should try experimenting with a few of them to see how they work. Here is an example that shows how to get the second and fourth segments from the id values:

mysql>SELECT

-> id,

-> SUBSTRING_INDEX(SUBSTRING_INDEX(id,'-',2),'-',-1) AS segment2,

-> SUBSTRING_INDEX(SUBSTRING_INDEX(id,'-',4),'-',-1) AS segment4

-> FROM housewares3;

+---------------+----------+----------+

| id | segment2 | segment4 |

+---------------+----------+----------+

| 13-478-92-2 | 478 | 2 |

| 873-48-649-63 | 48 | 63 |

| 8-4-2-1 | 4 | 1 |

| 97-681-37-66 | 681 | 66 |

| 27-48-534-2 | 48 | 2 |

| 5764-56-89-72 | 56 | 72 |

+---------------+----------+----------+

To use the substrings for sorting, use the appropriate expressions in the ORDER BY clause. (Remember to force a string-to-number conversion by adding zero if you want the sort to be numeric rather than lexical.) The following two queries order the results based on the second id segment. The first sorts lexically, the second numerically:

mysql>SELECT * FROM housewares3

-> ORDER BY SUBSTRING_INDEX(SUBSTRING_INDEX(id,'-',2),'-',-1);

+---------------+------------------+

| id | description |

+---------------+------------------+

| 8-4-2-1 | microwave oven |

| 13-478-92-2 | dining table |

| 873-48-649-63 | garbage disposal |

| 27-48-534-2 | shower stall |

| 5764-56-89-72 | lavatory |

| 97-681-37-66 | bedside lamp |

+---------------+------------------+

mysql> SELECT * FROM housewares3

-> ORDER BY SUBSTRING_INDEX(SUBSTRING_INDEX(id,'-',2),'-',-1)+0;

+---------------+------------------+

| id | description |

+---------------+------------------+

| 8-4-2-1 | microwave oven |

| 873-48-649-63 | garbage disposal |

| 27-48-534-2 | shower stall |

| 5764-56-89-72 | lavatory |

| 13-478-92-2 | dining table |

| 97-681-37-66 | bedside lamp |

+---------------+------------------+

The substring-extraction expressions here are messy, but at least the column values to which we’re applying the expressions have a consistent number of segments. To sort values that have varying numbers of segments, the job can be more difficult. The next section shows an example illustrating why that is.

Sorting Hostnames in Domain Order

Problem

You want to sort hostnames in domain order, with the rightmost parts of the names more significant than the leftmost parts.

Solution

Break apart the names, and sort the pieces from right to left.

Discussion

Hostnames are strings and therefore their natural sort order is lexical. However, it’s often desirable to sort hostnames in domain order, where the rightmost segments of the hostname values are more significant than the leftmost segments. Suppose that you have a table hostname that contains the following names:

mysql>SELECT name FROM hostname ORDER BY name;

+--------------------+

| name |

+--------------------+

| cvs.php.net |

| dbi.perl.org |

| jakarta.apache.org |

| lists.mysql.com |

| mysql.com |

| www.kitebird.com |

+--------------------+

The preceding query demonstrates the natural lexical sort order of the name values. That differs from domain order, as shown by the following table.

Lexical order

Domain order

cvs.php.net

www.kitebird.com

dbi.perl.org

mysql.com

jakarta.apache.org

lists.mysql.com

lists.mysql.com

cvs.php.net

mysql.com

jakarta.apache.org

www.kitebird.com

dbi.perl.org

Producing domain-ordered output is a substring-sorting problem, where it’s necessary to extract each segment of the names so they can be sorted in right-to-left fashion. There is also an additional complication if your values contain different numbers of segments, as our example hostnames do. (Most of them have three segments, but mysql.com has only two.)

To extract the pieces of the hostnames, begin by using SUBSTRING_INDEX() in a manner similar to that described previously in Sorting by Variable-Length Substrings. The hostname values have a maximum of three segments, from which the pieces can be extracted left to right like this:

SUBSTRING_INDEX(SUBSTRING_INDEX(name,'.',-3),'.',1)

SUBSTRING_INDEX(SUBSTRING_INDEX(name,'.',-2),'.',1)

SUBSTRING_INDEX(name,'.',-1)

These expressions work properly as long as all the hostnames have three components. But if a name has fewer than three, you don’t get the correct result, as the following query demonstrates:

mysql>SELECT name,

-> SUBSTRING_INDEX(SUBSTRING_INDEX(name,'.',-3),'.',1) AS leftmost,

-> SUBSTRING_INDEX(SUBSTRING_INDEX(name,'.',-2),'.',1) AS middle,

-> SUBSTRING_INDEX(name,'.',-1) AS rightmost

-> FROM hostname;

+--------------------+----------+----------+-----------+

| name | leftmost | middle | rightmost |

+--------------------+----------+----------+-----------+

| cvs.php.net | cvs | php | net |

| dbi.perl.org | dbi | perl | org |

| lists.mysql.com | lists | mysql | com |

| mysql.com | mysql | mysql | com |

| jakarta.apache.org | jakarta | apache | org |

| www.kitebird.com | www | kitebird | com |

+--------------------+----------+----------+-----------+

Notice the output for the mysql.com row; it has mysql for the value of the leftmost column, where it should have an empty string. The segment-extraction expressions work by pulling off the rightmost n segments, and then returning the leftmost segment of the result. The source of the problem for mysql.com is that if there aren’t n segments, the expression simply returns the leftmost segment of however many there are. To fix this problem, add a sufficient number of periods at the beginning of the hostname values to guarantee that they have the requisite number of segments:

mysql>SELECT name,

-> SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT('..',name),'.',-3),'.',1)

-> AS leftmost,

-> SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT('.',name),'.',-2),'.',1)

-> AS middle,

-> SUBSTRING_INDEX(name,'.',-1) AS rightmost

-> FROM hostname;

+--------------------+----------+----------+-----------+

| name | leftmost | middle | rightmost |

+--------------------+----------+----------+-----------+

| cvs.php.net | cvs | php | net |

| dbi.perl.org | dbi | perl | org |

| lists.mysql.com | lists | mysql | com |

| mysql.com | | mysql | com |

| jakarta.apache.org | jakarta | apache | org |

| www.kitebird.com | www | kitebird | com |

+--------------------+----------+----------+-----------+

That’s pretty ugly. But these expressions do serve to extract the substrings that are needed for sorting hostname values correctly in right-to-left fashion:

mysql>SELECT name FROM hostname

-> ORDER BY

-> SUBSTRING_INDEX(name,'.',-1),

-> SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT('.',name),'.',-2),'.',1),

-> SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT('..',name),'.',-3),'.',1);

+--------------------+

| name |

+--------------------+

| www.kitebird.com |

| mysql.com |

| lists.mysql.com |

| cvs.php.net |

| jakarta.apache.org |

| dbi.perl.org |

+--------------------+

If you had hostnames with a maximum of four segments rather than three, you’d need to add to the ORDER BY clause another SUBSTRING_INDEX() expression that adds three dots at the beginning of the hostname values.

Sorting Dotted-Quad IP Values in Numeric Order

Problem

You want to sort strings that represent IP numbers in numeric order.

Solution

Break apart the strings, and sort the pieces numerically. Or just use INET_ATON().

Discussion

If a table contains IP numbers represented as strings in dotted-quad notation (111.122.133.144), they’ll sort lexically rather than numerically. To produce a numeric ordering instead, you can sort them as four-part values with each part sorted numerically. Or, to be more efficient, you can represent the IP numbers as 32-bit unsigned integers, which take less space and can be ordered by a simple numeric sort. This section shows both methods.

To sort string-valued dotted-quad IP numbers, use a technique similar to that for sorting hostnames, but with the following differences:

§ Dotted quads always have four segments, so there’s no need to add dots to the value before extracting substrings.

§ Dotted quads sort left to right, so the order of the substrings used in the ORDER BY clause is opposite to that used for hostname sorting.

§ The segments of dotted-quad values are numbers, so add zero to each substring to tell MySQL to use a numeric sort rather than a lexical one.

Suppose that you have a hostip table with a string-valued ip column containing IP numbers:

mysql>SELECT ip FROM hostip ORDER BY ip;

+-----------------+

| ip |

+-----------------+

| 127.0.0.1 |

| 192.168.0.10 |

| 192.168.0.2 |

| 192.168.1.10 |

| 192.168.1.2 |

| 21.0.0.1 |

| 255.255.255.255 |

+-----------------+

The preceding query produces output sorted in lexical order. To sort the ip values numerically, you can extract each segment and add zero to convert it to a number using an ORDER BY clause like this:

mysql>SELECT ip FROM hostip

-> ORDER BY

-> SUBSTRING_INDEX(ip,'.',1)+0,

-> SUBSTRING_INDEX(SUBSTRING_INDEX(ip,'.',-3),'.',1)+0,

-> SUBSTRING_INDEX(SUBSTRING_INDEX(ip,'.',-2),'.',1)+0,

-> SUBSTRING_INDEX(ip,'.',-1)+0;

+-----------------+

| ip |

+-----------------+

| 21.0.0.1 |

| 127.0.0.1 |

| 192.168.0.2 |

| 192.168.0.10 |

| 192.168.1.2 |

| 192.168.1.10 |

| 255.255.255.255 |

+-----------------+

However, although that ORDER BY produces a correct result, it involves a lot of messing around. A simpler solution is possible: use the INET_ATON() function to convert network addresses in string form directly to their underlying numeric values and sort those numbers:

mysql>SELECT ip FROM hostip ORDER BY INET_ATON(ip);

+-----------------+

| ip |

+-----------------+

| 21.0.0.1 |

| 127.0.0.1 |

| 192.168.0.2 |

| 192.168.0.10 |

| 192.168.1.2 |

| 192.168.1.10 |

| 255.255.255.255 |

+-----------------+

If you’re tempted to sort by simply adding zero to the ip value and using ORDER BY on the result, consider the values that kind of string-to-number conversion actually produces:

mysql>SELECT ip, ip+0 FROM hostip;

+-----------------+---------+

| ip | ip+0 |

+-----------------+---------+

| 127.0.0.1 | 127 |

| 192.168.0.2 | 192.168 |

| 192.168.0.10 | 192.168 |

| 192.168.1.2 | 192.168 |

| 192.168.1.10 | 192.168 |

| 255.255.255.255 | 255.255 |

| 21.0.0.1 | 21 |

+-----------------+---------+

The conversion retains only as much of each value as can be interpreted as a valid number. The remainder would be unavailable for sorting purposes, even though it’s necessary to produce a correct ordering.

Use of INET_ATON() in the ORDER BY clause is more efficient than six SUBSTRING_INDEX() calls. Moreover, if you’re willing to consider storing IP addresses as numbers rather than as strings, you avoid having to perform any conversion at all when sorting. You gain an additional benefit as well because if you index the column, the query optimizer may be able to use the index for certain queries. Numeric IP addresses have 32 bits, so you can use an INT UNSIGNED column to store them. For cases when you need to display those values in dotted-quad notation, convert them with the INET_NTOA() function.

Floating Values to the Head or Tail of the Sort Order

Problem

You want a column to sort the way it normally does, except for a few values that you want at the beginning or end of the sort order. For example, suppose that you want to sort a list in lexical order except for certain high-priority values that should appear first no matter where they fall in the normal sort order.

Solution

Add another sort column to the ORDER BY clause that places those few values where you want them. The remaining sort columns will have their usual effect for the other values.

Discussion

If you want to sort a result set normally except that you want particular values first, create an additional sort column that is 0 for those values and 1 for everything else. This allows you to float the values to the head of the sort order. To put the values at the tail instead, use the additional column to map the values to 1 and all other values to 0.

For example, when a sorted column contains NULL values, MySQL puts them all together in the sort order (at the beginning for an ascending sort, at the end for a descending sort). It may seem a bit odd that NULL values are grouped, given that (as the following query shows) they are not considered equal in comparisons:

mysql>SELECT NULL = NULL;

+-------------+

| NULL = NULL |

+-------------+

| NULL |

+-------------+

On the other hand, NULL values conceptually do seem more similar to each other than to non-NULL values, and there’s no good way to distinguish one NULL from another, anyway. Normally, NULL values form a group at the beginning of the sort order (or at the end, if you specify DESC). If you want NULL values at a specific end of the sort order, you can force them to be placed where you want. Suppose that you have a table t with the following contents:

mysql>SELECT val FROM t;

+------+

| val |

+------+

| 3 |

| 100 |

| NULL |

| NULL |

| 9 |

+------+

Normally, sorting puts the NULL values at the beginning for an ascending sort:

mysql>SELECT val FROM t ORDER BY val;

+------+

| val |

+------+

| NULL |

| NULL |

| 3 |

| 9 |

| 100 |

+------+

To put them at the end instead, introduce an extra ORDER BY column that maps NULL values to a higher value than non-NULL values:

mysql>SELECT val FROM t ORDER BY IF(val IS NULL,1,0), val;

+------+

| val |

+------+

| 3 |

| 9 |

| 100 |

| NULL |

| NULL |

+------+

The IF() expression creates a new column for the sort that is used as the primary sort value.

For descending sorts, NULL values group at the end. To put them at the beginning instead, use the same technique, but reverse the second and third arguments of the IF() function to map NULL values to a lower value than non-NULL values:

IF(val IS NULL,0,1)

The same technique is useful for floating values other than NULL values to either end of the sort order. Suppose that you want to sort mail table messages in sender/recipient order, but you want to put messages for a particular sender first. In the real world, the most interesting sender might bepostmaster or root. Those names don’t appear in the table, so let’s use phil as the name of interest instead:

mysql>SELECT t, srcuser, dstuser, size

-> FROM mail

-> ORDER BY IF(srcuser='phil',0,1), srcuser, dstuser;

+---------------------+---------+---------+---------+

| t | srcuser | dstuser | size |

+---------------------+---------+---------+---------+

| 2006-05-16 23:04:19 | phil | barb | 10294 |

| 2006-05-12 15:02:49 | phil | phil | 1048 |

| 2006-05-15 08:50:57 | phil | phil | 978 |

| 2006-05-14 11:52:17 | phil | tricia | 5781 |

| 2006-05-17 12:49:23 | phil | tricia | 873 |

| 2006-05-14 14:42:21 | barb | barb | 98151 |

| 2006-05-11 10:15:08 | barb | tricia | 58274 |

| 2006-05-13 13:59:18 | barb | tricia | 271 |

| 2006-05-14 09:31:37 | gene | barb | 2291 |

| 2006-05-16 09:00:28 | gene | barb | 613 |

| 2006-05-15 17:35:31 | gene | gene | 3856 |

| 2006-05-15 07:17:48 | gene | gene | 3824 |

| 2006-05-19 22:21:51 | gene | gene | 23992 |

| 2006-05-15 10:25:52 | gene | tricia | 998532 |

| 2006-05-12 12:48:13 | tricia | gene | 194925 |

| 2006-05-14 17:03:01 | tricia | phil | 2394482 |

+---------------------+---------+---------+---------+

The value of the extra sort column is 0 for rows in which the srcuser value is phil, and 1 for all other rows. By making that the most significant sort column, rows for messages sent by phil float to the top of the output. (To sink them to the bottom instead, either sort the column in reverse order using DESC, or reverse the order of the second and third arguments of the IF() function.)

You can also use this technique for particular conditions, not just specific values. To put first those rows where people sent messages to themselves, do this:

mysql>SELECT t, srcuser, dstuser, size

-> FROM mail

-> ORDER BY IF(srcuser=dstuser,0,1), srcuser, dstuser;

+---------------------+---------+---------+---------+

| t | srcuser | dstuser | size |

+---------------------+---------+---------+---------+

| 2006-05-14 14:42:21 | barb | barb | 98151 |

| 2006-05-19 22:21:51 | gene | gene | 23992 |

| 2006-05-15 17:35:31 | gene | gene | 3856 |

| 2006-05-15 07:17:48 | gene | gene | 3824 |

| 2006-05-12 15:02:49 | phil | phil | 1048 |

| 2006-05-15 08:50:57 | phil | phil | 978 |

| 2006-05-11 10:15:08 | barb | tricia | 58274 |

| 2006-05-13 13:59:18 | barb | tricia | 271 |

| 2006-05-16 09:00:28 | gene | barb | 613 |

| 2006-05-14 09:31:37 | gene | barb | 2291 |

| 2006-05-15 10:25:52 | gene | tricia | 998532 |

| 2006-05-16 23:04:19 | phil | barb | 10294 |

| 2006-05-14 11:52:17 | phil | tricia | 5781 |

| 2006-05-17 12:49:23 | phil | tricia | 873 |

| 2006-05-12 12:48:13 | tricia | gene | 194925 |

| 2006-05-14 17:03:01 | tricia | phil | 2394482 |

+---------------------+---------+---------+---------+

If you have a pretty good idea about the contents of your table, you can sometimes eliminate the extra sort column. For example, srcuser is never NULL in the mail table, so the previous query can be rewritten as follows to use one less column in the ORDER BY clause (this relies on the property that NULL values sort ahead of all non-NULL values):

mysql>SELECT t, srcuser, dstuser, size

-> FROM mail

-> ORDER BY IF(srcuser=dstuser,NULL,srcuser), dstuser;

+---------------------+---------+---------+---------+

| t | srcuser | dstuser | size |

+---------------------+---------+---------+---------+

| 2006-05-14 14:42:21 | barb | barb | 98151 |

| 2006-05-19 22:21:51 | gene | gene | 23992 |

| 2006-05-15 17:35:31 | gene | gene | 3856 |

| 2006-05-15 07:17:48 | gene | gene | 3824 |

| 2006-05-12 15:02:49 | phil | phil | 1048 |

| 2006-05-15 08:50:57 | phil | phil | 978 |

| 2006-05-11 10:15:08 | barb | tricia | 58274 |

| 2006-05-13 13:59:18 | barb | tricia | 271 |

| 2006-05-16 09:00:28 | gene | barb | 613 |

| 2006-05-14 09:31:37 | gene | barb | 2291 |

| 2006-05-15 10:25:52 | gene | tricia | 998532 |

| 2006-05-16 23:04:19 | phil | barb | 10294 |

| 2006-05-14 11:52:17 | phil | tricia | 5781 |

| 2006-05-17 12:49:23 | phil | tricia | 873 |

| 2006-05-12 12:48:13 | tricia | gene | 194925 |

| 2006-05-14 17:03:01 | tricia | phil | 2394482 |

+---------------------+---------+---------+---------+

See Also

The technique of introducing additional sort columns is useful with UNION queries that produce the union of multiple SELECT statements. You can cause the SELECT results to appear one after the other and sort the rows within each individual SELECT. See Combining Several Result Sets in a Single Query for details.

Sorting in User-Defined Orders

Problem

You want to define a nonstandard sort order for the values in a column.

Solution

Use FIELD() to map column values to a sequence that places the values in the desired order.

Discussion

Recipe 7.14 showed how to make a specific group of rows go to the head of the sort order. If you want to impose a specific order on all values in a column, use the FIELD() function to map them to a list of numeric values and use the numbers for sorting. FIELD() compares its first argument to the following arguments and returns a number indicating which one of them it matches. The following FIELD() call compares value to str1, str2, str3, and str4, and returns 1, 2, 3, or 4, depending on which one of them value is equal to:

FIELD(value,str1,str2,str3,str4)

The number of comparison values need not be four; FIELD() takes a variable-length argument list. If value is NULL or none of the values match, FIELD() returns 0.

FIELD() can be used to sort an arbitrary set of values into any order you please. For example, to display driver_log rows for Henry, Suzi, and Ben, in that order, do this:

mysql>SELECT * FROM driver_log

-> ORDER BY FIELD(name,'Henry','Suzi','Ben');

+--------+-------+------------+-------+

| rec_id | name | trav_date | miles |

+--------+-------+------------+-------+

| 10 | Henry | 2006-08-30 | 203 |

| 8 | Henry | 2006-09-01 | 197 |

| 6 | Henry | 2006-08-26 | 115 |

| 4 | Henry | 2006-08-27 | 96 |

| 3 | Henry | 2006-08-29 | 300 |

| 7 | Suzi | 2006-09-02 | 502 |

| 2 | Suzi | 2006-08-29 | 391 |

| 5 | Ben | 2006-08-29 | 131 |

| 9 | Ben | 2006-09-02 | 79 |

| 1 | Ben | 2006-08-30 | 152 |

+--------+-------+------------+-------+

You can use FIELD() with column substrings, too. To sort items from the housewares table by country of manufacture using the order US, UK, JP, SG, do this:

mysql>SELECT id, description FROM housewares

-> ORDER BY FIELD(RIGHT(id,2),'US','UK','JP','SG');

+------------+------------------+

| id | description |

+------------+------------------+

| DIN40672US | dining table |

| BTH00485US | shower stall |

| KIT00372UK | garbage disposal |

| KIT01729JP | microwave oven |

| BTH00415JP | lavatory |

| BED00038SG | bedside lamp |

+------------+------------------+

More generally, FIELD() can be used to sort any kind of category-based values into a specific order when the categories don’t sort naturally into that order.

Sorting ENUM Values

Problem

ENUMvalues don’t sort like other string columns.

Solution

Learn how they work, and exploit those properties to your own advantage.

Discussion

ENUM is considered a string data type, but ENUM values actually are stored numerically with values ordered the same way they are listed in the table definition. These numeric values affect how enumerations are sorted, which can be very useful. Suppose that you have a table named weekdaycontaining an enumeration column day that has weekday names as its members:

CREATE TABLE weekday

(

day ENUM('Sunday','Monday','Tuesday','Wednesday',

'Thursday','Friday','Saturday')

);

Internally, MySQL defines the enumeration values Sunday through Saturday in that definition to have numeric values from 1 to 7. To see this for yourself, create the table using the definition just shown, and then insert into it a row for each day of the week. However, to make the insertion order differ from sorted order (so that you can see the effect of sorting), add the days in random order:

mysql>INSERT INTO weekday (day) VALUES('Monday'),('Friday'),

-> ('Tuesday'), ('Sunday'), ('Thursday'), ('Saturday'), ('Wednesday');

Then select the values, both as strings and as the internal numeric value (the latter are obtained by using +0 to effect a string-to-number conversion):

mysql>SELECT day, day+0 FROM weekday;

+-----------+-------+

| day | day+0 |

+-----------+-------+

| Monday | 2 |

| Friday | 6 |

| Tuesday | 3 |

| Sunday | 1 |

| Thursday | 5 |

| Saturday | 7 |

| Wednesday | 4 |

+-----------+-------+

Notice that because the query includes no ORDER BY clause, the rows are returned in unsorted order. If you add an ORDER BY day clause, it becomes apparent that MySQL uses the internal numeric values for sorting:

mysql>SELECT day, day+0 FROM weekday ORDER BY day;

+-----------+-------+

| day | day+0 |

+-----------+-------+

| Sunday | 1 |

| Monday | 2 |

| Tuesday | 3 |

| Wednesday | 4 |

| Thursday | 5 |

| Friday | 6 |

| Saturday | 7 |

+-----------+-------+

What about occasions when you do want to sort ENUM values in lexical order? Force them to be treated as strings for sorting using the CAST() function:

mysql>SELECT day, day+0 FROM weekday ORDER BY CAST(day AS CHAR);

+-----------+-------+

| day | day+0 |

+-----------+-------+

| Friday | 6 |

| Monday | 2 |

| Saturday | 7 |

| Sunday | 1 |

| Thursday | 5 |

| Tuesday | 3 |

| Wednesday | 4 |

+-----------+-------+

If you always (or nearly always) sort a nonenumeration column in a specific nonlexical order, consider changing the data type to ENUM, with its values listed in the desired sort order. To see how this works, create a color table containing a string column, and populate it with some sample rows:

mysql>CREATE TABLE color (name CHAR(10));

mysql> INSERT INTO color (name) VALUES ('blue'),('green'),

-> ('indigo'),('orange'),('red'),('violet'),('yellow');

Sorting by the name column at this point produces lexical order because the column contains CHAR values:

mysql>SELECT name FROM color ORDER BY name;

+--------+

| name |

+--------+

| blue |

| green |

| indigo |

| orange |

| red |

| violet |

| yellow |

+--------+

Now suppose that you want to sort the column by the order in which colors occur in the rainbow. (This order is given by the name “Roy G. Biv,” where successive letters of that name indicate the first letter of the corresponding color name.) One way to produce a rainbow sort is to useFIELD():

mysql>SELECT name FROM color

-> ORDER BY

-> FIELD(name,'red','orange','yellow','green','blue','indigo','violet');

+--------+

| name |

+--------+

| red |

| orange |

| yellow |

| green |

| blue |

| indigo |

| violet |

+--------+

To accomplish the same end without FIELD(), use ALTER TABLE to convert the name column to an ENUM that lists the colors in the desired sort order:

mysql>ALTER TABLE color

-> MODIFY name

-> ENUM('red','orange','yellow','green','blue','indigo','violet');

After converting the table, sorting on the name column produces rainbow sorting naturally with no special treatment:

mysql>SELECT name FROM color ORDER BY name;

+--------+

| name |

+--------+

| red |

| orange |

| yellow |

| green |

| blue |

| indigo |

| violet |

+--------+