Working with Database Structures - Using MySQL - Learning MySQL (2007)

Learning MySQL (2007)

Part II. Using MySQL

Chapter 6. Working with Database Structures

This chapter shows you how to create your own databases, add and remove structures such as tables and indexes, and make choices about column types in your tables. It focuses on the syntax and features of SQL, and not the semantics of conceiving, specifying, and refining a database design; you’ll find an introductory description of database design techniques in Chapter 4. To work through this chapter, you need to understand how to work with an existing database and its tables, as discussed in Chapter 5.

This chapter lists the structures in the sample music database used in this book; detail on how to load the database is presented in Chapter 2. If you’ve followed those instructions, you’ll already have the database available and know how to restore the database after you’ve modified its structures.

When you finish this chapter, you’ll have all the basics required to create, modify, and delete database structures. Together with the techniques you learned in Chapter 5, you’ll have the skills to carry out a wide range of basic operations. Chapters 7, 8, and 9 cover skills that allow you to do more advanced operations with MySQL.

Creating and Using Databases

When you’ve finished designing a database, the first practical step to take with MySQL is to create it. You do this with the CREATE DATABASE statement. Suppose you want to create a database with the name lucy. Here’s the statement you’d type in the monitor:

mysql> CREATE DATABASE lucy;

Query OK, 1 row affected (0.10 sec)

We assume here that you know how to connect to and use the monitor, as described in Chapter 3. We also assume that you’re able to connect as the root user or as another user who can create, delete, and modify structures (you’ll find a detailed discussion on user privileges in Chapter 9). Note that when you create the database, MySQL says that one row was affected. This isn’t in fact a normal row in any specific database—but a new entry added to the list that you see with SHOW DATABASES.

Behind the scenes, MySQL creates a new directory under the data directory for the new database and stores the text file db.opt that lists the database options; for example, the file might contain:

default-character-set=latin1

default-collation=latin1_swedish_ci

These particular two lines specify the default character set and collation of the new database. We’ll look at what these mean later, but you generally won’t need to know much about the db.opt file or access it directly.

Once you’ve created the database, the next step is to use it—that is, choose it as the database you’re working with. You do this with the MySQL command:

mysql> USE lucy;

Database changed

As discussed previously in Chapter 5, this command must be entered on one line and need not be terminated with a semicolon, though we usually do so automatically through habit. Once you’ve used the database, you can start creating tables, indexes, and other structures using the steps discussed next in Creating Tables.”

Before we move on to creating other structures, let’s discuss a few features and limitations of creating databases. First, let’s see what happens if you create a database that already exists:

mysql> CREATE DATABASE lucy;

ERROR 1007 (HY000): Can't create database 'lucy'; database exists

You can avoid this error by adding the IF NOT EXISTS keyword phrase to the statement:

mysql> CREATE DATABASE IF NOT EXISTS lucy;

Query OK, 0 rows affected (0.00 sec)

You can see that MySQL didn’t complain, but it didn’t do anything either: the 0 rows affected message indicates that no data was changed. This addition is useful when you’re adding SQL statements to a script: it prevents the script from aborting on error.

Let’s discuss how to choose database names and the use of character case. Database names define physical directory (or folder) names on disk. On some operating systems, directory names are case-sensitive; on others, case doesn’t matter. For example, Unix-like systems such as Linux and Mac OS X are typically case-sensitive, while Windows isn’t. The result is that database names have the same restrictions: when case matters to the operating system, it matters to MySQL. For example, on a Linux machine, LUCY, lucy, and Lucy are different database names; on Windows, they refer to just one database. Using incorrect capitalization under Linux or Mac OS X will cause MySQL to complain:

mysql> select artIst.Artist_id from ARTist;

ERROR 1146 (42S02): Table 'music.ARTist' doesn't exist

but under Windows, this will normally work. To make your SQL machine-independent, we recommend that you consistently use lowercase names for databases (and for tables, columns, aliases, and indexes).

There are other restrictions on database names. They can be at most 64 characters in length. You also shouldn’t use MySQL reserved words—such as SELECT, FROM, and USE—as names for structures; these can confuse the MySQL parser, making it impossible to interpret the meaning of your statements. There’s a way around this problem: you can enclose the reserved word with the backtick symbol (‵) on either side, but it’s more trouble remembering to do so than it’s worth. In addition, you can’t use selected characters in the names: specifically, you can’t use the forward slash, backward slash, semicolon, and period characters, and a database name can’t end in whitespace. Again, the use of these characters confuses the MySQL parser and can result in unpredictable behavior. For example, here’s what happens when you insert a semicolon into a database name:

mysql> CREATE DATABASE IF NOT EXISTS lu;cy;

Query OK, 1 row affected (0.00 sec)

ERROR 1064 (42000): You have an error in your SQL syntax. Check the manual

that corresponds to your MySQL server version for the right syntax to use

near 'cy' at line 1

Since more than one SQL statement can be on a single line, the result is that a database lu is created, and then an error is generated by the very short, unexpected SQL statement cy;.

Creating Tables

This section covers topics on table structures. We show you how to:

§ Create tables, through introductory examples

§ Choose names for tables and table-related structures

§ Understand and choose column types

§ Understand and choose keys and indexes

§ Use the proprietary MySQL AUTO_INCREMENT feature

When you finish this section, you’ll have completed all of the basic material on creating database structures; the remainder of this chapter covers the sample music database used in the book, and how to alter and remove existing structures.

Basics

For our examples in this section, we’ll assume that the database music hasn’t been created. If you want to follow the examples, and you have already loaded the database, you can drop it for this section and reload it later; dropping it removes the database, tables, and all of the data, but the original is easy to restore by following the steps in Chapter 2. Here’s how you drop it temporarily:

mysql> DROP DATABASE music;

Query OK, 4 rows affected (0.06 sec)

The DROP statement is discussed further at the end of this chapter in Deleting Structures.”

To begin, create the database music using the statement:

mysql> CREATE DATABASE music;

Query OK, 1 row affected (0.00 sec)

Then select the database with:

mysql> USE music;

Database changed

We’re now ready to begin creating the tables that’ll hold our data. Let’s create a table to hold artist details. Here’s the statement that we use:

mysql> CREATE TABLE artist (

-> artist_id SMALLINT(5) NOT NULL DEFAULT 0,

-> artist_name CHAR(128) DEFAULT NULL,

-> PRIMARY KEY (artist_id)

-> );

Query OK, 0 rows affected (0.06 sec)

Don’t panic: even though MySQL reports that zero rows were affected, it’s definitely created the table:

mysql> SHOW TABLES;

+-----------------+

| Tables_in_music |

+-----------------+

| artist |

+-----------------+

1 row in set (0.00 sec)

Let’s consider all this in detail. The CREATE TABLE statement has three major sections:

1. The CREATE TABLE statement, which is followed by the table name to create. In this example, it’s artist.

2. A list of one or more columns to add to the table. In this example, we’ve added two: artist_id SMALLINT(5) NOT NULL DEFAULT 0 and artist_name CHAR(128) default NULL. We’ll discuss these in a moment.

3. Optional key definitions. In this example, we’ve defined a single key: PRIMARY KEY (artist_id). We’ll discuss keys and indexes in detail later in this section.

Notice that the CREATE TABLE component is followed by an opening parenthesis that’s matched by a closing parenthesis at the end of the statement. Notice also that the other components are separated by commas. There are other elements that you can add to a CREATE TABLE statement, and we’ll discuss some in a moment.

Let’s discuss the column specifications. The basic syntax is as follows: name type [NOT NULL | NULL] [DEFAULT value]. The name field is the column name, and it has the same limitations as database names, as discussed in the previous section. It can be at most 64 characters in length, backward and forward slashes aren’t allowed, periods aren’t allowed, it can’t end in whitespace, and case sensitivity is dependent on the underlying operating system. The type defines how and what is stored in the column; for example, we’ve seen that it can be set to CHAR for strings,SMALLINT for numbers, or TIMESTAMP for a date and time.

If you specify NOT NULL, a row isn’t valid without a value for the column; if you specify NULL or omit the clause, a row can exist without a value for the column. If you specify a value with the DEFAULT clause, it’ll be used to populate the column when you don’t otherwise provide data; this is particularly useful when you frequently reuse a default value such as a country name. The value must be a constant (such as 0, "cat", or 20060812045623), except if the column is of the type TIMESTAMP. Types are discussed in detail later in this section.

The NOT NULL and DEFAULT features can be used together. If you specify NOT NULL and add a DEFAULT value, the default is used when you don’t provide a value for the column. Sometimes, this works fine:

mysql> INSERT INTO artist SET artist_name = "Duran Duran";

Query OK, 1 row affected (0.05 sec)

And sometimes it doesn’t:

mysql> INSERT INTO artist SET artist_name = "Bob The Builder";

ERROR 1062 (23000): Duplicate entry '0' for key 1

Whether it works or not is dependent on the underlying constraints and conditions of the database: in this example, artist_id has a default value of 0, but it’s also the primary key. Having two rows with the same primary-key value isn’t permitted, and so the second attempt to insert a row with no values (and a resulting primary-key value of 0) fails. We discuss primary keys in detail later in this section.

Column names have fewer restrictions than database and table names. What’s more, they’re not dependent on the operating system: the names are case-insensitive and portable across all platforms. All characters are allowed in column names, though if you want terminate them withwhitespace or include periods (or other special characters such as the semicolon), you’ll need to enclose the name with a backtick symbol (`) on either side. We recommend that you consistently choose lowercase names for developer-driven choices (such as database, alias, and table names) and avoid characters that require you to remember to use backticks. We also recommend being descriptive with your choices: name doesn’t mean much outside of the context of the artist table, but artist_name has universal meaning across the music database. We like using the underscore character to separate words, but that’s just a matter of style and taste; you could use underscores or dashes, or omit the word-separating formatting altogether. As with database and table names, the longest column name is 64 characters in length.

Collation and Character Sets

Because not everyone wants to store English strings, it’s important that a database server be able to manage non-English characters and different ways of sorting characters. When you’re comparing or sorting strings, how MySQL evaluates the result depends on the character set and collationused. Character sets define what characters can be stored; for example, you may need to store non-English characters such as ٱ or ü. A collation defines how strings are ordered, and there are different collations for different languages: for example, the position of the character ü in the alphabet is different in two German orderings, and different again in Swedish and Finnish.

In our previous string-comparison examples, we ignored the collation and character-set issue, and just let MySQL use its defaults; the default character set is latin1, and the default collation is latin1_swedish_ci. MySQL can be configured to use different character sets and collation orders at the connection, database, table, and column levels.

You can list the character sets available on your server with the SHOW CHARACTER SET command. This shows a short description for each character set, its default collation, and the maximum number of bytes used for each character in that character set:

mysql> SHOW CHARACTER SET;

+----------+-----------------------------+---------------------+--------+

| Charset | Description | Default collation | Maxlen |

+----------+-----------------------------+---------------------+--------+

| big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 |

| dec8 | DEC West European | dec8_swedish_ci | 1 |

| cp850 | DOS West European | cp850_general_ci | 1 |

| hp8 | HP West European | hp8_english_ci | 1 |

| koi8r | KOI8-R Relcom Russian | koi8r_general_ci | 1 |

| latin1 | cp1252 West European | latin1_swedish_ci | 1 |

| latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 |

| swe7 | 7bit Swedish | swe7_swedish_ci | 1 |

| ascii | US ASCII | ascii_general_ci | 1 |

| ujis | EUC-JP Japanese | ujis_japanese_ci | 3 |

| sjis | Shift-JIS Japanese | sjis_japanese_ci | 2 |

| hebrew | ISO 8859-8 Hebrew | hebrew_general_ci | 1 |

| tis620 | TIS620 Thai | tis620_thai_ci | 1 |

| euckr | EUC-KR Korean | euckr_korean_ci | 2 |

| koi8u | KOI8-U Ukrainian | koi8u_general_ci | 1 |

| gb2312 | GB2312 Simplified Chinese | gb2312_chinese_ci | 2 |

| greek | ISO 8859-7 Greek | greek_general_ci | 1 |

| cp1250 | Windows Central European | cp1250_general_ci | 1 |

| gbk | GBK Simplified Chinese | gbk_chinese_ci | 2 |

| latin5 | ISO 8859-9 Turkish | latin5_turkish_ci | 1 |

| armscii8 | ARMSCII-8 Armenian | armscii8_general_ci | 1 |

| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |

| ucs2 | UCS-2 Unicode | ucs2_general_ci | 2 |

| cp866 | DOS Russian | cp866_general_ci | 1 |

| keybcs2 | DOS Kamenicky Czech-Slovak | keybcs2_general_ci | 1 |

| macce | Mac Central European | macce_general_ci | 1 |

| macroman | Mac West European | macroman_general_ci | 1 |

| cp852 | DOS Central European | cp852_general_ci | 1 |

| latin7 | ISO 8859-13 Baltic | latin7_general_ci | 1 |

| cp1251 | Windows Cyrillic | cp1251_general_ci | 1 |

| cp1256 | Windows Arabic | cp1256_general_ci | 1 |

| cp1257 | Windows Baltic | cp1257_general_ci | 1 |

| binary | Binary pseudo charset | binary | 1 |

| geostd8 | GEOSTD8 Georgian | geostd8_general_ci | 1 |

| cp932 | SJIS for Windows Japanese | cp932_japanese_ci | 2 |

| eucjpms | UJIS for Windows Japanese | eucjpms_japanese_ci | 3 |

+----------+-----------------------------+---------------------+--------+

36 rows in set (0.30 sec)

For example, the latin1 character set is actually the Windows code page 1252 that supports West European languages. The default collation for this character set is latin1_swedish_ci, which follows Swedish conventions to sort accented characters (English is handled as you’d expect). This collation is case-insensitive, as indicated by the letters ci. Finally, each character takes up one byte. By comparison, if you use the ucs2 character set, each character would take up to two bytes of storage.

Similarly, you can list the collation orders and the character sets they apply to:

mysql> SHOW COLLATION;

+----------------------+----------+-----+---------+----------+---------+

| Collation | Charset | Id | Default | Compiled | Sortlen |

+----------------------+----------+-----+---------+----------+---------+

| big5_chinese_ci | big5 | 1 | Yes | Yes | 1 |

...

| latin1_german1_ci | latin1 | 5 | | | 0 |

| latin1_swedish_ci | latin1 | 8 | Yes | Yes | 1 |

| latin1_danish_ci | latin1 | 15 | | | 0 |

| latin1_german2_ci | latin1 | 31 | | Yes | 2 |

| latin1_bin | latin1 | 47 | | Yes | 1 |

| latin1_general_ci | latin1 | 48 | | | 0 |

| latin1_general_cs | latin1 | 49 | | | 0 |

..

| hebrew_general_ci | hebrew | 16 | Yes | | 0 |

...

| gb2312_chinese_ci | gb2312 | 24 | Yes | Yes | 1 |

...

| utf8_persian_ci | utf8 | 208 | | Yes | 8 |

| utf8_esperanto_ci | utf8 | 209 | | Yes | 8 |

...

| eucjpms_japanese_ci | eucjpms | 97 | Yes | Yes | 1 |

| eucjpms_bin | eucjpms | 98 | | Yes | 1 |

+----------------------+----------+-----+---------+----------+---------+

126 rows in set (0.02 sec)

You can see the current defaults on your server as follows:

mysql> SHOW VARIABLES LIKE 'c%';

+--------------------------+----------------------------+

| Variable_name | Value |

+--------------------------+----------------------------+

| character_set_client | latin1 |

| character_set_connection | latin1 |

| character_set_database | latin1 |

| character_set_filesystem | binary |

| character_set_results | latin1 |

| character_set_server | latin1 |

| character_set_system | utf8 |

| character_sets_dir | /usr/share/mysql/charsets/ |

| collation_connection | latin1_swedish_ci |

| collation_database | latin1_swedish_ci |

| collation_server | latin1_swedish_ci |

...

+--------------------------+----------------------------+

14 rows in set (0.00 sec)

When you’re creating a database, you can set the default character set and sort order for the database and its tables. For example, if you want to use the latin1 character set and the latin1_swedish_cs (case-sensitive) collation order, you would write:

mysql> CREATE DATABASE rose DEFAULT CHARACTER SET latin1 COLLATE latin1_swedish_cs;

Query OK, 1 row affected (0.00 sec)

As we’ve previously discussed, there’s no need to do this if you’ve installed your MySQL correctly for your language and region, and if you’re not planning on internationalizing your application. You can also control the character set and collation for individual tables or columns, but we won’t go into the detail of how to do that here.

Other Features

This section briefly describes other features of the MySQL CREATE TABLE statement. It includes an example using the IF NOT EXISTS feature, and a list of advanced features and where to find more about them in this book.

You can use the IF NOT EXISTS keyword phrase when creating a table, and it works much as it does for databases. Here’s an example that won’t report an error even when the artist table exists:

mysql> CREATE TABLE IF NOT EXISTS artist (

-> artist_id SMALLINT(5) NOT NULL DEFAULT 0,

-> artist_name CHAR(128) DEFAULT NULL,

-> PRIMARY KEY (artist_id)

-> );

Query OK, 0 rows affected (0.00 sec)

It’s actually hard to tell success from failure here: zero rows are affected whether or not the table exists, and no warning is reported when the table does exist.

There are a wide range of additional features you can add to a CREATE TABLE statement. Many of these are advanced and aren’t discussed in this book, but you can find more information in the MySQL manual under the heading “CREATE TABLE syntax.” These additional features include:

The AUTO_INCREMENT feature for numeric columns

This feature allows you to automatically create unique identifiers for a table. We discuss it in detail later in this chapter in The AUTO_INCREMENT Feature.”

Column comments

You can add a comment to a column; this is displayed when you use the SHOW CREATE TABLE command that we discuss later in this section.

Foreign key constraints

You can tell MySQL to check whether data in one or more columns matches data in another table. For example, you might want to prevent an album from being added to the music database unless there’s a matching artist in the artist table. As we explain in Table Types,” we don’t recommend using foreign key constraints for most applications. This feature is currently supported for only the InnoDB table type.

Creating temporary tables

If you create a table using the keyword phrase CREATE TEMPORARY TABLE, it’ll be removed (dropped) when the monitor connection is closed. This is useful for copying and reformatting data because you don’t have to remember to clean up.

Advanced table options

You can control a wide range of features of the table using table options. These include the starting value of AUTO_INCREMENT, the way indexes and rows are stored, and options to override the information that the MySQL query optimizer gathers from the table.

Control over index structures

Since MySQL 4.1, for some table types, you’ve been able to control what type of internal structure—such as a B-tree or hash table—MySQL uses for its indexes. You can also tell MySQL that you want a full text or spatial data index on a column, allowing special types of search.

You can check the CREATE TABLE statement for a table using the SHOW CREATE TABLE statement introduced in Chapter 5. This often shows you output that includes some of the advanced features we’ve just discussed; the output rarely matches what you actually typed to create the table. Here’s an example for the artist table:

mysql> SHOW CREATE TABLE artist;

+--------+------------------------------------------------+

| Table | Create Table |

+--------+------------------------------------------------+

| artist | CREATE TABLE `artist` (

`artist_id` smallint(5) NOT NULL default '0',

`artist_name` char(128) default NULL,

PRIMARY KEY (`artist_id`)

) ENGINE=MyISAM DEFAULT CHARSET=latin1 |

+--------+------------------------------------------------+

1 row in set (0.08 sec)

We’ve reformatted the output slightly to fit better in this book. You’ll notice that the output includes content added by MySQL that wasn’t in our original CREATE TABLE statement:

§ The names of the table and columns are enclosed in backticks. This isn’t necessary, but it does avoid any parsing problems that can occur through using reserved words and special characters, as discussed previously

§ An additional default ENGINE clause is included, which explicitly states the table type that should be used. The setting in a default installation of MySQL is MyISAM, so it has no effect in this example

§ An additional DEFAULT CHARSET=latin1 clause is included, which tells MySQL what character set is used by the columns in the table. Again, this has no effect in a default, Latin-character-set-based installation

Column Types

This section describes the column types you can use in MySQL. It explains when each should be used and any limitations it has. We’ve ordered the choices in two sections: first, the commonly used, and, second, the less frequently used choices. Skip the second part if you want to and revisit it when one of the common choices doesn’t fit your needs; it’s certainly worth reviewing when you’re tackling the exercises at the end of this chapter.

Common column types

The following are the six commonly used column types in MySQL tables:

INT[(width)] [UNSIGNED] [ZEROFILL]

The most commonly used numeric type. Stores integer (whole number) values in the range –2,147,483,648 to 2,147,483,647. If the optional UNSIGNED keyword is added, the range is 0 to 4,294,967,295. The keyword INT is short for INTEGER, and they can be used interchangeably. AnINT column requires four bytes of storage space.

You can also include optional width and ZEROFILL arguments to left-pad the values with zeros up to the specified length. The maximum width is 255. The width parameter has no effect on what is stored. If you store a value wider than the width, the width value is ignored. Consider this example:

mysql> CREATE TABLE numbers (my_number INT(4) ZEROFILL );

Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO numbers VALUES(3),(33),(333),(3333),(33333),(333333);

Query OK, 6 rows affected (0.00 sec)

Records: 6 Duplicates: 0 Warnings: 0

mysql> SELECT * FROM numbers;

+-----------+

| my_number |

+-----------+

| 0003 |

| 0033 |

| 0333 |

| 3333 |

| 33333 |

| 333333 |

+-----------+

6 rows in set (0.00 sec)

You can see that numbers shorter than four digits wide are zero-padded to four digits; once the numbers are longer than four digits long, they are shown unaffected by the width and the ZEROFILL parameters.

If you use ZEROFILL, MySQL automatically adds UNSIGNED to the declaration (since zero filling makes sense only in the context of positive numbers).

DECIMAL[(width[,decimals])] [UNSIGNED] [ZEROFILL]

A commonly used numeric type. Stores a fixed-point number such as a salary or distance, with a total of width digits of which some smaller number are decimals that follow a decimal point. For example, a column declared as price DECIMAL(4,2) should be used to store values in the range –99.99 to 99.99. If you try to store a value that’s outside this range, it will be stored as the closest value in the allowed range. For example, 100 would be stored as 99.99, and –100 would be stored as –99.99. Note that MySQL versions before 5.03 would allow an extra digit for positive values (numbers from –99.99 to 999.99 could be stored). The width is optional, and a value of 10 is assumed when this is omitted. The maximum value of width is 255.

The number of decimals is optional and, when omitted, a value of 0 is assumed; the maximum value of decimals should be two less than the value of width. If you’re storing only positive values, use the UNSIGNED keyword as described for INT. If you want zero padding, use theZEROFILL keyword for the same behavior as described for INT. The keyword DECIMAL has three identical, interchangeable alternatives: DEC, NUMERIC, and FIXED.

Prior to MySQL version 5.0.3, a DECIMAL column was stored as a string, and so required exactly the number of bytes of storage space as the length of the value (plus up to two bytes for a minus sign and a decimal point if required). Beginning with version 5.0.3, a binary format was introduced that uses four bytes for every nine digits. Under both approaches, the value retrieved is identical to the value stored; this isn’t always the case with other types that contain decimal points, such as the FLOAT and DOUBLE types described later.

DATE

Stores and displays a date in the format YYYY-MM-DD for the range 1000-01-01 to 9999-12-31. Dates must always be input as year, month, and day triples, but the format of the input can vary, as shown in the following examples:

YYYY-MM-DD or YY-MM-DD

It’s optional whether you provide two-digit or four-digit years. We strongly recommend that you use the four-digit version to avoid confusion about the century. In practice, if you use the two-digit version, you’ll find that 70 to 99 are interpreted as 1970 to 1999, and 00 to 69 are interpreted as 2000 to 2069.

YYYY/MM/DD, YYYY:MM:DD, YY/MM/DD, or other punctuated formats

MySQL allows any punctuation characters to separate the components of a date. We recommend using dashes and, again, avoiding the two-digit years.

YYYY-M-D, YYYY-MM-D, or YYYY-M-DD

When punctuation is used (again, any punctuation character is allowed), single-digit days and months can be specified as such. For example, February 2, 2006, can be specified as 2006-2-2. The two-digit year equivalent is available, but not recommended.

YYYYMMDD or YYMMDD

Punctuation can be omitted in both date styles, but the digit sequences must be six or eight digits in length.

You can also input a date by providing both a date and time in the formats described later for DATETIME and TIMESTAMP, but only the date component is stored in a DATE type column. Regardless of the input type, the storage and display type is always YYYY-MM-DD. The zero date 0000-00-00 is allowed in all versions and can be used to represent an unknown or dummy value. If an input date is out of range, the zero date 0000-00-00 is stored. By default, from MySQL 5.0.2 onward, the zero date is stored when you insert an invalid date such as 2007-02-31. Prior to that version, invalid dates are stored provided the month is in the range 0 to 12, and the day is in the range 0 to 31. Consider this example:

mysql> CREATE TABLE testdate (mydate DATE);

Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO testdate VALUES ('2007/02/0');

Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO testdate VALUES ('2007/02/1');

Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO testdate VALUES ('2007/02/31');

Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO testdate VALUES ('2007/02/100');

Query OK, 1 row affected, 1 warning (0.00 sec)

With a version of MySQL older than 5.0.2, we would have:

mysql> SELECT * FROM testdate;

+------------+

| mydate |

+------------+

| 2007-02-00 |

| 2007-02-01 |

| 2007-02-31 |

| 0000-00-00 |

+------------+

4 rows in set (0.00 sec)

while with version 5.0.2 onwards, we have:

mysql> SELECT * FROM testdate;

+------------+

| mydate |

+------------+

| 2007-02-00 |

| 2007-02-01 |

| 0000-00-00 |

| 0000-00-00 |

+------------+

4 rows in set (0.01 sec)

Note also that the date is displayed in the YYYY-MM-DD format, regardless of how it was input.

TIME

Stores a time in the format HHH:MM:SS for the range -838:59:59 to 838:59:59. The values that can be stored are outside the range of the 24-hour clock to allow large differences between time values (up to 34 days, 22 hours, 59 minutes, and 59 seconds) to be computed and stored. Times must always be input in the order days, hours, minutes, and seconds, using the following formats:

DD HH:MM:SS, HH:MM:SS, DD HH:MM, HH:MM, DD HH, or SS

The DD represents a one-digit or two-digit value of days in the range 0 to 34. The DD value is separated from the hour value, HH, by a space, while the other components are separated by a colon. Note that MM:SS is not a valid combination, since it cannot be disambiguated from HH:MM.

For example, if you insert 2 13:25:59 into a TIME type column, the value 61:25:59 is stored, since the sum of 2 days (48 hours) and 13 hours is 61 hours. If you try inserting a value that’s out of bounds, a warning is generated, and the value is limited to the maximum time available. Similarly, if you try inserting an incorrect value, a warning is generated and the value is set to zero. You can use the SHOW WARNINGS command to reports the details of the warning generated by the previous SQL statement.

Let’s try all these out in practice:

mysql CREATE TABLE test_time(id SMALLINT, mytime TIME);

Query OK, 0 rows affected (0.00 sec)

mysql INSERT INTO test_time VALUES(1, "2 13:25:59");

Query OK, 1 row affected (0.00 sec)

mysql INSERT INTO test_time VALUES(2, "35 13:25:59");

Query OK, 1 row affected, 1 warning (0.00 sec)

mysql SHOW WARNINGS;

+---------+------+----------------------------------------------------------+

| Level | Code | Message |

+---------+------+----------------------------------------------------------+

| Warning | 1264 | Out of range value adjusted for column 'mytime' at row 1 |

+---------+------+----------------------------------------------------------+

1 row in set (0.00 sec)

mysql INSERT INTO test_time VALUES(3, "-35 13:25:59");

Query OK, 1 row affected, 1 warning (0.00 sec)

mysql INSERT INTO test_time VALUES(4, "35 13:25:69");

Query OK, 1 row affected, 1 warning (0.00 sec)

mysql SHOW WARNINGS;

+---------+------+---------------------------------------------+

| Level | Code | Message |

+---------+------+---------------------------------------------+

| Warning | 1265 | Data truncated for column 'mytime' at row 1 |

+---------+------+---------------------------------------------+

1 row in set (0.00 sec)

mysql SELECT * FROM test_time;

+----+------------+

| id | mytime |

+----+------------+

| 1 | 61:25:59 |

| 2 | 838:59:59 |

| 3 | -838:59:59 |

| 4 | 00:00:00 |

+----+------------+

4 rows in set (0.00 sec)

Note how the out-of-range and invalid times are stored.

H:M:S, and single-, double-, and triple-digit combinations

You can use different combinations of digits when inserting or updating data; MySQL converts them into the internal time format and displays them consistently. For example, 1:1:3 is equivalent to 01:01:03. Different numbers of digits can be mixed; for example, 1:12:3 is equivalent to 01:12:03. Consider these examples:

mysql> CREATE TABLE mytime (testtime TIME);

Query OK, 0 rows affected (0.12 sec)

mysql> INSERT INTO mytime VALUES

-> ('-1:1:1'), ('1:1:1'),

-> ('1:23:45'), ('123:4:5'),

-> ('123:45:6'), ('-123:45:6');

Query OK, 4 rows affected (0.00 sec)

Records: 4 Duplicates: 0 Warnings: 0

mysql> SELECT * FROM mytime;

+------------+

| testtime |

+------------+

| -01:01:01 |

| 01:01:01 |

| 01:23:45 |

| 123:04:05 |

| 123:45:06 |

| -123:45:06 |

+------------+

5 rows in set (0.01 sec)

Note that hours are shown with two digits for values within the range –99 to +99.

HHMMSS, MMSS, and SS

Punctuation can be omitted, but the digit sequences must be two, four, or six digits in length. Note that the rightmost pair of digits is always interpreted as a SS (seconds) value, the second next rightmost pair (if present) as MM (minutes), and the third rightmost pair (if present) as HH(hours). The result is that a value such as 1222 is interpreted as 12 minutes and 22 seconds, not 12 hours and 22 minutes.

You can also input a time by providing both a date and time in the formats described for DATETIME and TIMESTAMP, but only the time component is stored in a TIME type column. Regardless of the input type, the storage and display type is always HH:MM:SS. The zero time 00:00:00can be used to represent an unknown or dummy value. If an input date is invalid or out of range, the zero time 00:00:00 is stored. The TIME type has an additional fraction component for storing fractions of seconds, but, while a time value can be input with a fractional component, it is presently truncated before storage by MySQL; we’ve therefore omitted it from our discussions.

TIMESTAMP

Stores and displays a date and time pair in the format YYYY-MM-DD HH:MM:SS for the range 1970-01-01 00:00:00 to sometime in 2037. The behavior of this type has varied over the life of MySQL (and continues to do so!), and this section describes only the version implemented since MySQL 4.1. The key features of a TIMESTAMP column are twofold. First, if you assign NULL to it, it’s set to the current date and time. Second, a developer-selected TIMESTAMP column in a table can be automatically updated to the current date and time when a row is inserted or updated. You can always explicitly set a column to a value you want by assigning that value to the column, regardless of whether it’s the automatically updating column. The automatic update feature is discussed later in this section. A nonupdating near-equivalent is the DATETIME type described later in this section.

The value stored always matches the template YYYY-MM-DD HH:MM:SS, but the value can be provided in a wide range of formats:

YYYY-MM-DD HH:MM:SS or YY-MM-DD HH:MM:SS

The date and time components follow the same relaxed restrictions as the DATE and TIME components described previously (however, as of MySQL 5.0.2, zero values aren’t permitted). This includes allowance for any punctuation characters, including (unlike TIME) flexibility in the punctuation used in the time component. For example, 2005/02/15 12+22+23 is valid.

YYYYMMDDHHMMSS or YYMMDDHHMMSS

Punctuation can be omitted, but the string should be either 12 or 14 digits in length. We recommend only the unambiguous 14-digit version, for the reasons discussed for the DATE type. You can specify values with other lengths without providing separators, but we don’t recommend doing so.

Let’s discuss the automatic-update feature in detail. Only one TIMESTAMP column per table can be automatically set to the current date and time on insert or update. You control this by following these steps when creating a table:

1. Choose the column you want to be automatically updated.

2. If you have other TIMESTAMP columns in the table, set the ones that precede the selected column in the CREATE TABLE statement to have a constant default (such as DEFAULT 0).

3. For the automatically updating column, decide which behavior you want:

a. If you want the timestamp to be set only when a new row is inserted into the table, add DEFAULT CURRENT_TIMESTAMP to the end of the column declaration.

b. If you don’t want a default timestamp but want the current time to be used whenever the data in a row is updated, add ON UPDATE CURRENT_TIMESTAMP to the end of the column declaration.

c. If you want both of the above—that is, you want the timestamp to be set to the current time in each new row or whenever an existing row is modified— add DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP to the end of the column declaration.

If you specify DEFAULT NULL for a TIMESTAMP column, it will be interpreted differently depending on whether there are any other TIMESTAMP columns before it in the table. DEFAULT NULL is handled as DEFAULT CURRENT_TIMESTAMP for the first timestamp column, but asDEFAULT 0 for any subsequent ones.

Consider this example:

mysql> CREATE TABLE mytime(id INT NOT NULL,

-> changetime TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP);

Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO mytime VALUES(1,''),(2,'2006-07-16 1:2:3'),(3,NULL);

Query OK, 3 rows affected, 2 warnings (0.01 sec)

Records: 3 Duplicates: 0 Warnings: 2

mysql> SELECT * FROM mytime;

+----+---------------------+

| id | changetime |

+----+---------------------+

| 1 | 0000-00-00 00:00:00 |

| 2 | 2006-07-16 01:02:03 |

| 3 | 2006-07-16 01:05:24 |

+----+---------------------+

3 rows in set (0.00 sec)

Note how the current time is stored when we ask to insert a NULL value. Now, let’s change the id for the first row:

mysql> UPDATE mytime SET id=4 WHERE id=1;

Query OK, 1 row affected (0.08 sec)

Rows matched: 1 Changed: 1 Warnings: 0

mysql> SELECT * FROM mytime;

+----+---------------------+

| id | changetime |

+----+---------------------+

| 4 | 2006-07-16 01:05:42 |

| 2 | 2006-07-16 01:02:03 |

| 3 | 2006-07-16 01:05:24 |

+----+---------------------+

3 rows in set (0.00 sec)

As you can see, the timestamp is updated to the current timestamp.

There are other variations on how you can control which column updates automatically, but if you stick to the previous steps, you’ll get the behavior you want. You can find more examples of using timestamps later in The Sample Music Database.”

CHAR[(width)]

The most commonly used string type. CHAR stores a fixed-length string (such as a name, address, or city) of length width. If a width is not provided, CHAR(1) is assumed. The maximum value of width is 255. With MySQL versions between 4.1.0 and 5.0.2, MySQL accepts values greater than 255 and silently changes the CHAR type to the smallest TEXT type that is suitable; we discuss the TEXT type later in this section.

You can in fact define a special CHAR(0) NULL column that takes up only one bit of storage. This provides two handy features. First, it allows you to include a dummy column in a table that doesn’t do anything (which might be useful as a placeholder for a future feature, or to be backward-compatible with an old application). Second, it allows you to store one of two values: NULL or the empty string '', giving you very compact storage of binary (Boolean) values. To help you understand this better, let’s create a table with a CHAR(0) field, and an id field to help differentiate between entries:

mysql> CREATE TABLE bool(id INT, bit CHAR(0) NULL);

Query OK, 0 rows affected (0.02 sec)

Now, let’s add three values: an empty string '', NULL, and the character 1:

mysql> INSERT INTO bool VALUES (1,''), (2,NULL), (3,'1');

Query OK, 3 rows affected, 1 warning (0.01 sec)

Records: 3 Duplicates: 0 Warnings: 1

These all look the same:

mysql> SELECT * FROM bool;

+----+------+

| id | bit |

+----+------+

| 1 | |

| 2 | |

| 3 | |

+----+------+

3 rows in set (0.00 sec)

However, one is NULL:

mysql> SELECT * FROM bool WHERE bit IS NULL;

+----+------+

| id | bit |

+----+------+

| 2 | |

+----+------+

1 row in set (0.00 sec)

and the other two aren’t:

mysql> SELECT * FROM bool WHERE bit IS NOT NULL;

+----+------+

| id | bit |

+----+------+

| 1 | |

| 3 | |

+----+------+

2 rows in set (0.01 sec)

In all other cases, the CHAR type takes exactly the number of bytes in storage space as the width of the column (assuming your chosen character set uses one byte per character). Values that are less than width characters in length are stored left-aligned in the allocated space, with space character padding on the right side. All trailing spaces are ignored when retrieving and displaying values, as in this example:

mysql> CREATE TABLE show_padding(mystring CHAR(10));

Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO show_padding VALUES ('a'),('abc'),('abcde'),('abcdefg ');

Query OK, 4 rows affected (0.00 sec)

Records: 4 Duplicates: 0 Warnings: 0

mysql> SELECT * FROM show_padding;

+----------+

| mystring |

+----------+

| a |

| abc |

| abcde |

| abcdefg |

+----------+

4 rows in set (0.01 sec)

As you can see, the trailing spaces aren’t shown in the last row. They’re also ignored if you try to find strings that have a trailing space:

mysql> SELECT * FROM show_padding WHERE mystring LIKE '% ';

Empty set (0.00 sec)

Since trailing spaces are ignored, no matches are reported.

Note that this has an interesting side effect: you can’t differentiate between strings of spaces alone; the strings " " and " " are considered to be the same thing. Consequently, you can’t use one value in the primary key if you’ve already got the other. Consider an example; we can create a table to store names and email addresses, with the email address as the primary key:

mysql> CREATE TABLE contacts (name CHAR(40), email CHAR(40) PRIMARY KEY);

Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO contacts VALUES('Sarah', 'sarah@learningmysql.com');

Query OK, 1 row affected (0.01 sec)

So far, so good. Now, if we don’t know someone’s email address, we can store an empty string:

mysql> INSERT INTO contacts VALUES('Zahra', '');

Query OK, 1 row affected (0.00 sec)

Note that an empty string is not NULL, so MySQL doesn’t complain; however, since the email address is the primary key, we can’t store another empty string. Let’s try storing a single space:

mysql> INSERT INTO Contacts VALUES('Samaneh', ' ');

ERROR 1062 (23000): Duplicate entry '' for key 1

MySQL complains about a duplicate key, since the single space is treated as an empty string. Trying to insert the string "not sure" works, but then "not sure " (with a trailing space) doesn’t work:

mysql> INSERT INTO Contacts VALUES('Samaneh', 'not sure');

Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO Contacts VALUES('Sadri', 'not sure ');

ERROR 1062 (23000): Duplicate entry 'not sure' for key 1

Leading spaces don’t cause any problems:

mysql> INSERT INTO Contacts VALUES('Saleh', ' not sure');

Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO Contacts VALUES('Susan', ' not sure');

Query OK, 1 row affected (0.00 sec)

You should use the BLOB or TEXT types described later if you don’t want this behavior.

That concludes our discussion of the six common column types used in MySQL. You’ll find examples using some of these types in The Sample Music Database,” later in this chapter. The remainder of this section covers the other type choices available in MySQL, beginning with the other choices for numeric values.

Other integer types

In Common column types,” we saw the INT type for storing integer numbers. In this section, we’ll look at a few other integer types that you can use. We recommend that you always choose the smallest possible type to store values. For example, if you’re storing age values, choose TINYINTinstead of the regular INT. Smaller types require less storage space; this reduces disk and memory requirements and speeds up the retrieval of data from disk. Indeed, column type tuning is a key step that professional database tuners use in optimizing database applications.

Here is the list of the integer types—besides INT—that you can choose from. Be aware that the general issues described for INT apply to these types as well:

BOOLEAN

A type introduced in MySQL 4.1 that stores a Boolean value of false (zero) or true (nonzero). For example, it might be used to store whether a person is alive (true) or dead (false), a customer is active (true) or inactive (false), or whether a customer wants to receive emails (true) or not (false). The BOOLEAN type has the synonyms BOOL and BIT. It is equivalent to TINYINT(1), and so requires one byte of storage space; you can achieve more compact, one-bit Boolean values by using CHAR(0), as described previously.

TINYINT[(width)] [UNSIGNED] [ZEROFILL]

Stores integer (whole number) values in the range –128 to 127. The width, UNSIGNED, and ZEROFILL options behave as for INT. When UNSIGNED is used, a column can store values in the range 0 to 255. A TINYINT column requires one byte of storage space.

SMALLINT[(width)] [UNSIGNED] [ZEROFILL]

Stores integer (whole number) values in the range –32,768 to 32,767. The width, UNSIGNED, and ZEROFILL options behave as for INT. When UNSIGNED is used, a column can store values in the range 0 to 65,535. A SMALLINT column requires two bytes of storage space.

MEDIUMINT[(width)] [UNSIGNED] [ZEROFILL]

Stores integer (whole number) values in the range –8,388,608 to 8,388,607. The width, UNSIGNED, and ZEROFILL options behave as for INT. When UNSIGNED is used, a column can store values in the range 0 to 16,777,215. A MEDIUMINT column requires three bytes of storage space.

BIGINT[(width)] [UNSIGNED] [ZEROFILL]

Stores integer (whole number) values in the range –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. The width, UNSIGNED, and ZEROFILL options behave as for INT. When UNSIGNED is used, a column can store values in the range 0 to 18,446,744,073,709,551,615. ABIGINT column requires eight bytes of storage space.

Other rational number types

In Common column types,” we discussed the fixed-point DECIMAL type. There are two other types that support decimal points: DOUBLE (also known as REAL) and FLOAT. They’re designed to store approximate numeric values rather than the exact values stored by DECIMAL. Why would you want approximate values? The answer is that many numbers with a decimal point are approximations of real quantities. For example, suppose you earn $50,000 per annum and you want to store it as a monthly wage. When you convert it to a per-month amount, it’s $4,166 plus 66 and 2/3rds cents. If you store this as $4,166.67, it’s not exact enough to convert to a yearly wage (since 12 multiplied by $4,166.67 is $50,000.04). However, if you store 2/3rds with enough decimal places, it’s a closer approximation, and you’ll find that it is accurate enough to correctly multiply to obtain the original value in a high-precision environment such as MySQL. That’s where DOUBLE and FLOAT are useful: they let you store values such as 2/3rds or pi with a large number of decimal places, allowing accurate approximate representations of exact quantities.

Let’s continue the previous example using DOUBLE. Suppose you create a table as follows:

mysql> CREATE TABLE wage (monthly DOUBLE);

Query OK, 0 rows affected (0.09 sec)

You can now insert the monthly wage using:

mysql> INSERT INTO wage VALUES (50000/12);

Query OK, 1 row affected (0.00 sec)

When you multiply it to a yearly value, you get an accurate approximation:

mysql> SELECT monthly*12 FROM wage;

+------------+

| monthly*12 |

+------------+

| 50000 |

+------------+

1 row in set (0.00 sec)

Here are the details of the DOUBLE and FLOAT types:

FLOAT[(width, decimals)] [UNSIGNED] [ZEROFILL] or FLOAT[(precision)] [UNSIGNED] [ZEROFILL]

Stores floating-point numbers. It has two optional syntaxes: the first allows an optional number of decimals and an optional display width, and the second allows an optional precision that controls the accuracy of the approximation measured in bits. Without parameters, the type stores small, four-byte, single-precision floating-point values; usually, you use it without providing any parameters. When precision is between 0 and 24, the default behavior occurs. When precision is between 25 and 53, the type behaves as for DOUBLE. The width has no effect on what is stored, only on what is displayed. The UNSIGNED and ZEROFILL options behave as for INT.

DOUBLE[(width, decimals)] [UNSIGNED] [ZEROFILL]

Stores floating-point numbers. It has one optional syntax: it allows an optional number of decimals and an optional display width. Without parameters, the type stores normal, eight-byte, double-precision floating point values; usually, you use it without providing any parameters. Thewidth has no effect on what is stored, only on what is displayed. The UNSIGNED and ZEROFILL options behave as for INT. The DOUBLE type has two identical synonyms: REAL and DOUBLE PRECISION. The REAL alternative can be made to behave as FLOAT using a nondefault parameter to the MySQL server, but this is not discussed here.

Other date and time types

We discussed the DATE, TIME, and TIMESTAMP types in Common column types.” There are two more date and time types: YEAR for storing only year values, and DATETIME for storing date and time combinations without the automatic-update feature of TIMESTAMP. These work as follows:

YEAR[(digits)]

Stores a two- or four-digit year, depending on whether 2 or 4 is passed as the optional digits parameter. Without the parameter, four digits is the default. The two-digit version stores values from 70 to 69, representing 1970 to 2069; again, we caution against using two-digit dates. The four-digit version stores values in the range 1901 to 2155, as well as the zero year, 0000. Illegal values are converted to the zero date. You can input year values as either strings (such as '2005') or integers (such as 2005). The YEAR type requires one byte of storage space.

DATETIME

Stores and displays a date and time pair in the format YYYY-MM-DD HH:MM:SS for the range 1000-01-01 00:00:00 to 9999-12-31 23:59:59. As for TIMESTAMP, the value stored always matches the template YYYY-MM-DD HH:MM:SS, but the value can be input in the same formats listed for the TIMESTAMP description. If you assign only a date to a DATETIME column, the zero time 00:00:00 is assumed. If you assign only a time to a DATETIME column, the zero date 0000-00-00 is assumed. This type does not have the automatic update features ofTIMESTAMP.

Other string types

The remaining types in MySQL are variants of the string type; here’s a list that you can choose from—excepting CHAR, which was described in Common column types”:

VARCHAR(width)

A commonly used string type. Stores variable-length strings (such as names, addresses, or cities) up to a maximum width. The maximum value of width is 65,535 characters.

Prior to MySQL version 5.0.3, the maximum length was 255 characters. Trying to specify a longer length would cause an error in versions up to 4.1.0. Between versions 4.1.0 and 5.0.3, the server would silently change the column type to the smallest TEXT type that would hold values of that length

A VARCHAR type incurs one or two extra bytes of overhead to store the length of the string, depending on whether the string is shorter than or longer than 255 characters.

Trailing spaces are removed when a value is stored; you can use TEXT or BLOB types to avoid this behavior.

BINARY(width) and VARBINARY(width)

Available since MySQL 4.1.2, these are equivalent to CHAR and VARCHAR but allow you to store binary strings. Binary strings have no character set, and sorting them is case-sensitive. Read the descriptions of CHAR and VARCHAR for other details. If you’re using a MySQL version earlier than 4.1.2, you can create the same behavior by adding the keyword BINARY after the CHAR or VARCHAR declaration, as in CHAR(12) BINARY.

BLOB

The commonly used type for storing large data. Stores a variable amount of data (such as an image, video, or other nontext file) up to 65,535 bytes in length. The data is treated as binary—that is, no character set is assumed, and comparisons and sorts are case-sensitive. There is no trailing-space-removal behavior as for the CHAR or VARCHAR types. In addition, a DEFAULT clause is not permitted, and you must take a prefix of the value when using it in an index (this is discussed in the next section).

TEXT

A commonly used type for storing large string data objects. Stores a variable amount of data (such as a document or other text file) up to 65,535 bytes in length. It is identical to BLOB, except that the data is treated as belonging to a character set. Since MySQL 4.1, the character set can be set for each column, and prior to that the character set of the server was assumed. Comparisons and sorts are case-insensitive.

TINYBLOB and TINYTEXT

Identical to BLOB and TEXT, respectively, except that a maximum of 255 bytes can be stored.

MEDIUMBLOB and MEDIUMTEXT

Identical to BLOB and TEXT, respectively, except that a maximum of 16,777,215 bytes can be stored.

LONGBLOB and LONGTEXT

Identical to BLOB and TEXT, respectively, except that a maximum of four gigabytes of data can be stored. The effective maximum can vary depending on the memory available on the server and its configuration.

ENUM('value1'[,'value2'[, ...]]

A list, or enumeration of string values. A column of type ENUM can be set to a value from the list value1, value2, and so on, up to a maximum of 65,535 different values. While the values are stored and retrieved as strings, what’s stored in the database is an integer representation. The enumerated column can contain NULL (stored as NULL), the empty string '' (stored as 0), or any of the valid elements (stored as 1, 2, 3, and so on). You can prevent NULL values from being accepted by declaring the column as NOT NULL when creating the table.

This type is a compact way of storing values from a list of predefined values, such as state or country names. Consider this example using fruit names; the name can be any one of the predefined values Apple, Orange, or Pear (in addition to NULL and the empty string):

mysql> CREATE TABLE fruits_enum ( fruit_name ENUM('Apple', 'Orange', 'Pear') );

Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO fruits_enum VALUES ('Apple');

Query OK, 1 row affected (0.00 sec)

If you try inserting a value that’s not in the list, MySQL warns you that it didn’t store the data you asked:

mysql> INSERT INTO fruits_enum VALUES ('Banana');

Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> SHOW WARNINGS;

+---------+------+-------------------------------------------------+

| Level | Code | Message |

+---------+------+-------------------------------------------------+

| Warning | 1265 | Data truncated for column 'fruit_name' at row 1 |

+---------+------+-------------------------------------------------+

1 row in set (0.00 sec)

Similarly, a list of several allowed values isn’t accepted either:

mysql> INSERT INTO fruits_enum VALUES ('Apple,Orange');

Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> SHOW WARNINGS;

+---------+------+-------------------------------------------------+

| Level | Code | Message |

+---------+------+-------------------------------------------------+

| Warning | 1265 | Data truncated for column 'fruit_name' at row 1 |

+---------+------+-------------------------------------------------+

1 row in set (0.00 sec)

Displaying the contents of the table, you can see that when you try to store anything that’s not in the valid values, an empty string is stored instead:

mysql> SELECT * FROM fruits_enum;

+------------+

| fruit_name |

+------------+

| Apple |

| |

| |

+------------+

3 rows in set (0.00 sec)

You can also specify a default value other than the empty string:

mysql> CREATE TABLE new_fruits_enum ( fruit_name ENUM('Apple', 'Orange', 'Pear')

-> DEFAULT 'Pear');

Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO new_fruits_enum VALUES();

Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM new_fruits_enum;

+------------+

| fruit_name |

+------------+

| Pear |

+------------+

1 row in set (0.00 sec)

Here, not specifying a value results in the default value Pear being stored.

SET('value1'[,'value2'[, ...]])

A set of string values. A column of type SET can be set to zero or more values from the list value1, value2, and so on, up to a maximum of 64 different values. While the values are strings, what’s stored in the database is an integer representation. SET differs from ENUM in that each row can store only one ENUM value in a column, but can store multiple SET values. This type is useful for storing a selection of choices from a list, such as user preferences. Consider this example using fruit names; the name can be any combination of the predefined values:

mysql> CREATE TABLE fruits_set ( fruit_name SET('Apple', 'Orange', 'Pear') );

Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO fruits_set VALUES ('Apple');

Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO fruits_set VALUES ('Banana');

Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> SHOW WARNINGS;

+---------+------+-------------------------------------------------+

| Level | Code | Message |

+---------+------+-------------------------------------------------+

| Warning | 1265 | Data truncated for column 'fruit_name' at row 1 |

+---------+------+-------------------------------------------------+

1 row in set (0.00 sec)

mysql> INSERT INTO fruits_set VALUES ('Apple,Orange');

Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM fruits_set;

+--------------+

| fruit_name |

+--------------+

| Apple |

| |

| Apple,Orange |

+--------------+

3 rows in set (0.01 sec)

Again, note that we can store multiple values from the set in a single field, and that an empty string is stored for invalid input.

As with numeric types, we recommend that you always choose the smallest possible type to store values. For example, if you’re storing a city name, use CHAR or VARCHAR, rather than, say, the TEXT type. Having shorter columns helps keep your table size down, which in turns helps performance when the server has to search through a table.

Using a fixed size with the CHAR type is often faster than using a variable size with VARCHAR, since the MySQL server knows where each row starts and ends, and can quickly skip over rows to find the one it needs. However, with fixed-length fields, any space that you don’t use is wasted. For example, if you allow up to 40 characters in a city name, then CHAR(40) will always use up 40 characters, no matter how long the city name actually is. If you declare the city name to be VARCHAR(40), then you’ll use up only as much space as you need, plus one byte to store the name length. If the average city name is 10 characters long, this means that using a variable length field will take up 29 fewer bytes per entry; this can make a big difference if you’re storing millions of addresses.

In general, if storage space is at a premium or you expect large variations in the length of strings that are to be stored, use a variable-length field; if performance is a priority, use a fixed length.

Keys and Indexes

You’ll find that almost all tables you use will have a PRIMARY KEY clause declared in their CREATE TABLE statement. The reasons why you need a primary key are discussed in Chapter 4. This section discusses how primary keys are declared, what happens behind the scenes when you do so, and why you might want to also create other keys and indexes on your data.

A primary key uniquely identifies each row in a table. When you declare one to MySQL, it creates a new file on disk that stores information about where the data from each row in the table is stored. This information is called an index, and its purpose is to speed up searches that use the primary key. For example, when you declare PRIMARY KEY (artist_id) in the artist table in the music database, MySQL creates a structure that allows it to find rows that match a specific artist_id (or a range of identifiers) extremely quickly. This is very useful to match artists to albums, tracks, and playlist information. You can display the indexes available on a table using the SHOW INDEX command:

mysql> SHOW INDEX FROM artist;

+--------+------------+----------+--------------+-------------+-----------+...

| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation |...

+--------+------------+----------+--------------+-------------+-----------+...

| artist | 0 | PRIMARY | 1 | artist_id | A |...

+--------+------------+----------+--------------+-------------+-----------+...

... +-------------+----------+--------+------+------------+---------+

... | Cardinality | Sub_part | Packed | Null | Index_type | Comment |

... +-------------+----------+--------+------+------------+---------+

... | 6 | NULL | | | BTREE | |

... +-------------+----------+--------+------+------------+---------+

1 row in set (0.00 sec)

We’ve wrapped the output here so that it would fit on the page. The cardinality is the number of unique values in the index; for an index on a primary key, this is the same as the number of rows in the table.

Note that all columns that are part of a primary key must be declared as NOT NULL, since they must have a value for the row to be valid. Without the index, the only way to find rows in the table is to read each one from disk and check whether it matches the artist_id you’re searching for. For tables with many rows, this exhaustive, sequential searching is extremely slow. However, you can’t just index everything; we’ll come back to this point at the end of this section.

You can create other indexes on the data in a table. You do this so that other searches—on other columns or combinations of columns—are extremely fast and in order to avoid sequential scans. For example, suppose you often want to search by artist_name. You can drop the table and modify the CREATE TABLE definition to add an extra index:

mysql> DROP TABLE artist;

Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TABLE artist (

-> artist_id SMALLINT(5) NOT NULL DEFAULT 0,

-> artist_name CHAR(128) DEFAULT NULL,

-> PRIMARY KEY (artist_id),

-> KEY artist_name (artist_name)

-> );

Query OK, 0 rows affected (0.06 sec)

You can see we’ve used the keyword KEY to tell MySQL that we want an extra index; you can use the word INDEX in place of KEY. Following this, we’ve named the index—in this example, we’ve named it after the column name—and then we’ve included the column to index in parentheses. You can also add indexes after tables are created—in fact, you can pretty much change anything about a table after its creation—and this is discussed in Altering Structures.”

You can build an index on more than one column. For example, consider the following customer table:

mysql> CREATE TABLE customer (

-> cust_id INT(4) NOT NULL DEFAULT 0,

-> firstname CHAR(50),

-> secondname CHAR(50),

-> surname CHAR(50),

-> PRIMARY KEY (cust_id),

-> KEY names (firstname, secondname, surname));

Query OK, 0 rows affected (0.01 sec)

You can see that we’ve added a primary key index on the cust_id identifier column, and we’ve also added another index—called names—that includes the firstname, secondname, and surname columns in this order. Let’s now consider how you can use that extra index.

You can use the names index for fast searching by combinations of the three name columns. For example, it’s useful in the following query:

mysql> SELECT * FROM customer WHERE

-> firstname = "Rose" AND

-> secondname = "Elizabeth" AND

-> surname = "Williams";

We know it helps the search, because all columns listed in the index are used in the query. You can use the EXPLAIN statement to check whether what you think should happen is in fact happening:

mysql> EXPLAIN SELECT * FROM customer WHERE

-> firstname = "Rose" AND

-> secondname = "Elizabeth" AND

-> surname = "Williams";

+----+-------------+----------+------+---------------+...

| id | select_type | table | type | possible_keys |...

+----+-------------+----------+------+---------------+...

| 1 | SIMPLE | customer | ref | names |...

+----+-------------+----------+------+---------------+...

...+-------+---------+-------------------+------+-------------+

...| key | key_len | ref | rows | Extra |

...+-------+---------+-------------------+------+-------------+

...| names | 153 | const,const,const | 1 | Using where |

...+-------+---------+-------------------+------+-------------+

1 row in set (0.00 sec)

We’ve reformatted the output slightly to fit better in the book. You can see that MySQL reports that the possible_keys are names (meaning that the index could be used for this query) and that the key that it’s decided to use is names. So, what you expect and what is happening are the same, and that’s good news! You’ll find out more about the EXPLAIN statement in Chapter 7.

The index we’ve created is also useful for queries on only the firstname column. For example, it can be used by the following query:

mysql> SELECT * FROM customer WHERE

-> firstname = "Rose";

You can use EXPLAIN to check whether the index is being used. The reason it can be used is because the firstname column is the first listed in the index. In practice, this means that the index clusters, or stores together, information about rows for all people with the same first name, and so the index can be used to find anyone with a matching first name.

The index can also be used for searches involving combinations of first name and second name, for exactly the same reasons we’ve just discussed. The index clusters together people with the same first name, and within that it clusters people with identical first names ordered by second name. So, it can be used for this query:

mysql> SELECT * FROM customer WHERE

-> firstname = "Rose" AND

-> secondname = "Elizabeth";

However, the index can’t be used for this query because the leftmost column in the index, firstname, does not appear in the query:

mysql> SELECT * FROM customer WHERE

-> surname = "Williams" AND

-> secondname = "Elizabeth";

The index should help narrow down the set of rows to a smaller set of possible answers. For MySQL to be able to use an index, the query needs to meet both the following conditions:

1. The leftmost column listed in the KEY (or PRIMARY KEY) clause must be in the query.

2. The query must contain no OR clauses for columns that aren’t indexed.

Again, you can always use the EXPLAIN statement to check whether an index can be used for a particular query.

Before we finish this section, here are a few ideas on how to choose and design indexes. When you’re considering adding an index, think about the following:

§ Indexes cost space on disk, and they need to be updated whenever data changes. If your data changes frequently, or lots of data changes when you do make a change, indexes will slow the process down. However, in practice, since SELECT statements (data reads) are usually much more common than other statements (data modifications), indexes are usually beneficial.

§ Only add an index that’ll be used frequently. Don’t bother indexing columns before you see what queries your users and your applications need. You can always add indexes afterward.

§ If all columns in an index are used in all queries, list the column with the highest number of duplicates at the left of the KEY clause. This minimizes index size.

§ The smaller the index, the faster it’ll be. If you index large columns, you’ll get a larger index. This is a good reason to ensure your columns are as small as possible when you design your tables.

§ For long columns, you can use only a prefix of the values from a column to create the index. You can do this by adding a value in parentheses after the column definition, such as KEY names (firstname(3), secondname(2), surname(10)). This means that only the first three characters of firstname are indexed, then the first two characters of secondname, and then 10 characters from surname. This is a significant saving over indexing 50 characters from each of the 3 columns! When you do this, your index will be less able to uniquely identify rows, but it’ll be much smaller and still reasonably good at finding matching rows.

The AUTO_INCREMENT Feature

MySQL’s proprietary AUTO_INCREMENT feature allows you to create a unique identifier for a row without running a SELECT query. Here’s how it works. Suppose you drop and re-create the artist table as follows:

mysql> DROP TABLE artist;

Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TABLE artist (

-> artist_id SMALLINT(5) NOT NULL AUTO_INCREMENT,

-> artist_name CHAR(128) DEFAULT NULL,

-> PRIMARY KEY (artist_id)

-> );

Query OK, 0 rows affected (0.06 sec)

You can now insert rows, without providing an artist_id:

mysql> INSERT INTO artist VALUES (NULL, "The Shamen");

Query OK, 1 row affected (0.06 sec)

mysql> INSERT INTO artist VALUES (NULL, "Probot");

Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO artist VALUES (NULL, "The Cult");

Query OK, 1 row affected (0.00 sec)

When you view the data in this table you can see that each artist has a meaningful artist_id:

mysql> SELECT * FROM artist;

+-----------+-------------+

| artist_id | artist_name |

+-----------+-------------+

| 1 | The Shamen |

| 2 | Probot |

| 3 | The Cult |

+-----------+-------------+

3 rows in set (0.01 sec)

Each time an artist is inserted, a unique artist_id is created for the new row.

Let’s consider how the new feature works. You can see that the artist_id column is declared as an integer with the clauses NOT NULL AUTO_INCREMENT. The AUTO_INCREMENT keyword tells MySQL that when a value isn’t provided for this column, the value allocated should be one more than the maximum currently stored in the table. The AUTO_INCREMENT sequence begins at 1 for an empty table.

The NOT NULL is required for AUTO_INCREMENT columns; when you insert NULL (or 0, though this isn’t recommended), the MySQL server automatically finds the next available identifier and assigns it to the new row. You can manually insert negative values if the column was not defined as UNSIGNED; however, for the next automatic increment, MySQL will simply use the largest (most positive) value in the column, or start from 1 if there are no positive values.

The AUTO_INCREMENT feature has the following requirements:

§ The column it is used on must be indexed.

§ The column that is it used on cannot have a DEFAULT value.

§ There can be only one AUTO_INCREMENT column per table.

MySQL supports different table types; we’ll learn more about these in Table Types” in Chapter 7. When you’re using the default MyISAM table type, you can use the AUTO_INCREMENT feature on keys that comprise multiple columns. In our music database example, we could create thealbum table as follows:

mysql> CREATE TABLE album (

-> artist_id INT(5) NOT NULL,

-> album_id INT(4) NOT NULL AUTO_INCREMENT,

-> album_name CHAR(128) DEFAULT NULL,

-> PRIMARY KEY (artist_id, album_id)

-> );

Query OK, 0 rows affected (0.00 sec)

You can see that the primary key is on two columns—artist_id and album_id—and that the AUTO_INCREMENT feature is applied to the album_id column.

Suppose you want to insert two albums for The Shamen, the artist we added earlier with an artist_id of 1. Here’s how you do it:

mysql> INSERT INTO album VALUES (1, NULL, "Boss Drum");

Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO album VALUES (1, NULL, "Entact");

Query OK, 1 row affected (0.00 sec)

Now, let’s inspect the results:

mysql> SELECT * FROM album WHERE artist_id = 1;

+-----------+----------+------------+

| artist_id | album_id | album_name |

+-----------+----------+------------+

| 1 | 1 | Boss Drum |

| 1 | 2 | Entact |

+-----------+----------+------------+

2 rows in set (0.00 sec)

You can see that the correct album_id values are assigned; this is just as we’d expect. Now, consider what happens when we add two albums for the artist “The Cult”:

mysql> INSERT INTO album VALUES (3, NULL, "Electric");

Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO album VALUES (3, NULL, "Sonic Temple");

Query OK, 1 row affected (0.00 sec)

Here are the results:

mysql> SELECT * FROM album WHERE artist_id = 3;

+-----------+----------+--------------+

| artist_id | album_id | album_name |

+-----------+----------+--------------+

| 3 | 1 | Electric |

| 3 | 2 | Sonic Temple |

+-----------+----------+--------------+

2 rows in set (0.00 sec)

You can see how the feature works with two columns in the primary key: it’s reused the artist_id value that was used for The Cult, and the weak key (album_id) is incremented automatically. This ensures that the album primary key (the combination of artist_id and album_id) is unique for each album. We now have albums 1 and 2 for The Shamen (with an artist_id of 1), and albums 1 and 2 for The Cult (with an artist_id of 3).

While the AUTO_INCREMENT feature is useful, it isn’t portable to other database environments, and it hides the logical steps to creating new identifiers. It can also lead to ambiguity; for example, dropping or truncating a table will reset the counter, but deleting selected rows (with a WHEREclause) doesn’t reset the counter. Consider an example; let’s create the table count that contains an auto-incrementing field counter:

mysql> CREATE TABLE count (counter INT AUTO_INCREMENT KEY);

Query OK, 0 rows affected (0.13 sec)

mysql> INSERT INTO count VALUES (),(),(),(),(),();

Query OK, 6 rows affected (0.01 sec)

Records: 6 Duplicates: 0 Warnings: 0

mysql> SELECT * FROM count;

+---------+

| counter |

+---------+

| 1 |

| 2 |

| 3 |

| 4 |

| 5 |

| 6 |

+---------+

6 rows in set (0.00 sec)

Inserting several values works as expected. Now, let’s delete a few rows and then add six new rows:

mysql> DELETE FROM count WHERE counter > 4;

Query OK, 2 rows affected (0.00 sec)

mysql> INSERT INTO count VALUES (),(),(),(),(),();

Query OK, 6 rows affected (0.00 sec)

Records: 6 Duplicates: 0 Warnings: 0

mysql> SELECT * FROM count;

+---------+

| counter |

+---------+

| 1 |

| 2 |

| 3 |

| 4 |

| 7 |

| 8 |

| 9 |

| 10 |

| 11 |

| 12 |

+---------+

10 rows in set (0.00 sec)

Here, we see that the counter is not reset, and continues from 7. If, however, we delete all the data in the table, the counter is reset to 1:

mysql> TRUNCATE TABLE count;

Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO count VALUES (),(),(),(),(),();

Query OK, 6 rows affected (0.01 sec)

Records: 6 Duplicates: 0 Warnings: 0

mysql> SELECT * FROM count;

+---------+

| counter |

+---------+

| 1 |

| 2 |

| 3 |

| 4 |

| 5 |

| 6 |

+---------+

6 rows in set (0.00 sec)

Instead of relying on MySQL to handle incrementing fields as you hope, you can manage the process in program code that you write to interact with the database. We don’t use an auto-incrementing field in the final music database specification, described fully in the next section. However, we do use one in our wedding gift registry in Chapter 15.

The Sample Music Database

We’ve used the music database extensively in this and the previous chapter, so you’re already familiar with its structure. This section explains the steps we took to express our sample music database as SQL statements for loading into MySQL. It also lists the complete SQL statements used to create the structures, which you’ll find a useful reference for discussions in later chapters.

Let’s begin by discussing how we structured the file that contains the SQL statements. You can download the file music.sql from the book’s web site. We created the table using the monitor, and created the file from the output of one of MySQL’s commands for dumping SQL, and then edited it for readability. You’ll find more about how to dump SQL statements to a file in Chapter 10.

The music.sql file is structured as follows:

1. Drop the database if it exists, and then create it.

2. Use the database.

3. Create the tables.

4. Insert the data.

This structure allows you to reload the database—using the SOURCE command discussed in Chapter 3—at any time without having to worry about whether the database, tables, or data exist. Loading the file just wipes the database and starts again. Of course, in a production environment, always ensure your backups are reasonably up-to-date before commencing a restore operation that involves dropping tables or deleting existing data.

The first three lines of the file carry out the first two steps:

DROP DATABASE IF EXISTS music;

CREATE DATABASE music;

USE music;

The next section of the file creates the tables (the third step), and that’s the focus of this section; we don’t list the insert statements in this book, but they’re easily viewed in music.sql. Let’s start by looking at how we created the artist table:

CREATE TABLE artist (

artist_id SMALLINT(5) NOT NULL DEFAULT 0,

artist_name CHAR(128) DEFAULT NULL,

PRIMARY KEY (artist_id)

);

The table has a structure that’s derived from the design in Chapter 4. The artist_id is the primary key; because of this, and as required by MySQL, we’ve added a NOT NULL clause. The DEFAULT clause inserts a default value for the column if we don’t provide one. If a field doesn’t have a default value, MySQL reports an error if a value isn’t provided for it during an insert operation. In the artist table, the artist_id will be set to 0 if we don’t provide an artist_id ourselves. MySQL will complain the second time we try to do this, since artist_id is the primary key of the artist table, and we can’t have two rows with the same primary key.

We’ve used the SMALLINT type for the artist_id because it’s a numeric identifier, and a SMALLINT allows us to have around 65,000 artists; we’ve limited its display width to 5 characters.

We’ve decided that 128 characters is more than we’d need for any likely artist_name. We use the CHAR type instead of the VARCHAR type so that each row has an fixed, predictable size; this allows MySQL to better optimize the retrieval of rows from its files, typically making the application faster despite the files being typically larger than if VARCHAR was used. We haven’t added a NOT NULL clause to the artist_name, and have instead assumed that whatever application we build will do the checking for us. In general, the fewer the constraints and conditions that are built into the database, the faster it is for MySQL to work with. However, MySQL now optimizes for NOT NULL columns, so it is better to declare NOT NULL where the data will never be NULL. See the "Data Size" section of the MySQL manual for details.

The album table follows a similar rationale:

CREATE TABLE album (

artist_id SMALLINT(5) NOT NULL DEFAULT 0,

album_id SMALLINT(4) NOT NULL DEFAULT 0,

album_name CHAR(128) DEFAULT NULL,

PRIMARY KEY (artist_id,album_id)

);

We’ve declared the artist_id to be the same type as in artist. This is important as otherwise MySQL couldn’t use indexes to join tables together to resolve queries (which is a very common cause of odd results in EXPLAIN output). We’ve used SMALLINT for the album_id, since we don’t expect more than 65,000 albums per artist! We define album_name as a CHAR(128) because 128 characters seems long enough for album titles. Again, we’ve added NOT NULL for the primary key, added DEFAULT clauses to make the behavior predictable, and gone with only fixed-length types to improve performance.

The track table is created as follows:

CREATE TABLE track (

track_id SMALLINT(3) NOT NULL DEFAULT 0,

track_name CHAR(128) DEFAULT NULL,

artist_id SMALLINT(5) NOT NULL DEFAULT 0,

album_id SMALLINT(4) NOT NULL DEFAULT 0,

time TIME DEFAULT NULL,

PRIMARY KEY (artist_id,album_id,track_id)

);

The reasoning behind the choices for the first four columns is the same as for the other tables. The time column stores the duration of each track, and we’ve chosen to use the TIME type to store this. Using the TIME type—in preference to a numeric type such as DECIMAL—makes it easy to do math such as summing values to find the running time for an album. It also gives you flexibility in formats for the time data, as discussed previously. Despite this, you’ll see that in music.sql we use the format HH:MM:SS because we prefer to keep SQL queries readable and unambiguous.

The final table is played:

CREATE TABLE played (

artist_id SMALLINT(5) NOT NULL DEFAULT 0,

album_id SMALLINT(4) NOT NULL DEFAULT 0,

track_id SMALLINT(3) NOT NULL DEFAULT 0,

played TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,

PRIMARY KEY (artist_id,album_id,track_id,played)

);

The choices for the first three columns are again as previously described. The played column makes use of the TIMESTAMP type and its automatic-update feature: we want the value to be set to the current date and time whenever a row is inserted (and, for good measure, whenever it’s updated, which we don’t plan to do). To use the feature, whenever we play a track, we create a new row with the artist_id, album_id, and track_id, and set the played column to NULL. Since all columns form the primary key, it’s acceptable to have more than one entry for a specific combination of artist, album, and track, as long as the timestamps aren’t the same. We can reasonably assume that two tracks won’t be played at the same time in a single-user application, and can also add instructions to enforce this in any application that uses this database.

Altering Structures

We’ve shown you all the basics you need for creating databases, tables, indexes, and columns. In this section, you’ll learn how to add, remove, and change columns, databases, tables, and indexes in structures that already exist.

Adding, Removing, and Changing Columns

You can use the ALTER TABLE statement to add new columns to a table, remove existing columns, and change column names, types, and lengths.

Let’s begin by considering how you modify existing columns. Consider an example in which we rename a table column. The played table has a column—also called played—that contains the time the track was played. To change the name of this column to last_played, you would write:

mysql> ALTER TABLE played CHANGE played last_played TIMESTAMP;

Query OK, 12 rows affected (0.03 sec)

Records: 12 Duplicates: 0 Warnings: 0

You can see that MySQL processes and alters each row. What actually happens behind the scenes is that MySQL creates a new table with the new structure, copies the data into that table, removes the original played table, and renames the table to played. You can check the result with theSHOW COLUMNS statement:

mysql> SHOW COLUMNS FROM played;

+----------------+-------------+------+-----+-------------------+-------+

| Field | Type | Null | Key | Default | Extra |

+----------------+-------------+------+-----+-------------------+-------+

| artist_id | smallint(5) | | PRI | 0 | |

| album_id | smallint(4) | | PRI | 0 | |

| track_id | smallint(3) | | PRI | 0 | |

| last_played | timestamp | YES | PRI | CURRENT_TIMESTAMP | |

+----------------+-------------+------+-----+-------------------+-------+

4 rows in set (0.01 sec)

In the previous example, you can see that we provided four parameters to the ALTER TABLE statement with the CHANGE keyword:

1. The table name, played

2. The original column name, played

3. The new column name, last_played

4. The column type, TIMESTAMP

You must provide all four; that means you need to respecify the type and any clauses that go with it. In the previous example, it just happens that the TIMESTAMP type defaults to:

DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP

If you want to rename the artist_name column to artist-name, you would write:

ALTER TABLE artist CHANGE artist_name artist-name CHAR(128) DEFAULT NULL;

If you want to modify the type and clauses of a column, but not its name, you can use the MODIFY keyword:

mysql> ALTER TABLE artist MODIFY artist_name CHAR(64) DEFAULT "Unknown";

Query OK, 6 rows affected (0.01 sec)

Records: 6 Duplicates: 0 Warnings: 0

You can also do this with the CHANGE keyword, but by specifying the same column name twice:

mysql> ALTER TABLE artist CHANGE artist_name artist_name CHAR(64) DEFAULT "Unknown";

Query OK, 6 rows affected (0.03 sec)

Records: 6 Duplicates: 0 Warnings: 0

Be careful when you’re modifying types:

§ Don’t use incompatible types, since you’re relying on MySQL to successfully convert data from one format to another (for example, converting an INT column to a DATETIME column isn’t likely to do what you hoped).

§ Don’t truncate the data unless that’s what you want. If you reduce the size of a type, the values will be edited to match the new width, and you can lose data.

Suppose you want to add an extra column to an existing table. Here’s how to do it with the ALTER TABLE statement:

mysql> ALTER TABLE artist ADD formed YEAR;

Query OK, 6 rows affected (0.02 sec)

Records: 6 Duplicates: 0 Warnings: 0

You must supply the ADD keyword, the new column name, and the column type and clauses. This example adds the new column, formed, as the last column in the table, as shown with the SHOW COLUMNS statement:

mysql> SHOW COLUMNS FROM artist;

+-------------+-------------+------+-----+---------+-------+

| Field | Type | Null | Key | Default | Extra |

+-------------+-------------+------+-----+---------+-------+

| artist_id | smallint(5) | | PRI | 0 | |

| artist_name | char(64) | YES | | Unknown | |

| formed | year(4) | YES | | NULL | |

+-------------+-------------+------+-----+---------+-------+

3 rows in set (0.02 sec)

If you want it to instead be the first column, use the FIRST keyword as follows:

mysql> ALTER TABLE artist ADD formed YEAR FIRST;

Query OK, 6 rows affected (0.04 sec)

Records: 6 Duplicates: 0 Warnings: 0

mysql> SHOW COLUMNS FROM artist;

+-------------+-------------+------+-----+---------+-------+

| Field | Type | Null | Key | Default | Extra |

+-------------+-------------+------+-----+---------+-------+

| formed | year(4) | YES | | NULL | |

| artist_id | smallint(5) | | PRI | 0 | |

| artist_name | char(64) | YES | | Unknown | |

+-------------+-------------+------+-----+---------+-------+

3 rows in set (0.01 sec)

If you want it added in a specific position, use the AFTER keyword:

mysql> ALTER TABLE artist ADD formed YEAR AFTER artist_id;

Query OK, 6 rows affected (0.03 sec)

Records: 6 Duplicates: 0 Warnings: 0

mysql> SHOW COLUMNS FROM artist;

+-------------+-------------+------+-----+---------+-------+

| Field | Type | Null | Key | Default | Extra |

+-------------+-------------+------+-----+---------+-------+

| artist_id | smallint(5) | | PRI | 0 | |

| formed | year(4) | YES | | NULL | |

| artist_name | char(64) | YES | | Unknown | |

+-------------+-------------+------+-----+---------+-------+

3 rows in set (0.01 sec)

To remove a column, use the DROP keyword followed by the column name. Here’s how to get rid of the newly added formed column:

mysql> ALTER TABLE artist DROP formed;

Query OK, 6 rows affected (0.02 sec)

Records: 6 Duplicates: 0 Warnings: 0

This removes both the column structure and any data contained in that column. It also removes the column from any index it was in; if it’s the only column in the index, the index is dropped, too. You can’t remove a column if it’s the only one in a table; to do this, you drop the table instead as explained later in Deleting Structures.” Be careful when dropping columns; you discard both the data and the structure of your table. When the structure of a table changes, you will generally have to modify any INSERT statements that you use to insert values in a particular order. We described INSERT statements in The INSERT Statement” in Chapter 5.

MySQL allows you to specify multiple alterations in a single ALTER TABLE statement by separating them with commas. Here’s an example that adds a new column and adjusts another:

mysql> ALTER TABLE artist ADD formed YEAR, MODIFY artist_name char(256);

Query OK, 6 rows affected, 1 warning (0.08 sec)

Records: 6 Duplicates: 0 Warnings: 0

It’s very efficient to join multiple modifications in a single operation, as it potentially saves the cost of creating a new table, copying data from the old table to the new table, dropping the old table, and renaming the new table with the name of the old table for each modification individually.

Adding, Removing, and Changing Indexes

As we discussed previously, it’s often hard to know what indexes are useful before the application you’re building is used. You might find that a particular feature of the application is much more popular than you expected, causing you to evaluate how to improve performance for the associated queries. You’ll therefore find it useful to be able to add, alter, and remove indexes on the fly after your application is deployed. This section shows you how. Modifying indexes does not affect the data stored in the table.

We’ll start with adding a new index. Imagine that the artist table is frequently queried using a WHERE clause that specifies an artist_name. To speed this query, you’ve decided to add a new index, which you’ve named by_name. Here’s how you add it after the table is created:

mysql> ALTER TABLE artist ADD INDEX by_name (artist_name);

Query OK, 6 rows affected (0.02 sec)

Records: 6 Duplicates: 0 Warnings: 0

Again, you can use the terms KEY and INDEX interchangeably. You can check the results with the SHOW CREATE TABLE statement:

mysql> SHOW CREATE TABLE artist;

+--------+-----------------------------------------------+

| Table | Create Table |

+--------+-----------------------------------------------+

| artist | CREATE TABLE `artist` ( |

| | `artist_id` smallint(5) NOT NULL default '0', |

| | `artist_name` char(128) default NULL, |

| | PRIMARY KEY (`artist_id`), |

| | KEY `by_name` (`artist_name`) |

| | ) ENGINE=MyISAM DEFAULT CHARSET=latin1 |

+--------+-----------------------------------------------+

As expected, the new index forms part of the table structure. You can also specify a primary key for a table after it’s created:

mysql> ALTER TABLE artist ADD PRIMARY KEY (artist_id);

Now let’s consider how to remove an index. To remove a non-primary-key index, you do the following:

mysql> ALTER TABLE artist DROP INDEX by_name;

Query OK, 6 rows affected (0.01 sec)

Records: 6 Duplicates: 0 Warnings: 0

You can drop a primary-key index as follows:

mysql> ALTER TABLE artist DROP PRIMARY KEY;

MySQL won’t allow you to have multiple primary keys in a table. If you want to change the primary key, you’ll have to remove the existing index before adding the new one. Consider this example:

mysql> CREATE TABLE staff (staff_id INT, name CHAR(40));

Query OK, 0 rows affected (0.01 sec)

mysql> ALTER TABLE staff ADD PRIMARY KEY (staff_id);

Query OK, 0 rows affected (0.00 sec)

Records: 0 Duplicates: 0 Warnings: 0

mysql> ALTER TABLE staff ADD PRIMARY KEY (name);

ERROR 1068 (42000): Multiple primary key defined

mysql> ALTER TABLE staff DROP PRIMARY KEY;

Query OK, 0 rows affected (0.00 sec)

Records: 0 Duplicates: 0 Warnings: 0

mysql> ALTER TABLE staff ADD PRIMARY KEY (name);

Query OK, 0 rows affected (0.00 sec)

Records: 0 Duplicates: 0 Warnings: 0

MySQL complains when we try to add the second primary key on name; we have to drop the existing primary key on staff_id, and then add one on name.

You can’t modify an index once it’s been created. However, sometimes you’ll want to; for example, you might want to reduce the number of characters indexed from a column or add another column to the index. The best method to do this is to drop the index and then create it again with the new specification. For example, suppose you decide that you want the by_name index to include only the first 10 characters of the artist_name. Simply do the following:

mysql> ALTER TABLE artist DROP INDEX by_name;

Query OK, 6 rows affected (0.02 sec)

Records: 6 Duplicates: 0 Warnings: 0

mysql> ALTER TABLE artist ADD INDEX by_name (artist_name(10));

Query OK, 6 rows affected (0.03 sec)

Records: 6 Duplicates: 0 Warnings: 0

Renaming Tables and Altering Other Structures

We’ve seen how to modify columns and indexes in a table; now let’s see how to modify tables themselves. It’s easy to rename a table. Suppose you want to rename played to playlist. Use the following command:

mysql> ALTER TABLE played RENAME TO playlist;

Query OK, 0 rows affected (0.00 sec)

The TO keyword is optional.

There are several other things you can do with ALTER statements:

§ Change the default character set and collation order for a database, a table, or a column.

§ Change the order of the rows in a table. This is useful only if you know you want to access the rows in a particular order and you want to help get the data into or near that order.

§ Manage and change constraints. For example, you can add and remove foreign keys.

You can find more about these operations in the MySQL manual under the “ALTER DATABASE” and “ALTER TABLE” headings.

Beginning with MySQL 5.1, you can also change the name of a database using the new RENAME DATABASE command:

mysql> RENAME DATABASE old_database_name new_database_name;

Query OK, 0 rows affected (0.01 sec)

Deleting Structures

In the previous section, we showed how you can delete columns and rows from a database; now we’ll describe how to remove databases and tables.

Dropping Databases

Removing, or dropping, a database is straightforward. Here’s how you drop the music database:

mysql> DROP DATABASE music;

Query OK, 4 rows affected (0.01 sec)

The number of rows returned in the response is the number of tables removed. You should take care when dropping a database, since all its tables, indexes, and columns are deleted, as are all the associated disk-based files and directories that MySQL uses to maintain them.

If a database doesn’t exist, trying to drop it causes MySQL to report an error. Let’s try dropping the music database again:

mysql> DROP DATABASE music;

ERROR 1008 (HY000): Can't drop database 'music'; database doesn't exist

You can avoid the error, which is useful when including the statement in a script, by using the IF EXISTS phrase:

mysql> DROP DATABASE IF EXISTS music;

Query OK, 0 rows affected, 1 warning (0.00 sec)

You can see that a warning is reported, since the music database has already been dropped. You can always check what the warning was with the SHOW WARNINGS statement, which has been available since MySQL 4.1.0:

mysql> SHOW WARNINGS;

+-------+------+-----------------------------------------------------+

| Level | Code | Message |

+-------+------+-----------------------------------------------------+

| Note | 1008 | Can't drop database 'music'; database doesn't exist |

+-------+------+-----------------------------------------------------+

1 row in set (0.00 sec)

The warning is also generated with the error if you leave out the IF EXISTS clause.

Removing Tables

Removing tables is as easy as removing a database. Let’s create and remove a table from the music database:

mysql> CREATE TABLE temp (temp INT(3), PRIMARY KEY (temp));

Query OK, 0 rows affected (0.00 sec)

mysql> DROP TABLE temp;

Query OK, 0 rows affected (0.00 sec)

Don’t worry: the 0 rows affected message is misleading. You’ll find the table is definitely gone.

You can use the IF EXISTS phrase to prevent errors. Let’s try dropping the temp table again:

mysql> DROP TABLE IF EXISTS temp;

Query OK, 0 rows affected, 1 warning (0.00 sec)

Again, you can investigate the warning indicates with the SHOW WARNINGS statement:

mysql> SHOW WARNINGS;

+-------+------+----------------------+

| Level | Code | Message |

+-------+------+----------------------+

| Note | 1051 | Unknown table 'temp' |

+-------+------+----------------------+

1 row in set (0.00 sec)

You can drop more than one table in a single statement by separating table names with commas:

mysql> DROP TABLE IF EXISTS temp, temp1, temp2;

Query OK, 0 rows affected, 3 warnings (0.00 sec)

You can see three warnings because none of these tables existed.

Exercises

All exercises here concern the music database. You’ll find that the CREATE TABLE statements in The Sample Music Database” are a useful reference.

1. You’ve decided to store more information about artists and albums. Specifically, for artists, you want to store the names of people who have worked with the artist (for example, vocalists, guitarists, trumpeters, and drummers), when they began working with the artist, and when they stopped working with the artist (if they have done so).

For albums, you want to store the name of the album producer, when the album was released, and where the album was recorded. Design tables or columns that can store this information, and explain the advantages and disadvantages of your design. Choose the column types you need, explaining the advantages and disadvantages of your choices.

2. There are five types for storing temporal data: DATETIME, DATE, TIME, YEAR, and TIMESTAMP. Explain what each is used for, and give an example of a situation in which you would choose to use it.

3. You’ve decided to use the AUTO_INCREMENT feature. List the three requirements that must be met by the column you’re applying it to.

4. Why can only one column in a table have the AUTO_INCREMENT feature?

5. Using the monitor, create a table with the following statement:

mysql> CREATE TABLE exercise (field1 INT(3));

Using the ALTER TABLE statement, make field1 the primary key, carrying out any additional steps you need to make this possible. Add a second column, field2, of type CHAR(64) with a DEFAULT 5 clause. Create an index on a prefix of 10 characters from field2.