MySQL Cookbook (2007)

Chapter 4. Table Management

Introduction

This chapter covers topics that relate to creating and populating tables:

§ Cloning a table

§ Copying one table to another

§ Using temporary tables

§ Generating unique table names

§ Determining what storage engine a table uses or converting a table to use a different storage engine

To create and load the mail table used for examples in this chapter, change location into the tables directory of the recipes distribution, and run this command:

%mysql cookbook < mail.sql

Cloning a Table

Problem

You need into create a table that has exactly the same structure as an existing table.

Solution

Use CREATE TABLE ... LIKE to clone the table structure. If it’s also necessary to copy some or all of the rows from the original table to the new one, use INSERT INTO ... SELECT.

Discussion

When you need to create a new table that is just like an existing table, use this statement:

CREATE TABLEnew_table LIKE original_table;

The structure of the new table will be exactly the same as that of the original table, with a few exceptions: CREATE TABLE ... LIKE does not copy foreign key definitions, and it doesn’t copy any DATA DIRECTORY or INDEX DIRECTORY table options that the table might use.

The new table will be empty. If you also need to copy the rows from the original table to the new table, use an INSERT INTO ... SELECT statement:

INSERT INTOnew_table SELECT * FROM original_table;

To copy only part of the table, add an appropriate WHERE clause that identifies which rows to copy. For example, these statements create a copy of the mail table named mail2 and populate it with the rows only for mail sent by barb:

mysql>CREATE TABLE mail2 LIKE mail;

mysql> INSERT INTO mail2 SELECT * FROM mail WHERE srcuser = 'barb';

mysql> SELECT * FROM mail2;

+---------------------+---------+---------+---------+---------+-------+

+---------------------+---------+---------+---------+---------+-------+

+---------------------+---------+---------+---------+---------+-------+

For more information about INSERT ... SELECT, see Saving a Query Result in a Table.

Saving a Query Result in a Table

Problem

You want to save the result from a SELECT statement into a table rather than display it.

Solution

If the table already exists, just use INSERT INTO ... SELECT to retrieve rows into it. If the table does not exist yet, use CREATE TABLE ... SELECT to create it on the fly from the SELECT result.

Discussion

The MySQL server normally returns the result of a SELECT statement to the client that issued the statement. For example, when you issue a statement from within the mysql program, the server returns the result to mysql, which in turn displays it to you on the screen. It’s also possible to save the results of a SELECT statement in a table, which is useful in a number of ways:

§ You can easily create a complete or partial copy of a table. If you’re developing an algorithm that modifies a table, it’s safer to work with a copy of a table so that you need not worry about the consequences of mistakes. Also, if the original table is large, creating a partial copy can speed the development process because queries run against it will take less time.

§ For a data-loading operation based on information that might be malformed, you can load new rows into a temporary table, perform some preliminary checks, and correct the rows as necessary. When you’re satisfied that the new rows are okay, copy them from the temporary table into your main table.

§ Some applications maintain a large repository table and a smaller working table into which rows are inserted on a regular basis, copying the working table rows to the repository periodically and clearing the working table.

§ If you’re performing a number of similar summary operations on a large table, it may be more efficient to select summary information once into a second table and use that for further analysis, rather than run expensive summary operations repeatedly on the original table.

This section shows how to retrieve a result set into a table. The table names src_tbl and dst_tbl in the examples refer to the source table from which rows are selected and the destination table into which they are stored, respectively.

If the destination table already exists, use INSERT ... SELECT to copy the result set into it. For example, if dst_tbl contains an integer column i and a string column s, the following statement copies rows from src_tbl into dst_tbl, assigning column val to i and column name to s:

INSERT INTO dst_tbl (i, s) SELECT val, name FROM src_tbl;

The number of columns to be inserted must match the number of selected columns, and the correspondence between sets of columns is established by position rather than name. In the special case that you want to copy all columns from one table to another, you can shorten the statement to this form:

INSERT INTO dst_tbl SELECT * FROM src_tbl;

To copy only certain rows, add a WHERE clause that selects those rows:

INSERT INTO dst_tbl SELECT * FROM src_tbl

WHERE val > 100 AND name LIKE 'A%';

The SELECT statement can produce values from expressions, too. For example, the following statement counts the number of times each name occurs in src_tbl and stores both the counts and the names in dst_tbl:

INSERT INTO dst_tbl (i, s) SELECT COUNT(*), name

FROM src_tbl GROUP BY name;

If the destination table does not exist, you can create it first with a CREATE TABLE statement, and then copy rows into it with INSERT ... SELECT. A second option is to use CREATE TABLE ... SELECT, which creates the destination table directly from the result of the SELECT. For example, to create dst_tbl and copy the entire contents of src_tbl into it, do this:

CREATE TABLE dst_tbl SELECT * FROM src_tbl;

MySQL creates the columns in dst_tbl based on the name, number, and type of the columns in src_tbl. Should you want to copy only certain rows, add an appropriate WHERE clause. To create an empty table, use a WHERE clause that is always false:

CREATE TABLE dst_tbl SELECT * FROM src_tbl WHERE 0;

To copy only some of the columns, name the ones you want in the SELECT part of the statement. For example, if src_tbl contains columns a, b, c, and d, you can copy just b and d like this:

CREATE TABLE dst_tbl SELECT b, d FROM src_tbl;

To create columns in a different order from that in which they appear in the source table, name them in the desired order. If the source table contains columns a, b, and c, but you want them to appear in the destination table in the order c, a, and b, do this:

CREATE TABLE dst_tbl SELECT c, a, b FROM src_tbl;

To create additional columns in the destination table besides those selected from the source table, provide appropriate column definitions in the CREATE TABLE part of the statement. The following statement creates id as an AUTO_INCREMENT column in dst_tbl and adds columns a, b, andc from src_tbl:

CREATE TABLE dst_tbl

(

id INT NOT NULL AUTO_INCREMENT,

PRIMARY KEY (id)

)

SELECT a, b, c FROM src_tbl;

The resulting table contains four columns in the order id, a, b, c. Defined columns are assigned their default values. This means that id, being an AUTO_INCREMENT column, will be assigned successive sequence numbers starting from one. (See Creating a Sequence Column and Generating Sequence Values.)

If you derive a column’s values from an expression, it’s prudent to provide an alias to give the column a name. Suppose that src_tbl contains invoice information listing items in each invoice. The following statement then generates a summary of each invoice named in the table, along with the total cost of its items. The second column includes an alias because the default name for an expression is the expression itself, which is a difficult name to work with later:

CREATE TABLE dst_tbl

SELECT inv_no, SUM(unit_cost*quantity) AS total_cost

FROM src_tbl

GROUP BY inv_no;

CREATETABLE ... SELECT is extremely convenient, but it does have some limitations. These stem primarily from the fact that the information available from a result set is not as extensive as what you can specify in a CREATETABLE statement. If you derive a table column from an expression, for example, MySQL has no idea whether the column should be indexed or what its default value is. If it’s important to include this information in the destination table, use the following techniques:

§ To make the destination table an exact copy of the source table, use the cloning technique described in Cloning a Table.

§ If you want indexes in the destination table, you can specify them explicitly. For example, if src_tbl has a PRIMARY KEY on the id column, and a multiple-column index on state and city, you can specify them for dst_tbl as well:

§ CREATE TABLE dst_tbl (PRIMARY KEY (id), INDEX(state,city))

SELECT * FROM src_tbl;

§ Column attributes such as AUTO_INCREMENT and a column’s default value are not copied to the destination table. To preserve these attributes, create the table, and then use ALTER TABLE to apply the appropriate modifications to the column definition. For example, if src_tbl has an idcolumn that is not only a PRIMARY KEY but an AUTO_INCREMENT column, copy the table, and then modify it:

§ CREATE TABLE dst_tbl (PRIMARY KEY (id)) SELECT * FROM src_tbl;

ALTER TABLE dst_tbl MODIFY id INT UNSIGNED NOT NULL AUTO_INCREMENT;

Creating Temporary Tables

Problem

You need a table only for a short time, and then you want it to disappear automatically.

Solution

Create a TEMPORARY table, and let MySQL take care of removing it.

Discussion

Some operations require a table that exists only temporarily and that should disappear when it’s no longer needed. You can of course issue a DROP TABLE statement explicitly to remove a table when you’re done with it. Another option is to use CREATE TEMPORARY TABLE. This statement is just like CREATE TABLE except that it creates a transient table that disappears when your connection to the server closes, if you haven’t already removed it yourself. This is extremely useful behavior because you need not remember to remove the table. MySQL drops it for you automatically.TEMPORARY can be used with the usual table-creation methods:

§ Create the table from explicit column definitions:

CREATE TEMPORARY TABLEtbl_name (...column definitions...);

§ Create the table from an existing table:

CREATE TEMPORARY TABLEnew_table LIKE original_table;

§ Create the table on the fly from a result set:

CREATE TEMPORARY TABLEtbl_name SELECT ... ;

Temporary tables are connection-specific, so several clients each can create a temporary table having the same name without interfering with each other. This makes it easier to write applications that use transient tables, because you need not ensure that the tables have unique names for each client. (See Generating Unique Table Names for further discussion of table-naming issues.)

Another property of temporary tables is that they can be created with the same name as a permanent table. In this case, the temporary table “hides” the permanent table for the duration of its existence, which can be useful for making a copy of a table that you can modify without affecting the original by mistake. The DELETE statement in the following set of statements removes rows from a temporary mail table, leaving the original permanent one unaffected:

mysql>CREATE TEMPORARY TABLE mail SELECT * FROM mail;

mysql> SELECT COUNT(*) FROM mail;

+----------+

| COUNT(*) |

+----------+

| 16 |

+----------+

mysql> DELETE FROM mail;

mysql> SELECT COUNT(*) FROM mail;

+----------+

| COUNT(*) |

+----------+

| 0 |

+----------+

mysql> DROP TABLE mail;

mysql> SELECT COUNT(*) FROM mail;

+----------+

| COUNT(*) |

+----------+

| 16 |

+----------+

Although temporary tables created with CREATE TEMPORARY TABLE have the preceding benefits, keep the following caveats in mind:

§ To reuse the temporary table within a given session, you’ll still need to drop it explicitly before recreating it. It’s only the last use within a session that you need no explicit DROP TABLE for. (If you’ve already created a temporary table with a given name, attempting to create a second one with that name results in an error.)

§ If you modify a temporary table that “hides” a permanent table with the same name, be sure to test for errors resulting from dropped connections if you’re using a programming interface that has reconnect capability enabled. If a client program automatically reconnects after it detects a dropped connection, you’ll be modifying the permanent table after the reconnect, not the temporary table.

§ Some APIs support persistent connections or connection pools. Use of these prevents temporary tables from being dropped as you expect when your script ends because the connection remains open for reuse by other scripts. Your script has no control over when the connection closes. This means it can be prudent to issue the following statement prior to creating a temporary table, just in case it’s still hanging around from the previous execution of the script:

DROP TEMPORARY TABLE IF EXISTStbl_name

The TEMPORARY keyword is useful here if the temporary table has already been dropped. It prevents the statement from dropping any permanent table that happens to have the same name.

Checking or Changing a Table’s Storage Engine

Problem

You need to check which storage engine a table uses so that you can determine what engine capabilities are applicable. Or you need to change a table’s storage engine because you realize that another engine has capabilities that are more desirable for the way you use the table.

Solution

To determine a table’s storage engine, you can use any of several statements. To change the table’s engine, use ALTER TABLE with an ENGINE clause.

Discussion

MySQL supports several storage engines, each of which have differing characteristics. For example, the InnoDB and BDB engines support transactions, whereas MyISAM does not. If you need to know whether a table supports transactions, check which storage engine it uses. If you need to use the table in transactions but the table’s engine does not support them, you can convert the table to use a transaction-capable engine.

To determine the current engine for a table, check INFORMATION_SCHEMA or use the SHOW TABLE STATUS or SHOW CREATE TABLE statement. For the mail table, obtain engine information as follows:

mysql>SELECT ENGINE FROM INFORMATION_SCHEMA.TABLES

-> WHERE TABLE_SCHEMA = 'cookbook' AND TABLE_NAME = 'mail';

+--------+

| ENGINE |

+--------+

| MyISAM |

+--------+

mysql> SHOW TABLE STATUS LIKE 'mail'\G

*************************** 1. row ***************************

Name: mail

Engine: MyISAM

...

mysql> SHOW CREATE TABLE mail\G

*************************** 1. row ***************************

Table: mail

Create Table: CREATE TABLE `mail` (

`t` datetime DEFAULT NULL,

`srcuser` char(8) DEFAULT NULL,

`srchost` char(20) DEFAULT NULL,

`dstuser` char(8) DEFAULT NULL,

`dsthost` char(20) DEFAULT NULL,

`size` bigint(20) DEFAULT NULL,

KEY `t` (`t`)

) ENGINE=MyISAM DEFAULT CHARSET=latin1

To change the storage engine used for a table, use ALTER TABLE with an ENGINE specifier. For example, to convert the mail table to use the InnoDB storage engine, use this statement:

ALTER TABLE mail ENGINE = InnoDB;

Be aware that converting a large table to a different storage engine might take a long time and be expensive in terms of CPU and I/O activity.

See Also

To determine which storage engines are supported by your MySQL server, see Determining Which Storage Engines the Server Supports.

Generating Unique Table Names

Problem

You need to create a table with a name that is guaranteed not to exist already.

Solution

If you can create a TEMPORARY table, it doesn’t matter if the name exists already. Otherwise, try to generate a value that is unique to your client program and incorporate it into the table name.

Discussion

MySQL is a multiple-client database server, so if a given script that creates a transient table might be invoked by several clients simultaneously, you must take care to keep multiple invocations of the script from fighting over the same table name. If the script creates tables using CREATETEMPORARY TABLE, there is no problem because different clients can create temporary tables having the same name without clashing.

If you do not want to use a TEMPORARY table, you should make sure that each invocation of the script creates a uniquely named table and that it drops that table when it is no longer needed. To accomplish this, incorporate into the name some value that is guaranteed to be unique per invocation. A timestamp won’t work, because it’s easily possible for two instances of a script to be invoked within the same second. A random number may be somewhat better. For example, in Java, you can use the java.util.Random() class to create a table name like this:

import java.util.Random;

import java.lang.Math;

Random rand = new Random ();

int n = rand.nextInt (); // generate random number

n = Math.abs (n); // take absolute value

String tblName = "tmp_tbl_" + n;

Unfortunately, random numbers only reduce the possibility of name clashes, they do not eliminate it. Process ID (PID) values are a better source of unique values. PIDs are reused over time, but never for two processes at the same time, so a given PID is guaranteed to be unique among the set of currently executing processes. You can use this fact to create unique table names as follows:

Perl:

my $tbl_name = "tmp_tbl_$$";

Ruby:

tbl_name = "tmp_tbl_" + Process.pid.to_s

PHP:

$tbl_name = "tmp_tbl_" . posix_getpid ();

Python:

import os

tbl_name = "tmp_tbl_%d" % os.getpid ()

Note that even if you create a table name using a value such as a PID that is guaranteed to be unique to a given script invocation, there may still be a chance that the table will already exist. This can happen if a previous invocation of the script with the same PID created a table with the same name, but crashed before removing the table. On the other hand, any such table cannot still be in use because it will have been created by a process that is no longer running. Under these circumstances, it’s safe to remove the table if it does exist by issuing the following statement:

DROP TABLE IF EXISTStbl_name

Then you can go ahead and create the new table.

Connection identifiers are another source of unique values. The MySQL server reuses these numbers over time, but no two simultaneous connections to the server have the same ID. To get your connection ID, execute this statement, and retrieve the result:

SELECT CONNECTION_ID();

Some MySQL APIs expose the connection ID directly without requiring any statement to be executed. For example, in Perl DBI, use the mysql_thread_id attribute of your database handle:

my $tbl_name = "tmp_tbl_" . $dbh->{mysql_thread_id};

In Ruby DBI, do this:

tbl_name = "tmp_tbl_" + dbh.func(:thread_id).to_s