Obtaining and Using Metadata - MySQL Cookbook (2007)

MySQL Cookbook (2007)

Chapter 9. Obtaining and Using Metadata

Introduction

Most of the SQL statements used so far have been written to work with the data stored in the database. That is, after all, what the database is designed to hold. But sometimes you need more than just data values. You need information that characterizes or describes those values—that is, the statement metadata. Metadata information is used most often in relation to processing result sets, but also is available for other aspects of your interaction with MySQL. This chapter describes how to obtain and use the following types of metadata:

Information about statement results

For statements that delete or update rows, you can determine how many rows were changed. For a SELECT statement, you can find out the number of columns in the result set, as well as information about each column in the result set, such as the column name and its display width. Such information often is essential for processing the results. For example, if you’re formatting a tabular display, you can determine how wide to make each column and whether to justify values to the left or right.

Information about tables and databases

Information pertaining to the structure of tables and databases is useful for applications that need to enumerate a list of tables in a database or databases hosted on a server (for example, to present a display allowing the user to select one of the available choices). You can also use this information to determine whether tables or databases exist. Another use for table metadata is to determine the legal values for ENUM or SET columns.

Information about the MySQL server

Some APIs provide information about the database server or about the status of your current connection with the server. Knowing the server version can be useful for determining whether it supports a given feature, which helps you build adaptive applications. Information about the connection includes such values as the current user and the default database.

In general, metadata information is closely tied to the implementation of the database system, so it tends to be somewhat database-dependent. This means that if an application uses techniques shown in this chapter, it might need some modification if you port it to other database systems. For example, lists of tables and databases in MySQL are available by issuing SHOW statements. However, SHOW is a MySQL-specific extension to SQL, so even if you’re using an API like Perl or Ruby DBI, PEAR DB, DB-API, or JDBC that gives you a database-independent way of issuing statements, the SQL itself is database-specific and will need to be changed to work with other engines.

A more portable source of metadata is the INFORMATION_SCHEMA database, which is available as of MySQL 5.0. This metadata database contains information about databases, tables, columns, character sets, and so forth. INFORMATION_SCHEMA has some advantages over SHOW:

§ Other database systems support INFORMATION_SCHEMA, so applications that use it are likely to be more portable than those that use SHOW statements.

§ INFORMATION_SCHEMA is used with standardSELECT syntax, so it’s more similar to other data-retrieval operations than SHOW statements.

Because of those advantages, recipes in this chapter use INFORMATION_SCHEMA rather than SHOW when possible.

A disadvantage of INFORMATION_SCHEMA is that statements to access it are more verbose than the corresponding SHOW statements. That doesn’t matter so much when you’re writing programs, but for interactive use, SHOW statements can be more attractive because they require less typing. And if you haven’t yet upgraded to MySQL 5.0 or higher, SHOW is your only choice.

The scripts containing the code for the examples in this chapter are found in the metadata directory of the recipes distribution. (Some of them use utility functions located in the lib directory.) To create any tables that you need for trying the examples, use the scripts in the tables directory.

As already indicated, recipes developed in this chapter tend to use INFORMATION_SCHEMA rather than SHOW. If you have a version of MySQL older than 5.0, the recipes distribution from the first edition of MySQL Cookbook might be helpful. That distribution used SHOW statements because INFORMATION_SCHEMA did not exist in MySQL then. Briefly, the SHOW statements that provide information similar to the contents of certain INFORMATION_SCHEMA tables are listed in the following table:

INFORMATION_SCHEMA table

SHOW statement

SCHEMATA

SHOW DATABASES

TABLES

SHOW TABLES

COLUMNS

SHOW COLUMNS

Obtaining the Number of Rows Affected by a Statement

Problem

You want to know how many rows were changed by an SQL statement.

Solution

Sometimes the row count is the return value of the function that issues the statement. Other times the count is returned by a separate function that you call after issuing the statement.

Discussion

For statements that affect rows (UPDATE, DELETE, INSERT, REPLACE), each API provides a way to determine the number of rows involved. For MySQL, the default meaning of “affected by” is “changed by,” not “matched by.” That is, rows that are not changed by a statement are not counted, even if they match the conditions specified in the statement. For example, the following UPDATE statement results in an “affected by” value of zero because it does not change any columns from their current values, no matter how many rows the WHERE clause matches:

UPDATE limbs SET arms = 0 WHERE arms = 0;

The MySQL server allows a client to set a flag when it connects to indicate that it wants rows-matched counts, not rows-changed counts. In this case, the row count for the preceding statement would be equal to the number of rows with an arms value of 0, even though the statement results in no net change to the table. However, not all MySQL APIs expose this flag. The following discussion indicates which APIs enable you to select the type of count you want and which use the rows-matched count by default rather than the rows-changed count.

Perl

In Perl DBI scripts, the row count for statements that modify rows is returned by do():

my $count = $dbh->do ($stmt);

# report 0 rows if an error occurred

printf "Number of rows affected: %d\n", (defined ($count) ? $count : 0);

If you prepare a statement first and then execute it, execute() returns the row count:

my $sth = $dbh->prepare ($stmt);

my $count = $sth->execute ();

printf "Number of rows affected: %d\n", (defined ($count) ? $count : 0);

You can tell MySQL whether to return rows-changed or rows-matched counts by specifying mysql_client_found_rows in the options part of the data source name argument of the connect() call when you connect to the MySQL server. Set the option to 0 for rows-changed counts and 1 for rows-matched counts. Here’s an example:

my %conn_attrs = (PrintError => 0, RaiseError => 1, AutoCommit => 1);

my $dsn = "DBI:mysql:cookbook:localhost;mysql_client_found_rows=1";

my $dbh = DBI->connect ($dsn, "cbuser", "cbpass", \%conn_attrs);

mysql_client_found_rows changes the row-reporting behavior for the duration of the connection.

Although the default behavior for MySQL itself is to return rows-changed counts, recent versions of the Perl DBI driver for MySQL automatically request rows-matched counts unless you specify otherwise. For applications that depend on a particular behavior, it’s best to explicitly set themysql_client_found_rows option in the DSN to the appropriate value.

Ruby

For statements that modify rows, Ruby DBI returns row counts similarly to Perl DBI scripts for the do method. That is, do itself returns the count:

count = dbh.do(stmt)

puts "Number of rows affected: #{count}"

If you use execute to execute a statement, execute does not return the row count. Instead, use the statement handle rows method to get the count after executing the statement:

sth = dbh.execute(stmt)

puts "Number of rows affected: #{sth.rows}"

The Ruby DBI driver for MySQL returns rows-changed counts by default, but the driver supports a mysql_client_found_rows option that enables you to control whether the server returns rows-changed or rows-matched counts. Its use is analogous to Perl DBI. For example, to request rows-matched counts, do this:

dsn = "DBI:Mysql:database=cookbook;host=localhost;mysql_client_found_rows=1"

dbh = DBI.connect(dsn, "cbuser", "cbpass")

PHP

In PHP, invoke the connection object’s affectedRows() method to find out how many rows a statement changed:

$result =& $conn->query ($stmt);

# report 0 rows if the statement failed

$count = (PEAR::isError ($result) ? 0 : $conn->affectedRows ());

print ("Number of rows affected: $count\n");

Python

Python’s DB-API makes the rows-changed count available as the value of the statement cursor’s rowcount attribute:

cursor = conn.cursor ()

cursor.execute (stmt)

print "Number of rows affected: %d" % cursor.rowcount

To obtain rows-matched counts instead, import the MySQLdb client constants and pass the FOUND_ROWS flag in the client_flag parameter of the connect() method:

import MySQLdb.constants.CLIENT

conn = MySQLdb.connect (db = "cookbook",

host = "localhost",

user = "cbuser",

passwd = "cbpass",

client_flag = MySQLdb.constants.CLIENT.FOUND_ROWS)

Java

For statements that modify rows, the MySQL Connector/J JDBC driver provides rows-matched counts rather than rows-changed counts. This is done for conformance with the JDBC specification.

The Java JDBC interface provides row counts two different ways, depending on the method you invoke to execute the statement. If you use executeUpdate(), the row count is its return value:

Statement s = conn.createStatement ();

int count = s.executeUpdate (stmt);

s.close ();

System.out.println ("Number of rows affected: " + count);

If you use execute(), that method returns true or false to indicate whether the statement produces a result set. For statements such as UPDATE or DELETE that return no result set, execute() returns false and the row count is available by calling the getUpdateCount() method:

Statement s = conn.createStatement ();

if (!s.execute (stmt))

{

// there is no result set, print the row count

System.out.println ("Number of rows affected: " + s.getUpdateCount ());

}

s.close ();

Obtaining Result Set Metadata

Problem

You already know how to retrieve the rows of a result set (Issuing Statements and Retrieving Results). Now you want to know things about the result set, such as the column names and data types, or how many rows and columns there are.

Solution

Use the appropriate capabilities provided by your API.

Discussion

For statements such as SELECT that generate a result set, you can get a number of types of metadata. This section discusses the information provided by each API, using programs that show how to display the result set metadata available after issuing a sample statement (SELECT name,foods FROM profile). One of the simplest uses for this information is illustrated by several of the example programs: when you retrieve a row of values from a result set and you want to process them in a loop, the column count stored in the metadata serves as the upper bound on the loop iterator.

Perl

Using the Perl DBI interface, you can obtain result sets two ways. These differ in the scope of result set metadata available to your scripts:

Process the statement using a statement handle

In this case, you invoke prepare() to get the statement handle. This handle has an execute() method that you invoke to generate the result set, and then you fetch the rows in a loop. With this approach, access to the metadata is available while the result set is active—that is, after the call to execute() and until the end of the result set is reached. When the row-fetching method finds that there are no more rows, it invokes finish() implicitly, which causes the metadata to become unavailable. (That also happens if you explicitly call finish() yourself.) Thus, normally it’s best to access the metadata immediately after calling execute(), making a copy of any values that you’ll need to use beyond the end of the fetch loop.

Process the statement using a database handle method that returns the result set in a single operation

With this method, any metadata generated while processing the statement will have been disposed of by the time the method returns, although you can still determine the number of rows and columns from the size of the result set.

When you use the statement handle approach to process a statement, DBI makes result set metadata available after you invoke the handle’s execute() method. This information is available primarily in the form of references to arrays. For each such type of metadata, the array has one element per column in the result set. Array references are accessed as attributes of the statement handle. For example, $sth->{NAME} points to the column name array, with individual column names available as elements of this array:

$name = $sth->{NAME}->[$i];

Or you can access the entire array like this:

@names = @{$sth->{NAME}};

The following table lists the attribute names through which you access array-based metadata and the meaning of values in each array. Names that begin with uppercase are standard DBI attributes and should be available for most database engines. Attribute names that begin with mysql_ are MySQL-specific and nonportable; the kinds of information they provide might be available in other database systems, but under different attribute names.

Attribute name

Array element meaning

NAME

Column name

NAME_lc

Column name in lowercase

NAME_uc

Column name in uppercase

NULLABLE

0 or empty string = column values cannot be NULL

1 = column values can be NULL

2 = unknown

PRECISION

Column width

SCALE

Number of decimal places (for numeric columns)

TYPE

Data type (numeric DBI code)

mysql_is_blob

True if column has a BLOB (or TEXT) type

mysql_is_key

True if column is part of a key

mysql_is_num

True if column has a numeric type

mysql_is_pri_key

True if column is part of a primary key

mysql_max_length

Actual maximum length of column values in result set

mysql_table

Name of table the column is part of

mysql_type

Data type (numeric internal MySQL code)

mysql_type_name

Data type name

Some types of metadata, listed in the following table, are accessible as references to hashes rather than arrays. These hashes have one element per column value. The element key is the column name and its value is the position of the column within the result set. For example:

$col_pos = $sth->{NAME_hash}->{col_name};

Attribute name

Hash element meaning

NAME_hash

Column name

NAME_hash_lc

Column name in lowercase

NAME_hash_uc

Column name in uppercase

The number of columns in a result set and the number of placeholders in a prepared statement are available as scalar values:

$num_cols = $sth->{NUM_OF_FIELDS};

$num_placeholders = $sth->{NUM_OF_PARAMS};

Here’s some example code that shows how to execute a statement and display result set metadata:

my $stmt = "SELECT name, foods FROM profile";

printf "Statement: %s\n", $stmt;

my $sth = $dbh->prepare ($stmt);

$sth->execute();

# metadata information becomes available at this point ...

printf "NUM_OF_FIELDS: %d\n", $sth->{NUM_OF_FIELDS};

print "Note: statement has no result set\n" if $sth->{NUM_OF_FIELDS} == 0;

for my $i (0 .. $sth->{NUM_OF_FIELDS}-1)

{

printf "--- Column %d (%s) ---\n", $i, $sth->{NAME}->[$i];

printf "NAME_lc: %s\n", $sth->{NAME_lc}->[$i];

printf "NAME_uc: %s\n", $sth->{NAME_uc}->[$i];

printf "NULLABLE: %s\n", $sth->{NULLABLE}->[$i];

printf "PRECISION: %s\n", $sth->{PRECISION}->[$i];

printf "SCALE: %s\n", $sth->{SCALE}->[$i];

printf "TYPE: %s\n", $sth->{TYPE}->[$i];

printf "mysql_is_blob: %s\n", $sth->{mysql_is_blob}->[$i];

printf "mysql_is_key: %s\n", $sth->{mysql_is_key}->[$i];

printf "mysql_is_num: %s\n", $sth->{mysql_is_num}->[$i];

printf "mysql_is_pri_key: %s\n", $sth->{mysql_is_pri_key}->[$i];

printf "mysql_max_length: %s\n", $sth->{mysql_max_length}->[$i];

printf "mysql_table: %s\n", $sth->{mysql_table}->[$i];

printf "mysql_type: %s\n", $sth->{mysql_type}->[$i];

printf "mysql_type_name: %s\n", $sth->{mysql_type_name}->[$i];

}

$sth->finish (); # release result set because we didn't fetch its rows

If you use the preceding code to execute the statement SELECT name, foods FROM profile, the output looks like this:

Statement: SELECT name, foods FROM profile

NUM_OF_FIELDS: 2

--- Column 0 (name) ---

NAME_lc: name

NAME_uc: NAME

NULLABLE:

PRECISION: 20

SCALE: 0

TYPE: 1

mysql_is_blob:

mysql_is_key:

mysql_is_num: 0

mysql_is_pri_key:

mysql_max_length: 7

mysql_table: profile

mysql_type: 254

mysql_type_name: char

--- Column 1 (foods) ---

NAME_lc: foods

NAME_uc: FOODS

NULLABLE: 1

PRECISION: 42

SCALE: 0

TYPE: 1

mysql_is_blob:

mysql_is_key:

mysql_is_num: 0

mysql_is_pri_key:

mysql_max_length: 21

mysql_table: profile

mysql_type: 254

mysql_type_name: char

To get a row count from a result set generated by calling execute(), you must fetch the rows and count them yourself. The use of $sth->rows() to get a count for SELECT statements is explicitly deprecated in the DBI documentation.

You can also obtain a result set by calling one of the DBI methods that uses a database handle rather than a statement handle, such as selectall_arrayref() or selectall_hashref(). These methods provide no access to column metadata. That information already will have been disposed of by the time the method returns, and is unavailable to your scripts. However, you can derive column and row counts by examining the result set itself. The way you do this depends on the kind of data structure a method produces. These structures and the way you use them to obtain result set row and column counts are discussed in Issuing Statements and Retrieving Results.

Ruby

Ruby DBI provides result set metadata after you execute a statement with execute, and access to metadata is possible until you invoke the statement handle finish method. The column_names method returns an array of column names (which is empty if there is no result set). If there is a result set, the column_info method returns an array of ColumnInfo objects, one for each column. A ColumnInfo object is similar to a hash and has the elements shown in the following table. Element names that begin with an underscore are MySQL-specific and may not be present for other database engines. (Most of these elements are not present unless you use Ruby DBI 0.1.1 or higher.)

Member name

Member meaning

name

Column name

sql_type

XOPEN type number

type_name

XOPEN type name

precision

Column width

scale

Number of decimal places (for numeric columns)

nullable

True if column allows NULL values

indexed

True if column is indexed

primary

True if column is part of a primary key

unique

True if column is part of a unique index

mysql_type

Data type (numeric internal MySQL code)

mysql_type_name

Data type name

mysql_length

Column width

mysql_max_length

Actual maximum length of column values in result set

mysql_flags

Data type flags

Here’s some example code that shows how to execute a statement and display the result set metadata values for the columns:

stmt = "SELECT name, foods FROM profile"

puts "Statement: " + stmt

sth = dbh.execute(stmt)

# metadata information becomes available at this point ...

puts "Number of columns: #{sth.column_names.size}"

puts "Note: statement has no result set" if sth.column_names.size == 0

sth.column_info.each_with_index do |info, i|

printf "--- Column %d (%s) ---\n", i, info["name"]

printf "sql_type: %s\n", info["sql_type"]

printf "type_name: %s\n", info["type_name"]

printf "precision: %s\n", info["precision"]

printf "scale: %s\n", info["scale"]

printf "nullable: %s\n", info["nullable"]

printf "indexed: %s\n", info["indexed"]

printf "primary: %s\n", info["primary"]

printf "unique: %s\n", info["unique"]

printf "mysql_type: %s\n", info["mysql_type"]

printf "mysql_type_name: %s\n", info["mysql_type_name"]

printf "mysql_length: %s\n", info["mysql_length"]

printf "mysql_max_length: %s\n", info["mysql_max_length"]

printf "mysql_flags: %s\n", info["mysql_flags"]

end

sth.finish

If you use the preceding code to execute the statement SELECT name, foods FROM profile, the output looks like this:

Statement: SELECT name, foods FROM profile

Number of columns: 2

--- Column 0 (name) ---

sql_type: 12

type_name: VARCHAR

precision: 20

scale: 0

nullable: false

indexed: false

primary: false

unique: false

mysql_type: 254

mysql_type_name: VARCHAR

mysql_length: 20

mysql_max_length: 7

mysql_flags: 4097

--- Column 1 (foods) ---

sql_type: 12

type_name: VARCHAR

precision: 42

scale: 0

nullable: true

indexed: false

primary: false

unique: false

mysql_type: 254

mysql_type_name: VARCHAR

mysql_length: 42

mysql_max_length: 21

mysql_flags: 2048

To get a row count from a result set generated by calling execute, fetch the rows and count them yourself. The sth.rows method is not guaranteed to work for result sets.

You can also obtain a result set by calling one of the DBI methods that uses a database handle rather than a statement handle, such as select_one or select_all. These methods provide no access to column metadata. That information already will have been disposed of by the time the method returns, and is unavailable to your scripts. However, you can derive column and row counts by examining the result set itself.

PHP

In PHP, metadata information for SELECT statements is available from PEAR DB after a successful call to query() and remains accessible up to the point at which you free the result set.

To check whether metadata is available, verify that the query result is a result set, and then pass it to the connection object tableInfo() method , which returns a structure containing an array of column information. Each array element contains the members shown in the following table.

Member name

Member meaning

name

Column name

type

Data type name

len

Data length

flags

Data type flags

PEAR DB also makes row and column counts for a result set available via the result’s numRows() and numCols() methods.

The following code shows how to access and display result set metadata:

$stmt = "SELECT name, foods FROM profile";

print ("Statement: $stmt\n");

$result =& $conn->query ($stmt);

if (PEAR::isError ($result))

die ("Statement failed\n");

# metadata information becomes available at this point ...

if (is_a ($result, "DB_result")) # statement generates a result set

{

$nrows = $result->numRows ();

$ncols = $result->numCols ();

$info =& $conn->tableInfo ($result);

}

else # statement generates no result set

{

$nrows = 0;

$ncols = 0;

}

print ("Number of rows: $nrows\n");

print ("Number of columns: $ncols\n");

if ($ncols == 0)

print ("Note: statement has no result set\n");

for ($i = 0; $i < $ncols; $i++)

{

$col_info = $info[$i];

printf ("--- Column %d (%s) ---\n", $i, $col_info["name"]);

printf ("type: %s\n", $col_info["type"]);

printf ("len: %s\n", $col_info["len"]);

printf ("flags: %s\n", $col_info["flags"]);

}

if ($ncols > 0) # dispose of result set, if there is one

$result->free ();

The output from the program looks like this:

Statement: SELECT name, foods FROM profile

Number of rows: 10

Number of columns: 2

--- Column 0 (name) ---

type: char

len: 7

flags: not_null

--- Column 1 (foods) ---

type: char

len: 21

flags: set

Python

For statements that produce a result set, Python’s DB-API makes row and column counts available, as well as a few pieces of information about individual columns.

To get the row count for a result set, access the cursor’s rowcount attribute. The column count is not available directly, but after calling fetchone() or fetchall(), you can determine the count as the length of any result set row tuple. It’s also possible to determine the column count without fetching any rows by using cursor.description. This is a tuple containing one element per column in the result set, so its length tells you how many columns are in the set. (If the statement generates no result set, such as for UPDATE, the value of description is None.) Each element of the description tuple is another tuple that represents the metadata for the corresponding column of the result. There are seven metadata values per column; the following code shows how to access them and what they mean:

stmt = "SELECT name, foods FROM profile"

print "Statement: ", stmt

cursor = conn.cursor ()

cursor.execute (stmt)

# metadata information becomes available at this point ...

print "Number of rows:", cursor.rowcount

if cursor.description == None: # no result set

ncols = 0

else:

ncols = len (cursor.description)

print "Number of columns:", ncols

if ncols == 0:

print "Note: statement has no result set"

for i in range (ncols):

col_info = cursor.description[i]

# print name, and then other information

print "--- Column %d (%s) ---" % (i, col_info[0])

print "Type: ", col_info[1]

print "Display size: ", col_info[2]

print "Internal size:", col_info[3]

print "Precision: ", col_info[4]

print "Scale: ", col_info[5]

print "Nullable: ", col_info[6]

cursor.close

The output from the program looks like this:

Statement: SELECT name, foods FROM profile

Number of rows: 10

Number of columns: 2

--- Column 0 (name) ---

Type: 254

Display size: 7

Internal size: 20

Precision: 20

Scale: 0

Nullable: 0

--- Column 1 (foods) ---

Type: 254

Display size: 21

Internal size: 42

Precision: 42

Scale: 0

Nullable: 1

Java

JDBC makes result set metadata available through a ResultSetMetaData object, which you obtain by calling the getMetaData() method of your ResultSet object. The metadata object provides access to several kinds of information. Its getColumnCount() method returns the number of columns in the result set. Other types of metadata, illustrated by the following code, provide information about individual columns and take a column index as their argument. For JDBC, column indexes begin at 1 rather than 0, which differs from the other APIs covered here.

String stmt = "SELECT name, foods FROM profile";

System.out.println ("Statement: " + stmt);

Statement s = conn.createStatement ();

s.executeQuery (stmt);

ResultSet rs = s.getResultSet ();

ResultSetMetaData md = rs.getMetaData ();

// metadata information becomes available at this point ...

int ncols = md.getColumnCount ();

System.out.println ("Number of columns: " + ncols);

if (ncols == 0)

System.out.println ("Note: statement has no result set");

for (int i = 1; i <= ncols; i++) // column index values are 1-based

{

System.out.println ("--- Column " + i

+ " (" + md.getColumnName (i) + ") ---");

System.out.println ("getColumnDisplaySize: " + md.getColumnDisplaySize (i));

System.out.println ("getColumnLabel: " + md.getColumnLabel (i));

System.out.println ("getColumnType: " + md.getColumnType (i));

System.out.println ("getColumnTypeName: " + md.getColumnTypeName (i));

System.out.println ("getPrecision: " + md.getPrecision (i));

System.out.println ("getScale: " + md.getScale (i));

System.out.println ("getTableName: " + md.getTableName (i));

System.out.println ("isAutoIncrement: " + md.isAutoIncrement (i));

System.out.println ("isNullable: " + md.isNullable (i));

System.out.println ("isCaseSensitive: " + md.isCaseSensitive (i));

System.out.println ("isSigned: " + md.isSigned (i));

}

rs.close ();

s.close ();

The output from the program looks like this:

Statement: SELECT name, foods FROM profile

Number of columns: 2

--- Column 1 (name) ---

getColumnDisplaySize: 20

getColumnLabel: name

getColumnType: 1

getColumnTypeName: CHAR

getPrecision: 20

getScale: 0

getTableName: profile

isAutoIncrement: false

isNullable: 0

isCaseSensitive: false

isSigned: false

--- Column 2 (foods) ---

getColumnDisplaySize: 42

getColumnLabel: foods

getColumnType: 1

getColumnTypeName: CHAR

getPrecision: 42

getScale: 0

getTableName: profile

isAutoIncrement: false

isNullable: 1

isCaseSensitive: false

isSigned: false

The row count of the result set is not available directly; you must fetch the rows and count them.

There are several other JDBC result set metadata calls, but many of them provide no useful information for MySQL. If you want to try them, get a JDBC reference to see what the calls are and modify the program to see what, if anything, they return.

Determining Whether a Statement Produced a Result Set

Problem

You just executed an SQL statement, but you’re not sure whether it produced a result set.

Solution

Check the column count in the metadata. If the count is zero, there is no result set.

Discussion

If you write an application that accepts statement strings from an external source such as a file or a user entering text at the keyboard, you may not necessarily know whether it’s a statement such as SELECT that produces a result set or a statement such as UPDATE that does not. That’s an important distinction, because you process statements that produce a result set differently from those that do not. Assuming that no error occurred, one way to tell the difference is to check the metadata value that indicates the column count after executing the statement (as shown in Obtaining Result Set Metadata). A column count of zero indicates that the statement was an INSERT, UPDATE, or some other statement that returns no result set. A nonzero value indicates the presence of a result set, and you can go ahead and fetch the rows. This technique distinguishes SELECT from non-SELECT statements, even for SELECT statements that return an empty result set. (An empty result is different from no result. The former returns no rows, but the column count is still correct; the latter has no columns at all.)

Some APIs provide ways to distinguish statement types other than checking the column count:

§ In JDBC, you can issue arbitrary statements using the execute() method, which directly indicates whether there is a result set by returning true or false.

§ In PHP, PEAR DB programs should check the result from statement-execution methods to see whether the return value is a DB_result object:

§ $result =& $conn->query ($stmt);

§ if (PEAR::isError ($result))

§ die ("Statement failed\n");

§ if (is_a ($result, "DB_result"))

§ {

§ # statement generates a result set

§ }

§ else

§ {

§ # statement generates no result set

}

Do this instead of checking the column count because attempting to invoke numCols() on a result that isn’t a DB_result object causes an error.

§ In Python, the value of cursor.description is None for statements that produce no result set.

Using Metadata to Format Query Output

Problem

You want to produce a nicely formatted result set display.

Solution

Let the result set metadata help you. It provides important information about the structure and content of the results.

Discussion

Metadata information is valuable for formatting query results, because it tells you several important things about the columns (such as the names and display widths), even if you don’t know what the query was. For example, you can write a general-purpose function that displays a result set in tabular format with no knowledge about what the query might have been. The following Java code shows one way to do this. It takes a result set object and uses it to get the metadata for the result. Then it uses both objects in tandem to retrieve and format the values in the result. The output is similar to that produced by mysql: a row of column headers followed by the rows of the result, with columns nicely boxed and lined up vertically. Here’s a sample of what the function displays, given the result set generated by the query SELECT id, name, birth FROM profile:

+----------+--------------------+----------+

|id |name |birth |

+----------+--------------------+----------+

|1 |Fred |1970-04-13|

|2 |Mort |1969-09-30|

|3 |Brit |1957-12-01|

|4 |Carl |1973-11-02|

|5 |Sean |1963-07-04|

|6 |Alan |1965-02-14|

|7 |Mara |1968-09-17|

|8 |Shepard |1975-09-02|

|9 |Dick |1952-08-20|

|10 |Tony |1960-05-01|

|11 |Juan |NULL |

+----------+--------------------+----------+

Number of rows selected: 11

The primary problem an application like this must solve is to determine the proper display width of each column. The getColumnDisplaySize() method returns the column width, but we actually need to take into consideration other pieces of information:

§ The length of the column name has to be considered (it might be longer than the column width).

§ We’ll print the word “NULL” for NULL values, so if the column can contain NULL values, the display width must be at least four.

The following Java function, displayResultSet(), formats a result set, taking the preceding factors into account. It also counts rows as it fetches them to determine the row count, because JDBC doesn’t make that value available directly from the metadata.

public static void displayResultSet (ResultSet rs) throws SQLException

{

ResultSetMetaData md = rs.getMetaData ();

int ncols = md.getColumnCount ();

int nrows = 0;

int[] width = new int[ncols + 1]; // array to store column widths

StringBuffer b = new StringBuffer (); // buffer to hold bar line

// calculate column widths

for (int i = 1; i <= ncols; i++)

{

// some drivers return -1 for getColumnDisplaySize();

// if so, we'll override that with the column name length

width[i] = md.getColumnDisplaySize (i);

if (width[i] < md.getColumnName (i).length ())

width[i] = md.getColumnName (i).length ();

// isNullable() returns 1/0, not true/false

if (width[i] < 4 && md.isNullable (i) != 0)

width[i] = 4;

}

// construct +---+---... line

b.append ("+");

for (int i = 1; i <= ncols; i++)

{

for (int j = 0; j < width[i]; j++)

b.append ("-");

b.append ("+");

}

// print bar line, column headers, bar line

System.out.println (b.toString ());

System.out.print ("|");

for (int i = 1; i <= ncols; i++)

{

System.out.print (md.getColumnName (i));

for (int j = md.getColumnName (i).length (); j < width[i]; j++)

System.out.print (" ");

System.out.print ("|");

}

System.out.println ();

System.out.println (b.toString ());

// print contents of result set

while (rs.next ())

{

++nrows;

System.out.print ("|");

for (int i = 1; i <= ncols; i++)

{

String s = rs.getString (i);

if (rs.wasNull ())

s = "NULL";

System.out.print (s);

for (int j = s.length (); j < width[i]; j++)

System.out.print (" ");

System.out.print ("|");

}

System.out.println ();

}

// print bar line, and row count

System.out.println (b.toString ());

System.out.println ("Number of rows selected: " + nrows);

}

If you want to be more elaborate, you can also test whether a column contains numeric values, and format it as right-justified if so. In Perl DBI scripts, this is easy to check, because you can access the mysql_is_num metadata attribute. For other APIs, it is not so easy unless there is some equivalent “column is numeric” metadata value available. If not, you must look at the data-type indicator to see whether it’s one of the several possible numeric types.

Another shortcoming of the displayResultSet() function is that it prints columns using the width of the column as specified in the table definition, not the maximum width of the values actually present in the result set. The latter value is often smaller. You can see this in the sample output that precedes the listing for displayResultSet(). The id and name columns are 10 and 20 characters wide, even though the widest values are only two and seven characters long, respectively. In Perl, Ruby, PHP, and DB-API, you can get the maximum width of the values present in the result set. To determine these widths in JDBC, you must iterate through the result set and check the column value lengths yourself. This requires a JDBC 2.0 driver that provides scrollable result sets. If you have such a driver (MySQL Connector/J is one), the column-width calculation code in the displayResultSet() function can be modified as follows:

// calculate column widths

for (int i = 1; i <= ncols; i++)

{

width[i] = md.getColumnName (i).length ();

// isNullable() returns 1/0, not true/false

if (width[i] < 4 && md.isNullable (i) != 0)

width[i] = 4;

}

// scroll through result set and adjust display widths as necessary

while (rs.next ())

{

for (int i = 1; i <= ncols; i++)

{

byte[] bytes = rs.getBytes (i);

if (!rs.wasNull ())

{

int len = bytes.length;

if (width[i] < len)

width[i] = len;

}

}

}

rs.beforeFirst (); // rewind result set before displaying it

With that change, the result is a more compact query result display:

+--+-------+----------+

|id|name |birth |

+--+-------+----------+

|1 |Fred |1970-04-13|

|2 |Mort |1969-09-30|

|3 |Brit |1957-12-01|

|4 |Carl |1973-11-02|

|5 |Sean |1963-07-04|

|6 |Alan |1965-02-14|

|7 |Mara |1968-09-17|

|8 |Shepard|1975-09-02|

|9 |Dick |1952-08-20|

|10|Tony |1960-05-01|

|11|Juan |NULL |

+--+-------+----------+

Number of rows selected: 11

See Also

The Ruby DBI::Utils::TableFormatter module has an ascii method that produces a formatted display much like that described in this section. Use it like this:

dbh.execute(stmt) do |sth|

DBI::Utils::TableFormatter.ascii(sth.column_names, sth.fetch_all)

end

Listing or Checking Existence of Databases or Tables

Problem

You want a list of databases hosted by the MySQL server or a list of tables in a database. Or you want to check whether a particular database or table exists.

Solution

Use INFORMATION_SCHEMA to get this information. The SCHEMATA table contains a row for each database, and the TABLES table contains a row for each table in each database.

Discussion

To retrieve the list of databases hosted by the server, use this statement:

SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA;

Add an ORDER BY SCHEMA_NAME clause if you want a sorted result.

To check whether a specific database exists, use a WHERE clause with a condition that names the database. If you get a row back, the database exists. If not, it doesn’t. The following Ruby method shows how to perform an existence test for a database:

def database_exists(dbh, db_name)

return dbh.select_one("SELECT SCHEMA_NAME

FROM INFORMATION_SCHEMA.SCHEMATA

WHERE SCHEMA_NAME = ?", db_name) != nil

end

To obtain a list of tables in a database, name the database in the WHERE clause of a statement that selects from the TABLES table:

SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES

WHERE TABLE_SCHEMA = 'cookbook';

Add an ORDER BY TABLE_NAME clause if you want a sorted result.

To obtain a list of tables in the default database, use this statement instead:

SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES

WHERE TABLE_SCHEMA = DATABASE();

If no database has been selected, DATABASE() returns NULL, and no rows match, which is the correct result.

To check whether a specific table exists, use a WHERE clause with a condition that names the table. Here’s a Ruby method that performs an existence test for a table in a given database:

def table_exists(dbh, db_name, tbl_name)

return dbh.select_one(

"SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES

WHERE TABLE_SCHEMA = ? AND TABLE_NAME = ?",

db_name, tbl_name) != nil

end

NOTE

The results retrieved from INFORMATION_SCHEMA depend on your privileges. You’ll see information only for those databases or tables for which you have some privileges. This means that an existence test will return false if the given object exists but you have no privileges for accessing it.

Some APIs provide a database-independent way to get database or table lists. In Perl DBI, the database handle tables() method returns a list of tables in the default database:

@tables = $dbh->tables ();

The Ruby method is similar:

tables = dbh.tables

In Java, you can use JDBC methods designed to return lists of databases or tables. For each method, invoke your connection object’s getMetaData() method and use the resulting DatabaseMetaData object to retrieve the information you want. Here’s how to produce a list of databases:

// get list of databases

DatabaseMetaData md = conn.getMetaData ();

ResultSet rs = md.getCatalogs ();

while (rs.next ())

System.out.println (rs.getString (1)); // column 1 = database name

rs.close ();

A similar procedure lists the tables in a given database:

// get list of tables in database named by dbName; if

// dbName is the empty string, the default database is used

DatabaseMetaData md = conn.getMetaData ();

ResultSet rs = md.getTables (dbName, "", "%", null);

while (rs.next ())

System.out.println (rs.getString (3)); // column 3 = table name

rs.close ();

Accessing Table Column Definitions

Problem

You want to find out what columns a table has and how they are defined.

Solution

There are several ways to do this. You can obtain column definitions from INFORMATION_SCHEMA, from SHOW statements, or from mysqldump.

Discussion

Information about the structure of tables enables you to answer questions such as “What columns does a table contain and what are their types?” or “What are the legal values for an ENUM or SET column?” In MySQL, there are several ways to find out about a table’s structure:

§ Retrieve the information from INFORMATION_SCHEMA. The COLUMNS table contains the column definitions.

§ Use a SHOW COLUMNS statement.

§ Use the SHOW CREATE TABLE statement or the mysqldump command-line program to obtain a CREATE TABLE statement that displays the table’s structure.

The following sections discuss how you can ask MySQL for table information using each of these methods. To try the examples, create the following item table that lists item IDs, names, and the colors in which each item is available:

CREATE TABLE item

(

id INT UNSIGNED NOT NULL AUTO_INCREMENT,

name CHAR(20),

colors SET('chartreuse','mauve','lime green','puce') DEFAULT 'puce',

PRIMARY KEY (id)

);

Using INFORMATION_SCHEMA to get table structure

To obtain information about the columns in a table by checking INFORMATION_SCHEMA, use a statement of the following form:

mysql>SELECT * FROM INFORMATION_SCHEMA.COLUMNS

-> WHERE TABLE_SCHEMA = 'cookbook' AND TABLE_NAME = 'item'\G

*************************** 1. row ***************************

TABLE_CATALOG: NULL

TABLE_SCHEMA: cookbook

TABLE_NAME: item

COLUMN_NAME: id

ORDINAL_POSITION: 1

COLUMN_DEFAULT: NULL

IS_NULLABLE: NO

DATA_TYPE: int

CHARACTER_MAXIMUM_LENGTH: NULL

CHARACTER_OCTET_LENGTH: NULL

NUMERIC_PRECISION: 10

NUMERIC_SCALE: 0

CHARACTER_SET_NAME: NULL

COLLATION_NAME: NULL

COLUMN_TYPE: int(10) unsigned

COLUMN_KEY: PRI

EXTRA: auto_increment

PRIVILEGES: select,insert,update,references

COLUMN_COMMENT:

*************************** 2. row ***************************

TABLE_CATALOG: NULL

TABLE_SCHEMA: cookbook

TABLE_NAME: item

COLUMN_NAME: name

ORDINAL_POSITION: 2

COLUMN_DEFAULT: NULL

IS_NULLABLE: YES

DATA_TYPE: char

CHARACTER_MAXIMUM_LENGTH: 20

CHARACTER_OCTET_LENGTH: 20

NUMERIC_PRECISION: NULL

NUMERIC_SCALE: NULL

CHARACTER_SET_NAME: latin1

COLLATION_NAME: latin1_swedish_ci

COLUMN_TYPE: char(20)

COLUMN_KEY:

EXTRA:

PRIVILEGES: select,insert,update,references

COLUMN_COMMENT:

*************************** 3. row ***************************

TABLE_CATALOG: NULL

TABLE_SCHEMA: cookbook

TABLE_NAME: item

COLUMN_NAME: colors

ORDINAL_POSITION: 3

COLUMN_DEFAULT: puce

IS_NULLABLE: YES

DATA_TYPE: set

CHARACTER_MAXIMUM_LENGTH: 32

CHARACTER_OCTET_LENGTH: 32

NUMERIC_PRECISION: NULL

NUMERIC_SCALE: NULL

CHARACTER_SET_NAME: latin1

COLLATION_NAME: latin1_swedish_ci

COLUMN_TYPE: set('chartreuse','mauve','lime green','puce')

COLUMN_KEY:

EXTRA:

PRIVILEGES: select,insert,update,references

COLUMN_COMMENT:

Here are some of the COLUMNS table values likely to be of most use:

§ COLUMN_NAME indicates the column name.

§ ORDINAL_POSITION is the position of the column within the table definition.

§ COLUMN_DEFAULT is the column’s default value.

§ IS_NULLABLE is YES or NO to indicate whether the column can contain NULL values.

§ DATA_TYPE and COLUMN_TYPE provide data-type information. DATA_TYPE is the data-type keyword and COLUMN_TYPE contains additional information such as type attributes.

§ CHARACTER_SET_NAME and COLLATION_NAME indicate the character set and collation for string columns. They are NULL for nonstring columns.

To retrieve information only about a single column, add a condition to the WHERE clause that names the appropriate COLUMN_NAME value:

mysql>SELECT * FROM INFORMATION_SCHEMA.COLUMNS

-> WHERE TABLE_SCHEMA = 'cookbook' AND TABLE_NAME = 'item'

-> AND COLUMN_NAME = 'colors'\G

*************************** 1. row ***************************

TABLE_CATALOG: NULL

TABLE_SCHEMA: cookbook

TABLE_NAME: item

COLUMN_NAME: colors

ORDINAL_POSITION: 3

COLUMN_DEFAULT: puce

IS_NULLABLE: YES

DATA_TYPE: set

CHARACTER_MAXIMUM_LENGTH: 32

CHARACTER_OCTET_LENGTH: 32

NUMERIC_PRECISION: NULL

NUMERIC_SCALE: NULL

CHARACTER_SET_NAME: latin1

COLLATION_NAME: latin1_swedish_ci

COLUMN_TYPE: set('chartreuse','mauve','lime green','puce')

COLUMN_KEY:

EXTRA:

PRIVILEGES: select,insert,update,references

COLUMN_COMMENT:

If you want only certain types of information, replace SELECT * with a list of the values of interest:

mysql>SELECT COLUMN_NAME, DATA_TYPE, IS_NULLABLE

-> FROM INFORMATION_SCHEMA.COLUMNS

-> WHERE TABLE_SCHEMA = 'cookbook' AND TABLE_NAME = 'item';

+-------------+-----------+-------------+

| COLUMN_NAME | DATA_TYPE | IS_NULLABLE |

+-------------+-----------+-------------+

| id | int | NO |

| name | char | YES |

| colors | set | YES |

+-------------+-----------+-------------+

INFORMATION_SCHEMA content is easy to use from within programs. Here’s a PHP function that illustrates this process. It takes database and table name arguments, selects from INFORMATION_SCHEMA to obtain a list of the table’s column names, and returns the names as an array. TheORDERBYORDINAL_POSITION clause ensures that the names in the array are returned in table definition order.

function get_column_names ($conn, $db_name, $tbl_name)

{

$stmt = "SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS

WHERE TABLE_SCHEMA = ? AND TABLE_NAME = ?

ORDER BY ORDINAL_POSITION";

$result =& $conn->query ($stmt, array ($db_name, $tbl_name));

if (PEAR::isError ($result))

return (FALSE);

$names = array();

while (list ($col_name) = $result->fetchRow ())

$names[] = $col_name;

$result->free ();

return ($names);

}

The equivalent routine using Ruby DBI looks like this:

def get_column_names(dbh, db_name, tbl_name)

stmt = "SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS

WHERE TABLE_SCHEMA = ? AND TABLE_NAME = ?

ORDER BY ORDINAL_POSITION"

return dbh.select_all(stmt, db_name, tbl_name).collect { |row| row[0] }

end

And in Python, it looks like this:

def get_column_names (conn, db_name, tbl_name):

stmt = """

SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS

WHERE TABLE_SCHEMA = %s AND TABLE_NAME = %s

ORDER BY ORDINAL_POSITION

"""

cursor = conn.cursor ()

cursor.execute (stmt, (db_name, tbl_name))

names = []

for row in cursor.fetchall ():

names.append (row[0])

cursor.close ()

return (names)

In Perl DBI, this operation is trivial, because selectcol_arrayref() returns the first column of the query result directly:

sub get_column_names

{

my ($dbh, $db_name, $tbl_name) = @_;

my $stmt = "SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS

WHERE TABLE_SCHEMA = ? AND TABLE_NAME = ?

ORDER BY ORDINAL_POSITION";

my $ref = $dbh->selectcol_arrayref ($stmt, undef, $db_name, $tbl_name);

return defined ($ref) ? @{$ref} : ();

}

The routines just shown return an array containing only column names. If you require additional column information, you can write more general routines that return an array of structures, in which each structure contains information about a given column. The lib directory of the recipesdistribution contains some examples. Look for routines named get_column_info() in the library files.

Using SHOW COLUMNS to get table structure

The SHOW COLUMNS statement produces one row of output for each column in the table, with each row providing various pieces of information about the corresponding column.[12]The following example demonstrates the output that SHOW COLUMNS produces for the item table.

mysql>SHOW COLUMNS FROM item\G

*************************** 1. row ***************************

Field: id

Type: int(10) unsigned

Null: NO

Key: PRI

Default: NULL

Extra: auto_increment

*************************** 2. row ***************************

Field: name

Type: char(20)

Null: YES

Key:

Default: NULL

Extra:

*************************** 3. row ***************************

Field: colors

Type: set('chartreuse','mauve','lime green','puce')

Null: YES

Key:

Default: puce

Extra:

The information displayed by the statement is as follows:

§ Field indicates the column’s name.

§ Type shows the data type.

§ Null is YES if the column can contain NULL values, NO otherwise.

§ Key provides information about whether the column is indexed.

§ Default indicates the default value.

§ Extra lists miscellaneous information.

The format of SHOW COLUMNS changes occasionally, but the fields just described should always be available. SHOW FULL COLUMNS displays additional fields.

SHOWCOLUMNS supports a LIKE clause that takes an SQL pattern:

SHOW COLUMNS FROMtbl_name LIKE 'pattern';

The pattern is interpreted the same way as for the LIKE operator in the WHERE clause of a SELECT statement. (For information about pattern matching, see Pattern Matching with SQL Patterns.) With a LIKE clause, SHOW COLUMNS displays information for any column having a name that matches the pattern. If you specify a literal column name, the string matches only that name and SHOW COLUMNS displays information only for that column. However, a trap awaits the unwary here. If your column name contains SQL pattern characters (% or _) and you want to match them literally, you must escape them with a backslash in the pattern string to avoid matching other names as well. The % character isn’t used very often in column names, but _ is quite common, so it’s possible that you’ll run into this issue. Suppose that you have a table that contains the results of carbon dioxide measurements in a column named co_2, and trigonometric cosine and cotangent calculations in columns named cos1, cos2, cot1, and cot2. If you want to get information only for the co_2 column, you can’t use this statement:

SHOW COLUMNS FROMtbl_name LIKE 'co_2';

The _ character means “match any character” in pattern strings, so the statement would return rows for co_2, cos2, and cot2. To match only the co_2 column, write the SHOW command like this:

SHOW COLUMNS FROMtbl_name LIKE 'co\_2';

Within a program, you can use your API language’s pattern matching capabilities to escape SQL pattern characters before putting the column name into a SHOW statement. For example, in Perl, Ruby, and PHP, you can use the following expressions.

Perl:

$name =~ s/([%_])/\\$1/g;

Ruby:

name.gsub!(/([%_])/, '\\\\\1')

PHP:

$name = ereg_replace ("([%_])", "\\\\1", $name);

For Python, import the re module, and use its sub() method:

name = re.sub (r'([%_])', r'\\\1', name)

For Java, use the java.util.regex package:

import java.util.regex.*;

Pattern p = Pattern.compile("([_%])");

Matcher m = p.matcher(name);

name = m.replaceAll ("\\\\$1");

If these expressions appear to have too many backslashes, remember that the API language processor itself interprets backslashes and strips off a level before performing the pattern match. To get a literal backslash into the result, it must be doubled in the pattern. PHP has another level on top of that because it strips a set and the pattern processor strips a set.

The need to escape % and _ characters to match a LIKE pattern literally also applies to other forms of the SHOW statement that allow a name pattern in the LIKE clause, such as SHOW TABLES and SHOW DATABASES.

Using CREATE TABLE to get table structure

Another way to obtain table structure information from MySQL is from the CREATE TABLE statement that defines the table. You can get this information using the SHOW CREATE TABLE statement:

mysql>SHOW CREATE TABLE item\G

*************************** 1. row ***************************

Table: item

Create Table: CREATE TABLE `item` (

`id` int(10) unsigned NOT NULL AUTO_INCREMENT,

`name` char(20) DEFAULT NULL,

`colors` set('chartreuse','mauve','lime green','puce') DEFAULT 'puce',

PRIMARY KEY (`id`)

) ENGINE=MyISAM DEFAULT CHARSET=latin1

From the command line, the same information is available from mysqldump if you use the --no-data option, which tells mysqldump to dump only the structure of the table and not its data:

%mysqldump --no-data cookbook item

-- MySQL dump 10.10

--

-- Host: localhost Database: cookbook

-- ------------------------------------------------------

-- Server version 5.0.27-log

--

-- Table structure for table `item`

--

CREATE TABLE `item` (

`id` int(10) unsigned NOT NULL auto_increment,

`name` char(20) default NULL,

`colors` set('chartreuse','mauve','lime green','puce') default 'puce',

PRIMARY KEY (`id`)

) ENGINE=MyISAM DEFAULT CHARSET=latin1;

This format is highly informative and easy to read because it shows column information in a format similar to the one you used to create the table in the first place. It also shows the index structure clearly, whereas the other methods do not. However, you’ll probably find this method for checking table structure more useful for visual examination than for use within programs. The information isn’t provided in regular row-and-column format, so it’s more difficult to parse. Also, the format is somewhat subject to change whenever the CREATE TABLE statement is enhanced, which happens from time to time as MySQL’s capabilities are extended.


[12] SHOWCOLUMNSFROMtbl_name is equivalent to SHOWFIELDSFROMtbl_name or DESCRIBEtbl_name.

Getting ENUM and SET Column Information

Problem

You want to know what members an ENUM or SET column has.

Solution

This problem is a subset of getting table structure metadata. Obtain the column definition from the table metadata, and then extract the member list from the definition.

Discussion

It’s often useful to know the list of legal values for an ENUM or SET column. Suppose that you want to present a web form containing a pop-up menu that has options corresponding to each legal value of an ENUM column, such as the sizes in which a garment can be ordered, or the available shipping methods for delivering a package. You could hardwire the choices into the script that generates the form, but if you alter the column later (for example, to add a new enumeration value), you introduce a discrepancy between the column and the script that uses it. If instead you look up the legal values using the table metadata, the script always produces a pop-up that contains the proper set of values. A similar approach can be used with SET columns.

To find out what values an ENUM or SET column can have, get the column definition using one of the techniques described in Accessing Table Column Definitions and look at the data type in the definition. For example, if you select from the INFORMATION_SCHEMA COLUMNS table, theCOLUMN_TYPE value for the colors column of the item table looks like this:

set('chartreuse','mauve','lime green','puce')

ENUM columns are similar, except that they say enum rather than set. For either data type, the allowable values can be extracted by stripping off the initial word and the parentheses, splitting at the commas, and removing the enclosing quotes from the individual values. Let’s write aget_enumorset_info() routine to break out these values from the data-type definition. While we’re at it, we can have the routine return the column’s type, its default value, and whether values can be NULL. Then the routine can be used by scripts that may need more than just the list of values. Here is a version in Ruby. Its arguments are a database handle, a database name, a table name, and a column name. It returns a hash with entries corresponding to the various aspects of the column definition (or nil if the column does not exist):

def get_enumorset_info(dbh, db_name, tbl_name, col_name)

row = dbh.select_one(

"SELECT COLUMN_NAME, COLUMN_TYPE, IS_NULLABLE, COLUMN_DEFAULT

FROM INFORMATION_SCHEMA.COLUMNS

WHERE TABLE_SCHEMA = ? AND TABLE_NAME = ? AND COLUMN_NAME = ?",

db_name, tbl_name, col_name)

return nil if row.nil?

info = {}

info["name"] = row[0]

return nil unless row[1] =~ /^(ENUM|SET)\((.*)\)$/i

info["type"] = $1

# split value list on commas, trim quotes from end of each word

info["values"] = $2.split(",").collect { |val| val.sub(/^'(.*)'$/, "\\1") }

# determine whether column can contain NULL values

info["nullable"] = (row[2].upcase == "YES")

# get default value (nil represents NULL)

info["default"] = row[3]

return info

end

The routine uses case-insensitive matching when checking the data type and nullable attributes. This guards against future possible lettercase changes in metadata results.

The following example shows one way to access and display each element of the hash returned by get_enumorset_info():

info = get_enumorset_info(dbh, db_name, tbl_name, col_name)

puts "Information for " + db_name + "." + tbl_name + "." + col_name + ":"

if info.nil?

puts "No information available (not an ENUM or SET column?)"

else

puts "Name: " + info["name"]

puts "Type: " + info["type"]

puts "Legal values: " + info["values"].join(",")

puts "Nullable: " + (info["nullable"] ? "yes" : "no")

puts "Default value: " + (info["default"].nil? ? "NULL" : info["default"])

end

That code produces the following output for the item table colors column:

Information for cookbook.item.colors:

Name: colors

Type: set

Legal values: chartreuse,mauve,lime green,puce

Nullable: yes

Default value: puce

Equivalent routines for other APIs are similar. Such routines are especially handy for generating list elements in web forms. (See Recipes and .)

Using Table Structure Information in Applications

Problem

It’s all well and good to be able to obtain table structure information, but what can you use it for?

Solution

Lots of things are possible: you can display lists of table columns, create web form elements, produce ALTER TABLE statements for modifying ENUM or SET columns, and more.

Discussion

This section describes some uses for the table structure information that MySQL makes available.

Displaying column lists

Probably the simplest use of table information is to present a list of the table’s columns. This is common in web-based or GUI applications that allow users to construct statements interactively by selecting a table column from a list and entering a value against which to compare column values. The get_column_names() routines shown in Accessing Table Column Definitions can serve as the basis for such list displays.

Interactive record editing

Knowledge of a table’s structure can be very useful for interactive record-editing applications. Suppose that you have an application that retrieves a record from the database, displays a form containing the record’s content so a user can edit it, and then updates the record in the database after the user modifies the form and submits it. You can use table structure information for validating column values. For example, if a column is an ENUM, you can find out the valid enumeration values and check the value submitted by the user against them to determine whether it’s legal. If the column is an integer type, check the submitted value to make sure that it consists entirely of digits, possibly preceded by a + or - sign character. If the column contains dates, look for a legal date format.

But what if the user leaves a field empty? If the field corresponds to, say, a CHAR column in the table, do you set the column value to NULL or to the empty string? This too is a question that can be answered by checking the table’s structure. Determine whether the column can contain NULLvalues. If it can, set the column to NULL; otherwise, set it to the empty string.

Mapping column definitions onto web page elements

Some data types such as ENUM and SET correspond naturally to elements of web forms:

§ An ENUM has a fixed set of values from which you choose a single value. This is analogous to a group of radio buttons, a pop-up menu, or a single-pick scrolling list.

§ A SET column is similar, except that you can select multiple values; this corresponds to a group of checkboxes or a multiple-pick scrolling list.

By using the table metadata to access the definitions for these types of columns, you can easily determine a column’s legal values and map them onto the appropriate form element automatically. This enables you to present users with a list of applicable values from which selections can be made easily without any typing. Getting ENUM and SET Column Information discussed how to get definitions for these types of columns. The methods developed there are used in Chapter 19, which discusses form generation in more detail.

Adding elements to ENUM or SET column definitions

When you need to modify a column definition, you can use ALTER TABLE. However, it’s really a pain to add a new element to an ENUM or SET column definition because you must list not only the new element, but all the existing elements, the default value, and NOT NULL if the column cannot contain NULL values. Suppose that you want to add “hot pink” to the colors column of an item table that has this structure:

CREATE TABLE item

(

id INT UNSIGNED NOT NULL AUTO_INCREMENT,

name CHAR(20),

colors SET('chartreuse','mauve','lime green','puce') DEFAULT 'puce',

PRIMARY KEY (id)

);

To change the column definition, use ALTER TABLE as follows:

ALTER TABLE item

MODIFY colors

SET('chartreuse','mauve','lime green','puce','hot pink')

DEFAULT 'puce';

The ENUM definition doesn’t contain many elements, so that statement isn’t very difficult to enter manually. However, the more elements a column has, the more difficult and error prone it is to type statements like that. To avoid retyping the existing definition just to add a new element, you have a choice of strategies:

§ Write a script that does the work for you. It can examine the table definition and use the column metadata to generate the ALTER TABLE statement.

§ Use mysqldump to get a CREATE TABLE statement that contains the current column definition, and modify the statement in a text editor to produce the appropriate ALTER TABLE statement that changes the definition.

As an implementation of the first approach, let’s develop a Python script add_element.py that generates the appropriate ALTER TABLE statement automatically when given database and table names, an ENUM or SET column name, and the new element value. add_element.py will use that information to figure out the correct ALTER TABLE statement and display it:

%add_element.py cookbook item colors "hot pink"

ALTER TABLE `cookbook`.`item`

MODIFY `colors`

set('chartreuse','mauve','lime green','puce','hot pink')

DEFAULT 'puce';

By having add_element.py produce the statement as its output, you have the choice of shoving it into mysql for immediate execution or saving the output into a file:

%add_element.py cookbook item colors "hot pink" | mysql cookbook

% add_element.py cookbook item colors "hot pink" > stmt.sql

The first part of the add_element.py script imports the requisite modules and checks the command-line arguments. This is fairly straightforward:

#!/usr/bin/python

# add_element.py - produce ALTER TABLE statement to add an element

# to an ENUM or SET column

import sys

import MySQLdb

import Cookbook

if len (sys.argv) != 5:

print "Usage: add_element.py db_name tbl_name col_name new_element"

sys.exit (1)

(db_name, tbl_name, col_name, new_elt) = (sys.argv[1:5])

After connecting to the MySQL server (code not shown, but is present in the script), the script checks INFORMATION_SCHEMA to retrieve the column definition, whether it allows NULL values, and its default value. The following code does this, checking to make sure that the column really exists in the table:

stmt = """

SELECT COLUMN_TYPE, IS_NULLABLE, COLUMN_DEFAULT

FROM INFORMATION_SCHEMA.COLUMNS

WHERE TABLE_SCHEMA = %s AND TABLE_NAME = %s AND COLUMN_NAME = %s

"""

cursor = conn.cursor ()

cursor.execute (stmt, (db_name, tbl_name, col_name))

info = cursor.fetchone ()

cursor.close

if info == None:

print "Could not retrieve information for table %s.%s, column %s" \

% (db_name, tbl_name, col_name)

sys.exit (1)

At this point, if the SELECT statement succeeded, the information produced by it is available as a tuple stored in the info variable. We’ll need to use several elements from this tuple. The most important is the COLUMN_TYPE value, which provides the enum(...) or set(...) string containing the column’s definition. We can use this string to verify that the column really is an ENUM or SET, and then add the new element to the string just before the closing parenthesis. For the colors column, we want to change this:

set('chartreuse','mauve','lime green','puce')

To this:

set('chartreuse','mauve','lime green','puce','hot pink')

It’s also necessary to check whether column values can be NULL and what the default value is so that the program can add the appropriate information to the ALTER TABLE statement. The code that does all this is as follows:

# get data type string; make sure it begins with ENUM or SET

type = info[0]

if type[0:5].upper() != "ENUM(" and type[0:4].upper() != "SET(":

print "table %s.%s, column %s is not an ENUM or SET" % \

(db_name, tbl_name, col_name)

sys.exit(1)

# insert comma and properly quoted new element just before closing paren

type = type[0:len(type)-1] + "," + conn.literal (new_elt) + ")"

# if column cannot contain NULL values, add "NOT NULL"

if info[1].upper() == "YES":

nullable = ""

else:

nullable = "NOT NULL ";

# construct DEFAULT clause (quoting value as necessary)

default = "DEFAULT " + conn.literal (info[2])

print "ALTER TABLE `%s`.`%s`\n MODIFY `%s`\n %s\n %s%s;" \

% (db_name, tbl_name, col_name, type, nullable, default)

That’s it. You now have a working ENUM- or SET-altering program. Still, add_element.py is fairly basic and can be improved in various ways:

§ Make sure that the element value you’re adding to the column isn’t already there.

§ Modify add_element.py to take more than one argument after the column name and add all of them to the column definition at the same time.

§ Add an option to indicate that the named element should be deleted rather than added.

Another approach to altering ENUM or SET columns involves capturing the current definition in a file and editing the file to produce the proper ALTER TABLE statement.

1. Run mysqldump to get the CREATE TABLE statement that contains the column definition:

%mysqldump --no-data cookbook item > test.txt

The --no-data option tells mysqldump not to dump the data from the table; it’s used here because only the table-creation statement is needed. The resulting file, test.txt, should contain this statement:

CREATE TABLE `item` (

`id` int(10) unsigned NOT NULL AUTO_INCREMENT,

`name` char(20) DEFAULT NULL,

`colors` set('chartreuse','mauve','lime green','puce') DEFAULT 'puce',

PRIMARY KEY (`id`)

) ENGINE=MyISAM DEFAULT CHARSET=latin1;

2. Edit the test.txt file to remove everything but the definition for the colors column:

`colors` set('chartreuse','mauve','lime green','puce') DEFAULT 'puce',

3. Modify the definition to produce an ALTER TABLE statement that has the new element and a semicolon at the end:

4. ALTER TABLE item MODIFY

5. `colors` set('chartreuse','mauve','lime green','puce','hot pink')

DEFAULT 'puce';

6. Write test.txt back out to save it, and then get out of the editor and feed test.txt as a batch file to mysql:

%mysql cookbook < test.txt

For simple columns, this procedure is more work than just typing the ALTER TABLE statement manually. However, for ENUM and SET columns with long and ungainly definitions, using an editor to create a mysql batch file from mysqldump output makes a lot of sense. This technique also is useful when you want to delete or reorder members of an ENUM or SET column, or to add or delete members from the column definition.

Selecting all except certain columns

Sometimes you want to retrieve “almost all” the columns from a table. Suppose that you have an image table that contains a BLOB column named data used for storing images that might be very large, and other columns that characterize the BLOB column, such as its ID, a description, and so forth. It’s easy to write a SELECT * statement that retrieves all the columns, but if all you need is the descriptive information about the images and not the images themselves, it’s inefficient to drag the BLOB values over the connection along with the other columns. Instead, you want to select everything in the row except the data column.

Unfortunately, there is no way to say directly in SQL, “select all columns except this one.” You must explicitly name all the columns except data. On the other hand, it’s easy to construct that kind of statement by using table structure information. Extract the list of column names, delete the one to be excluded, and then construct a SELECT statement from those columns that remain. The following example shows how to do this in PHP, using the get_column_names() function developed earlier in the chapter to obtain the column names for a table:

$names = get_column_names ($conn, $db_name, $tbl_name);

$stmt = "";

# construct list of columns to select: all but "data"

foreach ($names as $index => $name)

{

if ($name == "data")

continue;

if ($stmt != "") # put commas between column names

$stmt .= ", ";

$stmt .= "`$name`";

}

$stmt = "SELECT $stmt FROM `$db_name`.`$tbl_name`";

The equivalent Perl code for constructing the statement is a bit shorter (and correspondingly more cryptic):

my @names = get_column_names ($dbh, $db_name, $tbl_name);

my $stmt = "SELECT `"

. join ("`, `", grep (!/^data$/, @names))

. "` FROM `$db_name`.`$tbl_name`";

Whichever language you use, the result is a statement that you can use to select all columns but data. It will be more efficient than SELECT * because it won’t pull the BLOB values over the network. Of course, this process does involve an extra round trip to the server to execute the statement that retrieves the column names, so you should consider the context in which you plan to use the SELECT statement. If you’re going to retrieve only a single row, it might be more efficient simply to select the entire row than to incur the overhead of the extra round trip. But if you’re selecting many rows, the reduction in network traffic achieved by skipping the BLOB columns will be worth the overhead of the additional query for getting table structure.

Getting Server Metadata

Problem

You want the MySQL server to tell you about itself.

Solution

Several SQL functions and SHOW statements return information about the server.

Discussion

MySQL offers several SQL functions and statements that provide you with information about the server itself and about your current client connection. A few that you may find useful are listed here. To obtain the information provided by any of them, issue the statement, and then process its result set. Both SHOW statements allow a LIKE ' pattern ' clause for limiting the results only to those rows matching the pattern.

Statement

Information produced by statement

SELECT VERSION()

Server version string

SELECT DATABASE()

Default database name (NULL if none)

SELECT USER()

Current user as given by client when connecting

SELECT CURRENT_USER()

User used for checking client privileges

SHOW GLOBAL STATUS

Server global status indicators

SHOW VARIABLES

Server configuration variables

A given API might provide alternative ways to access these types of information. For example, JDBC has several database-independent methods for obtaining server metadata. Use your connection object to obtain the database metadata, and then invoke the appropriate methods to get the information in which you’re interested. You should consult a JDBC reference for a complete list, but here are a few representative examples:

DatabaseMetaData md = conn.getMetaData ();

// can also get this with SELECT VERSION()

System.out.println ("Product version: " + md.getDatabaseProductVersion ());

// this is similar to SELECT USER() but doesn't include the hostname

System.out.println ("Username: " + md.getUserName ());

Writing Applications That Adapt to the MySQL Server Version

Problem

You want to use a given feature that is only available in a particular version of MySQL.

Solution

Ask the server for its version number. If the server is too old to support a given feature, maybe you can fall back to a workaround, if one exists.

Discussion

Each version of MySQL adds features. If you’re writing an application that requires certain features, check the server version to determine if they are present; if not, you must perform some sort of workaround (assuming there is one).

To get the server version, issue a SELECT VERSION() statement. The result is a string that looks something like 5.0.13-rc or 4.1.10a. In other words, it returns a string consisting of major , minor, and “teeny” version numbers, possibly some letter at the end of the “teeny” version, and possibly some suffix. The version string can be used as is for presentation purposes if you want to produce a status display for the user. However, for comparisons, it’s simpler to work with a number—in particular, a five-digit number in Mmmtt format, in which M, mm, tt are the major, minor, and teeny version numbers. The conversion can be performed by splitting the string at the periods, stripping off from the third piece the suffix that begins with the first nonnumeric character, and then joining the pieces. For example, 5.0.13-rc. becomes 50013, and 4.1.10a becomes40110.

Here’s a Perl DBI function that takes a database handle argument and returns a two-element list that contains both the string and numeric forms of the server version. The code assumes that the minor and teeny version parts are less than 100 and thus no more than two digits each. That should be a valid assumption, because the source code for MySQL itself uses the same format.

sub get_server_version

{

my $dbh = shift;

my ($ver_str, $ver_num);

my ($major, $minor, $teeny);

# fetch result into scalar string

$ver_str = $dbh->selectrow_array ("SELECT VERSION()");

return undef unless defined ($ver_str);

($major, $minor, $teeny) = split (/\./, $ver_str);

$teeny =~ s/\D*$//; # strip any nonnumeric suffix if present

$ver_num = $major*10000 + $minor*100 + $teeny;

return ($ver_str, $ver_num);

}

To get both forms of the version information at once, call the function like this:

my ($ver_str, $ver_num) = get_server_version ($dbh);

To get just one of the values, call it as follows:

my $ver_str = (get_server_version ($dbh))[0]; # string form

my $ver_num = (get_server_version ($dbh))[1]; # numeric form

The following examples demonstrate how to use the numeric version value to check whether the server supports certain features:

my $ver_num = (get_server_version ($dbh))[1];

printf "Quoted identifiers: %s\n", ($ver_num >= 32306 ? "yes" : "no");

printf "UNION statement: %s\n", ($ver_num >= 40000 ? "yes" : "no");

printf "Subqueries: %s\n", ($ver_num >= 40100 ? "yes" : "no");

printf "Views: %s\n", ($ver_num >= 50001 ? "yes" : "no");

printf "Strict SQL mode: %s\n", ($ver_num >= 50002 ? "yes" : "no");

printf "Events: %s\n", ($ver_num >= 50106 ? "yes" : "no");

Determining the Default Database

Problem

Has any database been selected as the default database? What is its name?

Solution

Use the DATABASE() function.

Discussion

SELECTDATABASE() returns the name of the default database or NULL if no database has been selected. The following Ruby code uses the statement to present a status display containing information about the current connection:

db = dbh.select_one("SELECT DATABASE()")[0]

puts "Default database: " + (db.nil? ? "(no database selected)" : db)

Note that before MySQL 4.1.1, DATABASE() returns an empty string (not NULL) if there is no current database.

Monitoring the MySQL Server

Problem

You want to find out how the server was configured or monitor its state.

Solution

SHOWVARIABLESand SHOWSTATUS are useful for this.

Discussion

The SHOW VARIABLES and SHOW STATUS statements provide server configuration and performance information:

mysql>SHOW VARIABLES;

+---------------------------------+-------------------+

| Variable_name | Value |

+---------------------------------+-------------------+

| back_log | 50 |

| basedir | /usr/local/mysql/ |

| bdb_cache_size | 8388600 |

| bdb_log_buffer_size | 0 |

| bdb_home | |

...

mysql> SHOW /*!50002 GLOBAL */ STATUS;

+--------------------------+----------+

| Variable_name | Value |

+--------------------------+----------+

| Aborted_clients | 319 |

| Aborted_connects | 22 |

| Bytes_received | 32085033 |

| Bytes_sent | 26379272 |

| Connections | 65684 |

...

Both statements allow a LIKE ' pattern ' clause that takes an SQL pattern. In that case, only rows for variable names that match the pattern are returned.

The /*!50002 GLOBAL */ comment is present in the SHOW STATUS statement due to a change made in MySQL 5.0.2; before MySQL 5.0.2, status variables were global (server-wide values). In 5.0.2, status variables have global and session (per-connection) values, and SHOW STATUS has been extended to take GLOBAL or SESSION modifiers, with the default being, if neither is given, to display the session values. The comment causes servers from MySQL 5.0.2 and up to display the global values. Servers before 5.0.2 ignore the comment, but display global values because only global values exist in those versions. This idiom is used throughout this chapter.

System and status variable information can be useful for writing administrative applications. For example, the MyISAM key cache hit rate is a measure of how often key requests are satisfied from the cache without reading keys from disk. The following formula calculates the hit rate, whereKey_reads and Key_read_requests indicate the number of disk reads and number of requests, respectively:

1 - (Key_reads / Key_read_requests)

Values close to 1 indicate a high hit rate, which means that the key cache is very efficient. Values close to 0 indicate a low hit rate (possibly a sign that you should increase the value of the key_buffer_size system variable to use a larger cache). It’s easy to calculate the hit rate in a program, as the following Ruby code illustrates:

# Execute SHOW STATUS to get relevant server status variables, use

# names and values to construct hash of values keyed by names.

stat_hash = {}

stmt = "SHOW /*!50002 GLOBAL */ STATUS LIKE 'Key_read%'"

dbh.select_all(stmt).each do |name, value|

stat_hash[name] = value

end

key_reads = stat_hash["Key_reads"].to_f

key_reqs = stat_hash["Key_read_requests"].to_f

hit_rate = key_reqs == 0 ? 0 : 1.0 - (key_reads / key_reqs)

puts " Key_reads: #{key_reads}"

puts "Key_read_requests: #{key_reqs}"

puts " Hit rate: #{hit_rate}"

As another example, you might write a long-running program that probes the server periodically to monitor its activity. A simple application of this type might ask the server to report the number of connections it’s received and its uptime, to determine a running display of average connection activity. The statements to obtain this information are:

SHOW /*!50002 GLOBAL */ STATUS LIKE 'Connections';

SHOW /*!50002 GLOBAL */ STATUS LIKE 'Uptime';

If you want to avoid having to reconnect each time you issue the statements, you can ask the server for its client timeout period and probe it at intervals shorter than that value. You can get the timeout value (in seconds) with this statement:

SHOW VARIABLES LIKE 'wait_timeout';

The default value is 28800 (8 hours), but it might be configured to a different value on your system.

For system variables, an alternative way to access their values is by referring to them as @@ var_name. For example:

mysql>SELECT @@wait_timeout;

+----------------+

| @@wait_timeout |

+----------------+

| 28800 |

+----------------+

This provides the convenience that you can select multiple values in a single row, and you can use the values directly within expressions.

THE “MYSQL UNCERTAINTY PRINCIPLE”

Heisenberg’s uncertainty principle for measurement of quantum phenomena has a MySQL analog. If you monitor MySQL’s status to see how it changes over time, you might notice the curious effect that, for some of the indicators, each time you take a measurement, you change the value you’re measuring! For example, you can determine the number of statements the server has received by using the following statement:

SHOW /*!50002 GLOBAL */ STATUS LIKE 'Questions'

However, that statement is itself a statement, so each time you issue it, you cause the Questions value to change. In effect, your performance assessment instrument contaminates its own measurements, something you might want to take into account.

Determining Which Storage Engines the Server Supports

Problem

You want to know whether you can create a table using a given storage engine.

Solution

Use the SHOW ENGINES statement to ask the server which storage engines it supports.

Discussion

The SHOW ENGINES statement provides information about which storage engines the server supports. Its output looks like this:

mysql>SHOW ENGINES\G

*************************** 1. row ***************************

Engine: MyISAM

Support: DEFAULT

Comment: Default engine as of MySQL 3.23 with great performance

*************************** 2. row ***************************

Engine: MEMORY

Support: YES

Comment: Hash based, stored in memory, useful for temporary tables

*************************** 3. row ***************************

Engine: InnoDB

Support: YES

Comment: Supports transactions, row-level locking, and foreign keys

*************************** 4. row ***************************

Engine: BerkeleyDB

Support: NO

Comment: Supports transactions and page-level locking

...

The Engine value indicates a storage engine name and the Support value indicates its status. If Support is YES or DEFAULT, the engine is available. If the value is NO or DISABLED, the engine is not available. The following Ruby method uses those rules to determine engine status and return a list of the engines that are available:

def get_storage_engines(dbh)

engines = []

dbh.select_all("SHOW ENGINES").each do |engine, support|

engines << engine if ["YES", "DEFAULT"].include?(support.upcase)

end

return engines

end