MySQL Cookbook (2007)

Chapter 2. Writing MySQL-Based Programs

Introduction

This chapter discusses how to write programs that use MySQL. It covers basic application programming interface (API) operations that are fundamental to the programming recipes developed in later chapters. These operations include connecting to the MySQL server, issuing statements, and retrieving the results.

MySQL-based client programs can be written using several languages. The languages covered in this book are Perl , Ruby, PHP, Python, and Java, for which we’ll use the interfaces in the following table. Appendix A indicates where to get the software for each interface.

Language	Interface
Perl	Perl DBI
Ruby	Ruby DBI
PHP	PEAR DB
Python	DB-API
Java	JDBC

MySQL client APIs provide the following capabilities, each of which is covered in a section of this chapter:

Connecting to the MySQL server, selecting a database, and disconnecting from the server

Every program that uses MySQL must first establish a connection to the server, and most programs also select a default database to use. In addition, well-behaved MySQL programs close the connection to the server when they’re done with it.

Checking for errors

Many people write MySQL programs that perform no error checking at all. Such programs are difficult to debug when things go wrong. Any database operation can fail and you should know how to find out when that occurs and why. That knowledge enables you to take appropriate action such as terminating the program or informing the user of the problem.

Executing SQL statements and retrieving results

The whole point of connecting to a database server is to execute SQL statements. Each API provides at least one way to do this, as well as several methods for processing statements results.

Handling special characters and NULL values in statements

One way to write a statement that refers to specific data values is to embed the values directly in the statement string. However, some characters such as quotes and backslashes have special meaning, and you must take certain precautions when constructing statements containing them. The same is true for NULL values. If you do not handle these properly, your programs may generate SQL statements that are erroneous or that yield unexpected results. If you incorporate data from external sources into queries, you might become open to SQL injection attacks. Most APIs provide a way of writing statements that enables you to refer to data values symbolically. When you execute the statement, you supply the data values separately and the API places them into the statement string after properly encoding any special characters or NULL values.

Identifying NULL values in result sets

NULL values are special not only when you construct statements, but also in results returned from statements. Each API provides a convention for recognizing and dealing with them.

No matter which programming language you use, it’s necessary to know how to perform each of the fundamental database API operations, so each operation is shown in all five languages. Seeing how each API handles a given operation should help you see the correspondences between APIs more easily and better understand the recipes shown in the following chapters, even if they’re written in a language you don’t use very much. (Later chapters usually illustrate recipe implementations using only one or two languages.)

It may seem overwhelming to see each recipe in several languages if you’re interested only in one particular API. If so, I advise you to approach the recipes as follows: read just the introductory part that provides the general background, and then go directly to the section for the language in which you’re interested. Skip the other languages. Should you develop an interest in writing programs in other languages later, you can always come back and read the other sections then.

This chapter also discusses the following topics, which are not directly part of MySQL APIs but can help you use them more easily:

Writing library files

As you write program after program, you may find that there are certain operations you carry out repeatedly. Library files provide a way to encapsulate the code for these operations so that you can perform them from multiple scripts without including all the code in each script. This reduces code duplication and makes your programs more portable. This section shows how to write a library file for each API that includes a routine for connecting to the server—one operation that every program that uses MySQL must perform. (Later chapters develop additional library routines for other operations.)

Additional techniques for obtaining connection parameters

An early section on establishing connections to the MySQL server relies on connection parameters hardwired into the code. However, there are several other ways to obtain parameters, ranging from storing them in a separate file to allowing the user to specify them at runtime.

To avoid manually typing in the example programs, you should obtain the recipes source distribution. (See Appendix A.) Then, when an example says something like “create a file named xyz that contains the following information ...,” you can just use the corresponding file from therecipes distribution. The scripts for this chapter are located under the api directory, with the exception of the library files that can be found in the lib directory.

The primary table used for examples in this chapter is named profile. It first appears in Issuing Statements and Retrieving Results, which you should know in case you skip around in the chapter and wonder where it came from. See also the section at the very end of the chapter about resetting the profile table to a known state for use in later chapters.

NOTE

The programs discussed here can be run from the command line. For instructions on invoking programs for each of the languages covered here, see Appendix B.

Assumptions

To use the material in this chapter most effectively, you should make sure that the following assumptions are satisfied:

§ MySQL programming support must be installed for any language processors that you plan to use. If you need to install any of the APIs, see Appendix A.

§ You should already have set up a MySQL user account for accessing the server and a database to use for trying statements. As described in Setting Up a MySQL User Account, the examples in this book use a MySQL account that has a username and password of cbuser and cbpass, and we’ll connect to a MySQL server running on the local host to access a database named cookbook. If you need to create the account or the database, see the instructions in that recipe.

§ The discussion here assumes a certain basic understanding of the API languages. If a recipe uses language constructs with which you’re not familiar, consult a good general text that covers that language. Appendix D lists some resources that may be helpful.

§ Proper execution of some of the programs might require that you set certain environment variables. See Appendix B, for general information about setting environment variables, and Writing Library Files for details about environment variables that apply specifically to searching for library files.

MySQL Client API Architecture

One thing that all MySQL client programs have in common, no matter which language you use, is that they connect to the server using some kind of application programming interface that implements a communications protocol. This is true regardless of the program’s purpose, whether it’s a command-line utility, a job that runs automatically on a predetermined schedule, or a script that’s used from a web server to make database content available over the Web. MySQL APIs provide a standard way for you, the application developer, to express database operations. Each API translates your instructions into something the MySQL server can understand.

The server itself speaks a low-level protocol that I call the raw protocol. This is the level at which direct communication takes place over the network between the server and its clients. A client establishes a connection to the port on which the server is listening and communicates with it by speaking the client-server protocol in its most basic terms. (Basically, the client fills in data structures and shoves them over the network.) It’s not productive to attempt to communicate directly with the server at this level, nor to write programs that do so. The raw protocol is a binary communication stream that is efficient, but not particularly easy to use, a fact that usually deters developers from attempting to write programs that talk to the server this way. More convenient access to the MySQL server is available through a programming interface that is written at a level above that of the raw protocol. The interface handles the details of the raw protocol on behalf of your programs. It provides calls for operations such as connecting to the server, sending statements, retrieving the results of statements, and obtaining statement status information.

Most MySQL APIs do not implement the raw protocol directly. Instead, they are linked to and rely on the MySQL client library that is included with MySQL distributions. The client library is written in C and thus provides the basis of an interface for communicating with the server from within C programs. The majority of the standard clients in the MySQL distribution are written in C and use this API. You can use it in your own programs, too, and should consider doing so if you want the most efficient programs possible. However, most third-party application development is not done in C. Instead, the C API is most often used indirectly as an embedded library within other languages. This is how MySQL communication is implemented for Perl, Ruby, PHP, Python, and several other languages. The MySQL API for these higher-level languages is written as a“wrapper” around the C routines, which are linked to the language processor.

The benefit of this approach is that it allows a language processor to talk to the MySQL server on your behalf using the C routines while providing an interface in which you specify database operations more conveniently. For example, scripting languages such as Perl or Ruby typically make it easy to manipulate text without having to allocate string buffers or dispose of them when you’re done with them the way you do in C. Higher-level languages let you concentrate more on what you’re trying to do and less on the details that you must think about when you’re writing directly in C.

This book doesn’t cover the C API in any detail because we never use it directly; the programs developed in this book use higher-level interfaces that are built on top of the C API. However, if you’d like to write MySQL client programs in C, the following sources of information may be helpful:

§ The MySQL Reference Manual contains a chapter that describes all the C API functions. You should also have a look at the source for the standard MySQL clients provided with the MySQL source distribution that are written in C. Source distributions and the manual both are available at the MySQL web site, http://dev.mysql.com/, and you can obtain the manual in printed form from MySQL Press.

§ The book MySQL by Paul DuBois (Sams) contains reference material for the C API, and also includes a chapter that provides detailed tutorial instructions for writing MySQL programs in C. The chapter is available online at http://www.kitebird.com/mysql-book/. The source code for the sample programs discussed in the chapter is available from the same site for you to study and use. Those programs were deliberately written for instructional purposes, so you may find them easier to understand than the standard clients in the MySQL source distribution.

Java interfaces for MySQL do not use the C client library. They implement the raw protocol directly but map protocol operations onto the JDBC interface. You write your Java programs using standard JDBC calls, and JDBC passes your requests for database operations to the lower-level MySQL interface, which converts them to operations that communicate with the MySQL server using the raw protocol.

The MySQL programming interfaces used in this book share a common design principle: they all use a two-level architecture. The top level of this architecture provides database-independent methods that implement database access in a portable way that’s the same no matter which database management system you’re using, be it MySQL, PostgreSQL, Oracle, or whatever. The lower level consists of a set of drivers, each of which implements the details for a particular database system. The two-level architecture enables application programs to use an abstract interface that is not tied to the details involved with accessing any particular database server. This enhances portability of your programs because you just select a different lower-level driver to use a different type of database. That’s the theory, at least. In practice, perfect portability can be somewhat elusive:

§ The interface methods provided by the top level of the architecture are consistent regardless of the driver you use, but it’s still possible to issue SQL statements that contain constructs supported only by a particular server. For MySQL, a good example is the SHOW statement that provides information about database and table structure. If you use SHOW with a non-MySQL server, an error is the likely result.

§ Lower-level drivers often extend the abstract interface to make it more convenient to get at database-specific features. For example, the MySQL driver for Perl DBI makes the most recent AUTO_INCREMENT value available as an attribute of the database handle so that you can access it as$dbh->{mysql_insertid}. These features often make it easier to write a program initially, but at the same time make it less portable and require some rewriting should you port the program to use with another database system.

Despite these factors that compromise portability to some extent, the general portability characteristics of the two-level architecture provide significant benefits for MySQL developers.

Another thing that the APIs used in this book have in common is that they are object-oriented. Whether you write in Perl, Ruby, PHP, Python, or Java, the operation that connects to the MySQL server returns a value that enables you to process statements in an object-oriented manner. For example, when you connect to the database server, you get a database connection object that you use to further interact with the server. The interfaces also provide other objects, such as objects for statements, result sets, or metadata.

Now let’s see how to use these programming interfaces to perform the most fundamental MySQL operations: connecting to and disconnecting from the server.

Connecting, Selecting a Database, and Disconnecting

Problem

You need to establish a connection to the server to access a database, and to shut down the connection when you’re done.

Solution

Each API provides routines for connecting and disconnecting. The connection routines require that you provide parameters specifying the hostname that is running the MySQL server and the MySQL account that you want to use. You can also select a default database.

Discussion

The programs in this section show how to perform three fundamental operations that are common to the vast majority of MySQL programs:

Establishing a connection to the MySQL server

Every program that uses MySQL does this, no matter which API you use. The details on specifying connection parameters vary between APIs, and some APIs provide more flexibility than others. However, there are many common elements. For example, you must specify the host that is running the server, as well as the username and password for the MySQL account to use for accessing the server.

Selecting a database

Most MySQL programs select a default database.

Disconnecting from the server

Each API provides a way to close an open connection. It’s best to close the connection as soon as you’re done using the server so that it can free up any resources that are allocated to servicing the connection. Otherwise, if your program performs additional computations after accessing the server, the connection will be held open longer than necessary. It’s also preferable to close the connection explicitly. If a program simply terminates without closing the connection, the MySQL server eventually notices, but shutting down the connection explicitly enables the server to perform an immediate orderly close on its end.

The example programs for each API in this section show how to connect to the server, select the cookbook database, and disconnect.

On occasion you might want to write a MySQL program that doesn’t select a database. This could be the case if you plan to issue a statement that doesn’t require a default database, such as SHOW VARIABLES or SELECT VERSION(). Or perhaps you’re writing an interactive program that connects to the server and enables the user to specify the database after the connection has been made. To cover such situations, the discussion for each API also indicates how to connect without selecting any default database.

Perl

To write MySQL scripts in Perl, you should have the DBI module installed, as well as the MySQL-specific driver module, DBD::mysql. Appendix A contains information on getting these modules if they’re not already installed.

Here is a simple Perl script that connects to the cookbook database and then disconnects:

#!/usr/bin/perl

# connect.pl - connect to the MySQL server

use strict;

use warnings;

use DBI;

my $dsn = "DBI:mysql:host=localhost;database=cookbook";

my $dbh = DBI->connect ($dsn, "cbuser", "cbpass")

or die "Cannot connect to server\n";

print "Connected\n";

$dbh->disconnect ();

print "Disconnected\n";

To try the script, create a file named connect.pl that contains the preceding code and run it from the command line. (Under Unix, you may need to change the path on the first line of the script if your Perl program is located somewhere other than /usr/bin/perl.) You should see the program print two lines of output indicating that it connected and disconnected successfully:

%connect.pl

Connected

Disconnected

If you need background on running Perl programs, see Appendix B.

The use strict line turns on strict variable checking and causes Perl to complain about any variables that are used without having been declared first. This is a sensible precaution because it helps find errors that might otherwise go undetected.

The use warnings line turns on warning mode so that Perl produces warnings for any questionable constructs. Our example script has no such constructs, but it’s a good idea to get in the habit of enabling warnings to catch problems that occur during the script development process. usewarnings is similar to specifying the Perl -w command-line option, but provides more control over which warnings you want to see. (Execute a perldoc warnings command for more information.)

The use DBI statement tells Perl that the program needs to load the DBI module. It’s unnecessary to load the MySQL driver module (DBD::mysql) explicitly, because DBI does that itself when the script connects to the database server.

The next two lines establish the connection to MySQL by setting up a data source name (DSN) and calling the DBI connect() method. The arguments to connect() are the DSN, the MySQL username and password, and any connection attributes you want to specify. The DSN is required. The other arguments are optional, although usually it’s necessary to supply a username and password.

The DSN specifies which database driver to use and other options indicating where to connect. For MySQL programs, the DSN has the format DBI:mysql:options . The second colon in the DSN is not optional, even if you specify no options.

The three DSN components have the following meanings:

§ The first component is always DBI. It’s not case-sensitive; dbi or Dbi would do just as well.

§ The second component tells DBI which database driver to use. For MySQL, the name must be mysql, and it is case-sensitive. You can’t use MySQL, MYSQL, or any other variation.

§ The third component, if present, is a semicolon-separated list of name = value pairs that specify additional connection options. The order of any option pairs you provide doesn’t matter. For our purposes here, the two most relevant options are host and database. They specify the hostname where the MySQL server is running and the default database you want to use.

Given this information, the DSN for connecting to the cookbook database on the local host localhost looks like this:

DBI:mysql:host=localhost;database=cookbook

If you leave out the host option, its default value is localhost. Thus, these two DSNs are equivalent:

DBI:mysql:host=localhost;database=cookbook

DBI:mysql:database=cookbook

If you omit the database option, the connect() operation selects no default database.

The second and third arguments of the connect() call are your MySQL username and password. You can also provide a fourth argument following the password to specify attributes that control DBI’s behavior when errors occur. With no attributes, DBI by default prints error messages when errors occur but does not terminate your script. That’s why connect.pl checks whether connect() returns undef to indicate failure:

my $dbh = DBI->connect ($dsn, "cbuser", "cbpass")

or die "Cannot connect to server\n";

Other error-handling strategies are possible. For example, you can tell DBI to terminate the script automatically when an error occurs in a DBI call by disabling the PrintError attribute and enabling RaiseError instead. Then you don’t have to check for errors yourself (although you also lose the ability to decide how your program will recover from errors):

my $dbh = DBI->connect ($dsn, $user_name, $password,

{PrintError => 0, RaiseError => 1});

Checking for Errors discusses error handling further.

Another common attribute is AutoCommit, which sets the connection’s auto-commit mode for transactions. In MySQL, this is enabled by default for new connections, but we’ll set it from this point on to make the initial connection state explicit:

my $dbh = DBI->connect ($dsn, $user_name, $password,

{PrintError => 0, RaiseError => 1, AutoCommit => 1});

As shown, the fourth argument to connect() is a reference to a hash of connection attribute name/value pairs. An alternative way of writing this code is as follows:

my %conn_attrs = (PrintError => 0, RaiseError => 1, AutoCommit => 1);

my $dbh = DBI->connect ($dsn, $user_name, $password, \%conn_attrs);

Use whichever style you prefer. The scripts in this book use the %conn_attr hash to make the connect() call simpler to read.

Assuming that connect()succeeds, it returns a database handle that contains information about the state of the connection. (In DBI parlance, references to objects are called handles.) Later we’ll see other handles such as statement handles, which are associated with particular statements. Perl DBI scripts in this book conventionally use $dbh and $sth to signify database and statement handles.

Additional connection parameters. For localhost connections, you can provide a mysql_socket option in the DSN to specify the path to the Unix domain socket:

my $dsn = "DBI:mysql:host=localhost;database=cookbook"

. ";mysql_socket=/var/tmp/mysql.sock";

For non-localhost (TCP/IP) connections, you can provide a port option to specify the port number:

my $dsn = "DBI:mysql:host=mysql.example.com;database=cookbook"

. ";port=3307";

Ruby

To write MySQL scripts in Ruby, you should have the DBI module installed, as well as the MySQL-specific driver module. Both are included in the Ruby DBI distribution. Appendix A contains information on getting Ruby DBI if it’s not already installed.

Here is a simple Ruby script that connects to the cookbook database and then disconnects:

#!/usr/bin/ruby -w

# connect.rb - connect to the MySQL server

require "dbi"

begin

dsn = "DBI:Mysql:host=localhost;database=cookbook"

dbh = DBI.connect(dsn, "cbuser", "cbpass")

puts "Connected"

rescue

puts "Cannot connect to server"

exit(1)

end

dbh.disconnect

puts "Disconnected"

To try the script, create a file named connect.rb that contains the preceding code. (Under Unix, you may need to change the path on the first line of the script if your Ruby program is located somewhere other than /usr/bin/ruby.) You should see the program print two lines of output indicating that it connected and disconnected successfully:

%connect.rb

Connected

Disconnected

If you need background on running Ruby programs, see Appendix B.

The -w option turns on warning mode so that Ruby produces warnings for any questionable constructs. Our example script has no such constructs, but it’s a good idea to get in the habit of using -w to catch problems that occur during the script development process.

The require statement tells Ruby that the program needs to load the DBI module. It’s unnecessary to load the MySQL driver module explicitly, because DBI does that itself when the script connects to the database server.

The connection is established by passing a data source name and the MySQL username and password to the connect() method. The DSN is required. The other arguments are optional, although usually it’s necessary to supply a username and password.

The DSN specifies which database driver to use and other options that indicate where to connect. For MySQL programs, the DSN typically has one of these formats:

DBI:Mysql:db_name:host_name

DBI:Mysql:name=value;name=value ...

The DSN components have the following meanings:

§ The first component is always DBI or dbi.

§ The second component tells DBI which database driver to use. For MySQL, the name is Mysql.

§ The third component, if present, is either a database name and hostname separated by a colon, or a semicolon-separated list of name = value pairs that specify additional connection options. The order of any option pairs you provide doesn’t matter. For our purposes here, the two most relevant options are host and database. They specify the hostname of the server on which MySQL is running and the default database you want to use. As with Perl DBI, the second colon in the DSN is not optional, even if you specify no options.

Given this information, the DSN for connecting to the cookbook database on the local host localhost looks like this:

DBI:Mysql:host=localhost;database=cookbook

If you leave out the host option, its default value is localhost. Thus, these two DSNs are equivalent:

DBI:Mysql:host=localhost;database=cookbook

DBI:Mysql:database=cookbook

If you omit the database option, the connect() operation selects no default database.

Assuming that connect() succeeds, it returns a database handle that contains information about the state of the connection. Ruby DBI scripts in this book conventionally use dbh to signify a database handle.

If the connect() method fails, there is no special return value to check for. Ruby programs raise exceptions when problems occur. To handle errors, put the statements that might fail inside a begin block, and use a rescue clause that contains the error-handling code. Exceptions that occur at the top level of a script (that is, outside of any begin block) are caught by the default exception handler, which prints a stack trace and exits.

Additional connection parameters. For localhost connections, you can provide a socket option in the DSN to specify the path to the Unix domain socket:

dsn = "DBI:Mysql:host=localhost;database=cookbook" +

";socket=/var/tmp/mysql.sock"

For non-localhost (TCP/IP) connections, you can provide a port option to specify the port number:

dsn = "DBI:Mysql:host=mysql.example.com;database=cookbook" +

";port=3307"

PHP

To write PHP scripts that use MySQL, your PHP interpreter must have MySQL support compiled in. If it doesn’t, your scripts will be unable to connect to your MySQL server. Should that occur, check the instructions included with your PHP distribution to see how to enable MySQL support.

PHP actually has two extensions that enable the use of MySQL. The first, mysql, is the original MySQL extension. It provides a set of functions that have names beginning with mysql_. The second, mysqli, or “MySQL improved,” provides functions with names that begin with mysqli_. For purposes of this book, you can use either extension, although I recommend mysqli.

In any case, PHP scripts in this book won’t use either extension directly. Instead, they use the DB module from the PHP Extension and Add-on Repository (PEAR) . The PEAR DB module provides an interface to whichever underlying MySQL extension that you decide to use. This means that in addition to whichever PHP MySQL extension you choose, it’s also necessary to have PEAR installed. Appendix A contains information on getting PEAR if it’s not already installed.

PHP scripts usually are written for use with a web server. I’ll assume that if you’re going to use PHP that way here, you can simply copy PHP scripts into your server’s document tree, request them from your browser, and they will execute. For example, if you run Apache as the web server on the host localhost and you install a PHP script myscript.php at the top level of the Apache document tree, you should be able to access the script by requesting this URL:

http://localhost/myscript.php

This book uses the .php extension (suffix) for PHP script filenames, so your web server must be configured to recognize the .php extension. Otherwise, when you request a PHP script from your browser, the server will simply send the literal text of the script and that is what you’ll see in your browser window. You don’t want this to happen, particularly if the script contains the username and password that you use for connecting to MySQL. For information about configuring Apache for use with PHP, see Using Apache to Run Web Scripts.

PHP scripts often are written as a mixture of HTML and PHP code, with the PHP code embedded between the special <?php and ?> tags. Here is a simple example:

<html>

<head><title>A simple page</title></head>

<body>

<p>

<?php

print ("I am PHP code, hear me roar!\n");

</p>

</body>

</html>

PHP can be configured to recognize “short” tags as well, which are written as <? and ?>. This book does not assume that you have short tags enabled, so none of the PHP scripts shown here use them.

Here is a simple PHP script that connects to the cookbook database and then disconnects:

<?php

# connect.php - connect to the MySQL server

require_once "DB.php";

$dsn = "mysqli://cbuser:cbpass@localhost/cookbook";

$conn =& DB::connect ($dsn);

if (PEAR::isError ($conn))

die ("Cannot connect to server\n");

print ("Connected\n");

$conn->disconnect ();

print ("Disconnected\n");

For brevity, when I show examples consisting entirely of PHP code, typically I’ll omit the enclosing <?php and ?> tags. (Thus, if you see no tags in a PHP example, assume that <?php and ?> surround the entire block of code that is shown.) Examples that switch between HTML and PHP code do include the tags, to make it clear what is PHP code and what is not.

The require_once statement accesses the DB.php file that is required to use the PEAR DB module. require_once is just one of several PHP file-inclusion statements:

§ include instructs PHP to read the named file. require is like include except that PHP reads the file even if the require occurs inside a control structure that never executes (such as an if block for which the condition is never true).

§ include_once and require_once are likeinclude and require except that if the file has already been read, its contents are not processed again. This is useful for avoiding multiple-declaration problems that can easily occur in situations where library files include other library files.

$dsn is the data source name that indicates how to connect to the database server. Its general syntax is as follows:

phptype://user_name:password@host_name/db_name

The phptype value is the PHP driver type. For MySQL, it should be either mysql or mysqli to indicate which MySQL extension to use. You can choose either one, as long as your PHP interpreter has the chosen extension compiled in.

The PEAR DB connect() method uses the DSN to connect to MySQL. If the connection attempt succeeds, connect() returns a connection object that can be used to access other MySQL-related methods. PHP scripts in this book conventionally use $conn to signify connection objects.

If the connection attempt fails, connect() returns an error object. To determine whether the returned object represents an error, use the PEAR::isError() method.

Note that the assignment of the connect() result uses the =& operator and not the = operator. =& assigns a reference to the return value, whereas = creates a copy of the value. In this context, = would create another object that is not needed. (PHP scripts in this book generally use =& for assigning the result of connection attempts, but see Using MySQL-Based Storage with the PHP Session Manager for one instance that uses = to make sure that the assigned connection object persists longer than the function call in which it occurs.)

The last part of the DSN shown in the preceding example is the database name. To connect without selecting a default database, just omit it from the end of the DSN:

$dsn = "mysqli://cbuser:cbpass@localhost";

$conn =& DB::connect ($dsn);

To try the connect.php script, copy it to your web server’s document tree and request it using your browser. Alternatively, if you have a standalone version of the PHP interpreter that can be run from the command line, you can try the script without a web server or browser:

%php connect.php

Connected

Disconnected

If you need background on running PHP programs, see Appendix B.

As an alternative to specifying the DSN in string format, you can provide the connection parameters using an array:

$dsn = array

(

"phptype" => "mysqli",

"username" => "cbuser",

"password" => "cbpass",

"hostspec" => "localhost",

"database" => "cookbook"

);

$conn =& DB::connect ($dsn);

if (PEAR::isError ($conn))

print ("Cannot connect to server\n");

To connect without selecting a default database using an array-format DSN, omit the database member from the array.

Additional connection parameters. To use a specific Unix domain socket file or TCP/IP port number, modify the parameters used at connect time. The following two examples use an array-format DSN to do this.

For localhost connections, you can specify a pathname for the Unix domain socket file by including a socket member in the DSN array:

$dsn = array

(

"phptype" => "mysqli",

"username" => "cbuser",

"password" => "cbpass",

"hostspec" => "localhost",

"socket" => "/var/tmp/mysql.sock",

"database" => "cookbook"

);

$conn =& DB::connect ($dsn);

if (PEAR::isError ($conn))

print ("Cannot connect to server\n");

For non-localhost (TCP/IP) connections, you can specify the port number by including a port member in the DSN array:

$dsn = array

(

"phptype" => "mysqli",

"username" => "cbuser",

"password" => "cbpass",

"hostspec" => "mysql.example.com",

"port" => 3307,

"database" => "cookbook"

);

$conn =& DB::connect ($dsn);

if (PEAR::isError ($conn))

print ("Cannot connect to server\n");

You can use the PHP initialization file (typically named php.ini) to specify a default hostname, username, password, socket path, or port number. For the mysql extension, set the values of the mysql.default_host, mysql.default_user, mysql.default_password,mysql.default_socket, or mysql.default_port configuration variables. For mysqli, the corresponding variable names begin with mysqli (and the password variable is mysql_default_pw). These variables affect PHP scripts globally: for scripts that do not specify those parameters, the defaults from php.ini are used.

Python

To write MySQL programs in Python, you need the MySQLdb module that provides MySQL connectivity for Python’s DB-API interface. Appendix A, contains information on getting MySQLdb if it’s not already installed.

To use the DB-API interface, import the database driver module that you want to use (which is MySQLdb for MySQL programs). Then create a database connection object by calling the driver’s connect() method. This object provides access to other DB-API methods, such as theclose() method that severs the connection to the database server. Here is a short Python program, connect.py, that illustrates these operations:

#!/usr/bin/python

# connect.py - connect to the MySQL server

import sys

import MySQLdb

try:

conn = MySQLdb.connect (db = "cookbook",

host = "localhost",

user = "cbuser",

passwd = "cbpass")

print "Connected"

except:

print "Cannot connect to server"

sys.exit (1)

conn.close ()

print "Disconnected"

To try the script, create a file named connect.py that contains the preceding code. (Under Unix, you may need to change the path on the first line of the script if your Python program is located somewhere other than /usr/bin/python.) You should see the program print two lines of output indicating that it connected and disconnected successfully:

%connect.py

Connected

Disconnected

If you need background on running Python programs, see Appendix B.

The import lines give the script access to the sys module (needed for the sys.exit() method) and to the MySQLdb module. Then the script attempts to establish a connection to the MySQL server by calling connect() to obtain a connection object, conn. Python scripts in this book conventionally use conn to signify connection objects.

If the connect() method fails, there is no special return value to check for. Python programs raise exceptions when problems occur. To handle errors, put the statements that might fail inside a try statement and use an except clause that contains the error-handling code. Exceptions that occur at the top level of a script (that is, outside of any try statement) are caught by the default exception handler, which prints a stack trace and exits.

Because the connect() call uses named arguments, their order does not matter. If you omit the host argument from the connect() call, its default value is localhost. If you omit the db argument or pass a db value of "" (the empty string), the connect() operation selects no default database. If you pass a value of None, however, the call will fail.

Additional connection parameters. For localhost connections, you can provide a unix_socket parameter to specify the path to the Unix domain socket file:

conn = MySQLdb.connect (db = "cookbook",

host = "localhost",

unix_socket = "/var/tmp/mysql.sock",

user = "cbuser",

passwd = "cbpass")

For non-localhost (TCP/IP) connections, you can provide a port parameter to specify the port number:

conn = MySQLdb.connect (db = "cookbook",

host = "mysql.example.com",

port = 3307,

user = "cbuser",

passwd = "cbpass")

Java

Database programs in Java are written using the JDBC interface, together with a driver for the particular database engine you want to access. That is, the JDBC architecture provides a generic interface used in conjunction with a database-specific driver. Java is similar to Ruby and Python in that you don’t test specific method calls for return values that indicate an error. Instead, you provide handlers to be called when exceptions are thrown.

Java programming requires a software development kit (SDK), and you will need to set your JAVA_HOME environment variable to the location where your SDK is installed. To write MySQL-based Java programs, you’ll also need a MySQL-specific JDBC driver. Programs in this book use MySQL Connector/J, the driver provided by MySQL AB. Appendix A, has information on getting MySQL Connector/J if it’s not already installed. Appendix B, has information about obtaining an SDK and setting JAVA_HOME.

The following Java program, Connect.java, illustrates how to connect to and disconnect from the MySQL server, and select cookbook as the default database:

// Connect.java - connect to the MySQL server

import java.sql.*;

public class Connect

{

public static void main (String[] args)

{

Connection conn = null;

String url = "jdbc:mysql://localhost/cookbook";

String userName = "cbuser";

String password = "cbpass";

try

{

Class.forName ("com.mysql.jdbc.Driver").newInstance ();

conn = DriverManager.getConnection (url, userName, password);

System.out.println ("Connected");

}

catch (Exception e)

{

System.err.println ("Cannot connect to server");

System.exit (1);

}

if (conn != null)

{

try

{

conn.close ();

System.out.println ("Disconnected");

}

catch (Exception e) { /* ignore close errors */ }

}

The import java.sql.* statement references the classes and interfaces that provide access to the data types you use to manage different aspects of your interaction with the database server. These are required for all JDBC programs.

Connecting to the server is a two-step process. First, register the database driver with JDBC by calling Class.forName(). The Class.forName() method requires a driver name; for MySQL Connector/J, use com.mysql.jdbc.Driver. Then call DriverManager.getConnection()to initiate the connection and obtain a Connection object that maintains information about the state of the connection. Java programs in this book conventionally use conn to signify connection objects.

DriverManager.getConnection()takes three arguments: a URL that describes where to connect and the database to use, the MySQL username, and the password. The URL string has this format:

jdbc:driver://host_name/db_name

This format follows the Java convention that the URL for connecting to a network resource begins with a protocol designator. For JDBC programs, the protocol is jdbc, and you’ll also need a subprotocol designator that specifies the driver name (mysql, for MySQL programs). Many parts of the connection URL are optional, but the leading protocol and subprotocol designators are not. If you omit host_name, the default host value is localhost. If you omit the database name, the connect operation selects no default database. However, you should not omit any of the slashes in any case. For example, to connect to the local host without selecting a default database, the URL is:

jdbc:mysql:///

To try the program, compile it and execute it. The class statement indicates the program’s name, which in this case is Connect. The name of the file containing the program should match this name and include a .java extension, so the filename for the program is Connect.java.^[7^] Compile the program using javac:

%javac Connect.java

If you prefer a different Java compiler, just substitute its name for javac.

The Java compiler generates compiled byte code to produce a class file named Connect.class. Use the java program to run the class file (specified without the .class extension):

%java Connect

Connected

Disconnected

You might need to set your CLASSPATH environment variable before the example program will compile and run. The value of CLASSPATH should include at least your current directory (.) and the path to the MySQL Connector/J JDBC driver. If you need background on running Java programs or setting CLASSPATH, see Appendix B.

BEWARE OF CLASS.FORNAME()!

The example program Connect.java registers the JDBC driver like this:

Class.forName ("com.mysql.jdbc.Driver").newInstance ();

You’re supposed to be able to register drivers without invoking newInstance(), like so:

Class.forName ("com.mysql.jdbc.Driver");

However, that call doesn’t work for some Java implementations, so be sure to use newInstance(), or you may find yourself enacting the Java motto, “write once, debug everywhere.”

Some JDBC drivers (MySQL Connector/J among them) allow you to specify the username and password as parameters at the end of the URL. In this case, you omit the second and third arguments of the getConnection() call. Using that URL style, the code that establishes the connection in the example program can be written like this:

// connect using username and password included in URL

Connection conn = null;

String url = "jdbc:mysql://localhost/cookbook?user=cbuser&password=cbpass";

try

{

Class.forName ("com.mysql.jdbc.Driver").newInstance ();

conn = DriverManager.getConnection (url);

System.out.println ("Connected");

}

The character that separates the user and password parameters should be &, not ;.

Additional connection parameters. MySQL Connector/J does not support Unix domain socket file connections, so even connections for which the hostname is localhost are made via TCP/IP. You can specify an explicit port number by adding :port_num to the hostname in the connection URL:

String url = "jdbc:mysql://mysql.example.com:3307/cookbook";

^[7^]If you make a copy of Connect.java to use as the basis for a new program, you’ll need to change the class name in the class statement to match the name of your new file.

Checking for Errors

Problem

Something went wrong with your program, and you don’t know what.

Solution

Everyone has problems getting programs to work correctly. But if you don’t anticipate difficulties by checking for errors, you make the job a lot harder. Add some error-checking code so that your programs can help you figure out what went wrong.

Discussion

After working through Connecting, Selecting a Database, and Disconnecting, you now know how to connect to the MySQL server. It’s also a good idea to know how to check for errors and how to retrieve specific error information from the API, so that’s what we’ll cover next. You’re probably anxious to see how to do more interesting things (such as issuing statements and getting back the results), but error checking is fundamentally important. Programs sometimes fail, especially during development, and if you don’t know how to determine why failures occur, you’ll be flying blind.

When errors occur, MySQL provides three values:

§ A MySQL-specific error number

§ A MySQL-specific descriptive text error message

§ An SQLSTATE error code that is a five-character value defined according to the ANSI and ODBC standards

Various sections in this recipe in this section show how to access this information. Most of the later recipes in this book that display error information print only the MySQL-specific values, but the recipes here show how to access the SQLSTATE value as well, for those APIs that expose it.

The example programs demonstrate how to check for errors but will in fact execute without any problems if your MySQL account is set up properly. Thus, you may have to modify the examples slightly to force errors to occur so that the error-handling statements are triggered. That is easy to do. For example, you can change a connection-establishment call to supply a bad password.

A general debugging aid that is not specific to any API is to check the MySQL server’s query log to see what statements the server actually is receiving. (This requires that you have query logging enabled and that you have access to the log.) The query log often will show you that a statement is malformed in a particular way and give you a clue that your program is not constructing the proper statement string. If you’re running a script under a web server and it fails, check the web server’s error log.

DON’T SHOOT YOURSELF IN THE FOOT: CHECK FOR ERRORS

The principle that you should check for errors is not so obvious or widely appreciated as one might hope. Many messages posted on MySQL-related mailing lists are requests for help with programs that fail for reasons unknown to the people that wrote them. In a surprising number of cases, the reason these people are mystified by their programs is that they put in no error checking, and thus gave themselves no way to know that there was a problem or to find out what it was! You cannot help yourself this way. Plan for failure by checking for errors so that you can take appropriate action when they occur.

Perl

The DBI module provides two attributes that control what happens when DBI method invocations fail:

§ PrintError, if enabled, causes DBI to print an error message using warn().

§ RaiseError, if enabled, causes DBI to print an error message usingdie(). This terminates your script.

By default, PrintError is enabled, and RaiseError is disabled, so a script continues executing after printing a message if an error occurs. Either or both attributes can be specified in the connect() call. Setting an attribute to 1 or 0 enables or disables it, respectively. To specify either or both attributes, pass them in a hash reference as the fourth argument to the connect() call.

The following code sets only the AutoCommit attribute and uses the default settings for the error-handling attributes. If the connect() call fails, this results in a warning message, but the script continues to execute:

my %conn_attrs = (AutoCommit => 1);

my $dbh = DBI->connect ($dsn, "cbuser", "cbpass", \%conn_attrs);

However, because you really can’t do much if the connection attempt fails, it’s often prudent to exit instead after DBI prints a message:

my %conn_attrs = (AutoCommit => 1);

my $dbh = DBI->connect ($dsn, "cbuser", "cbpass", \%conn_attrs)

or exit;

To print your own error messages, leave RaiseError disabled, and disable PrintError as well. Then test the results of DBI method calls yourself. When a method fails, the $DBI::err, $DBI::errstr, and $DBI::state variables contain the MySQL error number, a descriptive error string, and the SQLSTATE value, respectively:

my %conn_attrs = (PrintError => 0, AutoCommit => 1);

my $dbh = DBI->connect ($dsn, "cbuser", "cbpass", \%conn_attrs)

or die "Connection error: "

. "$DBI::errstr ($DBI::err/$DBI::state)\n";

If no error occurs, $DBI::err will be 0 or undef, $DBI::errstr will be the empty string or undef, and $DBI::state will be empty or 00000.

When you check for errors, access these variables immediately after invoking the DBI method that sets them. If you invoke another method before using them, their values will be reset.

The default settings (PrintError enabled, RaiseError disabled) are not so useful if you’re printing your own messages. In this case, DBI prints a message automatically, and then your script prints its own message. This is at best redundant, and at worst confusing to the person using the script.

If you enable RaiseError, you can call DBI methods without checking for return values that indicate errors. If a method fails, DBI prints an error and terminates your script. If the method returns, you can assume it succeeded. This is the easiest approach for script writers: let DBI do all the error checking! However, if PrintError and RaiseError both are enabled, DBI may call warn() and die() in succession, resulting in error messages being printed twice. To avoid this problem, it’s best to disable PrintError whenever you enable RaiseError. That’s the approach generally used in this book, as illustrated here:

my %conn_attrs = (PrintError => 0, RaiseError => 1, AutoCommit => 1);

my $dbh = DBI->connect ($dsn, "cbuser", "cbpass", \%conn_attrs);

If you don’t want the all-or-nothing behavior of enabling RaiseError for automatic error checking versus having to do all your own checking, you can adopt a mixed approach. Individual handles have PrintError and RaiseError attributes that can be enabled or disabled selectively. For example, you can enable RaiseError globally by turning it on when you call connect(), and then disable it selectively on a per-handle basis. Suppose that you have a script that reads the username and password from the command-line arguments, and then loops while the user enters statements to be executed. In this case, you’d probably want DBI to die and print the error message automatically if the connection fails (there’s not much you can do if the user doesn’t provide a valid name and password). After connecting, on the other hand, you wouldn’t want the script to exit just because the user enters a syntactically invalid statement. It would be better for the script to trap the error, print a message, and then loop to get the next statement. The following code shows how this can be done. The do() method used in the example executes a statement and returnsundef to indicate an error:

my $user_name = shift (@ARGV);

my $password = shift (@ARGV);

my %conn_attrs = (PrintError => 0, RaiseError => 1, AutoCommit => 1);

my $dbh = DBI->connect ($dsn, $user_name, $password, \%conn_attrs);

$dbh->{RaiseError} = 0; # disable automatic termination on error

print "Enter queries to be executed, one per line; terminate with Control-D\n";

while (<>) # read and execute queries

{

$dbh->do ($_) or warn "Query failed: $DBI::errstr ($DBI::err)\en";

}

$dbh->{RaiseError} = 1; # re-enable automatic termination on error

If RaiseError is enabled, you can trap errors without terminating your program by executing code within an eval block. If an error occurs within the block, eval fails and returns a message in the $@ variable. Typically, you use eval something like this:

eval

{

# statements that might fail go here...

};

if ($@)

{

print "An error occurred: $@\n";

}

This technique is commonly used to implement transactions. For an example, see Using Transactions in Perl Programs.

Using RaiseError in combination with eval differs from using RaiseError alone in the following ways:

§ Errors terminate only the eval block, not the entire script.

§ Any error terminates the eval block, whereas RaiseError applies only to DBI-related errors.

When you use eval with RaiseError enabled, be sure to disable PrintError. Otherwise, in some versions of DBI, an error may simply cause warn() to be called without terminating the eval block as you expect.

In addition to using the error-handling attributes PrintError and RaiseError, you can get lots of useful information about your script’s execution by turning on DBI’s tracing mechanism. Invoke the trace() method with an argument indicating the trace level. Levels 1 to 9 enable tracing with increasingly more verbose output, and level 0 disables tracing:

DBI->trace (1); # enable tracing, minimal output

DBI->trace (3); # elevate trace level

DBI->trace (0); # disable tracing

Individual database and statement handles have trace() methods, too. That means you can localize tracing to a single handle if you want.

Trace output normally goes to your terminal (or, in the case of a web script, to the web server’s error log). You can write trace output to a specific file by providing a second argument indicating a filename:

DBI->trace (1, "/tmp/trace.out");

If the trace file already exists, trace output is appended to the end; the file’s contents are not cleared first. Beware of turning on a file trace while developing a script, and then forgetting to disable the trace when you put the script into production. You’ll eventually find to your chagrin that the trace file has become quite large. Or worse, a filesystem will fill up, and you’ll have no idea why!

Ruby

Ruby signals errors by raising exceptions and Ruby programs handle errors by catching exceptions in a rescue clause of a begin block. Ruby DBI methods raise exceptions when they fail and provide error information by means of a DBI::DatabaseError object. To get the MySQL error number, error message, and SQLSTATE value, access the err, errstr, and state methods of this object. The following example shows how to trap exceptions and access error information in a DBI script:

begin

dsn = "DBI:Mysql:host=localhost;database=cookbook"

dbh = DBI.connect(dsn, "cbuser", "cbpass")

puts "Connected"

rescue DBI::DatabaseError => e

puts "Cannot connect to server"

puts "Error code: #{e.err}"

puts "Error message: #{e.errstr}"

puts "Error SQLSTATE: #{e.state}"

exit(1)

end

PHP

A PEAR DB method indicates success or failure by means of its return value. If the method fails, the return value is an error object. If the method succeeds, the return value is something else:

§ The connect() method returns a connection object for interacting with the database server.

§ The query() method for executing SQL statements returns a result set object for statements such as SELECT that return rows, or the DB_OK value for statements such as INSERT, UPDATE, or DELETE that modify rows.

To determine whether a method return value is an error object, pass it to the PEAR::isError() method, check the PEAR::isError() result, and take action accordingly. For example, the following code prints “Connected” if connect() succeeds and exits with a generic error message if not:

$dsn = "mysqli://cbuser:cbpass@localhost/cookbook";

$conn =& DB::connect ($dsn);

if (PEAR::isError ($conn))

die ("Cannot connect to server\n");

print ("Connected\n");

To obtain more specific information when a PEAR DB method fails, use the methods provided by the error object:

§ getCode()and getMessage() return an error number and message, respectively. These are standard values provided by PEAR that are not MySQL specific.

§ getUserInfo()and getDebugInfo() return MySQL-specific information.

The following listing shows how each method displays the error information returned by PEAR DB when a connect error occurs:

$dsn = "mysqli://cbuser:cbpass@localhost/cookbook";

$conn =& DB::connect ($dsn);

if (PEAR::isError ($conn))

{

print ("Cannot connect to server.\n");

printf ("Error code: %d\n", $conn->getCode ());

printf ("Error message: %s\n", $conn->getMessage ());

printf ("Error debug info: %s\n", $conn->getDebugInfo ());

printf ("Error user info: %s\n", $conn->getUserInfo ());

exit (1);

}

Python

Python signals errors by raising exceptions, and Python programs handle errors by catching exceptions in the except clause of a try statement. To obtain MySQL-specific error information, name an exception class, and provide a variable to receive the information. Here’s an example:

try:

conn = MySQLdb.connect (db = "cookbook",

host = "localhost",

user = "cbuser",

passwd = "cbpass")

print "Connected"

except MySQLdb.Error, e:

print "Cannot connect to server"

print "Error code:", e.args[0]

print "Error message:", e.args[1]

sys.exit (1)

If an exception occurs, the first and second elements of e.args are set to the error number and error message, respectively. (Note that the Error class is accessed through the MySQLdb driver module name.)

Java

Java programs handle errors by catching exceptions. If you simply want to do the minimum amount of work, print a stack trace to inform the user where the problem lies:

try

{

/* ... some database operation ... */

}

catch (Exception e)

{

e.printStackTrace ();

}

The stack trace shows the location of the problem but not necessarily what the problem is. It may not be all that meaningful except to you, the program’s developer. To be more specific, you can print the error message and code associated with an exception:

§ All Exception objects support the getMessage() method. JDBC methods may throw exceptions using SQLException objects; these are like Exception objects but also support getErrorCode() and getSQLState() methods. getErrorCode() and getMessage() return the MySQL-specific error number and message string. getSQLState() returns a string containing the SQLSTATE value.

§ You can also get information about nonfatal warnings, which some methods generate using SQLWarning objects. SQLWarning is a subclass of SQLException, but warnings are accumulated in a list rather than thrown immediately, so they don’t interrupt your program, and you can print them at your leisure.

The following example program, Error.java, demonstrates how to access error messages by printing all the error information it can get its hands on. It attempts to connect to the MySQL server and prints exception information if the attempt fails. Then it issues a statement and prints exception and warning information if the statement fails:

// Error.java - demonstrate MySQL error-handling

import java.sql.*;

public class Error

{

public static void main (String[] args)

{

Connection conn = null;

String url = "jdbc:mysql://localhost/cookbook";

String userName = "cbuser";

String password = "cbpass";

try

{

Class.forName ("com.mysql.jdbc.Driver").newInstance ();

conn = DriverManager.getConnection (url, userName, password);

System.out.println ("Connected");

tryQuery (conn); // issue a query

}

catch (Exception e)

{

System.err.println ("Cannot connect to server");

System.err.println (e);

if (e instanceof SQLException) // JDBC-specific exception?

{

// print general message, plus any database-specific message

// (e must be cast from Exception to SQLException to

// access the SQLException-specific methods)

System.err.println ("SQLException: " + e.getMessage ());

System.err.println ("SQLState: "

+ ((SQLException) e).getSQLState ());

System.err.println ("VendorCode: "

+ ((SQLException) e).getErrorCode ());

}

finally

{

if (conn != null)

{

try

{

conn.close ();

System.out.println ("Disconnected");

}

catch (SQLException e)

{

// print general message, plus any database-specific message

System.err.println ("SQLException: " + e.getMessage ());

System.err.println ("SQLState: " + e.getSQLState ());

System.err.println ("VendorCode: " + e.getErrorCode ());

}

public static void tryQuery (Connection conn)

{

try

{

// issue a simple query

Statement s = conn.createStatement ();

s.execute ("USE cookbook");

s.close ();

// print any accumulated warnings

SQLWarning w = conn.getWarnings ();

while (w != null)

{

System.err.println ("SQLWarning: " + w.getMessage ());

System.err.println ("SQLState: " + w.getSQLState ());

System.err.println ("VendorCode: " + w.getErrorCode ());

w = w.getNextWarning ();

}

catch (SQLException e)

{

// print general message, plus any database-specific message

System.err.println ("SQLException: " + e.getMessage ());

System.err.println ("SQLState: " + e.getSQLState ());

System.err.println ("VendorCode: " + e.getErrorCode ());

}

Writing Library Files

Problem

You notice that you’re writing the same code to perform common operations in multiple programs.

Solution

Put routines to perform those operations in a library file, and have your programs access the library. Then write the code only once. You might need to set an environment variable so that your scripts can find the library.

Discussion

This section describes how to put code for common operations in library files. Encapsulation (or modularization) isn’t really a “recipe” so much as a programming technique. Its principal benefit is that you don’t have to repeat code in each program you write. Instead, you just call a routine that’s in the library. For example, by putting the code for connecting to the cookbook database into a library routine, you need not write out all the parameters associated with making that connection. Simply invoke the routine from your program, and you’re connected.

Connection establishment isn’t the only operation you can encapsulate, of course. Later sections in this book develop other utility functions to be placed in library files. All such files, including those shown in this section, can be found under the lib directory of the recipes distribution. As you write your own programs, you’ll probably identify several operations that you perform often and that are good candidates for inclusion in a library. The techniques demonstrated in this section will help you write your own library files.

Library files have other benefits besides making it easier to write programs. They can help portability. For example, if you write connection parameters directly into each program that connects to the MySQL server, you have to change all those programs if you move them to another machine that uses different parameters. If instead you write your programs to connect to the database by calling a library routine, you localize the changes that need to be made: it’s necessary to modify only the affected library routine, not all the programs that use it.

Code encapsulation also can improve security in some ways. If you make a private library file readable only to yourself, only scripts run by you can execute routines in the file. Or suppose that you have some scripts located in your web server’s document tree. A properly configured server will execute the scripts and send their output to remote clients. But if the server becomes misconfigured somehow, the result can be that your scripts are sent to clients as plain text, thus displaying your MySQL username and password. (And you’ll probably realize it too late. Oops.) If the code for establishing a connection to the MySQL server is placed in a library file that’s located outside the document tree, those parameters won’t be exposed to clients.

WARNING

Be aware that if you install a library file to be readable by your web server, you don’t have much security should you share the web server with other developers. Any of those developers can write a web script to read and display your library file because, by default, the script runs with the permissions of the web server and thus will have access to the library.

The examples of programs that follow demonstrate how to write, for each API, a library file that contains a routine for connecting to the cookbook database on the MySQL server. The calling program can use the error-checking techniques discussed in Checking for Errors to determine whether a connection attempt fails. The connection routine for each language except PHP returns a database handle or connection object when it succeeds or raises an exception if the connection cannot be established. The PHP routine returns an object that represents a connection or an error, because that is what the PEAR DB connection method does (it does not raise an exception).

Libraries are of no utility in themselves, so each one’s use is illustrated by a short “test harness” program. You can use any of these harness programs as the basis for creating new programs of your own: make a copy of the file and add your own code between the connect and disconnect calls.

Library file writing involves not only the question of what to put in the file but also subsidiary issues such as where to install the file so it can be accessed by your programs, and (on multiuser systems such as Unix) how to set its access privileges so its contents aren’t exposed to people who shouldn’t see it.

Choosing a library file installation location

If you install a library file in a directory that a language processor searches by default, programs written in that language need do nothing special to access the library. However, if you install a library file in a directory that the language processor does not search by default, you’ll have to tell your scripts how to find the library. There are two common ways to do this:

§ Most languages provide a statement that can be used within a script to add directories to the language processor search path. This requires that you modify each script that needs the library.

§ You can set an environment or configuration variable that changes the language processor search path. This approach requires that each user who uses scripts that require the library to set the appropriate variable. Alternatively, if the language processor has a configuration file, you might be able to set a parameter in the file that affects scripts globally for all users.

We’ll use the second approach. For our API languages, the following table shows the relevant variables. In each case, the variable value is a directory or list of directories.

Language	Variable name	Variable type
Perl	PERL5LIB	Environment variable
Ruby	RUBYLIB	Environment variable
PHP	include_path	Configuration variable
Python	PYTHONPATH	Environment variable
Java	CLASSPATH	Environment variable

For general information on setting environment variables, see Appendix B. You can use those instructions to set environment variables to the values in the following discussion.

Suppose that you want to install library files in a directory that language processors do not search by default. For purposes of illustration, let’s use /usr/local/lib/mcb on Unix or C:\lib\mcb on Windows. (To put the files somewhere else, adjust the pathnames in the variable settings accordingly. For example, you might want to use a different directory, or you might want to put libraries for each language in separate directories.)

Under Unix, if you put Perl library files in the /usr/local/lib/mcb directory, you can set the PERL5LIB environment variable. For a shell in the Bourne shell family (sh, bash, ksh), set the variable like this in the appropriate startup file:

export PERL5LIB=/usr/local/lib/mcb

NOTE

If you are using the original Bourne shell, sh, you may need to split this into two commands:

PERL5LIB=/usr/local/lib/mcb

export PERL5LIB

For a shell in the C shell family (csh, tcsh), set PERL5LIB like this in your .login file:

setenv PERL5LIB /usr/local/lib/mcb

Under Windows, if you put Perl library files in C:\lib\mcb, you can set PERL5LIB as follows:

PERL5LIB=C:\lib\mcb

In each case, the variable setting tells Perl to look in the specified directory for library files, in addition to whatever other directories it would search by default. If you set PERL5LIB to name multiple directories, the separator character between directory pathnames is colon (:) on Unix orsemicolon (;) on Windows.

The other environment variables (RUBYLIB, PYTHONPATH, and CLASSPATH) are specified using the same syntax.

NOTE

Setting these environment variables as just discussed should suffice for scripts that you run from the command line. But for scripts that are intended to be executed by a web server, you’ll likely have to configure the server as well so that it can find the library files. See Using Apache to Run Web Scripts for details on how to do this.

For PHP, the search path is defined by the value of the include_path variable in the php.ini PHP initialization file. On Unix, the file’s pathname is likely to be /usr/lib/php.ini or /usr/local/lib/php.ini. Under Windows, the file is likely to be found in the Windows directory or under the main PHP installation directory. The value of include_path is defined with a line like this:

include_path = "value"

value is specified using the same syntax as for environment variables that name directories. That is, it’s a list of directory names, with the names separated by colons on Unix or semicolons on Windows. For example, on Unix, if you want PHP to look for include files in the current directory and in /usr/local/lib/mcb, set include_path like this:

include_path = ".:/usr/local/lib/mcb"

On Windows, to search the current directory and C:\lib\mcb, set include_path like this:

include_path = ".;C:\lib\mcb"

If you modify the php.ini file, and PHP is running as an Apache module, you’ll need to restart Apache to make your changes take effect.

Setting library file access privileges

Questions about file ownership and access mode are issues about which you’ll need to make decisions if you’re using a multiple-user system such as Unix:

§ If a library file is private and contains code to be used only by you, the file can be placed under your own account and made accessible only to you. Assuming that a library file mylib is already owned by you, you can make it private like this:

%chmod 600 mylib

§ If the library file is to be used only by your web server, you can install it in a server library directory and make the file owned by and accessible only to the server user ID. You may need to be root to do this. For example, if the web server runs as wwwusr, the following commands make the file private to that user:

§ #chown wwwusr mylib

# chmod 600 mylib

§ If the library file is public, you can place it in a location that your programming language searches automatically when it looks for libraries. (Most language processors search for libraries in some default set of directories.) You may need to be root to install files in one of these directories. Then you can make the file world-readable:

#chmod 444 mylib

Now let’s construct a library for each API. Each section here demonstrates how to write the library file itself and discusses how to use the library from within programs.

Perl

In Perl, library files are called modules, and typically have an extension of .pm (“Perl module”). Here’s a sample module file, Cookbook.pm, that implements a module named Cookbook. (It’s conventional for the basename of a Perl module file to be the same as the identifier on the packageline in the file.)

package Cookbook;

# Cookbook.pm - library file with utility method for connecting to MySQL

# via Perl DBI module

use strict;

use warnings;

use DBI;

my $db_name = "cookbook";

my $host_name = "localhost";

my $user_name = "cbuser";

my $password = "cbpass";

my $port_num = undef;

my $socket_file = undef;

# Establish a connection to the cookbook database, returning a database

# handle. Raise an exception if the connection cannot be established.

sub connect

{

my $dsn = "DBI:mysql:host=$host_name";

my %conn_attrs = (PrintError => 0, RaiseError => 1, AutoCommit => 1);

$dsn .= ";database=$db_name" if defined $db_name;

$dsn .= ";mysql_socket=$socket_file" if defined $socket_file;

$dsn .= ";port=$port_num" if defined $port_num;

return (DBI->connect ($dsn, $user_name, $password, \%conn_attrs));

}

1; # return true

The module encapsulates the code for establishing a connection to the MySQL server into a connect() method, and the package identifier establishes a Cookbook namespace for the module, so you invoke the connect() method using the module name:

$dbh = Cookbook::connect ();

The final line of the module file is a statement that trivially evaluates to true. This is needed because Perl assumes that something is wrong with a module and exits after reading it if the module doesn’t return a true value.

Perl locates library files by searching through the directories named in its @INC array. This array contains a default list of directories. To check the value of this variable on your system, invoke Perl as follows at the command line:

%perl -V

The last part of the output from the command shows the directories listed in the @INC array. If you install a library file in one of those directories, your scripts will find it automatically. If you install the module somewhere else, you need to tell your scripts where to find it by setting thePERL5LIB environment variable, as discussed earlier in the introduction to this recipe.

After installing the Cookbook.pm module, try it from a test harness script harness.pl written as follows:

#!/usr/bin/perl

# harness.pl - test harness for Cookbook.pm library

use strict;

use warnings;

use Cookbook;

my $dbh;

eval

{

$dbh = Cookbook::connect ();

print "Connected\n";

};

die "$@" if $@;

$dbh->disconnect ();

print "Disconnected\n";

harness.pl has no useDBI statement. It’s not necessary because the Cookbook.php library file itself imports the DBI module, so any script that uses Cookbook also gains access to DBI.

If you don’t want to bother catching connection errors explicitly, you can write the body of the script more simply. In this case, Perl will catch any connection exception and terminate the script after printing the error message generated by the connect() method:

my $dbh = Cookbook::connect ();

print "Connected\n";

$dbh->disconnect ();

print "Disconnected\n";

Ruby

The following Ruby library file, Cookbook.rb, defines a Cookbook class that implements a connect method:

# Cookbook.rb - library file with utility method for connecting to MySQL

# via Ruby DBI module

require "dbi"

# Establish a connection to the cookbook database, returning a database

# handle. Raise an exception if the connection cannot be established.

class Cookbook

@@host = "localhost"

@@db_name = "cookbook"

@@user_name = "cbuser"

@@password = "cbpass"

# class method for connecting to server to access

# cookbook database; returns database handle object.

def Cookbook.connect

return DBI.connect("DBI:Mysql:host=#{@@host};database=#{@@db_name}",

@@user_name, @@password)

end

The connect method is defined in the library as Cookbook.connect because Ruby class methods are defined as class_name.method_name.

Ruby locates library files by searching through the directories named in its $LOAD_PATH variable (also known as $:), which is an array that contains a default list of directories. To check the value of this variable on your system, use Ruby to execute this statement:

puts $LOAD_PATH

If you install a library file in one of those directories, your scripts will find it automatically. If you install the file somewhere else, you need to tell your scripts where to find it by setting the RUBYLIB environment variable, as discussed earlier in the introduction to this recipe.

After installing the Cookbook.rb library file, try it from a test harness script harness.rb written as follows:

#!/usr/bin/ruby -w

# harness.rb - test harness for Cookbook.rb library

require "Cookbook"

begin

dbh = Cookbook.connect

print "Connected\n"

rescue DBI::DatabaseError => e

puts "Cannot connect to server"

puts "Error code: #{e.err}"

puts "Error message: #{e.errstr}"

exit(1)

end

dbh.disconnect

print "Disconnected\n"

harness.rb has no require statement for the DBI module. It’s not necessary, because the Cookbook module itself imports DBI, so any script that imports Cookbook also gains access to DBI.

If you just want a script to die if an error occurs without checking for an exception yourself, write the body of the script like this:

dbh = Cookbook.connect

print "Connected\n"

dbh.disconnect

print "Disconnected\n"

PHP

The contents of PHP library files are written like regular PHP scripts. You can write such a file, Cookbook.php, that implements a Cookbook class with a connect() method as follows:

<?php

# Cookbook.php - library file with utility method for connecting to MySQL

# via PEAR DB module

require_once "DB.php";

class Cookbook

{

# Establish a connection to the cookbook database, returning a

# connection object, or an error object if an error occurs.

function connect ()

{

$dsn = array (

"phptype" => "mysqli",

"username" => "cbuser",

"password" => "cbpass",

"hostspec" => "localhost",

"database" => "cookbook"

);

return (DB::connect ($dsn));

}

} # end Cookbook

Although most PHP examples throughout this book don’t show the <?php and ?> tags, I’ve shown them as part of Cookbook.php here to emphasize that library files must enclose all PHP code within those tags. The PHP interpreter doesn’t make any assumptions about the contents of a library file when it begins parsing it because you might include a file that contains nothing but HTML. Therefore, you must use <?php and ?> to specify explicitly which parts of the library file should be considered as PHP code rather than as HTML, just as you do in the main script.

PHP looks for libraries by searching the directories named in the value of the include_path variable in the PHP initialization file, as described earlier in the introduction to this recipe. Assuming that Cookbook.php is installed in one of those directories, you can access it from a test harness script harness.php written as follows:

<?php

# harness.php - test harness for Cookbook.php library

require_once "Cookbook.php";

$conn =& Cookbook::connect ();

if (PEAR::isError ($conn))

die ("Cannot connect to server: " . $conn->getMessage () . "\n");

print ("Connected\n");

$conn->disconnect ();

print ("Disconnected\n");

harness.php has no statement to include DB.php. It’s not necessary because the Cookbook module itself includes DB.php, which gives any script that includes Cookbook.php access to DB.php.

WHERE SHOULD PHP LIBRARY FILES BE INSTALLED?

PHP scripts often are placed in the document tree of your web server, and clients can request them directly. For PHP library files, I recommend that you place them somewhere outside the document tree, especially if (like Cookbook.php) they contain usernames and passwords. This is particularly true if you use a different extension such as .inc for the names of include files. If you do that and install include files in the document tree, they might be requested directly by clients and be displayed as plain text, exposing their contents. To prevent that from happening, reconfigure Apache so that it treats files with the .inc extension as PHP code to be processed by the PHP interpreter rather than being displayed literally.

Python

Python libraries are written as modules and referenced from scripts using import or from statements. To create a method for connecting to MySQL, we can write a module file Cookbook.py:

# Cookbook.py - library file with utility method for connecting to MySQL

# via MySQLdb module

import MySQLdb

host_name = "localhost"

db_name = "cookbook"

user_name = "cbuser"

password = "cbpass"

# Establish a connection to the cookbook database, returning a connection

# object. Raise an exception if the connection cannot be established.

def connect ():

return MySQLdb.connect (db = db_name,

host = host_name,

user = user_name,

passwd = password)

The filename basename determines the module name, so the module is called Cookbook. Module methods are accessed through the module name; thus you would invoke the connect() method of the Cookbook module like this:

conn = Cookbook.connect ();

The Python interpreter searches for modules in directories named in the sys.path variable. You can find out what the default value of sys.path is on your system by running Python interactively and entering a couple of commands:

%python

>>> import sys

>>> sys.path

If you install Cookbook.py in one of the directories named by sys.path, your scripts will find it with no special handling. If you install Cookbook.py somewhere else, you’ll need to set the PYTHONPATH environment variable, as discussed earlier in the introduction to this recipe.

After installing the Cookbook.py library file, try it from a test harness script harness.py written as follows:

#!/usr/bin/python

# harness.py - test harness for Cookbook.py library

import sys

import MySQLdb

import Cookbook

try:

conn = Cookbook.connect ()

print "Connected"

except MySQLdb.Error, e:

print "Cannot connect to server"

print "Error code:", e.args[0]

print "Error message:", e.args[1]

sys.exit (1)

conn.close ()

print "Disconnected"

NOTE

The Cookbook.py file imports the MySQLdb module, but a script that imports Cookbook does not thereby gain access to MySQLdb. If the script needs MySQLdb-specific information (such as MySQLdb.Error), the script must also import MySQLdb.

If you just want a script to die if an error occurs without checking for an exception yourself, write the body of the script like this:

conn = Cookbook.connect ()

print "Connected"

conn.close ()

print "Disconnected"

Java

Java library files are similar to Java programs in most ways:

§ The class line in the source file indicates a class name.

§ The file should have the same name as the class (with a .java extension).

§ You compile the .java file to produce a .class file.

Java library files also differ from Java programs in some ways:

§ Unlike regular program files, Java library files have no main() function.

§ A library file should begin with a package identifier that specifies the location of the class within the Java namespace.

A common convention for Java package identifiers is to begin them with the reverse domain of the code author; this helps make identifiers unique and avoids conflict with classes written by other authors. Domain names proceed right to left from more general to more specific within the domain namespace, whereas the Java class namespace proceeds left to right from general to specific. Thus, to use a domain as the prefix for a package name within the Java class namespace, it’s necessary to reverse it. In my case, the domain is kitebird.com, so if I write a library file and place it under mcb within my domain’s namespace, the library begins with a package statement like this:

package com.kitebird.mcb;

Java packages developed for this book are placed within the com.kitebird.mcb namespace to ensure their uniqueness in the package namespace.

The following library file, Cookbook.java, defines a Cookbook class that implements a connect() method for connecting to the cookbook database. connect() returns a Connection object if it succeeds, and throws an exception otherwise. To help the caller deal with failures, theCookbook class also defines getErrorMessage() and printErrorMessage() utility methods that return the error message as a string or print it to System.err.

// Cookbook.java - library file with utility method for connecting to MySQL

// via MySQL Connector/J

package com.kitebird.mcb;

import java.sql.*;

public class Cookbook

{

// Establish a connection to the cookbook database, returning

// a connection object. Throw an exception if the connection

// cannot be established.

public static Connection connect () throws Exception

{

String url = "jdbc:mysql://localhost/cookbook";

String user = "cbuser";

String password = "cbpass";

Class.forName ("com.mysql.jdbc.Driver").newInstance ();

return (DriverManager.getConnection (url, user, password));

}

// Return an error message as a string

public static String getErrorMessage (Exception e)

{

StringBuffer s = new StringBuffer ();

if (e instanceof SQLException) // JDBC-specific exception?

{

// print general message, plus any database-specific message

s.append ("Error message: " + e.getMessage () + "\n");

s.append ("Error code: " + ((SQLException) e).getErrorCode () + "\n");

}

else

{

s.append (e + "\n");

}

return (s.toString ());

}

// Get the error message and print it to System.err

public static void printErrorMessage (Exception e)

{

System.err.println (Cookbook.getErrorMessage (e));

}

The routines within the class are declared using the static keyword, which makes them class methods rather than instance methods. That is done here because the class is used directly rather than creating an object from it and invoking the methods through the object.

To use the Cookbook.java file, compile it to produce Cookbook.class, and then install the class file in a directory that corresponds to the package identifier. This means that Cookbook.class should be installed in a directory named com/kitebird/mcb (or com\kitebird\mcb under Windows) that is located under some directory named in your CLASSPATH setting. For example, if CLASSPATH includes /usr/local/lib/mcb under Unix, you can install Cookbook.class in the /usr/local/lib/mcb/com/kitebird/mcb directory. (See the Java discussion in Connecting, Selecting a Database, and Disconnecting for more information about the CLASSPATH variable.)

To use the Cookbook class from within a Java program, import it, and then invoke the Cookbook.connect() method. The following test harness program, Harness.java, shows how to do this:

// Harness.java - test harness for Cookbook library class

import java.sql.*;

import com.kitebird.mcb.Cookbook;

public class Harness

{

public static void main (String[] args)

{

Connection conn = null;

try

{

conn = Cookbook.connect ();

System.out.println ("Connected");

}

catch (Exception e)

{

Cookbook.printErrorMessage (e);

System.exit (1);

}

finally

{

if (conn != null)

{

try

{

conn.close ();

System.out.println ("Disconnected");

}

catch (Exception e)

{

String err = Cookbook.getErrorMessage (e);

System.out.println (err);

}

Harness.java also shows how to use the error message utility methods from the Cookbook class when a MySQL-related exception occurs:

§ printErrorMessage() takes the exception object and uses it to print an error message to System.err.

§ getErrorMessage() returns the error message as a string. You can display the message yourself, write it to a logfile, or whatever.

Issuing Statements and Retrieving Results

Problem

You want your program to send an SQL statement to the MySQL server and retrieve whatever result it produces.

Solution

Some statements return only a status code; others return a result set (a set of rows). Some APIs provide different methods for issuing each type of statement. If so, use the method that’s appropriate for the statement to be executed.

Discussion

There are two general categories of SQL statements that you can execute. Some statements retrieve information from the database; others make changes to that information. These two types of statements are handled differently. In addition, some APIs provide several different routines for issuing statements, which complicates matters further. Before we get to the examples demonstrating how to issue statements from within each API, I’ll show the database table that the examples use, and then further discuss the two statement categories and outline a general strategy for processing statements in each category.

In Chapter 1, we created a table named limbs to try some sample statements. In this chapter, we’ll use a different table named profile. It’s based on the idea of a “buddy list,” that is, the set of people we like to keep in touch with while we’re online. To maintain a profile about each person, we can use the following table definition:

CREATE TABLE profile

(

id INT UNSIGNED NOT NULL AUTO_INCREMENT,

name CHAR(20) NOT NULL,

birth DATE,

color ENUM('blue','red','green','brown','black','white'),

foods SET('lutefisk','burrito','curry','eggroll','fadge','pizza'),

cats INT,

PRIMARY KEY (id)

);

The profile table indicates the things that are important to us about each buddy: name, age, favorite color, favorite foods, and number of cats—obviously one of those goofy tables that are used only for examples in a book!^[8^] The table also includes an id column containing unique values so that we can distinguish rows from each other, even if two buddies have the same name. id and name are declared as NOT NULL because they’re each required to have a value. The other columns are allowed to be NULL (and that is implicitly their default value) because we might not know the value to put into them for any given individual. That is, we’ll use NULL to signify “unknown.”

Notice that although we want to keep track of age, there is no age column in the table. Instead, there is a birth column of DATE type. That’s because ages change, but birthdays don’t. If we recorded age values, we’d have to keep updating them. Storing the birth date is better because it’s stable, and we can use it to calculate age at any given moment. (Calculating Ages discusses age calculations.) color is an ENUM column; color values can be any one of the listed values. foods is a SET, which allows the value to be chosen as any combination of the individual set members. That way we can record multiple favorite foods for any buddy.

To create the table, use the profile.sql script in the tables directory of the recipes distribution. Change location into that directory, and then run the following command:

%mysql cookbook < profile.sql

Another way to create the table is to issue the CREATE TABLE statement manually from within the mysql program, but I recommend that you use the script because it also loads sample data into the table. That way you can experiment with the table, and then restore it if you change its contents by running the script again. (See the last recipe of this chapter on the importance of restoring the profile table.)

The initial contents of the profile table as loaded by the profile.sql script look like this:

mysql>SELECT * FROM profile;

+----+---------+------------+-------+-----------------------+------+

+----+---------+------------+-------+-----------------------+------+

| 1 | Fred | 1970-04-13 | black | lutefisk,fadge,pizza | 0 |

| 2 | Mort | 1969-09-30 | white | burrito,curry,eggroll | 3 |

| 3 | Brit | 1957-12-01 | red | burrito,curry,pizza | 1 |

| 4 | Carl | 1973-11-02 | red | eggroll,pizza | 4 |

| 5 | Sean | 1963-07-04 | blue | burrito,curry | 5 |

| 6 | Alan | 1965-02-14 | red | curry,fadge | 1 |

| 7 | Mara | 1968-09-17 | green | lutefisk,fadge | 1 |

| 8 | Shepard | 1975-09-02 | black | curry,pizza | 2 |

| 9 | Dick | 1952-08-20 | green | lutefisk,fadge | 0 |

| 10 | Tony | 1960-05-01 | white | burrito,pizza | 0 |

+----+---------+------------+-------+-----------------------+------+

Although most of the columns in the profile table allow NULL values, none of the rows in the sample dataset actually contain NULL yet. That is because NULL values complicate statement processing a bit and I don’t want to deal with those complications until we get to Recipes and .

SQL statement categories

SQL statements can be divided into two broad categories:

§ Statements that do not return a result set (that is, a set of rows). Statements in this category include INSERT, DELETE, and UPDATE. As a general rule, statements of this type generally change the database in some way. There are some exceptions, such as USE db_name, which changes the default (current) database for your session without making any changes to the database itself. The example data-modifying statement used in this section is an UPDATE:

UPDATE profile SET cats = cats+1 WHERE name = 'Fred'

We’ll cover how to issue this statement and determine the number of rows that it affects.

§ Statements that return a result set, such as SELECT, SHOW, EXPLAIN, and DESCRIBE. I refer to such statements generically as SELECT statements, but you should understand that category to include any statement that returns rows. The example row-retrieval statement used in this section is a SELECT:

SELECT id, name, cats FROM profile

We’ll cover how to issue this statement, fetch the rows in the result set, and determine the number of rows and columns in the result set. (If you need information such as the column names or data types, you must access the result set metadata. Obtaining Result Set Metadata shows how to do this.)

The first step in processing an SQL statement is to send it to the MySQL server for execution. Some APIs (those for Perl, Ruby, and Java, for example) recognize a distinction between the two categories of statements and provide separate calls for executing them. Other APIs (such as those for PHP and Python) have a single call that can be used for any statement. However, one thing all APIs have in common is that you don’t use any special character to indicate the end of the statement. No terminator is necessary because the end of the statement string implicitly terminates the statement. This differs from the way that you issue statements in the mysql program, where you terminate statements using a semicolon (;) or \g. (It also differs from the way that I normally show the syntax for SQL statements in examples, because usually I include semicolons to make it clear where statements end.)

When you send a statement to the server, you should be prepared to handle errors if it did not execute successfully. Do not neglect to do this! If a statement fails and you proceed on the basis that it succeeded, your program won’t work. For the most part, error-checking code is not shown in this section, but that is for brevity. The sample scripts in the recipes distribution from which the examples are taken do include error handling, based on the techniques illustrated in Checking for Errors.

If a statement does execute without error, your next step depends on the type of statement you issued. If it’s one that returns no result set, there’s nothing else to do, unless you want to check how many rows were affected by the statement. If the statement does return a result set, you can fetch its rows, and then close the result set. If you are executing a statement in a context where you don’t necessarily know whether it returns a result set, Determining Whether a Statement Produced a Result Set discusses how to tell.

Perl

The Perl DBI module provides two basic approaches to SQL statement execution, depending on whether you expect to get back a result set. To issue a statement such as INSERT or UPDATE that returns no result set, use the database handle do() method. It executes the statement and returns the number of rows affected by it, or undef if an error occurs. For example, if Fred gets a new cat, the following statement can be used to increment his cats count by one:

my $count = $dbh->do ("UPDATE profile SET cats = cats+1

WHERE name = 'Fred'");

if ($count) # print row count if no error occurred

{

$count += 0;

print "Number of rows updated: $count\n";

}

If the statement executes successfully but affects no rows, do() returns a special value, "0E0" (that is, the value zero in scientific notation, expressed as a string). "0E0" can be used for testing the execution status of a statement because it is true in Boolean contexts (unlike undef). For successful statements, it can also be used when counting how many rows were affected because it is treated as the number zero in numeric contexts. Of course, if you print that value as is, you’ll print "0E0", which might look kind of weird to people who use your program. The preceding example shows one way to make sure that doesn’t happen: add zero to the value to explicitly coerce it to numeric form so that it displays as 0. You can also use printf with a %d format specifier to cause an implicit numeric conversion:

my $count = $dbh->do ("UPDATE profile SET cats = cats+1

WHERE name = 'Fred'");

if ($count) # print row count if no error occurred

{

printf "Number of rows updated: %d\n", $count;

}

If RaiseError is enabled, your script terminates automatically if a DBI-related error occurs, so you don’t need to bother checking $count to see whether do() failed. In that case, you can simplify the code somewhat:

my $count = $dbh->do ("UPDATE profile SET cats = cats+1

WHERE name = 'Fred'");

printf "Number of rows updated: %d\n", $count;

To process a statement such as SELECT that does return a result set, use a different approach that involves several steps:

1. Specify the statement to be executed by calling prepare() using the database handle. If prepare() is successful, it returns a statement handle to use with all subsequent operations on the statement. (If an error occurs, the script terminates if RaiseError is enabled; otherwise,prepare() returns undef.)

2. Call execute() to execute the statement and generate the result set.

3. Perform a loop to fetch the rows returned by the statement. DBI provides several methods that you can use in this loop; we cover them shortly.

4. If you don’t fetch the entire result set, release resources associated with it by calling finish().

The following example illustrates these steps, using fetchrow_array() as the row-fetching method and assuming that RaiseError is enabled so that errors terminate the script:

my $sth = $dbh->prepare ("SELECT id, name, cats FROM profile");

$sth->execute ();

my $count = 0;

while (my @val = $sth->fetchrow_array ())

{

print "id: $val[0], name: $val[1], cats: $val[2]\n";

++$count;

}

$sth->finish ();

print "Number of rows returned: $count\n";

The row-fetching loop just shown is followed by a call to finish(), which closes the result set and tells the server that it can free any resources associated with it. You don’t actually need to call finish() if you fetch every row in the set, because DBI notices when you’ve reached the last row and releases the set for you. The only time it’s important to invoke finish() explicitly is when you don’t fetch the entire result set. Thus, the example could have omitted the finish() call without ill effect.

The example illustrates that if you want to know how many rows a result set contains, you should count them yourself while you’re fetching them. Do not use the DBI rows() method for this purpose; the DBI documentation discourages this practice. (The reason is that rows() is not necessarily reliable for SELECT statements—not because of some deficiency in DBI, but because of differences in the behavior of various database engines.)

DBI has several methods that fetch a row at a time. The one used in the previous example, fetchrow_array(), returns an array containing the next row or an empty list when there are no more rows. Array elements are accessed as $val[0], $val[1], ..., and are present in the array in the same order they are named in the SELECT statement. The size of the array tells you how many columns the result set has.

The fetchrow_array() method is most useful for statements that explicitly name the columns to select. (If you retrieve columns with SELECT *, there are no guarantees about the positions of columns within the array.)

fetchrow_arrayref() is like fetchrow_array(), except that it returns a reference to the array or undef when there are no more rows. Array elements are accessed as $ref->[0], $ref->[1], and so forth. As with fetchrow_array(), the values are present in the order named in the statement:

my $sth = $dbh->prepare ("SELECT id, name, cats FROM profile");

$sth->execute ();

my $count = 0;

while (my $ref = $sth->fetchrow_arrayref ())

{

print "id: $ref->[0], name: $ref->[1], cats: $ref->[2]\n";

++$count;

}

print "Number of rows returned: $count\n";

fetchrow_hashref()returns a reference to a hash structure, or undef when there are no more rows:

my $sth = $dbh->prepare ("SELECT id, name, cats FROM profile");

$sth->execute ();

my $count = 0;

while (my $ref = $sth->fetchrow_hashref ())

{

print "id: $ref->{id}, name: $ref->{name}, cats: $ref->{cats}\n";

++$count;

}

print "Number of rows returned: $count\n";

To access the elements of the hash, use the names of the columns that are selected by the statement ($ref->{id}, $ref->{name}, and so forth). fetchrow_hashref() is particularly useful for SELECT * statements because you can access elements of rows without knowing anything about the order in which columns are returned. You just need to know their names. On the other hand, it’s more expensive to set up a hash than an array, so fetchrow_hashref() is slower than fetchrow_array() or fetchrow_arrayref(). It’s also possible to “lose” row elements if they have the same name because column names must be unique. The following statement selects two values, but fetchrow_hashref() would return a hash structure containing a single element named id:

SELECT id, id FROM profile

To avoid this problem, you can use column aliases to ensure that like-named columns have distinct names in the result set. The following statement retrieves the same columns as the previous statement, but gives them the distinct names id and id2:

SELECT id, id AS id2 FROM profile

Admittedly, this statement is pretty silly. However, if you’re retrieving columns from multiple tables, you can very easily run into the problem of having columns in the result set that have the same name. An example where this occurs may be seen in Referring to Join Output Column Names in Programs.

In addition to the methods for performing the statement execution process just described, DBI provides several high-level retrieval methods that issue a statement and return the result set in a single operation. All of these are database handle methods that create and dispose of the statement handle internally before returning the result set. Where the methods differ is the form in which they return the result. Some return the entire result set, others return a single row or column of the set, as summarized in the following table:

Method	Return value
selectrow_array()	First row of result set as an array
selectrow_arrayref()	First row of result set as a reference to an array
selectrow_hashref()	First row of result set as a reference to a hash
selectcol_arrayref()	First column of result set as a reference to an array
selectall_arrayref()	Entire result set as a reference to an array of array references
selectall_hashref()	Entire result set as a reference to a hash of hash references

Most of these methods return a reference. The exception is selectrow_array(), which selects the first row of the result set and returns an array or a scalar, depending on how you call it. In array context, selectrow_array() returns the entire row as an array (or the empty list if no row was selected). This is useful for statements from which you expect to obtain only a single row:

my @val = $dbh->selectrow_array ("SELECT name, birth, foods FROM profile

WHERE id = 3");

When selectrow_array() is called in array context, the return value can be used to determine the size of the result set. The column count is the number of elements in the array, and the row count is 1 or 0:

my $ncols = @val;

my $nrows = ($ncols ? 1 : 0);

You can also invoke selectrow_array() in scalar context, in which case it returns only the first column from the row. This is especially convenient for statements that return a single value:

my $buddy_count = $dbh->selectrow_array ("SELECT COUNT(*) FROM profile");

If a statement returns no result, selectrow_array() returns an empty array or undef, depending on whether you call it in array or scalar context.

selectrow_arrayref() and selectrow_hashref() select the first row of the result set and return a reference to it or undef if no row was selected. To access the column values, treat the reference the same way you treat the return value from fetchrow_arrayref() orfetchrow_hashref(). You can also use the reference to get the row and column counts:

my $ref = $dbh->selectrow_arrayref ($stmt);

my $ncols = (defined ($ref) ? @{$ref} : 0);

my $nrows = ($ncols ? 1 : 0);

my $ref = $dbh->selectrow_hashref ($stmt);

my $ncols = (defined ($ref) ? keys (%{$ref}) : 0);

my $nrows = ($ncols ? 1 : 0);

With selectcol_arrayref(), a reference to a single-column array is returned, representing the first column of the result set. Assuming a non-undef return value, elements of the array are accessed as $ref->[ i ] for the value from row i. The number of rows is the number of elements in the array, and the column count is 1 or 0:

my $ref = $dbh->selectcol_arrayref ($stmt);

my $nrows = (defined ($ref) ? @{$ref} : 0);

my $ncols = ($nrows ? 1 : 0);

selectall_arrayref() returns a reference to an array, where the array contains an element for each row of the result. Each of these elements is a reference to an array. To access row i of the result set, use $ref->[i] to get a reference to the row. Then treat the row reference the same way as a return value from fetchrow_arrayref() to access individual column values in the row. The result set row and column counts are available as follows:

my $ref = $dbh->selectall_arrayref ($stmt);

my $nrows = (defined ($ref) ? @{$ref} : 0);

my $ncols = ($nrows ? @{$ref->[0]} : 0);

selectall_hashref() is somewhat similar to selectall_arrayref() but returns a reference to a hash, each element of which is a hash reference to a row of the result. To call it, specify an argument that indicates which column to use for hash keys. For example, if you’re retrieving rows from the profile table, the primary key is the id column:

my $ref = $dbh->selectall_hashref ("SELECT * FROM profile", "id");

Then access rows using the keys of the hash. For example, if one of the rows has a key column value of 12, the hash reference for the row is accessed as $ref->{12}. That row value is keyed on column names, which you can use to access individual column elements (for example, $ref->{12}->{name}). The result set row and column counts are available as follows:

my @keys = (defined ($ref) ? keys (%{$ref}) : ());

my $nrows = scalar (@keys);

my $ncols = ($nrows ? keys (%{$ref->{$keys[0]}}) : 0);

The various selectall_ XXX () methods are useful when you need to process a result set more than once because Perl DBI provides no way to “rewind” a result set. By assigning the entire result set to a variable, you can iterate through its elements multiple times.

Take care when using the high-level methods if you have RaiseError disabled. In that case, a method’s return value may not always enable you to distinguish an error from an empty result set. For example, if you call selectrow_array() in scalar context to retrieve a single value, anundef return value is particularly ambiguous because it may indicate any of three things: an error, an empty result set, or a result set consisting of a single NULL value. If you need to test for an error, you can check the value of $DBI::errstr, $DBI::err, or $DBI::state.

Ruby

As with Perl DBI, Ruby DBI provides two approaches to SQL statement execution. With either approach, if a statement-execution method fails with an error, it raises an exception.

For statements such as INSERT or UPDATE that return no result set, invoke the do database handle method. Its return value indicates the number of rows affected:

count = dbh.do("UPDATE profile SET cats = cats+1 WHERE name = 'Fred'")

puts "Number of rows updated: #{count}"

For statements such as SELECT that return a result set, invoke the execute database handle method. execute returns a statement handle to use for fetching the rows of the result set. The statement handle has several methods of its own that enable row fetching to be done in different ways. After you are done with the statement handle, invoke its finish method. (Unlike Perl DBI, where it is necessary to invoke finish only if you do not fetch the entire result set, in Ruby you should call finish for every statement handle that you create.) To determine the number of rows in the result set, count them as you fetch them.

The following example executes a SELECT statement and uses the statement handle’s fetch method in a while loop:

count = 0

sth = dbh.execute("SELECT id, name, cats FROM profile")

while row = sth.fetch do

printf "id: %s, name: %s, cats: %s\n", row[0], row[1], row[2]

count += 1

end

sth.finish

puts "Number of rows returned: #{count}"

You can also use fetch as an iterator that returns each row in turn:

count = 0

sth = dbh.execute("SELECT id, name, cats FROM profile")

sth.fetch do |row|

printf "id: %s, name: %s, cats: %s\n", row[0], row[1], row[2]

count += 1

end

sth.finish

puts "Number of rows returned: #{count}"

In iterator context (such as just shown), the each method is a synonym for fetch.

The fetch method returns DBI::Row objects. As just shown, you can access column values within the row by position (beginning with 0). DBI::Row objects also provide access to columns by name:

sth.fetch do |row|

printf "id: %s, name: %s, cats: %s\n",

row["id"], row["name"], row["cats"]

end

To fetch all rows at once, use fetch_all, which returns an array of DBI::Row objects:

sth = dbh.execute("SELECT id, name, cats FROM profile")

rows = sth.fetch_all

sth.finish

rows.each do |row|

printf "id: %s, name: %s, cats: %s\n",

row["id"], row["name"], row["cats"]

end

row.size tells you the number of columns in the result set.

To fetch each row as a hash keyed on column names, use the fetch_hash method. It can be called in a loop or used as an iterator. The following example shows the iterator approach:

count = 0

sth = dbh.execute("SELECT id, name, cats FROM profile")

sth.fetch_hash do |row|

printf "id: %s, name: %s, cats: %s\n",

row["id"], row["name"], row["cats"]

count += 1

end

sth.finish

puts "Number of rows returned: #{count}"

The preceding examples invoke execute to get a statement handle and then invoke finish when that handle is no longer needed. Another way to invoke execute is with a code block. In this case, it passes the statement handle to the block and invokes finish on that handle automatically. For example:

count = 0

dbh.execute("SELECT id, name, cats FROM profile") do |sth|

sth.fetch do |row|

printf "id: %s, name: %s, cats: %s\n", row[0], row[1], row[2]

count += 1

end

puts "Number of rows returned: #{count}"

Ruby DBI has some higher-level database handle methods for executing statements to produce result sets:

§ select_oneexecutes a query and returns the first row as an array (or nil if the result is empty):

row = dbh.select_one("SELECT id, name, cats FROM profile WHERE id = 3")

§ select_allexecutes a query and returns an array of DBI::Row objects, one per row of the result set. The array is empty if the result is empty:

rows = dbh.select_all( "SELECT id, name, cats FROM profile")

The select_all method is useful when you need to process a result set more than once because Ruby DBI provides no way to “rewind” a result set. By fetching the entire result set as an array of row objects, you can iterate through its elements multiple times. If you need to run through the rows only once, you can apply an iterator directly to select_all:

dbh.select_all("SELECT id, name, cats FROM profile").each do |row|

printf "id: %s, name: %s, cats: %s\n",

row["id"], row["name"], row["cats"]

end

PHP

In PHP, PEAR DB doesn’t have separate methods for issuing statements that return result sets and those that do not. Instead, the query() method executes any kind of statement. query() takes a statement string argument and returns a result value that you can test with PEAR::isError()to determine whether the statement failed. If PEAR::isError() is true, it means that your statement was bad: it was syntactically invalid, you didn’t have permission to access a table named in the statement, or some other problem prevented the statement from executing.

If the statement executed properly, what you do at that point depends on the type of statement. For statements such as INSERT or UPDATE that don’t return rows, the statement has completed. If you want, you can call the connection object’s affectedRows() method to find out how many rows were changed:

$result =& $conn->query ("UPDATE profile SET cats = cats+1

WHERE name = 'Fred'");

if (PEAR::isError ($result))

die ("Oops, the statement failed");

printf ("Number of rows updated: %d\n", $conn->affectedRows ());

For statements such as SELECT that return a result set, the query() method returns a result set object. Generally, you use this object to call a row-fetching method in a loop, and then call its free() method to release the result set. Here’s an example that shows how to issue a SELECTstatement and use the result set object to fetch the rows:

$result =& $conn->query ("SELECT id, name, cats FROM profile");

if (PEAR::isError ($result))

die ("Oops, the statement failed");

while ($row =& $result->fetchRow (DB_FETCHMODE_ORDERED))

print ("id: $row[0], name: $row[1], cats: $row[2]\n");

printf ("Number of rows returned: %d\n", $result->numRows ());

$result->free ();

The example demonstrates several methods that result set objects have:

§ To obtain the rows in the result set, execute a loop in which you invoke the fetchRow() method.

§ To obtain a count of the number of rows in the result set, invoke numRows().

§ When there are no more rows, invoke the free() method.

The fetchRow() method returns the next row of the result set or NULL when there are no more rows. fetchRow() takes an argument that indicates what type of value it should return. As shown in the preceding example, with an argument of DB_FETCHMODE_ORDERED, fetchRow()returns an array with elements that correspond to the columns selected by the statement and that are accessed using numeric subscripts. The size of the array indicates the number of columns in the result set.

With an argument of DB_FETCHMODE_ASSOC, fetchRow() returns an associative array containing values that are accessed by column name:

$result =& $conn->query ("SELECT id, name, cats FROM profile");

if (PEAR::isError ($result))

die ("Oops, the statement failed");

while ($row =& $result->fetchRow (DB_FETCHMODE_ASSOC))

{

printf ("id: %s, name: %s, cats: %s\n",

$row["id"], $row["name"], $row["cats"]);

}

printf ("Number of rows returned: %d\n", $result->numRows ());

$result->free ();

With an argument of DB_FETCHMODE_OBJECT, fetchRow() returns an object having members that are accessed using the column names:

$result =& $conn->query ("SELECT id, name, cats FROM profile");

if (PEAR::isError ($result))

die ("Oops, the statement failed");

while ($row =& $result->fetchRow (DB_FETCHMODE_OBJECT))

print ("id: $row->id, name: $row->name, cats: $row->cats\n");

printf ("Number of rows returned: %d\n", $result->numRows ());

$result->free ();

If you invoke fetchRow() without an argument, it uses the default fetch mode. Normally, this is DB_FETCHMODE_ORDERED unless you have changed it by calling the setFetchMode() connection object method. For example, to use DB_FETCHMODE_ASSOC as the default fetch mode for a connection, do this:

$conn->setFetchMode (DB_FETCHMODE_ASSOC);

Python

The Python DB-API interface does not have distinct calls for SQL statements that return a result set and those that do not. To process a statement in Python, use your database connection object to get a cursor object. Then use the cursor’s execute() method to send the statement to the server. If the statement fails with an error, execute() raises an exception. Otherwise, if there is no result set, the statement is completed, and you can use the cursor’s rowcount attribute to determine how many rows were changed:

cursor = conn.cursor ()

cursor.execute ("UPDATE profile SET cats = cats+1 WHERE name = 'Fred'")

print "Number of rows updated: %d" % cursor.rowcount

Note that rowcount is an attribute, not a method. Refer to it as rowcount, not rowcount(), or an exception will be raised.

NOTE

The Python DB-API specification indicates that database connections should begin with auto-commit mode disabled, so MySQLdb disables auto-commit when it connects to the MySQL server. One implication is that if you use transactional tables, modifications to them will be rolled back when you close the connection unless you commit the changes first. Changes to nontransactional tables such as MyISAM tables are committed automatically, so this issue does not arise. For more information on auto-commit mode, see Chapter 15 (Using Transactions in Python Programs in particular).

If the statement does return a result set, fetch its rows, and then close the set. DB-API provides a couple of methods for retrieving rows. fetchone() returns the next row as a sequence (or None when there are no more rows):

cursor = conn.cursor ()

cursor.execute ("SELECT id, name, cats FROM profile")

while 1:

row = cursor.fetchone ()

if row == None:

break

print "id: %s, name: %s, cats: %s" % (row[0], row[1], row[2])

print "Number of rows returned: %d" % cursor.rowcount

cursor.close ()

As you can see from the preceding example, the rowcount attribute is useful for SELECT statements, too; it indicates the number of rows in the result set.

len(row) tells you the number of columns in the result set.

Another row-fetching method, fetchall(), returns the entire result set as a sequence of row sequences. You can iterate through the sequence to access the rows:

cursor = conn.cursor ()

cursor.execute ("SELECT id, name, cats FROM profile")

rows = cursor.fetchall ()

for row in rows:

print "id: %s, name: %s, cats: %s" % (row[0], row[1], row[2])

print "Number of rows returned: %d" % cursor.rowcount

cursor.close ()

DB-API doesn’t provide any way to rewind a result set, so fetchall() can be convenient when you need to iterate through the rows of the result set more than once or access individual values directly. For example, if rows holds the result set, you can access the value of the third column in the second row as rows[1][2] (indexes begin at 0, not 1).

To access row values by column name, specify the DictCursor cursor type when you create the cursor object. This causes rows to be returned as Python dictionary objects with named elements:

cursor = conn.cursor (MySQLdb.cursors.DictCursor)

cursor.execute ("SELECT id, name, cats FROM profile")

for row in cursor.fetchall ():

print "id: %s, name: %s, cats: %s" % (row["id"], row["name"], row["cats"])

print "Number of rows returned: %d" % cursor.rowcount

cursor.close ()

The preceding example also demonstrates how to use fetch_all() directly as an iterator.

Java

The JDBC interface provides specific object types for the various phases of SQL statement processing. Statements are issued in JDBC by using Java objects of one type. The results, if there are any, are returned as objects of another type.

To issue a statement, the first step is to get a Statement object by calling the createStatement() method of your Connection object:

Statement s = conn.createStatement ();

Now use the Statement object to send the statement to the server. JDBC provides several methods for doing this. Choose the one that’s appropriate for the type of statement you want to issue: executeUpdate() for statements that don’t return a result set, executeQuery() for statements that do, and execute() when you don’t know. Each method raises an exception if the statement fails with an error.

The executeUpdate() method sends a statement that generates no result set to the server and returns a count indicating the number of rows that were affected. When you’re done with the statement object, close it. The following example illustrates this sequence of events:

Statement s = conn.createStatement ();

int count = s.executeUpdate (

"UPDATE profile SET cats = cats+1 WHERE name = 'Fred'");

s.close (); // close statement

System.out.println ("Number of rows updated: " + count);

For statements that return a result set, use executeQuery(). Then get a result set object, and use it to retrieve the row values. When you’re done, close the result set and statement objects:

Statement s = conn.createStatement ();

s.executeQuery ("SELECT id, name, cats FROM profile");

ResultSet rs = s.getResultSet ();

int count = 0;

while (rs.next ()) // loop through rows of result set

{

int id = rs.getInt (1); // extract columns 1, 2, and 3

String name = rs.getString (2);

int cats = rs.getInt (3);

System.out.println ("id: " + id

+ ", name: " + name

+ ", cats: " + cats);

++count;

}

rs.close (); // close result set

s.close (); // close statement

System.out.println ("Number of rows returned: " + count);

The ResultSet object returned by the getResultSet() method of your Statement object has a number of methods of its own, such as next() to fetch rows and various get XXX () methods that access columns of the current row. Initially, the result set is positioned just before the first row of the set. Call next() to fetch each row in succession until it returns false, which means that there are no more rows. To determine the number of rows in a result set, count them yourself, as shown in the preceding example.

To access column values, use methods such as getInt(), getString(), getFloat(), and getDate(). To obtain the column value as a generic object, use getObject(). The get XXX () calls can be invoked with an argument that indicates either column position (beginning at 1, not 0) or column name. The previous example shows how to retrieve the id, name, and cats columns by position. To access columns by name instead, the row-fetching loop of that example can be rewritten as follows:

while (rs.next ()) // loop through rows of result set

{

int id = rs.getInt ("id");

String name = rs.getString ("name");

int cats = rs.getInt ("cats");

System.out.println ("id: " + id

+ ", name: " + name

+ ", cats: " + cats);

++count;

}

You can retrieve a given column value using any get XXX () call that makes sense for the data type. For example, you can use getString() to retrieve any column value as a string:

String id = rs.getString ("id");

String name = rs.getString ("name");

String cats = rs.getString ("cats");

System.out.println ("id: " + id

+ ", name: " + name

+ ", cats: " + cats);

Or you can use getObject() to retrieve values as generic objects and convert the values as necessary. The following example uses toString() to convert object values to printable form:

Object id = rs.getObject ("id");

Object name = rs.getObject ("name");

Object cats = rs.getObject ("cats");

System.out.println ("id: " + id.toString ()

+ ", name: " + name.toString ()

+ ", cats: " + cats.toString ());

To find out how many columns are in the result set, access the result set’s metadata:

ResultSet rs = s.getResultSet ();

ResultSetMetaData md = rs.getMetaData (); // get result set metadata

int ncols = md.getColumnCount (); // get column count from metadata

The third JDBC statement-execution method, execute(), works for either type of statement. It’s particularly useful when you receive a statement string from an external source and don’t know whether it generates a result set. The return value from execute() indicates the statement type so that you can process it appropriately: if execute() returns true, there is a result set, otherwise not. Typically, you’d use it something like this, where stmtStr represents an arbitrary SQL statement:

Statement s = conn.createStatement ();

if (s.execute (stmtStr))

{

// there is a result set

ResultSet rs = s.getResultSet ();

// ... process result set here ...

rs.close (); // close result set

}

else

{

// there is no result set, just print the row count

System.out.println ("Number of rows affected: " + s.getUpdateCount ());

}

s.close (); // close statement

^[8^]Actually, it’s not that goofy. The table uses several different data types for its columns, and these come in handy later for illustrating how to solve problems that pertain to specific data types.

Handling Special Characters and NULL Values in Statements

Problem

You need to construct SQL statements that refer to data values containing special characters such as quotes or backslashes, or special values such as NULL. Or you are constructing statements using data obtained from external sources and want to avoid being subject to SQL injection attacks.

Solution

Use your API’s placeholder mechanism or quoting function to make data safe for insertion.

Discussion

Up to this point in the chapter, our statements have used “safe” data values that require no special treatment. For example, we can easily construct the following SQL statements from within a program by putting the data values literally into the statement strings:

SELECT * FROM profile WHERE age > 40 AND color = 'green'

INSERT INTO profile (name,color) VALUES('Gary','blue')

However, some data values are not so easily handled and can cause problems if you are not careful. Statements might use values that contain special characters such as quotes, backslashes , binary data, or values that are NULL. The following discussion describes the difficulties caused by these types of values and the proper methods for handling them.

Suppose that you want to execute this INSERT statement:

INSERT INTO profile (name,birth,color,foods,cats)

VALUES('Alison','1973-01-12','blue','eggroll',4);

There’s nothing unusual about that. But if you change the name column value to something like De'Mont that contains a single quote, the statement becomes syntactically invalid:

INSERT INTO profile (name,birth,color,foods,cats)

VALUES('De'Mont','1973-01-12','blue','eggroll',4);

The problem is that there is a single quote inside a single-quoted string. To make the statement legal, the quote could be escaped by preceding it either with a single quote or with a backslash:

INSERT INTO profile (name,birth,color,foods,cats)

VALUES('De''Mont','1973-01-12','blue','eggroll',4);

INSERT INTO profile (name,birth,color,foods,cats)

VALUES('De\'Mont','1973-01-12','blue','eggroll',4);

Alternatively, you could quote the name value itself within double quotes rather than within single quotes (assuming that the ANSI_QUOTES SQL mode is not enabled):

INSERT INTO profile (name,birth,color,foods,cats)

VALUES("De'Mont",'1973-01-12','blue','eggroll',4);

If you are writing a statement literally in your program, you can escape or quote the name value by hand because you know what the value is. But if a variable holds the name value, you don’t necessarily know what the variable’s value is. Worse yet, single quote isn’t the only character you must be prepared to deal with; double quotes and backslashes cause problems, too. And if you want to store binary data such as images or sound clips in your database, a value might contain anything—not just quotes or backslashes, but other characters such as nulls (zero-valued bytes). The need to handle special characters properly is particularly acute in a web environment where statements are constructed using form input (for example, if you’re searching for rows that match search terms entered by the remote user). You must be able to handle any kind of input in a general way, because you can’t predict in advance what kind of information a user will supply. In fact, it is not uncommon for malicious users to enter garbage values containing problematic characters in a deliberate attempt to compromise the security of your server. That is a standard technique for finding insecure scripts that can be exploited.

The SQL NULL value is not a special character, but it too requires special treatment. In SQL, NULL indicates “no value.” This can have several meanings depending on context, such as “unknown,” “missing,” “out of range,” and so forth. Our statements thus far have not used NULL values, to avoid dealing with the complications that they introduce, but now it’s time to address these issues. For example, if you don’t know De’Mont’s favorite color, you can set the color column to NULL—but not by writing the statement like this:

INSERT INTO profile (name,birth,color,foods,cats)

VALUES('De''Mont','1973-01-12','NULL','eggroll',4);

Instead, the NULL value should have no enclosing quotes:

INSERT INTO profile (name,birth,color,foods,cats)

VALUES('De''Mont','1973-01-12',NULL,'eggroll',4);

If you were writing the statement literally in your program, you’d simply write the word “NULL” without enclosing quotes. But if the color value comes from a variable, the proper action is not so obvious. You must know something about the variable’s value to be able to determine whether to enclose it within quotes when you construct the statement.

There are two general means at your disposal for dealing with special characters such as quotes and backslashes, and with special values such as NULL:

§ Use placeholders in the statement string to refer to data values symbolically, and then bind the data values to the placeholders when you execute the statement. Generally, this is the preferred method because the API itself will do all or most of the work for you of providing quotes around values as necessary, quoting or escaping special characters within the data value, and possibly interpreting a special value to map onto NULL without enclosing quotes.

§ Use a quoting function (if your API provides one) for converting data values to a safe form that is suitable for use in statement strings.

This section shows how to use these techniques to handle special characters and NULL values for each API. One of the examples demonstrated here shows how to insert a profile table row that contains De'Mont for the name value and NULL for the color value. However, the principles shown here have general utility and handle any special characters, including those found in binary data. (See Chapter 18 for examples showing how to work with images, which are one kind of binary data.) Also, the principles are not limited to INSERT statements. They work for other kinds of statements as well, such as SELECT. One of the other examples shown here demonstrates how to execute a SELECT statement using placeholders.

Special characters come up in other contexts that are not covered here:

§ The placeholder and quoting techniques described here are only for data values and not for identifiers such as database or table names. For discussion of the problem of quoting identifiers, refer to Handling Special Characters in Identifiers.

§ This section covers the issue of getting special characters into your database. A related issue not covered here is the inverse operation of transforming special characters in values returned from your database for display in various contexts. For example, if you’re generating HTML pages that include values taken from your database, you have to convert < and > characters in those values to the HTML entities < and > to make sure they display properly. Encoding Special Characters in Web Output discusses that topic.

Using placeholders

Placeholders enable you to avoid writing data values literally into SQL statements. Using this approach, you write the statement using placeholders—special characters that indicate where the values go. One common placeholder character is ?. For APIs that use that character, the INSERTstatement would be rewritten to use placeholders like this:

INSERT INTO profile (name,birth,color,foods,cats)

VALUES(?,?,?,?,?)

You then pass the statement string to the database and supply the data values separately. The values are bound to the placeholders to replace them, resulting in a statement that contains the data values.

One of the benefits of using placeholders is that parameter binding operations automatically handle escaping of characters such as quotes and backslashes. This can be especially useful if you’re inserting binary data such as images into your database or using data values with unknown content such as input submitted by a remote user through a form on a web page. Also, there is usually some special value that you can bind to a placeholder to indicate that you want an SQL NULL value in the resulting statement.

A second benefit of placeholders is that you can “prepare” a statement in advance and then reuse it by binding different values to it each time it’s executed. Some database interfaces have this capability, which allows some preparsing or even execution planning to be done prior to executing the statement. For a statement that is executed multiple times later, this reduces overhead because anything that can be done prior to execution need be done only once, not once per execution. Prepared statements thus encourage statement reuse. Statements become more generic because they contain placeholders rather than specific data values. If you’re executing an operation over and over, you may be able to reuse a prepared statement and simply bind different data values to it each time you execute it. If so, you gain a performance benefit, at least for database systems that support query planning. For example, if a program issues a particular type of SELECT statement several times while it runs, such a database system can construct a plan for the statement and then reuse it each time, rather than rebuild the plan over and over. MySQL doesn’t build query plans in advance, so you don’t get any performance boost from using prepared statements. However, if you port a program to a database that does not use query plans and you’ve written your program to use prepared statements, you can get this advantage of prepared statements automatically. You don’t have to convert from nonprepared statements to enjoy that benefit.

A third benefit is that code that uses placeholder-based statements can be easier to read, although that’s somewhat subjective. As you read through this section, compare the statements used here with those from Issuing Statements and Retrieving Results that did not use placeholders to see which you prefer.

GENERATING A LIST OF PLACEHOLDERS

You cannot bind an array of data values to a single placeholder. Each value must be bound to a separate placeholder. If you want to use placeholders for a list of data values that may vary in number, you must construct a list of placeholder characters. For example, in Perl, the following statement creates a string consisting of n placeholder characters separated by commas:

$str = join (",", ("?") xn);

The x repetition operator, when applied to a list, produces n copies of the list, so the join() call joins these lists to produce a single string containing n comma-separated instances of the ? character. This is handy when you want to bind an array of data values to a list of placeholders in a statement string because the size of the array indicates how many placeholder characters are needed:

$str = join (",", ("?") x @values);

In Ruby, the * operator can be used to similar effect:

str = (["?"] * values.size).join(",")

Another Perl method of generating a list of placeholders that is perhaps less cryptic looks like this:

$str = "?" if @values;

for (my $i = 1; $i < @values; $i++)

{

$str .= ",?";

}

This method’s syntax is less Perl-specific and therefore easier to translate into other languages. For example, the equivalent method in Python looks like this:

str = ""

if len (values) > 0:

str = "?"

for i in range (1, len (values)):

str = str + ",?"

Using a quoting function

Some APIs provide a quoting function that takes a data value as its argument and returns a properly quoted and escaped value suitable for safe insertion into an SQL statement. This is less common than using placeholders, but it can be useful for constructing statements that you do not intend to execute immediately. However, you do need to have a connection open to the database server while you use such a quoting function because the proper quoting rules cannot be selected until the database driver is known. (Some database systems have different quoting rules from others.)

Perl

To use placeholders in Perl DBI scripts, put a ? in your SQL statement string at each location where you want to insert a data value, and then bind the values to the statement. You can bind values by passing them to do() or execute(), or by calling a DBI method specifically intended for placeholder substitution.

With do(), you can add the profile row for De’Mont by passing the statement string and the data values in the same call:

my $count = $dbh->do ("INSERT INTO profile (name,birth,color,foods,cats)

VALUES(?,?,?,?,?)",

undef,

"De'Mont", "1973-01-12", undef, "eggroll", 4);

The arguments after the statement string should be undef followed by the data values, one value for each placeholder. (The undef argument that follows the statement string is a historical artifact, but it must be present.)

Alternatively, use prepare() plus execute(). Pass the statement string to prepare() to get a statement handle, and then use that handle to pass the data values via execute():

my $sth = $dbh->prepare ("INSERT INTO profile (name,birth,color,foods,cats)

VALUES(?,?,?,?,?)");

my $count = $sth->execute ("De'Mont", "1973-01-12", undef, "eggroll", 4);

If you need to issue the same statement over and over again, you can use prepare() once and call execute() each time you need to run the statement.

In either case, the resulting statement generated by DBI is as follows:

INSERT INTO profile (name,birth,color,foods,cats)

VALUES('De\'Mont','1973-01-12',NULL,'eggroll','4')

Note how DBI adds quotes around data values, even though there were none around the ? placeholder characters in the original statement string. (The placeholder mechanism adds quotes around numeric values, too, but that’s okay because the MySQL server performs type conversion as necessary to convert strings to numbers.) Also note the DBI convention that when you bind undef to a placeholder, DBI puts a NULL into the statement and correctly refrains from adding enclosing quotes.

You can use these methods for other types of statements as well. For example, the following SELECT statement uses a placeholder to look for rows that have a cats value larger than 2:

my $sth = $dbh->prepare ("SELECT * FROM profile WHERE cats > ?");

$sth->execute (2);

while (my $ref = $sth->fetchrow_hashref ())

{

print "id: $ref->{id}, name: $ref->{name}, cats: $ref->{cats}\n";

}

Another way to bind values to placeholders is to use the bind_param() call. It takes two arguments: a placeholder position and a value to bind to the placeholder at that position. (Placeholder positions begin with 1, not 0.) The preceding INSERT and SELECT examples can be rewritten to use bind_param() as follows:

my $sth = $dbh->prepare ("INSERT INTO profile (name,birth,color,foods,cats)

VALUES(?,?,?,?,?)");

$sth->bind_param (1, "De'Mont");

$sth->bind_param (2, "1973-01-12");

$sth->bind_param (3, undef);

$sth->bind_param (4, "eggroll");

$sth->bind_param (5, 4);

my $count = $sth->execute ();

my $sth = $dbh->prepare ("SELECT * FROM profile WHERE cats > ?");

$sth->bind_param (1, 2);

$sth->execute ();

while (my $ref = $sth->fetchrow_hashref ())

{

print "id: $ref->{id}, name: $ref->{name}, cats: $ref->{cats}\n";

}

No matter which method you use for placeholders, don’t put any quotes around the ? characters, not even for placeholders that represent strings. DBI adds quotes as necessary on its own. In fact, if you do put quotes around the placeholder character, DBI interprets it as the literal string constant "?", not as a placeholder.

The high-level retrieval methods such as selectrow_array() and selectall_arrayref() can be used with placeholders, too. Like the do() method, the arguments are the statement string and undef, followed by the data values to bind to the placeholders that occur in the statement string. Here’s an example:

my $ref = $dbh->selectall_arrayref (

"SELECT name, birth, foods FROM profile

WHERE id > ? AND color = ?",

undef, 3, "green");

Perl DBI also provides a quote() database handle method as an alternative to using placeholders. Here’s how to use quote() to create a statement string that inserts a new row in the profile table:

my $stmt = sprintf ("INSERT INTO profile (name,birth,color,foods,cats)

VALUES(%s,%s,%s,%s,%s)",

$dbh->quote ("De'Mont"),

$dbh->quote ("1973-01-12"),

$dbh->quote (undef),

$dbh->quote ("eggroll"),

$dbh->quote (4));

my $count = $dbh->do ($stmt);

The statement string generated by this code is the same as when you use placeholders. The %s format specifiers are written without enclosing quotes because quote() provides them automatically as necessary: non-undef values are inserted with quotes, and undef values are inserted asNULL without quotes.

Ruby

Ruby DBI uses ? as the placeholder character in SQL statements and nil as the value to use for binding an SQL NULL value to a placeholder.

To use placeholders with do, pass the statement string followed by the data values to bind to the placeholders:

count = dbh.do("INSERT INTO profile (name,birth,color,foods,cats)

VALUES(?,?,?,?,?)",

"De'Mont", "1973-01-12", nil, "eggroll", 4)

Alternatively, pass the statement string to prepare to get a statement handle, and then use that handle to invoke execute with the data values:

sth = dbh.prepare("INSERT INTO profile (name,birth,color,foods,cats)

VALUES(?,?,?,?,?)")

count = sth.execute("De'Mont", "1973-01-12", nil, "eggroll", 4)

Regardless of how you construct the statement, DBI includes a properly escaped quote and a properly unquoted NULL value:

INSERT INTO profile (name,birth,color,foods,cats)

VALUES('De\'Mont','1973-01-12',NULL,'eggroll',4)

The approach that uses prepare plus execute is useful for a statement that is executed multiple times with different data values. For a statement to be executed just once, you can skip the prepare step. Pass the statement string to the database handle execute method, followed by the data values. The following example processes a SELECT statement this way:

sth = dbh.execute("SELECT * FROM profile WHERE cats > ?", 2)

sth.fetch do |row|

printf "id: %s, name: %s, cats: %s\n", row["id"], row["name"], row["cats"]

end

sth.finish

The Ruby DBI placeholder mechanism provides quotes around data values as necessary when they are bound to the statement string, so do not put quotes around the ? characters in the string.

The Ruby DBI quote() database handle method is an alternative to placeholders. The following example uses quote() to produce the INSERT statement for De’Mont:

stmt = sprintf "INSERT INTO profile (name,birth,color,foods,cats)

VALUES(%s,%s,%s,%s,%s)",

dbh.quote("De'Mont"),

dbh.quote("1973-01-12"),

dbh.quote(nil),

dbh.quote("eggroll"),

dbh.quote(4)

count = dbh.do(stmt)

The statement string generated by this code is the same as when you use placeholders. The %s format specifiers are written without enclosing quotes because quote() provides them automatically as necessary: non-nil values are inserted with quotes, and nil values are inserted as NULLwithout quotes.

PHP

The PEAR DB module allows placeholders to be used with the query() method that executes SQL statements, or you can use prepare() to prepare a statement, and execute() to supply the data values and execute the prepared statement. PEAR DB uses ? as the placeholder marker in SQL statements and the PHP NULL as the value to use when binding an SQL NULL value to a placeholder.

With query(), pass the statement string followed by an array that contains the data values to bind to the placeholders:

$result =& $conn->query ("INSERT INTO profile (name,birth,color,foods,cats)

VALUES(?,?,?,?,?)",

array ("De'Mont","1973-01-12",NULL,"eggroll",4));

if (PEAR::isError ($result))

die ("Oops, the statement failed");

Alternatively, pass the statement string to prepare() to get a statement object. Pass this object and the array of data values to execute():

$stmt =& $conn->prepare ("INSERT INTO profile (name,birth,color,foods,cats)

VALUES(?,?,?,?,?)");

if (PEAR::isError ($stmt))

die ("Oops, statement preparation failed");

$result =& $conn->execute ($stmt,

array ("De'Mont","1973-01-12",NULL,"eggroll",4));

if (PEAR::isError ($result))

die ("Oops, statement execution failed");

The statement constructed either way includes a properly escaped quote and a properly unquoted NULL value:

INSERT INTO profile (name,birth,color,foods,cats)

VALUES('De\'Mont','1973-01-12',NULL,'eggroll',4)

If there is only one placeholder, the array of data values has only a single member:

$result =& $conn->query ("SELECT * FROM profile WHERE cats > ?", array (2));

if (PEAR::isError ($result))

die ("Oops, the statement failed");

while ($row =& $result->fetchRow (DB_FETCHMODE_ASSOC))

{

printf ("id: %s, name: %s, cats: %s\n",

$row["id"], $row["name"], $row["cats"]);

}

$result->free ();

In that case, you can specify the data value without using an array. The following two query() invocations are equivalent:

$result =& $conn->query ("SELECT * FROM profile WHERE cats > ?", array (2));

$result =& $conn->query ("SELECT * FROM profile WHERE cats > ?", 2)

The PEAR DB placeholder mechanism provides quotes around data values as necessary when they are bound to the statement string, so do not put quotes around the ? characters in the string.

The PEAR DB quoteSmart() method can be used instead of placeholders for quoting data values. The following example inserts the row for De’Mont after using quoteSmart() to construct the statement string:

$stmt = sprintf ("INSERT INTO profile (name,birth,color,foods,cats)

VALUES(%s,%s,%s,%s,%s)",

$conn->quoteSmart ("De'Mont"),

$conn->quoteSmart ("1973-01-12"),

$conn->quoteSmart (NULL),

$conn->quoteSmart ("eggroll"),

$conn->quoteSmart (4));

$result =& $conn->query ($stmt);

if (PEAR::isError ($result))

die ("Oops, the statement failed");

PEAR DB also has an escapeSimple() quoting method, but it’s inferior to quoteSmart(). For example, it doesn’t handle NULL values properly.

Python

Python’s MySQLdb module implements placeholders using format specifiers in the SQL statement string. To use placeholders, invoke the execute() method with two arguments: a statement string containing format specifiers and a sequence containing the values to bind to the statement string. To add the profile table row for De’Mont, the code looks like this:

cursor = conn.cursor ()

cursor.execute ("""

INSERT INTO profile (name,birth,color,foods,cats)

VALUES(%s,%s,%s,%s,%s)

""", ("De'Mont", "1973-01-12", None, "eggroll", 4))

Some of the Python DB-API driver modules support several format specifiers (such as %d for integers and %f for floating-point numbers). With MySQLdb, you should use a placeholder of %s to format all data values as strings. MySQL will perform type conversion as necessary. To place a literal % character into the statement, use %% in the statement string.

The parameter binding mechanism adds quotes around data values where necessary. DB-API treats None as logically equivalent to the SQL NULL value, so you can bind None to a placeholder to produce a NULL in the statement string. The statement that is sent to the server by the precedingexecute() call looks like this:

INSERT INTO profile (name,birth,color,foods,cats)

VALUES('De\'Mont','1973-01-12',NULL,'eggroll',4)

If you have only a single value val to bind to a placeholder, you can write it as a sequence using the syntax ( val ,). The following SELECT statement demonstrates this:

cursor = conn.cursor ()

cursor.execute ("SELECT * FROM profile WHERE cats = %s", (2,))

for row in cursor.fetchall ():

print row

cursor.close ()

Python’s placeholder mechanism provides quotes around data values as necessary when they are bound to the statement string, so do not put quotes around the %s format specifiers in the string.

With MySQLdb, an alternative method of quoting data values is to use the literal() method. To produce the INSERT statement for De’Mont by using literal(), do this:

cursor = conn.cursor ()

stmt = """

INSERT INTO profile (name,birth,color,foods,cats)

VALUES(%s,%s,%s,%s,%s)

""" % \

(conn.literal ("De'Mont"), \

conn.literal ("1973-01-12"), \

conn.literal (None), \

conn.literal ("eggroll"), \

conn.literal (4))

cursor.execute (stmt)

Java

JDBC provides support for placeholders if you use prepared statements. Recall that the process for issuing nonprepared statements in JDBC is to create a Statement object and then pass the statement string to one of the statement-issuing functions executeUpdate() , executeQuery(),or execute(). To use a prepared statement instead, create a PreparedStatement object by passing a statement string containing ? placeholder characters to your connection object’s prepareStatement() method. Then bind your data values to the statement using set XXX () methods. Finally, execute the statement by calling executeUpdate(), executeQuery(), or execute() with an empty argument list.

Here is an example that uses executeUpdate() to issue an INSERT statement that adds the profile table row for De’Mont:

PreparedStatement s;

int count;

s = conn.prepareStatement (

"INSERT INTO profile (name,birth,color,foods,cats)"

+ " VALUES(?,?,?,?,?)");

s.setString (1, "De'Mont"); // bind values to placeholders

s.setString (2, "1973-01-12");

s.setNull (3, java.sql.Types.CHAR);

s.setString (4, "eggroll");

s.setInt (5, 4);

count = s.executeUpdate ();

s.close (); // close statement

The set XXX () methods that bind data values to statements take two arguments: a placeholder position (beginning with 1, not 0) and the value to bind to the placeholder. Choose each value-binding call to match the data type of the column to which the value is bound: setString() to bind a string to the name column, setInt() to bind an integer to the cats column, and so forth. (Actually, I cheated a bit by using setString() to treat the date value for birth as a string.)

One difference between JDBC and the other APIs is that you don’t specify a special value to bind a NULL to a placeholder by specifying some special value (such as undef in Perl or nil in Ruby). Instead, you invoke a special method setNull(), in which the second argument indicates the type of the column (java.sql.Types.CHAR for a string, java.sql.Types.INTEGER for an integer, and so forth).

The set XXX () calls add quotes around data values if necessary, so do not put quotes around the ? placeholder characters in the statement string.

For a statement that returns a result set, the preparation process is similar, but you execute the prepared statement with executeQuery() instead:

PreparedStatement s;

s = conn.prepareStatement ("SELECT * FROM profile WHERE cats > ?");

s.setInt (1, 2); // bind 2 to first placeholder

s.executeQuery ();

// ... process result set here ...

s.close (); // close statement

Handling Special Characters in Identifiers

Problem

You need to construct SQL statements that refer to identifiers containing special characters.

Solution

Quote the identifiers so that they can be inserted safely into statement strings.

Discussion

Handling Special Characters and NULL Values in Statements discusses how to handle special characters in data values by using placeholders or quoting methods. Special characters also can be present in identifiers such as database, table, and column names. For example, the table namesome table contains a space, which is not allowed by default:

mysql>CREATE TABLE some table (i INT);

ERROR 1064 (42000): You have an error in your SQL syntax near 'table (i INT)'

Special characters are handled differently in identifiers than in data values. To make an identifier safe for insertion into an SQL statement, quote it by enclosing it within backticks:

mysql>CREATE TABLE `some table` (i INT);

Query OK, 0 rows affected (0.04 sec)

If a quoting character appears within the identifier itself, double it when quoting the identifier. For example, quote abc`def as `abc``def`.

In MySQL, backticks are always allowed for identifier quoting. If the ANSI_QUOTES SQL mode is enabled, the double-quote character also is legal for quoting identifiers. Thus, both of the following statements are equivalent with the ANSI_QUOTES SQL mode enabled:

CREATE TABLE `some table` (i INT);

CREATE TABLE "some table" (i INT);

If it’s necessary to know which identifier quoting characters are allowable, issue a SELECT @@sql_mode statement to retrieve the SQL mode and check whether its value includes ANSI_QUOTES.

Be aware that although strings in MySQL normally can be quoted using either single-quote or double-quote characters ('abc', "abc"), that is not true when ANSI_QUOTES is enabled. In that case, MySQL interprets 'abc' as a string and "abc" as an identifier, so you must use only single quotes for strings.

Within a program, you can use an identifier-quoting routine if your API provides one, or write one yourself if it does not. Perl DBI has a quote_identifier() method that returns a properly quoted identifier. For an API that has no such method, you can quote an identifier by enclosing it within backticks and doubling any backticks that occur within the identifier. Here’s a Ruby routine that does so:

def quote_identifier(ident)

return "`" + ident.gsub(/`/, "``") + "`"

end

If you’re willing to assume that an identifier has no internal backticks, you can simply enclose it within backticks.

Portability note: If you write your own identifier-quoting routines, remember that other DBMSs may require different quoting conventions.

In some contexts, identifiers might be used as data values, and should be handled as such. If you select information from INFORMATION_SCHEMA, the metadata database, it’s common to indicate which rows to return by specifying database object names in the WHERE clause. For example, this statement retrieves the column names for the profile table in the cookbook database:

SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS

WHERE TABLE_SCHEMA = 'cookbook' AND TABLE_NAME = 'item';

The database and table names are used here as data values, not as identifiers. Were you to construct this statement within a program, you would parameterize them using placeholders, not identifier quoting. For example, you might do this in Ruby:

names = dbh.select_all(

"SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS

WHERE TABLE_SCHEMA = ? AND TABLE_NAME = ?",

db_name, tbl_name)

Identifying NULL Values in Result Sets

Problem

A query result includes NULL values, but you’re not sure how to tell where they are.

Solution

Your API probably has some special value that represents NULL by convention. You just have to know what it is and how to test for it.

Discussion

Handling Special Characters and NULL Values in Statements described how to refer to NULL values when you send statements to the database server. In this section, we’ll deal instead with the question of how to recognize and process NULL values that are returned from the database server. In general, this is a matter of knowing what special value the API maps NULL values onto, or what method to call. These values are shown in the following table:

Language	NULL-detection value or method
Perl DBI	undef value
Ruby DBI	nil value
PHP PEAR DB	A NULL or unset value
Python DB-API	None value
Java JDBC	wasNull() method

The following sections show a very simple application of NULL value detection. The examples retrieve a result set and print all values in it, mapping NULL values onto the printable string “NULL”.

To make sure the profile table has a row that contains some NULL values, use mysql to issue the following INSERT statement, and then issue the SELECT statement to verify that the resulting row has the expected values:

mysql>INSERT INTO profile (name) VALUES('Juan');

mysql> SELECT * FROM profile WHERE name = 'Juan';

+----+------+-------+-------+-------+------+

+----+------+-------+-------+-------+------+

+----+------+-------+-------+-------+------+

The id column might contain a different number, but the other columns should appear as shown, with values of NULL.

Perl

Perl DBI represents NULL values using undef. It’s easy to detect such values using the defined() function, and it’s particularly important to do so if you enable warnings with the Perl -w option or by including a use warnings line in your script. Otherwise, accessing undef values causes Perl to issue the following complaint:

Use of uninitialized value

To avoid this warning, test column values that might be undef with defined() before using them. The following code selects a few columns from the profile column and prints “NULL” for any undefined values in each row. This makes NULL values explicit in the output without activating any warning messages:

my $sth = $dbh->prepare ("SELECT name, birth, foods FROM profile");

$sth->execute ();

while (my $ref = $sth->fetchrow_hashref ())

{

printf "name: %s, birth: %s, foods: %s\n",

defined ($ref->{name}) ? $ref->{name} : "NULL",

defined ($ref->{birth}) ? $ref->{birth} : "NULL",

defined ($ref->{foods}) ? $ref->{foods} : "NULL";

}

Unfortunately, all that testing of column values is ponderous and becomes worse the more columns there are. To avoid this, you can test and set undefined values in a loop prior to printing them. Then the amount of code you have to write to perform the tests is constant, not proportional to the number of columns to be tested. The loop also makes no reference to specific column names, so it can more easily be copied and pasted to other programs or used as the basis for a utility routine:

my $sth = $dbh->prepare ("SELECT name, birth, foods FROM profile");

$sth->execute ();

while (my $ref = $sth->fetchrow_hashref ())

{

foreach my $key (keys (%{$ref}))

{

$ref->{$key} = "NULL" unless defined ($ref->{$key});

}

printf "name: %s, birth: %s, foods: %s\n",

$ref->{name}, $ref->{birth}, $ref->{foods};

}

If you fetch rows into an array rather than into a hash, you can use map to convert any undef values:

my $sth = $dbh->prepare ("SELECT name, birth, foods FROM profile");

$sth->execute ();

while (my @val = $sth->fetchrow_array ())

{

@val = map { defined ($_) ? $_ : "NULL" } @val;

printf "name: %s, birth: %s, foods: %s\n",

$val[0], $val[1], $val[2];

}

Ruby

Ruby DBI represents NULL values using nil, which can be identified by applying the nil? method to a value. The following example uses nil? to determine whether to print result set values as is or as the string "NULL" for NULL values:

dbh.execute("SELECT name, birth, foods FROM profile") do |sth|

sth.fetch do |row|

for i in 0...row.length

row[i] = "NULL" if row[i].nil? # is the column value NULL?

end

printf "id: %s, name: %s, cats: %s\n", row[0], row[1], row[2]

end

A shorter alternative to the for loop is the collect! method, which takes each array element in turn and replaces it with the value returned by the code block:

row.collect! { |val| val.nil? ? "NULL" : val }

PHP

PHP represents SQL NULL values in result sets as the PHP NULL value. To determine whether a value from a result set represents a NULL value, compare it to the PHP NULL value using the === triple equal operator:

if ($val === NULL)

{

# $val is a NULL value

}

In PHP, the triple equal operator means “exactly equal to.” The usual == equal to comparison operator is not suitable here. If you use ==, PHP considers the NULL value, the empty string, and 0 all equal to each other.

An alternative to using === to test for NULL values is to use isset():

if (!isset ($val))

{

# $val is a NULL value

}

The following code uses the === operator to identify NULL values in a result set and print them as the string "NULL":

$result =& $conn->query ("SELECT name, birth, foods FROM profile");

if (PEAR::isError ($result))

die ("Oops, the statement failed");

while ($row =& $result->fetchRow ())

{

foreach ($row as $key => $value)

{

if ($row[$key] === NULL)

$row[$key] = "NULL";

}

print ("name: $row[0], birth: $row[1], foods: $row[2]\n");

}

$result->free ();

Python

Python DB-API programs represent NULL in result sets using None. The following example shows how to detect NULL values:

cursor = conn.cursor ()

cursor.execute ("SELECT name, birth, foods FROM profile")

for row in cursor.fetchall ():

row = list (row) # convert nonmutable tuple to mutable list

for i in range (0, len (row)):

if row[i] == None: # is the column value NULL?

row[i] = "NULL"

print "name: %s, birth: %s, foods: %s" % (row[0], row[1], row[2])

cursor.close ()

The inner loop checks for NULL column values by looking for None and converts them to the string "NULL". Note how the example converts row to a mutable object prior to the loop; fetchall() returns rows as sequence values, which are nonmutable (read-only).

Java

For JDBC programs, if it’s possible for a column in a result set to contain NULL values, it’s best to check for them explicitly. The way to do this is to fetch the value and then invoke wasNull(), which returns true if the column is NULL and false otherwise. For example:

Object obj = rs.getObject (index);

if (rs.wasNull ())

{ /* the value's a NULL */ }

The preceding example uses getObject(), but the principle holds for other get XXX () calls as well.

Here’s an example that prints each row of a result set as a comma-separated list of values, with "NULL" printed for each NULL value:

Statement s = conn.createStatement ();

s.executeQuery ("SELECT name, birth, foods FROM profile");

ResultSet rs = s.getResultSet ();

ResultSetMetaData md = rs.getMetaData ();

int ncols = md.getColumnCount ();

while (rs.next ()) // loop through rows of result set

{

for (int i = 0; i < ncols; i++) // loop through columns

{

String val = rs.getString (i+1);

if (i > 0)

System.out.print (", ");

if (rs.wasNull ())

System.out.print ("NULL");

else

System.out.print (val);

}

System.out.println ();

}

rs.close (); // close result set

s.close (); // close statement

Techniques for Obtaining Connection Parameters

Problem

You need to obtain connection parameters for a script so that it can connect to a MySQL server.

Solution

There are lots of ways to do this. Take your pick from the alternatives in the following section.

Discussion

Any program that connects to MySQL needs to specify connection parameters such as the username, password, and hostname. The recipes shown so far have put connection parameters directly into the code that attempts to establish the connection, but that is not the only way for your programs to obtain the parameters. This section briefly surveys some of the techniques you can use and then shows how to implement two of them.

Hardwire the parameters into the program

The parameters can be given either in the main source file or in a library file that is used by the program. This technique is convenient because users need not enter the values themselves. The flip side is that it’s not very flexible. To change the parameters, you must modify your program.

Ask for the parameters interactively

In a command-line environment, you can ask the user a series of questions. In a web or GUI environment, this might be done by presenting a form or dialog. Either way, this gets to be tedious for people who use the application frequently, due to the need to enter the parameters each time.

Get the parameters from the command line

This method can be used either for commands that you run interactively or that are run from within a script. Like the method of obtaining parameters interactively, this requires you to supply parameters each time you use MySQL and can be similarly tiresome. (A factor that significantly mitigates this burden is that many shells enable you to easily recall commands from your history list for re-execution.)

Get the parameters from the execution environment

The most common way to use this method is to set the appropriate environment variables in one of your shell’s startup files (such as .profile for sh, bash, ksh; or .login for csh or tcsh). Programs that you run during your login session then can get parameter values by examining their environment.

Get the parameters from a separate file

With this method, you store information such as the username and password in a file that programs can read before connecting to the MySQL server. Reading parameters from a file that’s separate from your program gives you the benefit of not having to enter them each time you use the program, while allowing you to avoid hardwiring the values into the program itself. This technique is especially convenient for interactive programs, because then you need not enter parameters each time you run the program. Also, storing the values in a file enables you to centralize parameters for use by multiple programs, and you can use the file access mode for security purposes. For example, you can keep other users from reading the file by setting its mode to restrict access to yourself.

The MySQL client library itself supports an option file mechanism, although not all APIs provide access to it. For those that don’t, workarounds may exist. (As an example, Java supports the use of properties files and supplies utility routines for reading them.)

Use a combination of methods

It’s often useful to combine some of the preceding methods, to give users the flexibility of providing parameters different ways. For example, MySQL clients such as mysql and mysqladmin look for option files in several locations and read any that are present. They then check the command-line arguments for further parameters. This enables users to specify connection parameters in an option file or on the command line.

These methods of obtaining connection parameters do involve some security concerns. Here is a brief summary of the issues:

§ Any method that stores connection parameters in a file may compromise your system’s security unless the file is protected against access by unauthorized users. This is true whether parameters are stored in a source file, an option file, or a script that invokes a command and specifies the parameters on the command line. (Web scripts that can be read only by the web server don’t qualify as secure if other users have administrative access to the server.)

§ Parameters specified on the command line or in environment variables are not particularly secure. While a program is executing, its command-line arguments and environment may be visible to other users who run process status commands such as ps -e. In particular, storing the password in an environment variable perhaps is best limited to those situations in which you’re the only user on the machine or you trust all other users.

The rest of this section shows how to process command-line arguments to get connection parameters and how to read parameters from option files.

Getting parameters from the command line

The usual MySQL convention for command-line arguments (that is, the convention followed by standard clients such as mysql and mysqladmin) is to allow parameters to be specified using either a short option or a long option. For example, the username cbuser can be specified either as -ucbuser (or -ucbuser) or --user=cbuser. In addition, for either of the password options (-p or --password), the password value may be omitted after the option name to indicate that the program should prompt for the password interactively.

The next set of example programs shows how to process command arguments to obtain the hostname, username, and password. The standard flags for these are -h or --host, -u or --user, and -p or --password. You can write your own code to iterate through the argument list, but in general, it’s much easier to use existing option-processing modules written for that purpose.

To enable a script to use other options, such as --port or --socket, you can use the code shown but extend the option-specifier arrays to include additional options. You’ll also need to modify the connection-establishment code slightly to use the option values if they are given.

For those APIs shown here (Perl, Ruby, Python), the programs presented use a getopt()-style function. For Java, look under the api directory in the recipes distribution for sample code that is not shown here, as well as instructions for using it.

NOTE

Insofar as possible, the examples mimic the option-handling behavior of the standard MySQL clients. An exception is that option-processing libraries may not allow for making the password value optional, and they provide no way of prompting the user for a password interactively if a password option is specified without a password value. Consequently, the example scripts are written so that if you use -p or --password, you must provide the password value following the option.

Perl. Perl passes command-line arguments to scripts via the@ARGV array, which can be processed using the GetOptions() function of the Getopt::Long module. The following program shows how to parse the command arguments for connection parameters.

#!/usr/bin/perl

# cmdline.pl - demonstrate command-line option parsing in Perl

use strict;

use warnings;

use DBI;

use Getopt::Long;

$Getopt::Long::ignorecase = 0; # options are case sensitive

$Getopt::Long::bundling = 1; # allow short options to be bundled

# connection parameters - all missing (undef) by default

my $host_name;

my $password;

my $user_name;

GetOptions (

# =s means a string value is required after the option

"host|h=s" => \$host_name,

"password|p=s" => \$password,

"user|u=s" => \$user_name

) or exit (1); # no error message needed; GetOptions() prints its own

# any remaining nonoption arguments are left

# in @ARGV and can be processed here as necessary

# construct data source name

my $dsn = "DBI:mysql:database=cookbook";

$dsn .= ";host=$host_name" if defined ($host_name);

# connect to server

my %conn_attrs = (PrintError => 0, RaiseError => 1, AutoCommit => 1);

my $dbh = DBI->connect ($dsn, $user_name, $password, \%conn_attrs);

print "Connected\n";

$dbh->disconnect ();

print "Disconnected\n";

The arguments to GetOptions() are pairs of option specifiers and references to the script variables into which option values should be placed. An option specifier lists both the long and short forms of the option (without leading dashes), followed by =s if the option requires a following value. For example, "host|h=s" allows both --host and -h and indicates that a following string value is required. You need not pass the @ARGV array because GetOptions() uses it implicitly. When GetOptions() returns, @ARGV contains any remaining nonoption arguments.

Ruby. Ruby programs access command-line arguments via the ARGV array, which you can process with theGetoptLong.new() method. The following program uses this method to parse the command arguments for connection parameters:

#!/usr/bin/ruby -w

# cmdline.rb - demonstrate command-line option parsing in Ruby

require "getoptlong"

require "dbi"

# connection parameters - all missing (nil) by default

host_name = nil

password = nil

user_name = nil

opts = GetoptLong.new(

[ "--host", "-h", GetoptLong::REQUIRED_ARGUMENT ],

[ "--password", "-p", GetoptLong::REQUIRED_ARGUMENT ],

[ "--user", "-u", GetoptLong::REQUIRED_ARGUMENT ]

)

# iterate through options, extracting whatever values are present;

# opt will be the long-format option, arg is its value

opts.each do |opt, arg|

case opt

when "--host"

host_name = arg

when "--password"

password = arg

when "--user"

user_name = arg

end

# any remaining nonoption arguments are left

# in ARGV and can be processed here as necessary

# construct data source name

dsn = "DBI:Mysql:database=cookbook"

dsn << ";host=#{host_name}" unless host_name.nil?

# connect to server

begin

dbh = DBI.connect(dsn, user_name, password)

puts "Connected"

rescue DBI::DatabaseError => e

puts "Cannot connect to server"

puts "Error code: #{e.err}"

puts "Error message: #{e.errstr}"

exit(1)

end

dbh.disconnect()

puts "Disconnected"

To process the ARGV array, use the GetoptLong.new() method, and pass information to it that indicates which options to recognize. Each argument to this method is an array of three values:

§ The long option name.

§ The short option name.

§ A flag that indicates whether the option requires a value. The allowable flags are GetoptLong::NO_ARGUMENT (option takes no value), GetoptLong::REQUIRED_ARGUMENT (option requires a value), and GetoptLong::OPTIONAL_ARGUMENT (option value is optional). For the example program, all options require a value.

Python. Python passes command arguments to scripts as a list via the sys.argv variable. You can access this variable and process its contents by importing the sys and getopt modules. The following program illustrates how to get parameters from the command arguments and use them for establishing a connection to the server:

#!/usr/bin/python

# cmdline.py - demonstrate command-line option parsing in Python

import sys

import getopt

import MySQLdb

try:

opts, args = getopt.getopt (sys.argv[1:],

"h:p:u:",

[ "host=", "password=", "user=" ])

except getopt.error, e:

# for errors, print program name and text of error message

print "%s: %s" % (sys.argv[0], e)

sys.exit (1)

# default connection parameter values (all empty)

host_name = password = user_name = ""

# iterate through options, extracting whatever values are present

for opt, arg in opts:

if opt in ("-h", "--host"):

host_name = arg

elif opt in ("-p", "--password"):

password = arg

elif opt in ("-u", "--user"):

user_name = arg

# any remaining nonoption arguments are left in

# args and can be processed here as necessary

try:

conn = MySQLdb.connect (db = "cookbook",

host = host_name,

user = user_name,

passwd = password)

print "Connected"

except MySQLdb.Error, e:

print "Cannot connect to server"

print "Error:", e.args[1]

print "Code:", e.args[0]

sys.exit (1)

conn.close ()

print "Disconnected"

getopt() takes either two or three arguments:

§ A list of command arguments to be processed. This should not include the program name, sys.argv[0], so use sys.argv[1:] to refer to the list of arguments that follow the program name.

§ A string naming the short option letters. In cmdline.py, each of these is followed by a colon character (:) to indicate that the option requires a following value.

§ An optional list of long option names. In cmdline.py, each name is followed by = to indicate that the option requires a following value.

getopt() returns two values. The first is a list of option/value pairs, and the second is a list of any remaining nonoption arguments following the last option. cmdline.py iterates through the option list to determine which options are present and what their values are. Note that although you do not specify leading dashes in the option names passed to getopt(), the names returned from that function do include leading dashes.

Getting parameters from option files

If your API allows it, you can specify connection parameters in a MySQL option file and the API will read the parameters from the file for you. For APIs that do not support option files directly, you may be able to arrange to read other types of files in which parameters are stored or to write your own functions that read option files.

Specifying Connection Parameters Using Option Files describes the format of MySQL option files. I’ll assume that you’ve read the discussion there and concentrate here on how to use option files from within programs. Under Unix, user-specific options are specified by convention in~/.my.cnf (that is, in the .my.cnf file in your home directory). However, the MySQL option file mechanism can look in several different files if they exist (no option file is required to exist). The standard search order is /etc/my.cnf, the my.cnf file in the MySQL installation directory, and the~/.my.cnf file for the current user. Under Windows, the option files you can use are the my.ini file in your MySQL installation directory (for example, C:\Program Files\MySQL\MySQL Server 5.0), my.ini in your Windows directory (this is something like C:\Windows or C:\WINNT), or themy.cnf file.

If multiple option files exist and a given parameter is specified in several of them, the last value found takes precedence.

MySQL option files are not used by your own programs unless you tell them to do so:

§ Perl, Ruby, and Python provide direct API support for reading option files; simply indicate that you want to use them at the time that you connect to the server. It’s possible to specify that only a particular file should be read, or that the standard search order should be used to look for multiple option files.

§ PHP and Java do not support option files. As a workaround for PHP, we’ll write a simple option file parsing function. For Java, we’ll adopt a different approach that uses properties files.

Although the conventional name under Unix for the user-specific option file is .my.cnf in the current user’s home directory, there’s no rule that your programs must use this particular file. You can name an option file anything you like and put it wherever you want. For example, you might set up a file named mcb.cnf and install it in the /usr/local/lib/mcb directory for use by scripts that access the cookbook database. Under some circumstances, you might even want to create multiple option files. Then, from within any given script, you can select the file that’s appropriate for the type of permissions the script needs. For example, you might have one option file, mcb.cnf, that lists parameters for a full-access MySQL account, and another file, mcb-ro.cnf, that lists connection parameters for an account that needs only read-only access to MySQL. Another possibility is to list multiple groups within the same option file and have your scripts select options from the appropriate group.

Perl. Perl DBI scripts can use option files. To take advantage of this, place the appropriate option specifiers in the third component of the data source name string:

§ To specify an option group, use mysql_read_default_group=groupname . This tells MySQL to search the standard option files for options in the named group and in the [client] group. The groupname value should be written without the square brackets that are part of the line that begins the group. For example, if a group in an option file begins with a [my_prog] line, specify my_prog as the groupname value. To search the standard files but look only in the [client] group, groupname should be client.

§ To name a specific option file, use mysql_read_default_file=filename in the DSN. When you do this, MySQL looks only in that file and only for options in the [client] group.

§ If you specify both an option file and an option group, MySQL reads only the named file, but looks for options both in the named group and in the [client] group.

The following example tells MySQL to use the standard option file search order to look for options in both the [cookbook] and [client] groups:

my %conn_attrs = (PrintError => 0, RaiseError => 1, AutoCommit => 1);

# basic DSN

my $dsn = "DBI:mysql:database=cookbook";

# look in standard option files; use [cookbook] and [client] groups

$dsn .= ";mysql_read_default_group=cookbook";

my $dbh = DBI->connect ($dsn, undef, undef, \%conn_attrs);

The next example explicitly names the option file located in $ENV{HOME}, the home directory of the user running the script. Thus, MySQL will look only in that file and will use options from the [client] group:

my %conn_attrs = (PrintError => 0, RaiseError => 1, AutoCommit => 1);

# basic DSN

my $dsn = "DBI:mysql:database=cookbook";

# look in user-specific option file owned by the current user

$dsn .= ";mysql_read_default_file=$ENV{HOME}/.my.cnf";

my $dbh = DBI->connect ($dsn, undef, undef, \%conn_attrs);

If you pass an empty value (undef or the empty string) for the username or password arguments of the connect() call, connect() uses whatever values are found in the option file or files. A nonempty username or password in the connect() call overrides any option file value. Similarly, a host named in the DSN overrides any option file value. You can use this behavior to enable DBI scripts to obtain connection parameters both from option files as well as from the command line as follows:

1. Create $host_name, $user_name, and $password variables and initialize them to undef. Then parse the command-line arguments to set the variables to non-undef values if the corresponding options are present on the command line. (The cmdline.pl Perl script shown earlier in this recipe for processing command-line arguments demonstrates how to do this.)

2. After parsing the command arguments, construct the DSN string, and call connect(). Use mysql_read_default_group and mysql_read_default_file in the DSN to specify how you want option files to be used, and, if $host_name is not undef, add host=$host_nameto the DSN. In addition, pass $user_name and $password as the username and password arguments to connect(). These will be undef by default; if they were set from the command-line arguments, they will have non-undef values that override any option file values.

If a script follows this procedure, parameters given by the user on the command line are passed to connect() and take precedence over the contents of option files.

Ruby. Ruby DBI scripts can access option files by using a mechanism analogous to that used for Perl DBI, and the following examples correspond exactly to those shown in the preceding Perl discussion.

This example use the standard option file search order to look for options in both the [cookbook] and [client] groups:

# basic DSN

dsn = "DBI:Mysql:database=cookbook"

# look in standard option files; use [cookbook] and [client] groups

dsn << ";mysql_read_default_group=cookbook"

dbh = DBI.connect(dsn, nil, nil)

The following example uses the .my.cnf file in the current user’s home directory to obtain parameters from the [client] group:

# basic DSN

dsn = "DBI:Mysql:database=cookbook"

# look in user-specific option file owned by the current user

dsn << ";mysql_read_default_file=#{ENV['HOME']}/.my.cnf"

dbh = DBI.connect(dsn, nil, nil)

PHP. PHP has no native support for using MySQL option files. To work around that limitation, use a function that reads an option file, such as theread_mysql_option_file() function shown in the following listing. It takes as arguments the name of an option file and an option group name or an array containing group names. (Group names should be named without square brackets.) It then reads any options present in the file for the named group or groups. If no option group argument is given, the function looks by default in the [client] group. The return value is an array of option name/value pairs, or FALSE if an error occurs. It is not an error for the file not to exist. (Note that quoted option values and trailing #-style comments following option values are legal in MySQL option files, but this function does not handle those constructs.)

function read_mysql_option_file ($filename, $group_list = "client")

{

if (is_string ($group_list)) # convert string to array

$group_list = array ($group_list);

if (!is_array ($group_list)) # hmm ... garbage argument?

return (FALSE);

$opt = array (); # option name/value array

if (!@($fp = fopen ($filename, "r"))) # if file does not exist,

return ($opt); # return an empty list

$in_named_group = 0; # set nonzero while processing a named group

while ($s = fgets ($fp, 1024))

{

$s = trim ($s);

if (ereg ("^[#;]", $s)) # skip comments

continue;

if (ereg ("^\[([^]]+)]", $s, $arg)) # option group line?

{

# check whether we're in one of the desired groups

$in_named_group = 0;

foreach ($group_list as $key => $group_name)

{

if ($arg[1] == $group_name)

{

$in_named_group = 1; # we are in a desired group

break;

}

continue;

}

if (!$in_named_group) # we're not in a desired

continue; # group, skip the line

if (ereg ("^([^ \t=]+)[ \t]*=[ \t]*(.*)", $s, $arg))

$opt[$arg[1]] = $arg[2]; # name=value

else if (ereg ("^([^ \t]+)", $s, $arg))

$opt[$arg[1]] = ""; # name only

# else line is malformed

}

return ($opt);

}

Here are a couple of examples showing how to use read_mysql_option_file(). The first reads a user’s option file to get the [client] group parameters and then uses them to connect to the server. The second reads the system-wide option file, /etc/my.cnf, and prints the server startup parameters that are found there (that is, the parameters in the [mysqld] and [server] groups):

$opt = read_mysql_option_file ("/u/paul/.my.cnf");

$dsn = array

(

"phptype" => "mysqli",

"username" => $opt["user"],

"password" => $opt["password"],

"hostspec" => $opt["host"],

"database" => "cookbook"

);

$conn =& DB::connect ($dsn);

if (PEAR::isError ($conn))

print ("Cannot connect to server\n");

$opt = read_mysql_option_file ("/etc/my.cnf", array ("mysqld", "server"));

foreach ($opt as $name => $value)

print ("$name => $value\n");

Python. The MySQLdb module for DB-API provides direct support for using MySQL option files. Specify an option file or option group using read_default_file or read_default_group arguments to the connect() method. These two arguments act the same way as themysql_read_default_file and mysql_read_default_group options for the Perl DBI connect() method (see the Perl discussion earlier in this section). To use the standard option file search order to look for options in both the [cookbook] and [client] groups, do something like this:

conn = MySQLdb.connect (db = "cookbook", read_default_group = "cookbook")

The following example shows how to use the .my.cnf file in the current user’s home directory to obtain parameters from the [client] group:

option_file = os.environ["HOME"] + "/" + ".my.cnf"

conn = MySQLdb.connect (db = "cookbook", read_default_file = option_file)

You must import the os module to access os.environ.

Java. The MySQL Connector/J JDBC driver doesn’t support option files. However, the Java class library provides support for reading properties files that contain lines in name=value format. This is somewhat similar to MySQL option file format, although there are some differences (for example, properties files do not allow [groupname] lines). Here is a simple properties file:

# this file lists parameters for connecting to the MySQL server

user=cbuser

password=cbpass

host=localhost

The following program, ReadPropsFile.java, shows one way to read a properties file named Cookbook.properties to obtain connection parameters. The file must be in some directory that is named in your CLASSPATH variable, or else you must specify it using a full pathname (the example shown here assumes that the file is in a CLASSPATH directory):

import java.sql.*;

import java.util.*; // need this for properties file support

public class ReadPropsFile

{

public static void main (String[] args)

{

Connection conn = null;

String url = null;

String propsFile = "Cookbook.properties";

Properties props = new Properties ();

try

{

props.load (ReadPropsFile.class.getResourceAsStream (propsFile));

}

catch (Exception e)

{

System.err.println ("Cannot read properties file");

System.exit (1);

}

try

{

// construct connection URL, encoding username

// and password as parameters at the end

url = "jdbc:mysql://"

+ props.getProperty ("host")

+ "/cookbook"

+ "?user=" + props.getProperty ("user")

+ "&password=" + props.getProperty ("password");

Class.forName ("com.mysql.jdbc.Driver").newInstance ();

conn = DriverManager.getConnection (url);

System.out.println ("Connected");

}

catch (Exception e)

{

System.err.println ("Cannot connect to server");

}

finally

{

try

{

if (conn != null)

{

conn.close ();

System.out.println ("Disconnected");

}

catch (SQLException e) { /* ignore close errors */ }

}

If you want getProperty() to return a particular default value when the named property is not found, pass that value as a second argument. For example, to use 127.0.0.1 as the default host value, call getProperty() like this:

String hostName = props.getProperty ("host", "127.0.0.1");

The Cookbook.java library file developed earlier in the chapter (Writing Library Files) includes an extra library call in the version of the file that you’ll find in the lib directory of the recipes distribution: a propsConnect() routine that is based on the concepts discussed here. To use it, set up the contents of the properties file, Cookbook.properties, and copy the file to the same location where you installed Cookbook.class. You can then establish a connection within a program by importing the Cookbook class and calling Cookbook.propsConnect() rather than by callingCookbook.connect().

Conclusion and Words of Advice

This chapter discusses the basic operations provided by each of our APIs for handling various aspects of interaction with the MySQL server. These operations enable you to write programs that issue any kind of statement and retrieve the results. Up to this point, we’ve used simple statements because the focus is on the APIs rather than on SQL. The next chapter focuses on SQL instead, to show how to ask the database server more complex questions.

Before you proceed, it would be a good idea to reset the profile table used in this chapter to a known state. Several statements in later chapters use this table; by reinitializing it, you’ll get the same results displayed in those chapters when you run the statements shown there. To reset the table, change location into the tables directory of the recipes distribution, and run the following commands:

%mysql cookbook < profile.sql

% mysql cookbook < profile2.sql