Data Types - Appendixes - MySQL in a Nutshell (2008)

MySQL in a Nutshell (2008)

Part V. Appendixes

This part of the book contains quick-reference information that applies to many of the chapters and is often desired in a compact format by a programmer or administrator.

Appendix A. Data Types

When a table is created using the CREATE TABLE statement, every column in a table must be declared as one of the data types supported by MySQL. A column in a table can be added or changed using the ALTER TABLE statement. Data types can be organized in three basic groups: numeric, date and time, and string. This appendix provides a listing of data types along with their limitations.

Numeric Data Types

Standard SQL numeric data types are allowed: accurate numeric data types (i.e., BIGINT, DECIMAL, INTEGER, MEDIUMINT, NUMERIC, SMALLINT, and TINYINT) and approximate numeric data types (i.e., DOUBLE and FLOAT). For all numeric data types, you can use the UNSIGNED andZEROFILL flags depending on your needs. If UNSIGNED is omitted, SIGNED is assumed. A numeric data type has different allowable ranges based on whether it’s SIGNED or UNSIGNED. The ZEROFILL flag instructs MySQL to pad the unused spaces to the left of a number with zeros. For example, a column with a data type set to INT(10) using ZEROFILL will display the number 5 as 0000000005. If the ZEROFILL flag is used, UNSIGNED is assumed for the column. When subtracting values where one is UNSIGNED, the results will become UNSIGNED.

For several of the numeric data types, you can specify a width for displaying. This number cannot exceed 255. The display width is a factor only when ZEROFILL is used for the column. You may also specify the number of digits allowed for the decimals, including the decimal point.

Approximate numeric data types store floating-point numbers such as fractions where an approximation must be made. For instance, an accurate number, per se, cannot be stored for 1/3 because the decimal point for 3 continues on endlessly. MySQL provides two approximate numeric data types: FLOAT and DOUBLE with their synonyms.

Following is a list of numeric data types. They’re not organized alphabetically; instead, they’re organized in ascending order based on the size of numeric values they can contain, with the column data types that may be used for approximate numbers listed last.

Name

BIT

Synopsis

BIT[(width)]

This is a bit-field data column type. With this column you can specify the maximum number of bits. It accepts 1 to 64. If no width is given, 1 bit is assumed. You can use this column to store binary data (i.e., data composed of 1s and 0s). You can also use this column in conjunction with functions like BIN(). Here is an example with a column that has a data type of BIT(8):

SELECT server_id

FROM servers

WHERE status = BIT(4);

This statement will return rows where the status column has a value of 00000100, which is the binary equivalent of 4.

Name

TINYINT

Synopsis

TINYINT[(width)] [UNSIGNED] [ZEROFILL]

This data column type can be used for tiny integers. The signed range can be from –128 to 127; unsigned can be from 0 to 255. The default if no width is given is 4. This column type can be useful for a simple logical column. For example, TINYINT(1) can be used for a column in which you only want a value of 1 or 0, yes or no. You could just as easily use BOOLEAN, though.

Name

BOOL, BOOLEAN

Synopsis

BOOL

This data column type, BOOL, is synonymous with BOOLEAN and TINYINT(1). In fact, if you set a column to this type and then use the DESCRIBE statement to see the description of the column, it will show it as a TINYINT(1). It can be useful for a simple logical column in which you want only a true or false value. For example, if a column labeled active was a BOOLEAN type, you could do something like the following:

SELECT client_name AS 'Client',

IF(active, 'Active', 'Inactive') AS Status

FROM clients;

This statement will show each client name in the table with the words Active or Inactive next to each name. This works because the IF() function checks for a value of 1 or 0 for the value given; it returns the second parameter given if the value is 1, the third parameter if the value is 0.

Name

SMALLINT

Synopsis

SMALLINT[(width)] [UNSIGNED] [ZEROFILL]

Use this data column type for small integer values. The signed range can be from –32,768 to 32,767; unsigned can be from 0 to 65,535. The default if no width is given is 6.

Name

MEDIUMINT

Synopsis

MEDIUMINT[(width)] [UNSIGNED] [ZEROFILL]

This data column type is for integer values of medium size. The signed range can be from –8,388,608 to 8,388,607; unsigned can be from 0 to 16,777,215. The default if no width is given is 9.

Name

INT, INTEGER

Synopsis

INT[(width)] [UNSIGNED] [ZEROFILL]

This is probably the most common numeric data column type used. The signed range can be from –2,147,483,648 to 2,147,483,647; unsigned can be from 0 to 4,294,967,295. The default if no width is given is 11. INTEGER is a synonym for this data type.

Name

BIGINT, SERIAL

Synopsis

BIGINT[(width)] [UNSIGNED] [ZEROFILL]

This data column type is for integer values of a large size. The signed range can be from –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807; unsigned can be from 0 to 18,446,744,073,709,551,615. The default if no width is given is 20. SERIAL is a synonym for this data type, but with specific column options: BIGINT, UNSIGNED, NOT NULL, AUTO_INCREMENT, and UNIQUE.

Name

FLOAT

Synopsis

FLOAT[(width, decimals)|(bit_precision)] [UNSIGNED] [ZEROFILL]

You can specify a level of precision for this data type. It may be from 0 to 24 for single-precision floating-point numbers, and from 25 to 53 for double-precision floating-point numbers. To make it ODBC-compatible, you can use the second syntax that specifies the precision only in bits. If you don’t give a precision with FLOAT, a single-precision floating point is assumed. If the SQL mode for the server has the REAL_AS_FLOAT option enabled, REAL is a synonym for FLOAT. This is set with the --sql-mode option when starting the server.

Name

DOUBLE, DOUBLE PRECISION

Synopsis

DOUBLE[(width, decimals)] [UNSIGNED] [ZEROFILL]

Use this data column type for approximate, floating-point numbers. The width given is the maximum width of the total digits—on both sides of the decimal place, not including the decimal point. The decimals given are the number of decimals allowed and the number for which it zero-fills. For example, a column type setting of DOUBLE(4,2) that is given a value of 123.4 will return a warning and store the value as 99.99, the maximum amount given the width. If the same column is given a value of 12.3, it won’t generate any warnings and it will store the value as 12.30.

The signed range of values can be from –1.7976931348623157E+308 to –2.2250738585072014E–308; the unsigned range from 2.2250738585072014E–308 to 1.7976931348623157E+308. Usually these limits aren’t possible due to the hardware and operating system limits of the server. The accuracy of this data type is up to about 15 decimal places.

With this data type, if UNSIGNED is specified, negative values are not allowed. If the SQL mode for the server does not have the REAL_AS_FLOAT option enabled, REAL is a synonym for DOUBLE.

Name

DEC, DECIMAL, FIXED, NUMERIC

Synopsis

DECIMAL[(width[, decimals])] [UNSIGNED] [ZEROFILL]

This data column type is similar to FLOAT, but it’s used for accurate, fixed-point numbers. When calculating with a DECIMAL column, it has a 65-digit level of precision. The width given is the maximum width of the total digits—on both sides of the decimal place, not including the decimal point or the negative sign if a negative number. The decimals given are the number of decimals allowed and the number for which it zero-fills. For example, a column type setting of DECIMAL(4,2) that is given a value of 123.4 will return a warning and store the value as 99.99, the maximum amount given the width. If the same column is given a value of 12.3, it won’t generate any warnings and will store the value as 12.30. If UNSIGNED is used with DECIMAL, negative values are not allowed.

MySQL stores numbers in DECIMAL columns as strings. Therefore, numbers outside the maximum numeric range for this data type may be stored in a DECIMAL column. This data type may be retrieved and displayed as a string, but in a numeric context (i.e., as part of a calculation), it cannot exceed the values shown in the table.

Date and Time Data Types

There are a few column data types for storing date and time values. They are listed in Table A-1. The table also lists the valid ranges for each data type. If a value that is not permitted or that is outside the acceptable range for the data type is inserted, zeros are used instead. You can override this feature by starting the server with --sql-mode='ALLOW_INVALID_DATES. As of version 5.0.2 of MySQL, warnings are generated when inserting invalid dates or times. For dates that are inserted with only 2 digits for the year, values from 00 to 69 are assumed to be in the 21st century. For years from 70 to 99, they are assumed to be in the 20th century.

Table A-1. Date and time data types

Data type

Format

Range

DATE

yyyy-mm-dd

1000-01-01 to 9999-12-31

DATETIME

yyyy-mm-dd hh:mm:ss

1000-01-01 00:00:00 to 9999-12-31 00:00:00

TIMESTAMP

yyyy-mm-dd hh:mm:ss

1970-01-01 00:00:00 to 2037-12-31 23:59:59

TIME

hh:mm:ss

–838:59:59 to 838:59:59

YEAR[(2|4)]

yy or yyyy

1970 to 2069, or 1901 to 2155

Times values may be given either as a string or numerically. To give them as a string, you can enter a value as d hh:mm:ss.f. In this format, d stands for the number of days, and has an allowable range of 0 to 34. The f stands for a fractional number of seconds. This value will not be stored, though. The ability to store fractional seconds is expected to be added in future releases of MySQL. You don’t have to specify values for all elements of a time. Instead, you can enter a time value using one of these formats: hh:mm:ss.f, hh:mm:ss, hh:mm, or just ss. If you want to include the number of days, you can use these formats: d hh:mm:ss, d hh:mm, or d hh. You can also drop the colons and just enter hhmmss, but you can’t add minutes onto the end of that format. The data type TIMESTAMP stores the date and time as the number of seconds since the epoch (its earliest date allowed), but it displays this number with the format yyyy-mm-dd hh:mm:ss. MySQL will automatically convert a date or time to its numeric date equivalent when it is used in a numeric context, and it will convert a numeric date to a date or time. If it isn’t given a date, MySQL will use a default of the current date—this is the primary difference between this column data type and DATETIME.

String Data Types

There are several column data types for storing strings. String data types are case-sensitive, so lowercase and uppercase letters remain unchanged when stored or retrieved. For a few of the string data types, you may specify a maximum column width. If a string is entered in a column that exceeds the width set for the column, the string will be right-truncated when stored. Binary strings are case-sensitive.

Following is a list of string data types. They’re not organized alphabetically; instead, they’re organized in ascending order based on the size of string values they can contain. This list also includes the width in characters or bytes for each data type.

Name

CHAR

Synopsis

CHAR(width) [BINARY|ASCII|UNICODE] [CHARACTER SET character_set]

[COLLATE collation]

The CHAR data type is a fixed-width column, padded with spaces to the right as needed. The spaces are not included in the results when queried. This column may be from 0 to 255 characters wide. The default if no width is given is 1. This type is synonymous with CHARACTER. You can also use NATIONAL CHARACTER or NCHAR to indicate that a predefined national character set is to be used. Columns are right-padded with spaces when stored. FULLTEXT indexing and searching may be performed on a CHAR column with a MyISAM table.

As of version 4.1 of MySQL, you can specify the ASCII attribute for use with the CHAR data type. This will set the column to the latin1 character set. Also as of version 4.1 of MySQL, you can specify the UNICODE attribute, which will set the column to the ucs2 character set.

Name

VARCHAR

Synopsis

VARCHAR(width) [BINARY]

The VARCHAR data type adjusts its width and does not pad the strings stored. Any trailing spaces contained in a string that is stored are removed. This column may be from 0 to 65,535 characters wide. The type of character set given can affect the number of characters given, as some require more than one byte per character (e.g., UTF-8). The default if no width is given is 1. This type is synonymous with CHARACTER VARYING. You can also use NATIONAL VARCHAR to indicate that a predefined national character set is to be used. FULLTEXT indexing and searching may be performed on a VARCHAR column with a MyISAM table.

Name

BINARY

Synopsis

BINARY(width)

This data type stores data as binary strings, not as character strings like CHAR does. The width given is for the maximum width in bytes—this value must be specified. This data type replaces CHAR BINARY. Before version 4.1.2 of MySQL, if you added the BINARY flag after CHAR, it instructed MySQL to treat the values as byte strings for sorting and comparing. If a BINARY column is used in an expression, all elements of the expression are treated as binary.

Name

VARBINARY

Synopsis

VARBINARY(width) [CHARACTER SET character_set] [COLLATE collation]

This data type stores data as binary strings, not as character strings like VARCHAR. The width given is for the maximum width in bytes—this value must be specified. If you want to use a character set for the column other than the default for the table, you can give one for the column. Values are sorted based on the collation of the character set for the column. This data type replaces VARCHAR BINARY. Before version 4.1.2 of MySQL, if you added the BINARY flag after VARCHAR, it instructed MySQL to treat the values as byte strings for sorting and comparing. All elements of the expression are then treated as binary.

Name

TINYBLOB

Synopsis

TINYBLOB

This column data type allows for the storage of binary data. The maximum width is 255 bytes.

Name

TINYTEXT

Synopsis

TINYTEXT[CHARACTER SET character_set] [COLLATE collation]

This column data type allows for the storage of text data. The maximum width is 255 bytes. If you want to use a character set for the column other than the default for the table, you can give one for the column. Values are sorted based on the collation of the character set for the column.

Name

BLOB

Synopsis

BLOB[(width)]

This column data type allows for the storage of a large amount of binary data. You may give a width with this data type; the maximum is 65,535 bytes. If you attempt to store a value in a BLOB column that is larger than its limit, unless the server is set to SQL strict mode, the data will be truncated and a warning message will be generated. If strict mode is on, the data will be rejected and an error will be returned. A BLOB column cannot have a default value. For sorting data, the value given for the system variable max_sort_length will be used. Only the number of bytes specified by that variable for each column will be included in sorts.

Name

TEXT

Synopsis

TEXT[(width)] [CHARACTER SET character_set] [COLLATE collation]

This column data type allows for the storage of a large amount of text data. You may give a width with this data type; the maximum is 65,535 bytes. If you attempt to store a value in a TEXT column that is larger than its limit, unless the server is set to SQL strict mode, the data will be truncated and a warning message will be generated. If strict mode is on, the data will be rejected and an error will be returned. A TEXT column cannot have a default value. For sorting data, the value given for the system variable max_sort_length will be used. Only the number of bytes specified by that variable for each column will be included in sorts. FULLTEXT indexing and searching may be performed on a TEXT column with a MyISAM table, but not on a BLOB column. If you want to use a character set for the column other than the default for the table, you can give one for the column. Values are sorted based on the collation of the character set for the column.

Name

MEDIUMBLOB

Synopsis

MEDIUMBLOB

This column data type allows for the storage of a large amount of binary data. The maximum width is 16,777,215 bytes.

Name

MEDIUMTEXT

Synopsis

MEDIUMTEXT [CHARACTER SET character_set] [COLLATE collation]

This column data type allows for the storage of a large amount of text data. The maximum width is 16,777,215 bytes. If you want to use a character set for the column other than the default for the table, you can give one for the column. Values are sorted based on the collation of the character set for the column.

Name

LONGBLOB

Synopsis

LONGBLOB

This column data type allows for the storage of a large amount of binary data. The maximum width is 4 GB.

Name

LONGTEXT

Synopsis

LONGTEXT [CHARACTER SET character_set] [COLLATE collation]

This column data type allows for the storage of a large amount of text data. The maximum width is 4 GB. If you want to use a character set for the column other than the default for the table, you can give one for the column. Values are sorted based on the collation of the character set for the column.

Name

ENUM

Synopsis

ENUM('value', ...) [CHARACTER SET character_set] [COLLATE collation]

An ENUM column is one in which all possible choices are enumerated (e.g., ENUM('yes', 'no', 'maybe')). It’s possible for it to contain a blank value (i.e., '') and NULL. If an ENUM column is set up to allow NULL values, NULL will be the default value. If an ENUM column is set up with NOT NULL, NULL isn’t allowed and the default value becomes the first element given.

MySQL stores a numeric index of the enumerated values in the column, 1 being the first value. The values can be retrieved when the column is used in a numeric context (e.g., SELECT col1 + 0 FROM table1;). The reverse may be performed when entering data into a column (e.g.,UPDATE table1 SET col1 = 3; to set the value to the third element). The column values are sorted in ascending order based on the numeric index, not on their corresponding enumerated values. If you want to use a character set for the column other than the default for the table, you can give one for the column. Values are sorted based on the collation of the character set for the column.

Name

SET

Synopsis

SET('value', ...) [CHARACTER SET character_set] [COLLATE collation]

The SET data type is similar to ENUM, except that a SET column can hold multiple values (e.g., UPDATE table1 SET col1 = 'a, b';). For this data type, values may be filtered with the FIND_IN_SET() function. If you want to use a character set for the column other than the default for the table, you can give one for the column. Values are sorted based on the collation of the character set for the column.