MySQL Stored Procedure Programming (2009)

Part II. Stored Program Construction

Chapter 10. Stored Functions

A stored function is a stored program that returns a value. While stored procedures may return values via OUT or INOUT variables, a function can—and must—return data only via a single RETURN value. Unlike stored procedures, stored functions can be used in expressions wherever you can use a built-in function of the same return data type and can be used inside of SQL statements such as SELECT, UPDATE, DELETE, and INSERT.

In this chapter we will look at how and when to use stored functions.

The use of stored functions can improve the readability and maintainability of stored program code by encapsulating commonly used business rules or formulas. You can also use stored function return values to control the overall program flow.

Using stored functions in standard SQL statements can simplify the syntax of the SQL by hiding complex calculations and avoiding the repetitive coding of these calculations throughout your code. Stored functions can also be used in SQL to implement operations that would otherwise require subqueries or joins, although you need to be careful to avoid possible performance problems that can occur if a function called from a SQL statement itself calls other SQL statements.

Stored functions may not return result sets and may not include dynamic SQL.

Creating Stored Functions

We provided an overview of the CREATE FUNCTION statement in Chapter 7, but we will recap here. You create a stored function using the following syntax:

CREATE FUNCTION function_name (parameter[,...])

RETURNS datatype

[LANGUAGE SQL]

[ [NOT] DETERMINISTIC ]

[ {CONTAINS SQL | NO SQL | MODIFIES SQL DATA | READS SQL DATA} ]

[ SQL SECURITY {DEFINER|INVOKER} ]

[ COMMENT comment_string ]function_statements

Most of the options for the CREATE FUNCTION statement also apply to CREATE PROCEDURE and are documented in Chapter 7. However, the following are unique to stored functions:

§ The RETURNS clause is mandatory and defines the data type that the function will return.

§ You cannot specify the IN, OUT, or INOUT modifiers to parameters. All parameters are implicitly IN parameters.

§ The function body must contain one or more RETURN statements, which terminate function execution and return the specified result to the calling program, as described in the following section.

The RETURN Statement

The RETURN statement terminates stored function execution and returns the specified value to the calling program. You can have as many RETURN statements in your stored function as makes sense. Example 10-1 shows an example of a stored function that has multiple RETURN statements.

Example 10-1. Simple stored function with multiple RETURN statements

CREATE FUNCTION cust_status(in_status CHAR(1))

RETURNS VARCHAR(20)

BEGIN

IF in_status = 'O' THEN

RETURN('Overdue');

ELSEIF in_status = 'U' THEN

RETURN('Up to date');

ELSEIF in_status = 'N' THEN

RETURN('New');

END IF;

END;

However, it is usually regarded as good practice to include only a single RETURN statement ("one way in and one way out"), and to use variable assignments within conditional statements to change the return value. Aside from arguably resulting in more comprehensible program flow, using a single RETURN statement can avoid the situation in which none of the RETURN statements get executed. "Falling out" of a function, rather than exiting cleanly via a RETURN statement, will cause a runtime error, as shown in Example 10-2.

Example 10-2. "Falling out" of a function without executing a RETURN statement

mysql> SELECT cust_status('X');

ERROR 1321 (2F005): FUNCTION cust_status ended without RETURN

Example 10-3 shows our previous example recoded to include only a single RETURN statement.

Example 10-3. Simple stored function with single RETURN statement

CREATE FUNCTION cust_status(in_status CHAR(1))

RETURNS VARCHAR(20)

BEGIN

DECLARE long_status VARCHAR(20);

IF in_status = 'O' THEN

SET long_status='Overdue';

ELSEIF in_status = 'U' THEN

SET long_status='Up to date';

ELSEIF in_status = 'N' THEN

SET long_status='New';

END IF;

RETURN(long_status);

END;

Tip

It is good practice to include only a single RETURN statement—as the last line of executable code—in your stored functions . Avoid any flow control that could allow the stored function to terminate without calling a RETURN statement.

Parameters to Stored Functions

Stored functions can include multiple parameters, but these may only be IN parameters. That is, you can specify neither the OUT nor INOUT clause (nor even the IN clause) when defining your parameters (see Chapter 7 for a more detailed description of OUT and INOUT parameters). So, for instance, the function defined in Example 10-4 will not compile.

Example 10-4. Function will not compile due to the INOUT clause

CREATE FUNCTION f_inout(INOUT x INT) RETURNS INT

BEGIN

SET x=1;

RETURN(1);

END;

Tip

Stored functions cannot include OUT or INOUT parameters; if you need to return multiple variables from your stored program, then a procedure is possibly more appropriate than a function.

The DETERMINISTIC and SQL Clauses

When binary logging is enabled, MySQL needs to know if a stored function that modifies SQL is deterministic—that is, if it always performs the same actions and returns the same results when provided with the same inputs. Since the default for stored programs is NOT DETERMINISTIC CONTAINS SQL, you need to explicitly set the appropriate keywords in order for the function to compile when binary logging is enabled. This requirement relates to the need to ensure that changes made in the stored function can be correctly replicated to another server. If the actions performed by the function are nondeterministic, then correct replication cannot be assured.

A nondeterministic routine is one that can produce different outputs when provided with the same inputs. In this context, "outputs" include not just the return values of the stored program, but also any modifications that may be made to data within the MySQL databases. Currently, MySQL only cares about the determinism of a function or a procedure in the context of replication. In the future, however, the DETERMINISTIC keyword may also be used to perform certain optimizations (such as caching function return values) or to allow a function to be used in an index or partition definition.

If you declare a stored function without one of the SQL mode clauses NO SQL or READS SQL, and if you have not specified the DETERMINISTIC clause, and if the binary log is enabled, you may receive the following error:

ERROR 1418 (HY000): This function has none of DETERMINISTIC, NO SQL, or READS SQL DATA in

its declaration and binary logging is enabled (you *might* want to use the less safe

log_bin_trust_function_creators variable)

To avoid this error, you must do one of the following:

§ Specify one or more of the DETERMINISTIC, NO SQL, and/or READS SQL DATA keywords in your stored function definition.

§ Set the value of log_bin_trust_routine_creators to 1 (SET GLOBAL log_bin_trust_routine_creators = 1)

Of course, you should not specify that a stored function is DETERMINISTIC if it is not, and you should avoid setting log_bin_trust_routine_creators to 1 unless you are unconcerned about the correctness of data recovery or replication. Therefore, as a general rule, you should avoid creating nondeterministic stored functions that modify data.

The use of the NOW function or any similar time-based functions does not necessarily cause a stored function to become nondeterministic (at least from a replication perspective), since MySQL logs the timestamp in the binary log, resulting in NOW( ) being calculated correctly during replication or recovery. Likewise, a single random number will also not cause the routine to become nondeterministic, since the seed to the random number generator will be identical on the slave and during data recovery. However, multiple calls to RAND( ) will cause a routine to become nondeterministic.

This restriction on nondeterministic routines applied to both stored functions and stored procedures in the initial production release of MySQL 5.0, but from 5.0.16 on it applies only to stored functions.

If your function is nondeterministic, and it reads but does not modify the database, then you may use the clauses NOT DETERMINISTIC READS SQL DATA to allow the function to be created. If the function is nondeterministic and performs no database access at all, then we recommend using NOT DETERMINISTIC NO SQL.

The relevant ANSI standard intended that the NO SQL clause should pertain only to "external" stored programs written in nondatabase languages such as (for instance) Java or PHP. Therefore, the use of NO SQL may not be strictly correct from a standards perspective. However, we think that the alternatives—to specify READS SQL DATA for a function that performs no database access at all or to declare a nondeterministic function as DETERMINISTIC—are clearly unacceptable. Therefore, we recommend that you use NO SQL when required to denote that a stored function performs no database operations.

Issues relating to replication and nondeterministic functions are expected to be resolved in MySQL 5.1 with the introduction of row-level binary logging.

SQL Statements in Stored Functions

You can include SQL statements within stored functions , although you should be very careful about including SQL statements in a stored function that might itself be used inside a SQL statement (more on that later).

However, you cannot return a result set from a stored function: trying to create a stored function that contains a SELECT statement without an INTO clause will result in a 1415 error, as shown in Example 10-5.

Example 10-5. Stored functions cannot return result sets

mysql> CREATE FUNCTION test_func( )

-> RETURNS INT

-> BEGIN

-> SELECT 'Hello World';

-> RETURN 1;

-> END;$$

ERROR 1415 (0A000): Not allowed to return a result set from a function

Calling Stored Functions

A function can be called by specifying its name and parameter list wherever an expression of the appropriate data type may be used. To show how stored functions can be called, we'll use the simple stored function shown in Example 10-6.

Example 10-6. Simple stored function

CREATE FUNCTION isodd(input_number int)

RETURNS int

BEGIN

DECLARE v_isodd INT;

IF MOD(input_number,2)=0 THEN

SET v_isodd=FALSE;

ELSE

SET v_isodd=TRUE;

END IF;

RETURN(v_isodd);

END ;

From the MySQL command line, we can invoke our simple stored function in a number of ways. Example 10-7 shows how to call the stored function from a SET statement and from a SELECT statement.

Example 10-7. Calling a stored function from the MySQL command line

mysql> SET @x=isodd(42);

Query OK, 0 rows affected (0.00 sec)

mysql> SELECT @x;

+------+

| @x |

+------+

| 0 |

+------+

1 row in set (0.02 sec)

mysql> SELECT isodd(42)

-> ;

+-----------+

| isodd(42) |

+-----------+

| 0 |

+-----------+

From within a stored procedure, we can invoke the function both within a SET clause and within a variety of flow control statements. Example 10-8 shows how to call a stored function from within a SET statement, as well as from an IF statement.

Example 10-8. Calling a stored function from within a stored procedure

SET l_isodd=isodd(aNumber);

IF (isodd(aNumber)) THEN

SELECT CONCAT(aNumber," is odd") as isodd;

ELSE

SELECT CONCAT(aNumber," is even") AS isodd;

END IF;

Programming languages support a variety of methods for calling a stored function. Java and .NET languages (VB.NET and C#) provide methods to call stored functions directly. However, in many of the dynamic languages (PHP, Perl, Python) there is no API for directly accessing a stored function. (We give guidelines for common programming languages in Chapters 12 through 17.)

If a language does not support a method for directly calling a stored function, you should embed the call in a SELECT statement without a FROM clause and retrieve the function result from the subsequent result set. For instance, in PHP, with the mysqli interface, we can retrieve a stored function result as shown in Example 10-9.

Example 10-9. Calling a stored function from PHP

$stmt=$my_db->prepare("SELECT isodd(?)") or die($my_db->error);

$stmt->bind_param('i',$aNumber) or die($stmt->error);

$stmt->execute( ) or die($stmt->error);

$stmt->bind_result($isodd);

$stmt->fetch( );

if ($isodd == 1 )

printf("%d is an odd number\n",$aNumber);

else

printf("%d is an even number\n",$aNumber);

Some languages specifically support calling stored functions . For instance, Java JDBC allows a stored function to be called directly, as shown in Example 10-10.

Example 10-10. JDBC support for stored functions

CallableStatement PreparedFunc =

MyConnect.prepareCall("{ ? = call isodd( ? ) }");

PreparedFunc.registerOutParameter(1, Types.INTEGER);

PreparedFunc.setInt(1, aNumber);

PreparedFunc.execute( );

if (PreparedFunc.getInt(1) == 1)

System.out.println(aNumber + " is odd");

else

System.out.println(aNumber + " is even");

Using Stored Functions in SQL

So far, we have looked at stored functions as though they were simply a variant on the stored procedure syntax—a special type of stored procedure that can return a value. While this is certainly a valid use for a stored function, stored functions have an additional and significant role to play: asuser-defined functions (UDFs ) within SQL statements.

Consider the SELECT statement shown in Example 10-11: it returns a count of customers by status, with the one-byte status code decoded into a meaningful description. It also sorts by the decoded customer status. Notice that we must repeat the rather awkward CASE statement in both theSELECT list and the ORDER BY clause.

Example 10-11. SQL statement with multiple CASE statements

SELECT CASE customer_status

WHEN 'U' THEN 'Up to Date'

WHEN 'N' THEN 'New'

WHEN 'O' THEN 'Overdue'

END as Status, count(*) as Count

FROM customers

GROUP BY customer_status

ORDER BY CASE customer_status

WHEN 'U' THEN 'Up to Date'

WHEN 'N' THEN 'New'

WHEN 'O' THEN 'Overdue'

END

Now imagine an application with many similar CASE statements, as well as complex calculations involving business accounting logic, scattered throughout our application. Such statements—often with embedded expressions far more complex than the one shown in Example 10-11—result in code that is difficult to understand and maintain. Whenever the CASE constructs or business calculations need to be modified, it will be necessary to find and then modify a large number of SQL statements, affecting many different modules.

Stored functions can help us minimize this problem, by centralizing the complex code in one program unit, and then deploying that program wherever needed. Example 10-12 shows the result of transferring the logic in the previous query's CASE expression into a stored function.

Example 10-12. Stored function for use in our SQL statement

CREATE FUNCTION cust_status(IN in_status CHAR(1))

RETURNS VARCHAR(20)

BEGIN

DECLARE long_status VARCHAR(20);

IF in_status = 'O' THEN

SET long_status='Overdue';

ELSEIF in_status = 'U' THEN

SET long_status='Up to date';

ELSEIF in_status = 'N' THEN

SET long_status='New';

END IF;

RETURN(long_status);

END;

We can now use this function in our SQL statement, as shown in Example 10-13.

Example 10-13. Stored function in a SQL statement

SELECT cust_status(customer_status) as Status, count(*) as Count

FROM customers

GROUP BY customer_status

ORDER BY cust_status(customer_status);

Notice that the repetition has been removed and the query is also much more readable, since it is hiding the details of the customer status formula. If and when a programmer needs to understand the logic used to determine customer status, she can open up the stored function and take a look.

Using SQL in Stored Functions

You can include SQL statements inside of stored functions that are themselves used within SQL statements as user-defined functions. However, be careful when doing so, since functions calling SQL inside of SQL statements can lead to unpredictable and often poor performance.

For instance, consider the stored function shown in Example 10-14.

Example 10-14. Stored function to return customer count for a sales rep

CREATE FUNCTION customers_for_rep(in_rep_id INT)

RETURNS INT

READS SQL DATA

BEGIN

DECLARE customer_count INT;

SELECT COUNT(*)

INTO customer_count

FROM customers

WHERE sales_rep_id=in_rep_id;

RETURN(customer_count);

END;

This function returns the number of customers assigned to a given sales representative. We might use this function in a stored program when calculating a commission, as shown in Example 10-15.

Example 10-15. Using the sales rep function in a stored program

IF customers_for_rep(in_employee_id) > 0 THEN

CALL calc_sales_rep_bonus(in_employee_id);

ELSE

CALL calc_nonrep_bonus(in_employee_id);

END IF;

If this stored function is called for a single employee, then the use of the stored function is probably appropriate—it improves the clarity of the business logic, and performance would be no worse than it would be with an embedded SQL statement.

However, consider the case where we want to issue a query listing all the sales representatives with more than 10 customers together with their customer counts. In standard SQL, the query might look like that shown in Example 10-16.

Example 10-16. Standard SQL to retrieve sales reps with more than 10 customers

SELECT employee_id,COUNT(*)

FROM employees JOIN customers

ON (employee_id=sales_rep_id)

GROUP BY employee_id

HAVING COUNT(*) > 10

ORDER BY COUNT(*) desc;

Alternately, we can use our stored function, which will—apparently—avoid the join between employees and customers and also avoid a GROUP BY. The stored function version of the query is shown in Example 10-17.

Example 10-17. Function-based query to retrieve sales reps with more than 10 customers

SELECT employee_id,customers_for_rep(employee_id)

FROM employees

WHERE customers_for_rep(employee_id)>10

ORDER BY customers_for_rep(employee_id) desc

Although the stored function solution looks a lot simpler, it actually takes much longer to run than the standard SQL. For every row retrieved from the employees table, the stored function must be called three times (once for the SELECT, once for the WHERE, and once for the ORDER BY). Furthermore, each invocation of the stored function performs a full table scan of the customers table—resulting in three such full scans for each employee row. In contrast, the standard SQL performs just one scan of the customers table and then joins that to the employees table using the primary key (employee_id).

For our sample data, the standard SQL returned the required results almost instantaneously, while the stored function solution took almost half a minute. Figure 10-1 compares the execution times for the two solutions.

Using a stored function inside of a SQL statement that, in turn, contains SQL will not always cause such extreme response time degradation. In general, though, you should think twice about using a function that contains SQL inside of another SQL statement unless the embedded SQL is very efficient—such as a SQL statement that retrieves data via a quick index lookup.

Comparison of performance between standard SQL and SQL using a stored function containing embedded SQL

Figure 10-1. Comparison of performance between standard SQL and SQL using a stored function containing embedded SQL

Tip

Be careful using SQL inside of stored functions that are called by other SQL statements. The resulting performance can be very poor unless the stored function is extremely efficient.

Conclusion

A stored function is a special type of stored program that returns a single result. Stored functions can be used in SQL statements or within other stored programs wherever an expression that returns a corresponding data type can be used.

Stored functions have the following limitations when compared to stored procedures:

§ They may not include OUT or INOUT parameters.

§ They may not return result sets.

A stored function terminates when a RETURN statement is encountered. In general, it is good practice to include a single RETURN statement at the end of the function rather than including multiple RETURN statements inside flow control statements. If a stored function terminates without issuing a RETURN statement, an error will be raised.

You can use stored functions within standard SQL. Doing so can improve the readability and maintainability of the SQL by centralizing the definition of complex calculations, decodes, or other application logic.

Be careful, however, when using stored functions inside SQL statements if those functions embed SQL statements. Stored functions that include SQL can often perform badly when included within standard SQL statements.