Praise for Gray Hat Hacking: The Ethical Hacker’s Handbook, Fourth Edition (2015)

PART II. From Vulnerability to Exploit

CHAPTER 15. Exploiting Web Applications

This chapter shows you advanced techniques for finding and exploiting common vulnerabilities in web applications, even with proper security controls in place. You will learn how to find design flaws in real scenarios and, more importantly, how to fix them.

In particular, this chapter covers the following topics:

• Overview of the most common web vulnerabilities in the last decade

• SQL injection via MD5 hash injection and multibyte encoding injection

• Exploiting type conversion in MySQL 5.x

• Hunting cross-site scripting (XSS)

• Unicode normalization forms attack with Fiddler2 Proxy

Overview of the Top 10 Web Vulnerabilities

In June of 2013, the Open Web Application Security Project (OWASP) released the following list of the top 10 web vulnerabilities:

• A1: Injection

• A2: Broken Authentication and Session Management

• A3: Cross-Site Scripting (XSS)

• A4: Insecure Direct Object References

• A5: Security Misconfigurations

• A6: Sensitive Data Exposure

• A7: Missing Function-Level Access Controls

• A8: Cross-Site Request Forgery (CSRF)

• A9: Using Components with Known Vulnerabilities

• A10: Unvalidated Redirects and Forwards

In order to analyze the evolution of vulnerabilities over the past 10 years, here is the OWASP top 10 list of web vulnerabilities from 2004:

• A1: Unvalidated Input

• A2: Broken Access Control

• A3: Broken Authentication and Session Management

• A4: Cross-Site Scripting (XSS)

• A5: Buffer Overflows

• A6: Injection Flaws

• A7: Improper Error Handling

• A8: Insecure Storage

• A9: Denial of Service

• A10: Insecure Configuration Management

Table 15-1 compares these two lists so we can see the vulnerabilities that have been in the top 10 for a decade.

Table 15-1 Comparison of the OWASP Top 10 Lists from 2004 and 2013

At this point, you might be wondering why we have the same vulnerabilities found 10 years ago in modern applications—especially with the current security-awareness programs and secure code reviews added to the development life cycle.

The problem commonly lies in the poor design of the applications. This chapter does not describe how the OWASP vulnerabilities work, because they have existed for a decade and therefore plenty of information is available on the Internet. Instead, this chapter provides you with real scenarios where the applications can be compromised without the need to bypass any security control but rather by taking advantage of the poor design and implementation of security controls. The examples in this chapter focus only on the 10-year-old vulnerabilities mentioned in Table 15-1.

MD5 Hash Injection

Authentication is a component of the access control mechanism responsible for making sure that only valid subjects can log onto a system. By breaking the authentication, attackers can gain unauthorized access to sensitive information such as bank accounts, social security numbers, medical records, and so on. This information can be sold in the underground, giving big revenue to criminals, which explains why this mechanism has been a constant target for hackers for the last 10 years (refer to Table 15-1).

When dealing with authentication design, it is recommended that you store the hash of the password in the database instead of in plain text so that in case of a breach, attackers will need to reverse the hash data in order to get the plain text, which is not possible by design.

CAUTION Although it is not possible to reverse the hash, it is possible to generate the same output hash with different source data—a good example is the MD5 collision attack. It is recommended that you replace MD5 with a stronger hash such as SHA-512 to protect passwords.

In Lab 15-1, an MD5 hash is used to try to protect the users’ passwords; however, there are same flaws in the implementation that can allow an attacker to perform SQL injection to bypass the authentication.

Lab 15-1: Injecting the Hash

NOTE This lab, like all the labs, has a unique README file with instructions for setup. See the Appendix A for more information.

Go to directory /GH4/15/1/ on your web root folder (check the README file for this lab) and open the login.php script. The important portions of the file are shown here:

We can see a good secure coding practice for avoiding SQL injection by using mysql_real_escape_string() on the lines labeled and . So how, then, is the injection possible?

The PHP hash() function has an option to output the message digest in raw binary format if the third parameter is set to TRUE. This is the case in our example, which uses the MD5 algorithm. The raw output stored in the variable ‘$p’ can contain any character, including a single quote, which is commonly needed to perform SQL injection. In order to check how it works, run hash.php, which is located in the same web root folder, below the content and execution results:

You can see that the output generated some nonprintable characters, a double quote, a colon, and so on. Therefore, we need to find a combination of chars that can generate MD5 raw output with our injection string embedded that’s able to bypass the login check.

So, what combination of chars can we use for injection? Here, the first rule is that the string should be as small as possible so it can be generated by the MD5 raw output relatively quickly; otherwise, it could take hours or even months to find a match.

One of the smaller injection strings for bypassing authentication in MySQL is ′=′, which takes advantage of how type conversion during SQL expression evaluation works. Therefore, let’s discuss this concept before brute-forcing the MD5 raw output.

Type Conversion in MySQL 5.x

You’ll be surprised at the end of this exercise when you see the weird results MySQL can produce when the type conversion feature is used.

For this exercise, let’s assume we know the username (admin) but do not know the password (of course). Therefore, if we execute the following query with the nonexistent password string1, we get no results:

Internally, MySQL is executing something like this:

Select user, pass from users where user=′admin′ and 0

NOTE MySQL does not have a proper Boolean type; instead, TRUE is equal to 1 and FALSE is equal to 0.

What we need in order to bypass authentication is to force MySQL to return 1 instead of 0 when evaluating the password. The following query will suffice for our purposes because 0=0 is TRUE and therefore would return 1, thus giving us the admin password:

So, how can we force MySQL to evaluate 0=0? Here is where type conversion comes into play. The following query will help us to achieve our requirement:

Select user, pass from users where user=′admin′ and pass=′string1′=′string2′

Here, string1 is a sequence of arbitrary characters (for example, X₁ X₂ …X_n) and string2 is also a sequence of arbitrary characters (for example, Y₁ Y₂ …Y_n).

The expression pass=′string1′=′string2′ is analyzed from left to right and therefore parsed as (pass=′string1′) = ′string2′. The expression pass=′string1′ returns 0 (because there is no password in the users table equal to ′string1′), leaving us a new expression to be evaluated: 0=′string2′. However, the = cannot compare two values of different types directly; therefore, we get an implicit conversion of ′string2′ to Double (so that it can be compared to 0). However, because this alphanumeric value cannot be converted, another 0 is returned, so we get the final expression 0=0, which is TRUE and therefore returns 1.

Table 15-2 simplifies the type conversion process just explained.

Table 15-2 MySQL Type Conversion Dissected

We tested that (pass=′string1′=′string2′) is equal to 0=0, thus giving us the following query:

Select user, pass from users where user=′admin′ and 0=0

Therefore, we were able to bypass authentication without knowing the password, as expected!

In order to replicate this, connect to the gh4book database (as detailed in the README file) and execute the following command:

Now let’s look at how to generate our injection string in the next section.

MD5 Raw Output Brute Force

It is now time to brute-force MD5 raw output until it contains our injection string ′=′. We can do this by running brute.php, which is found in the repository. Here is the code:

Anomalous Sequences to Consider

Some anomalous sequences could make the previous assertion not to be true. Here are two examples:

1. Consider the case where string2 begins with the number 1 (that is, Y₁=1). In this case, we would end up comparing something like this:

pass=′X₁ X₂ ..X_n′=′1 Y₂ ..Y_n′

Here, the MySQL CAST conversion converts the rightmost side to 1 successfully, and the final comparison would be 0=1, which turns out to be FALSE, and our attack would not be successful! Such sequences in fact exist. One of these problematic sequences is the string “abnlaw,” which generates the following pattern:

pass=′AåÛën•2′′=′1…′

2. An even more improbable, anomalous case is one where, for example, we have string2 include other characters such as the < symbol (another MySQL operator) and end up in a well-formed sequence. Let’s say that Y₁=a, Y₂=w, Y₃ =′, Y₄=<, Y₅=′ and Y₆=1 so that string2 =aw′<′1.

In this hypothetical (but possible) case, the final comparison would be

pass=′X₁ X₂..X_n′=′aw′<′1′

and that would be evaluated as follows:

However, for the purpose of this discussion (and given the improbability of these anomalous sequences actually occurring), we would assume this to be a probabilistic attack and we would dismiss such cases.

The script will try to find the source data that generates the raw output containing our injection string. After running our script, we get the source data “esvh,” which will indeed generate raw output containing our injection string ′=′. As you’ll remember, this is needed to force MySQL to perform a type conversion that allows us to bypass the authentication via a SQL injection attack.

As already explained, in order to bypass authentication, the right portion of the evaluation must start with a nonnumeric character or with the number 0. The following is an invalid injection because the second string starts with number 1:

If you encounter this scenario, just rerun brute.php to generate a new string, but this time make sure to skip the value that generated the invalid injection:

How the SQL Injection Works

Now that the injection string “esvh” needed to bypass authentication has been identified, let’s test it:

1. Go to http://<your_ip>/GH4/15/1/access.html.

2. Enter user admin and password esvh and then click Submit to send the data to login.php, as shown here:

3. Because the password is alphabetic, it won’t be filtered by mysql_escape_string() in the code listing for login.php.

4. The string “esvh” is converted into raw output and pasted into the SQL query, allowing us to bypass authentication, as shown here:

You can see the message “User found!!” here, which confirms we were able to bypass the authentication. The content of the raw output was intentionally printed out to show the full injection; string1 and string2 represent the left- and right-side portions of the query, respectively.

We can see in this exercise that the security controls were in place to prevent a SQL injection attack; however, the design of the MD5 hashing algorithm introduced a vulnerability to the authentication module. Actually, any of the 42 or so hashing algorithms supported by PHP (MD5, SHA256, crc32, and so on) can be exploited in the same way in a similar scenario.

NOTE The key point to keep in mind when hunting SQL injections is to analyze the input validation controls, trying to find a potential weakness.

Even when other, more secure technologies such as cryptography are used, if the implementation is wrong, input validation can be bypassed easily. One example is when implementing AES-128 with CBC (Cipher Block Chaining)¹ without a ciphertext integrity check.

From a developer’s point of view, make sure you use parameterized SQL queries (see the “For Further Reading” section) when creating queries based on user input.

Multibyte Encoding Injection

Multibyte encoding is the capability of a computer system to support a wide range of characters represented in more than one byte in order to understand different languages. So, if the system is a browser set to English only, it will need to know 128 characters from that language, but if it is set to Chinese, the browser will need to know more than 1,000 characters! In order to have multilingual systems, the UTF-8 standard was created. Here are the main points to understand about this topic:

• A language is called charset in the computer systems world.

• Encoding is the alphabet used by the charset.

• Encoding means to represent a symbol (character) with a sequence of bytes.

• More than one byte to represent a character is called multibyte encoding.

• Multibyte encoding helps to represent larger character sets, such as those for Asian languages.

• One-byte encoding can produce 256 characters.

• Two-byte encoding can produce 65,536 characters.

• The Unicode standard, with its UTF-8 implementation, is the most common multibyte encoding used nowadays.

• UTF-8 can encode all ASCII characters with one byte and uses up to four bytes for encoding other characters.

• UTF-8 allows systems from different countries with different languages (charsets) to communicate in a transparent way.

Multibyte encoding injection is a technique for sending language-specific malicious characters that can confuse the system in order to go undetected by security controls and enable the compromise of the applications.

But, are multilingual environments common? They definitely are, especially with globalization, where companies have facilities all around the world. These companies might have Asian, African, Spanish, and other languages enabled on all their systems at the same time.

Understanding the Vulnerability

When all the parties involved in a process speak the same language, there is no miscommunication and the possibility for errors is low. But what if one party speaks Spanish (via UTF-8 charset) and the other one speaks Chinese (via GBK charset)? In this case, there definitely could be a miscommunication that will lead to a vulnerability such as SQL injection.

CAUTION Think about this attack in real-life terms. You speak Spanish and need to explain to a Chinese person how to get to a specific address. There is a potential risk that this person will get lost trying to follow your directions. In an IT environment, you do not get lost, you get hacked!

Although this issue was explained back in 2006 by Chris Shiflett,² we will go a little bit deeper into this topic and demonstrate that even with the mysql_real_escape_string() filter, the attack is still possible. In our scenario, the attacker sends a combination of Chinese-encoded bytes to the application server, which is set to Latin and therefore not able to understand the message. However, the backend database does understand Chinese and therefore properly translates those encoded bytes—which, unfortunately, are malicious SQL commands! Let’s see how it works.

Lab 15-2: Leverage Multibyte Encoding

Let’s start by changing the character set of our Users table (from Lab 15-1) to the Chinese character set by logging into the MySQL/gh4book database and executing the following instructions:

We can confirm our change by looking at the Collation column, where gbk_chinese_ci has been set (it is the default for the GBK charset).

NOTE As you may have already identified, because we only set the “pass” field as gbk, the “user” field is not vulnerable to this attack.

The final step is to make sure the DB client is configured for Chinese by enabling the GBK charset. Otherwise, you might get a “mix of collations errors” on the server side and therefore won’t be able to inject your string. The reason that this error occurs because the client might be set to Latin and is trying to communicate to a Chinese DB column, thus causing an error like the one shown here:

Go to the /GH4/15/2/ directory on your web root folder (check the README file for this lab) and open the login.php script. The important portions of the file are shown here:

The gbk charset is enabled in our login.php script at the line labeled . Go to http://<your_ip>/GH4/15/2/access.html and enter any username and the classic injection string ‘ or 1=1# in the password field, as shown here:

As you can see, our single quote has been properly escaped by the mysql_real_escape_string() function in login.php. Therefore, sending to MySQL the password \′ or 1=1# (which does not exist) gives us a “Login failed” response.

So now our challenge is to remove that backslash that has been added (encoded as %5c) so that our single quote is not escaped and we can perform the SQL injection. How do we do that? We need to find a way to inject into the MySQL query the following string:

%bf%5c%27 or 1=1#

This way, the multibyte %bf%5c can be translated into a valid Chinese character (because password column has the GBK charset configured), thus removing the backslash as planned. Table 15-3 shows the steps to accomplish the SQL injection:

Table 15-3 Charsets that Use the Backslash Character to Bypass Filters

1. From the browser, we want PHP to add the escape symbol, so we send the following POST request via the password field:

%bf%27 or 1=1#

2. The Apache PHP server is not set to Chinese, so it will not detect the multibyte character injected and thus will forward the same string:

%bf%27 or 1=1#

3. The PHP filter detects a single quote and escapes it with a backslash (%5c). It then sends the following escaped string to MySQL:

%bf%5c%27 or 1=1#

4. MySQL is set to Chinese and therefore translates %bf%5c into a Chinese character, removing the escape symbol (backslash). Here is the new string after translation:

5. MySQL now will process the SQL injection because the single quote is unescaped, and it retrieves the first row from the Users table, like so:

In order to replicate the attack, you will need to use a browser proxy to intercept/modify the POST request. In this example, we will use Tamper Data, which is an add-on for Firefox. Therefore, let’s resend our attack using the aforementioned adjustments, as shown here:

Finally, as expected, we are able to bypass the login mechanism again:

The question mark shown in the response represents a Chinese character not translated by the browser; however, because our single quote was not escaped, the first user in the table was retrieved successfully!

In Table 15-3, you will find examples of other charsets that can be used to bypass input validation controls (using the backslash character when escaping) via multibyte injection.

NOTE The charsets mentioned in Table 15-3 can be supported by other databases and web and application servers that might also be targets for multibyte injection.

In our exercise, if we had configured login.php to understand Chinese characters by encoding the input with the GBK charset using

then it would have treated the multibyte %bf%27 as an invalid Chinese character (remember that this sequence does not exist in the GBK charset) and the engine would return the question mark symbol, removing the single quote and thus preventing the injection, as shown in the following illustration (consult the previous illustration showing the Tamper Data add-on to replicate the injection):

Therefore, it’s a good idea in our defense-in-depth approach to prevent multibyte injection by configuring all the application layers (DB, app server, clients, and so on) with the same charset so that they communicate in the same language. The most recommended charset nowadays is UTF-8.

As a penetration tester, you should test all the different charsets mentioned at Table 15-3 as part of your automated test cases.

Hunting Cross-site Scripting (XSS)

If you’re not familiar with XSS attacks, make sure you read the OWASP article “Cross-site Scripting (XSS)” at http://tinyurl.com/3hl5rxt. Here are the main points you need to know about XSS:

• XSS is a client-side attack executed in the browser.

• JavaScript and VBScript are the main languages used on this attack.

• XSS is prevented by implementing proper output validation.

Nowadays it’s difficult to find XSS vulnerabilities, even if the developer did not implement any output validation, because the browsers have built-in protection for this attack.

When hunting XSS vulnerabilities, the first step is to identify the input fields (cookies, headers, forms, and so on) in the web application that will send back to the browser the data entered in those fields, either immediately (reflected XSS) or later after a specific query (stored XSS). Here are some common scenarios where XSS can be found:

• Search fields The search term entered will be reflected in the response (for example, “The name <search-term-you-entered> was not found”).

• Contact forms This is where most of XSS is found. Usually, if the user enters a value not valid in the form, such as a wrong email address, date, and so on, the error is detected and all the information entered will be sent back, filling out the contact form automatically so that the user only needs to fix the appropriate field. Attackers will take advantage of this behavior by purposely entering a wrong email address, for example, and the injection in another field will be executed while the contact form is being filled out again in the browser.

• Error messages Many XSS bugs have been found in the error messages returned by applications such as Apache, .NET, Java, PHP, Perl, and more. This usually occurs when a wrong URI, an invalid filename, or an invalid data format is entered.

• HTML links The data entered in the input fields is used to generate dynamic HTML links in the response.

• Injection in JavaScript blocks This scenario occurs when the application creates JavaScript code based on the data entered by the users. Such scenarios include showing a pop-up message with the action performed, filling out HTML elements dynamically, and creating DOM elements such as a list of states based on the country selected.

Injecting malicious code into JavaScript blocks can help you easily bypass the browser’s protection, so let’s see how it works.

Lab 15-3: Basic XSS Injection into a JavaScript Block

The js.php script from Lab 15-3 fills out a textarea based on the info received from a ‘data’ GET parameter:

This gives us the following result:

But, as we can see in the js.php source code, the input received is inserted into a JavaScript block. Therefore, in order to perform a XSS attack, we can send the following XSS attack in the ‘data’ parameter:

mitnick′;alert(′XSS HERE!!′);var c=′

Here is the source of the browser page after we send the malicious string (underlined):

Here we have a single quote and semicolon to complete the var a= instruction, the malicious code to execute, and the extra code to close the remaining single quote to avoid a syntax error that could prevent the XSS execution. This gives us the following alert message shown in the browser:

CAUTION In this attack, there was no need to insert the <SCRIPT> tag to successfully execute the XSS attack. We can easily bypass weak output validation if it only relies on filtering out this JavaScript tag. As long as the single quote is not filtered out, the attack is possible.

Audit your source code and make sure you detect all the inputs received that are being sent back to the browser, and make sure there is proper output HTML encoding. The best approach to accomplish this task is to use automated source code review tools such as IBM Security AppScan Source, which is very good at detecting these potential bugs. Basically, the tool will trace all the inputs and then detect the ones going back to the browser:

AppScan Source will realize the $user variable is being sent back to the browser without being properly encoded and will flag this variable as being “XSS vulnerable.” The tool is very powerful because it will also detect stored XSS by making sure that all the data being retrieved from the database, configuration files, or session context that is going back to the browser is properly encoded.

Unicode Normalization Forms Attack

Nowadays if a good XSS filter is implemented, it is really hard to successfully perform XSS. You can find multiple ways to bypass filters by looking at the OWASP Filter Evasion Cheat Sheet inspired by RSnake’s work. This cheat sheet can be found here:

https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet

The OWASP XSS Filter Evasion Cheat Sheet assumes the single quote, greater-than, and less-than symbols are not being filtered out, which is a very uncommon scenario nowadays. Therefore, you will sadly realize that a basic output HTML encoding will stop all those attacks, as implemented in the transforme.php script found at Lab 15-4.

Lab 15-4: Leveraging Unicode Normalization

Before learning how Unicode normalization works, let’s see how a common application with well-known filters like htmlspecialchars() helps to prevent most of cross-site scripting attacks:

Here are the translations performed by htmlspecialchars(), per the PHP site:

• & (ampersand) becomes &.

• “ (double quote) becomes " when ENT_NOQUOTES is not set.

• ′ (single quote) becomes ' (or ') only when ENT_QUOTES is set.

• < (less than) becomes <.

• > (greater than) becomes >.

You can see here that the string ′><SCRIPT>alert(1)</SCRIPT> sent to transforme.php was properly encoded in the response:

With this simple filter, almost all the attacks explained in the XSS Filter Evasion Cheat Sheet, with the exception of US-ASCII Encoding attack (applicable in specific scenarios and with Apache Tomcat only), are useless.

But wait, the script transforme.php uses normalization, so we still have a chance to bypass the XSS filter.

Unicode Normalization Introduction

Per Unicode.org, Unicode Normalization Forms is defined as follows: “When implementations keep strings in a normalized form, they can be assured that equivalent strings have a unique binary representation.”³ The Unicode standard defines two types of equivalence between characters: canonical equivalence and compatibility equivalence.

Canonical equivalence is a fundamental equivalency between characters or sequences of characters that represent the same abstract character, and when correctly displayed should always have the same visual appearance and behavior. Compatibility equivalence is a weaker equivalence between characters or sequences of characters that represent the same abstract character, but may have a different visual appearance or behavior.

Based on this explanation, let’s look at how canonical equivalence works. Table 15-4 shows that in UTF-8 the letter A can be represented in different ways, depending on country, language, and purpose.

Table 15-4 UTF-8 Table with Different Representations of the Letter A

Therefore, canonical equivalency normalization means that every time one of these versions of the letter A is entered into your system, you will always treat it as the common Latin capital letter A, which definitely is helpful in the following scenarios:

• The text needs to be compared for sorting.

• The text needs to be compared for searching.

• Consistent storage representation is required, such as for unique usernames.

But normalization can also introduce vulnerabilities such as account hijacking, as has been detailed at the Spotify Labs website.⁴

So, how does this help us perform our XSS attack? What if we send multibyte UTF-8 characters that are not filtered by htmlspecialchars() but are normalized by the application into our malicious character? For example, as you saw earlier, the single quote character (encoded as %27) will be filtered as &#039, but we know that the single quote has other UTF-8 representations, such as ec bc 87, as shown here:

The PHP filter won’t recognize this UTF-8 combination as malicious and will therefore not filter it, thus allowing normalization to do its job on the next line (the line labeled in transforme.php) and sending us back in the browser the unfiltered single quote! Got it? If not, don’t worry. We’ll discuss this process in detail in the next section.

Normalization Forms

Normalization Forms are four algorithms that determine whether any two Unicode strings are equivalent to each other (see Table 15-5).

Table 15-5 Normalization Forms (Unicode.org)

All these algorithms are idempotent transformations, meaning that a string that is already in one of these normalized forms will not be modified if processed again by the same algorithm.

You may have already noticed that our transforme.php script uses the NFKC algorithm, where characters are decomposed by compatibility and then recomposed by canonical equivalence.

You can get all details and examples of Normalization Forms at Unicode.org.

Preparing the Environment for Testing

Install Fiddler2 Proxy and the x5s plug-in as described in the README file for Lab 15-4 in the repository.

Fiddler2 is a free, powerful HTTP proxy that runs on Windows and has multiple features to help in testing web applications. We are going to use the x5s plug-in created by Casaba Security, LLC, which describes this tool as follows:

x5s is a plugin for the free Fiddler HTTP proxy that actively injects tiny test cases into every user-controlled input of a Web-application in order to elicit and identify encoding issues that could lead to XSS vulnerability.

The x5s plug-in is pretty easy to configure; the steps for doing so can be found at its website.⁵ Basically, you need to manually crawl the web application so that Fiddler Proxy can identify potential input fields to be tested. This information is used by the x5s plug-in to inject its own test cases, trying to find XSS vulnerabilities.

Following are the steps to start hunting XSS via the x5s plug-in:

1. Start Fiddler2.

2. Go to the x5s tab and enable basic configuration, as shown here:

3. Go to the Test Case Configuration tab and enable just one test case (the one with code point U+FF1C), as shown:

This is the core functionality of the plug-in. As explained previously, the tool will inject specific UTF-8-encoded characters (shown in the Source/Test-case column) that it expects to be transformed into specific characters (shown in the Target column) by the application, thus helping us to bypass filters and perform our XSS attack.

XSS Testing via x5s the Plug-In

Browse to transforme.php, enter any data in the input field (as shown next), and click Submit so that it can be detected by Fiddler and so that x5s can do its magic:

Right after clicking the Submit button, go to the Results tab for the x5s plug-in and review the response (shown here):

Notice that the Transformation column reads “Transformed,” which means that the injected code point U+FF1C was transformed (thanks to normalization) to U+003C.

NOTE Although the U+FF1C code point is displayed, internally x5s is sending its UTF-8 encoded value (in this case, %ef%bc%9c).

Based on the transforme.php code, we were able to bypass the htmlspecialchars() function because it is receiving a UTF-8 value (%ef%bc%9c) that is not in the list of characters to be filtered out; then, normalization is applied to the string, which transforms the injection into the less-than character.

Launching the Attack Manually

Now that we know the application is using normalization, we can prepare an attack to successfully execute XSS, because our injected value will be placed in the value parameter of the input text field. Here is the classic example for injecting XSS into HTML forms using the single quote character to modify the form element:

<input type=′text′ name=′data′ value=′ ′ onMouseOver=alert(111) a=′

As explained before, the UTF-8 representation of our malicious single quote is ef bc 87, so we will inject %ef%bc%87 in order to transform it to %27. Our final encoded string looks like this:

%ef%bc%87%20onMouseOver%3dalert(111)%20a%3d%ef%bc%87

So, let’s send the malicious string to transforme.php script, as shown next:

Here, we check the page source from the browser:

We can see our injection string was able to bypass the filter and therefore was able to alter the HTML response! As a result, if you move your mouse over the input text, you’ll see that the XSS was successfully executed, as shown here:

NOTE Although we used x5s plug-in to find XSS vulnerabilities, it can definitely also be used to test SQL injections—just make sure to review all the responses thoroughly when trying to find a SQL syntax error.

Adding Your Own Test Case

Now that we have a new test case, we can easily add it to the x5s plug-in. In order to add your own code point to be injected, you need to edit the ShortMappingList.xml file located in the default Scripts directory where x5s was installed:

%USERPROFILE%\Documents\Fiddler2\Scripts\

Just add a new UnicodeTestMapping node with its own description, as shown here:

CAUTION Do not add the new description as the first xml node in the configuration file because, for some reason, Fiddler will fail to load the plug-in.

The most important options here are the Target and Source code points. After saving the file, restart Fiddler, go to the Test Case Configuration tab, and you will see that “My first Transformable Test Case” has been added (as shown next) and is ready to be injected for your next pen testing efforts:

When performing black box testing, it’s difficult to identify the way the applications are configured; for this reason, it’s imperative that you test all different cases in order to identify a vector attack. This testing must be automated using a tool such as Fiddler. Once a potential vulnerability has been identified, try to exploit it by testing it manually. Finally, add any new test cases identified to your automated system, as we did with the x5s plug-in, so that with each new effort, your testing capabilities become stronger and broader.

Summary

Here’s a rundown of what you learned in this chapter:

• How to perform SQL injection attacks by taking advantage of poor authentication implementations via hashing algorithms such as MD5.

• The importance of making sure systems are configured to recognize (“speak”) the same language to avoid multibyte injection attacks.

• How to recognize scenarios where you can force your XSS attacks to succeed.

• The importance of Unicode normalization and how it can be exploited.

• That even applications with proper security controls in place can be attacked successfully due to a misconfiguration.

• How to identify (and attack) the security controls needed to protect your applications.

References

1. Regalado, Daniel (2013, September 6). CBC Byte Flipping Attack – 101 Approach. Retrieved from Regalado (In) Security: danuxx.blogspot.com/2013/09/cbc-byte-flipping-attack-101-approach.html.

2. Shiflett, Chris (2006, January 6). addslashes() Versus mysql_real_escape_string(). Retrieved from Shiflett.org: shiflett.org/blog/2006/jan/addslashes-versus-mysql-real-escape-string.

3. Davis, Mark, and Ken Whistler (2014, June 5). Unicode Normalization Forms. Retrieved from Unicode Technical Reports: unicode.org/reports/tr15/.

4. Goldman, Mikael (2013, June 18). Creative Usernames and Spotify Account Hijacking. Retrieved from Spotify Labs: labs.spotify.com/2013/06/18/creative-usernames/.

5. Hernandez, John (2009/2010). x5s - automated XSS testing assistant. Casaba Security. Retrieved from CodePlex: xss.codeplex.com/documentation?referringTitle=Home.

For Further Reading

List of charset code tables www.fileformat.info/info/charset/index.htm.

Simplified Chinese GBK msdn.microsoft.com/en-US/goglobal/cc305153.aspx.

UTF-8 encoding table www.utf8-chartable.de/unicode-utf8-table.pl.

OWASP top 10 list released June of 2013 www.owasp.org/index.php/Category:OWASP_Top_Ten_Project.

OWASP top 10 list released in 2004 www.owasp.org/index.php/2004_Updates_OWASP_Top_Ten_Project.

OWASP parameterized SQL queries www.owasp.org/index.php/Query_Parameterization_Cheat_Sheet.