CORS in Action: Creating and consuming cross-origin APIs (2015)

Appendix C. What is CSRF?

Chapter 6 introduced the concept of cross-site request forgery (CSRF). This appendix takes a closer look at CSRF.

C.1. What is CSRF?

Let’s step out of the CORS mindset for a bit and talk about regular, old same-origin requests. Cookies are always included on same-origin requests, regardless of how that request was initiated. If you’re logged in to www.twitter.com, any time your browser navigates to a www.twitter.com site, the cookies will be included in the request. It doesn’t matter where the request originates: you can visit www.twitter.com directly or click a link to go to www.twitter.com. Even if a page merely links to an image hosted on www.twitter.com, the request for that image will include your cookies. You have no control over this behavior. If your browser has cookies associated with a site, they’re always included on the request.

Suppose a hacker creates a page that adds a new tweet to Twitter. Whenever someone visits this site, it sends a request to Twitter to create a tweet that says, “I have hacked your site!” (see figure C.1). If the hacker can somehow trick you into visiting his page, the tweet will be added to your own Twitter feed!

Figure C.1. CSRF exists because cookies are always included on requests, regardless of where the request comes from Luckily Twitter protects itself from CSRF with an authenticity_token..

This is at the heart of CSRF: an unauthorized site makes a request on your behalf using your cookies.

Note

We often think of hacking in terms of a hacker gaining access to your data. But in the case of CSRF, the hacker can do damage without ever reading your data. Actions with side effects such as adding a tweet or changing a password can have devastating consequences, without ever compromising your data.

Of course, Twitter has taken steps to guard against this issue. figure C.2 shows an actual request to Twitter to create a new tweet.

Figure C.2. Sending a new tweet request to Twitter. The authenticity_token guards against CSRF.

Along with things like the text of the tweet, the request includes an authenticity _token. This token is an encrypted string that Twitter uses to verify that the request is coming from Twitter’s own servers, and not from someone else. If Twitter receives a request without this authenticity_token (or with an invalid token), the request will fail.

Twitter’s authenticity_token is an example of a CSRF token. A CSRF token is a server-generated, cryptographically secure token that’s included on requests to verify that the request comes from a trusted server. It’s similar to the origin header in that it helps validate where the request originates from, but because the CSRF token is cryptographically secure, it can’t be generated by anyone but Twitter. This ensures that a request to create a new tweet comes only from Twitter’s own web page.

figure C.3 shows the lifecycle of Twitter’s authenticity_token. When you first make a request to Twitter, its server generates a unique authenticity_token and includes it as a hidden form field in the HTML response. Next, when you compose a new tweet and click the Tweet button, the text of the new tweet and the authenticity_token are sent to Twitter’s servers. Finally, Twitter’s servers compare the authenticity_token against the expected value. If they match, the new tweet is created; otherwise, the request is rejected.

Figure C.3. How Twitter uses the authenticity_token field to guard against CSRF

CSRF protection works because it introduces an “active” form of protection (the CSRF token) to a “passive” form of protection (the cookie). By passive, I mean that the browser will always include the cookie on requests, without looking at where the request comes from. The CSRF token fills this hole by serving as a marker that indicates where the request is coming from. It’s active because the client making the request must manually add the token to the request. There is no way for the browser to automatically add the CSRF token to the request, or to even know what the value of the CSRF token is.

What is in a CSRF token?

We’ve talked about validating the CSRF token in abstract terms, but what exactly is inside the CSRF token that needs to be validated? Different servers implement CSRF tokens differently. The CSRF token in Express (from the CSURF package from https://github.com/expressjs/csurf) looks like this:

CSRF token = salt + crypto(salt + secret)

Here is what each of those pieces means:

· Secret— The secret is a per-server secret value. This can be set by the user in the session (which is important for coordinating secrets across servers, as you’ll see later on), otherwise it will be randomly generated.

· Salt— The salt is another random value. But unlike the secret, the user cannot choose its value. The salt also has a fixed number of characters; at the time of this writing, Express’s salt has 10 characters (the number of characters in the salt comes into play when validating the token).

· Crypto— The crypto function hashes the salt plus the secret using SHA1, and then base64 encodes the result. Hashing is a one-way operation that can’t be reversed or decrypted.

Finally, the unencrypted salt value is prepended to the encrypted token value, and the sum is the CSRF token.

Let’s look at an example of how to calculate the CSRF token. Suppose the server’s secret is SECRET, and the salt is 0123456789. The first step is to hash the value 0123456789SECRET (the secret plus the salt). Suppose the result of this hash is ABCDEF (the hash value will look completely different from the secret and the salt). Finally, the salt is added to the hashed value, which is 0123456789ABCEDF. This is the CSRF token.

When validating the token, the server doesn’t decrypt the token and look at each part. (In fact, it can’t decrypt the token, because hashing the token is a one-way operation that can’t be reversed.) Instead, it looks at the incoming request CSRF token and grabs the first 10 characters. This is the salt value for this token. It then combines the salt value with the server secret to generate another token (using the same equation just noted). If the newly generated token matches the request CSRF token, the request is valid.

Turning again to the example, when the server receives the CSRF token 0123456789ABCEDF, it first strips off the first 10 characters to get the salt, which is 0123456789. Next, it runs the salt through the same equation. If this new value matches the CSRF token from the request, the request is valid.

Based on this explanation, the CSRF token may sound very similar to the Origin header. After all, they both describe where a request originates. However, there are some key differences, as summarized in table C.1. These differences make it a good idea to use a CSRF token even if an Origin header is available. CSRF protection is especially important for simple requests, where there is no preflight to protect the server from invalid requests.

Table C.1. Differences between the Origin header and CSRF token

Origin header	CSRF token
Set by the browser	Set by the server
Value is in plain text	Value is encrypted
Can be guessed (and spoofed using tools like curl)	Cannot be guessed or spoofed
Only present on cross-origin requests (although some browsers, such as Chrome and Safari, include Origin headers on some same-origin requests)	Present on cross-origin and same-origin requests

The next section looks at how to implement CSRF protection for same-origin requests. The techniques here don’t apply to CORS, but they’re useful for getting an understanding of how CSRF protection works.

C.2. Implementing CSRF protection for same-origin requests

It may be easier to understand CSRF tokens by looking at a new example that isolates the core concepts of CSRF. Listing C.1 shows a simple web server that implements a CSRF token. Note that this is new server code, so you should put this in a new app.js file (but running it is the same as before—just run node app.js). The Express framework has middleware support for CSRF tokens. You can install this middleware (and its dependencies) by running the following command:

npm install express body-parser cookie-parser express-session csurf

Listing C.1. Example of implementing CSRF protection

Once the code is set up and running, you can visit the page at http://localhost:2468/csrftest. This page displays an input box containing the CSRF token itself, along with a Submit button, as shown in figure C.4.

Figure C.4. CSRF token test page

The HTML source of this page is shown in figure C.5. The page is a simple web form with a text box and submit box. The text box is named _csrf, and contains the value of the CSRF token.

Figure C.5. HTML source for the CSRF sample

Clicking the Submit button will send a POST request to the server with the CSRF token in the POST body. In this sample, the CSRF token is included as part of the POST body, but it can be included anywhere in the request, including in the request URL or as a request header. The server reads the value of the CRSF token from the _csrf field, and checks that it’s a valid value. If the token is valid, the server responds with a successful message.

figure C.6 shows the lifecycle of the CSRF token. First, the server generates the CSRF token which is embedded somewhere in the client’s HTML. When the user performs an action, such as clicking the Submit button, the request includes the CSRF token. Finally, the server reads the CSRF token and checks whether or not it’s valid.

Figure C.6. Lifecycle of a CSRF token

The sample also lets you change the value of the CSRF token by typing in the text box. If you navigate back to the form, edit the CSRF token to something new, and click the Submit button again, you should see an error message with an HTTP status 403. figure C.7 shows both the success and error messages.

Figure C.7. Sample app with valid (left) and invalid (right) CSRF tokens

As you can see from this example, the Express middleware takes care of the details of implementing the CSRF token.

All materials on the site are licensed Creative Commons Attribution-Sharealike 3.0 Unported CC BY-SA 3.0 & GNU Free Documentation License (GFDL)

If you are the copyright holder of any material contained on our site and intend to remove it, please contact our site administrator for approval.