Symmetric Encryption - Adaptive Code via C#. Agile coding with design patterns and SOLID principles (2014)

Adaptive Code via C#. Agile coding with design patterns and SOLID principles (2014)

Chapter 5. Symmetric Encryption

This chapter discusses the basics of symmetric encryption algorithms. Message integrity checking and hash functions are covered in Chapter 6. The use of cryptography on a network is discussed in Chapter 9.

WARNING

Many of the recipes in this chapter are too low-level for general-purpose use. We recommend that you first try to find what you need in Chapter 9 before resorting to building solutions yourself using the recipes in this chapter. If you do use these recipes, please be careful, read all of our warnings, and do consider using the higher-level constructs we suggest.

5.1. Deciding Whether to Use Multiple Encryption Algorithms

Problem

You need to figure out whether to support multiple encryption algorithms in your system.

Solution

There is no right answer. It depends on your needs, as we discuss in the following section.

Discussion

Clearly, if you need to support multiple encryption algorithms for standards compliance or legacy support, you should do so. Beyond that, there are two schools of thought. The first school of thought recommends that you support multiple algorithms to allow users to pick their favorite. The other benefit of this approach is that if an algorithm turns out to be seriously broken, supporting multiple algorithms can make it easier for users to switch.

However, the other school of thought points out that in reality, many users will never switch algorithms, even if one is broken. Moreover, by supporting multiple algorithms, you risk adding more complexity to your application, which can be detrimental. In addition, if there are multiple interoperating implementations of a protocol you're creating, other developers often will implement only their own preferred algorithms, potentially leading to major interoperability problems.

We personally prefer picking a single algorithm that will do a good enough job of meeting the needs of all users. That way, the application is simpler to comprehend, and there are no interoperability issues. If you choose well-regarded algorithms, the hope is that there won't be a break that actually impacts end users. However, if there is such a break, you should make the algorithm easy to replace. Many cryptographic APIs, such as the OpenSSL EVP interface (discussed in Recipe 5.17), provide an interface to help out here.

See Also

Recipe 5.17

5.2. Figuring Out Which Encryption Algorithm Is Best

Problem

You need to figure out which encryption algorithm you should use.

Solution

Use something well regarded that fits your needs. We recommend AES for general-purpose use. If you're willing to go against the grain and are paranoid, you can use Serpent, which isn't quite as fast as AES but is believed to have a much higher security margin.

If you really feel that you need the fastest possible secure solution, consider the SNOW 2.0 stream cipher, which currently looks very good. It appears to have a much better security margin than the popular favorite, RC4, and is even faster. However, it is fairly new. If you're highly risk-adverse, we recommend AES or Serpent. Although popular, RC4 would never be the best available choice.

Discussion

WARNING

Be sure to read this discussion carefully, as well as other related discussions. While a strong encryption algorithm is a great foundation, there are many ways to use strong encryption primitives in an insecure way.

There are two general types of ciphers:

Block ciphers

These work by encrypting a fixed-size chunk of data (a block). Data that isn't aligned to the size of the block needs to be padded somehow. The same input always produces the same output.

Stream ciphers

These work by generating a stream of pseudo-random data, then using XOR[1] to combine the stream with the plaintext.

There are many different ways of using block ciphers; these are called block cipher modes . Selecting a mode and using it properly is important to security. Many block cipher modes are designed to produce a result that acts just like a stream cipher. Each block cipher mode has its advantages and drawbacks. See Recipe 5.4 for information on selecting a mode.

Stream ciphers generally are used as designed. You don't hear people talking about stream cipher modes. This class of ciphers can be made to act as block ciphers, but that generally destroys their best property (their speed), so they are typically not used that way.

We recommend the use of only those ciphers that have been studied by the cryptographic community and are held in wide regard.

There are a large number of symmetric encryption algorithms. However, unless you need a particular algorithm for the sake of interoperability or standards, we recommend using one of a very small number of well-regarded algorithms. AES, the Advanced Encryption Standard, is a great general-purpose block cipher. It is among the fastest block ciphers, is extremely well studied, and is believed to provide a high level of security. It can also use key lengths up to 256 bits.

AES has recently replaced Triple-DES (3DES), a variant of the original Data Encryption Standard (DES), as the block cipher of choice, partially because of its status as a U.S. government standard, and partially because of its widespread endorsement by leading cryptographers. However, Triple-DES is still considered a very secure alternative to AES. In fact, in some ways it is a more conservative solution, because it has been studied for many more years than has AES, and because AES is based on a relatively new breed of block cipher that is far less understood than the traditional underpinnings upon which Triple-DES is based.[2]

Nonetheless, AES is widely believed to be able to resist any practical attack currently known that could be launched against any block cipher. Today, many cryptographers would feel just as safe using AES as they would using Triple-DES. In addition, AES always uses longer effective keys and is capable of key sizes up to 256 bits, which should offer vastly more security than Triple-DES, with its effective 112-bit keys.[3] (The actual key length can be either 128 or 192 bits, but not all of the bits have an impact on security.) DES itself is, for all intents and purposes, insecure because of its short key length. Finally, AES is faster than DES, and much faster than Triple-DES.

Serpent is a block cipher that has received significant scrutiny and is believed to have a higher security margin than AES. Some cryptographers worry that AES may be easy to break in 5 to 10 years because of its nontraditional nature and its simple algebraic structure. Serpent is significantly more conservative in every way, but it is slower. Nonetheless, it's at least three times faster than Triple-DES and is more than fast enough for all practical purposes.

Of course, because AES is a standard, you won't lose your job if AES turns out to be broken, whereas you'll probably get in trouble if Serpent someday falls!

RC4 is the only widely used stream cipher. It is quite fast but difficult to use properly, because of a major weakness in initialization (when using a key to initialize the cipher). In addition, while there is no known practical attack against RC4, there are some theoretical problems that show this algorithm to be far from optimal. In particular, RC4's output is fairly easy to distinguish from a true random generator, which is a bad sign. (See Recipe 5.23 for information on how to use RC4 securely.)

SNOW is a new stream cipher that makes significant improvements on old principles. Besides the fact that it's likely to be more secure than RC4, it is also faster—an optimized C version runs nearly twice as fast for us than does a good, optimized assembly implementation of RC4. It has also received a fair amount of scrutiny, though not nearly as much as AES. Nothing significant has been found in it, and even the minor theoretical issues in the first version were fixed, resulting in SNOW 2.0.

Table 5-1 shows some of the fastest noncommercial implementations for popular patent-free algorithms we could find and run on our own x86-based hardware. (There may, of course, be faster implementations out there.) Generally, the implementations were optimized assembly. Speeds are measured in cycles per byte for the Pentium III, which should give a good indication of how the algorithms perform in general.

On a 1 GHz machine, you would need an algorithm running at 1 cycle per byte to be able to encrypt 1 gigabyte per second. On a 3 GHz machine, you would only need the algorithm to run at 3 cycles per byte. Some of the implementations listed in the table are therefore capable of handling gigabit speeds fairly effortlessly on reasonable PC hardware.

Note that you won't generally quite get such speeds in practice as a result of overhead from cache misses and other OS-level issues, but you may come within a cycle or two per byte.

Table 5-1. Noncommercial implementations for popular patent-free encryption algorithms

Cipher

Key size

Speed[4]

Implementation

Notes

AES

128 bits[5]

14.1 cpb in asm, 22.6 cpb in C

Brian Gladman's[6]

The assembly version currently works only on Windows.

AES

128 bits

41.3 cpb

OpenSSL

This could be a heck of a lot better and should probably improve in the near future. Currently, we recommend Brian Gladman's C code instead. Perhaps OpenSSL will incorporate Brian's code soon!

Triple DES

192 bits[7]

108.2 cpb

OpenSSL

SNOW 2.0

128 or 256 bits

6.4 cpb

Fast reference implementation[8]

This implementation is written in C.

RC4

Up to 256 bits (usually 128 bits)

10.7 cpb

OpenSSL

Serpent

128, 192, or 256 bits

35.6 cpb

Fast reference implementation

It gets a lot faster on 64-bit platforms and is at least as fast as AES in hardware.

Blowfish

Up to 256 bits (usually 128 bits)

23.2 cpb

OpenSSL

[4] All timing values are best cases based on empirical testing and assumes that the data being processed is already in cache. Do not expect that you'll quite be able to match these speeds in practice.

[5] AES supports 192-bit and 256-bit keys, but the algorithm then runs slower.

[6] http://fp.gladman.plus.com/AES/

[7] The effective strength of Triple DES is theoretically no greater than112 bits.

[8] Available from http://www.it.lth.se/cryptology/snow/

As we mentioned, we generally prefer AES (when used properly), which is not only a standard but also is incredibly fast for a block cipher. It's not quite as fast as RC4, but it seems to have a far better security margin. If speed does make a difference to you, you can choose SNOW 2.0, which is actually faster than RC4. Or, in some environments, you can use an AES mode of operation that allows for parallelization, which really isn't possible in an interoperable way using RC4. Particularly in hardware, AES in counter mode can achieve much higher speeds than even SNOW can.

Clearly, Triple-DES isn't fast in the slightest; we have included it in Table 5-1 only to give you a point of reference. In our opinion, you really shouldn't need to consider anything other than AES unless you need interoperability, in which case performance is practically irrelevant anyway!

See Also

§ Brian Gladman's Cryptographic Technology page: http://fp.gladman.plus.com/AES/

§ OpenSSL home page: http://www.openssl.org/

§ SNOW home page: http://www.it.lth.se/cryptology/snow/

§ Serpent home page: http://www.cl.cam.ac.uk/~rja14/serpent.html

§ Recipe 5.4, Recipe 5.23


[1] Or some other in-group operation, such as modular addition.

[2] Most block ciphers are known as Feistel ciphers, a construction style dating back to the early 1970s. AES is a Square cipher, which is a new style of block cipher construction, dating only to 1997.

[3] This assumes that a meet-in-the-middle attack is practical. Otherwise, the effective strength is 168 bits. In practice, even 112 bits is enough.

5.3. Selecting an Appropriate Key Length

Problem

You are using a cipher with a variable key length and need to decide which key length to use.

Solution

Strike a balance between long-term security needs and speed requirements. The weakest commonly used key length we would recommend in practice would be Triple-DES keys (112 effective bits). For almost all other algorithms worth considering, it is easy to use 128-bit keys, and you should do so. Some would even recommend using a key size that's twice as big as the effective strength you'd like (but this is unnecessary if you properly use a nonce when you encrypt; see Section 5.3.3).

Discussion

Some ciphers offer configurable key lengths. For example, AES allows 128-bit, 192-bit, or 256-bit keys, whereas RC4 allows for many different sizes, but 40 bits and 128 bits are the common configurations. The ease with which an attacker can perform a brute-force attack (trying out every possible key) is based not only on key length, but also on the financial resources of the attacker. 56-bit keys are trivial for a well-funded government to break, and even a person with access to a reasonable array of modern desktop hardware can break 56-bit keys fairly quickly. Therefore, the lifetime of 56-bit keys is unreasonable for any security needs. Unfortunately, there are still many locations where 40-bit keys or 56-bit keys are used, because weak encryption used to be the maximum level of encryption that could be exported from the United States.

TIP

Symmetric key length recommendations do not apply to public key lengths. See Recipe 7.3 for public key length recommendations.

Supporting cryptographically weak configurations is a risky proposition. Not only are the people who are legitimately using those configurations at risk, but unless you are extremely careful in your protocol design, it is also possible that an attacker can force the negotiation of an insecure configuration by acting as a "man in the middle" during the initial phases of a connection, before full-fledged encryption begins. Such an attack is often known as a rollback attack , because the attacker forces the communicating parties to use a known insecure version of the protocol. (We discuss how to thwart such attacks in Recipe 10.7.)

In the real world, people try very hard to get to 80 bits of effective security, which we feel is the minimum effective strength you should accept. Generally, 128 bits of effective security is considered probably enough for all time, if the best attack that can be launched against a system is brute force. However, even if using the right encryption mode, that still assumes no cryptographic weaknesses in the cipher whatsoever.

In addition, depending on the way you use encryption, there are precomputation and collision attacks that allow the attacker to do better than brute force. The general rule of thumb is that the effective strength of a block cipher is actually half the key size, assuming the cipher has no known attacks that are better than brute force.

However, if you use random data properly, you generally get a bit of security back for each bit of the data (assuming it's truly random; see Recipe 11.1 for more discussion about this). The trick is using such data properly. In CBC mode, generally the initialization vector for each message sent should be random, and it will thwart these attacks. In most other modes, the initialization vector acts more like a nonce, where it must be different for each message but doesn't have to be completely random. In such cases, you can select a random value at key setup time, then construct per-message initializers by combining the random value and a message counter.

In any event, with a 128-bit key, we strongly recommend that you build a system without a 64-bit random value being used in some fashion to prevent against attack.

Should you use key lengths greater than 128 bits, especially considering that so many algorithms provide for them? For example, AES allows for 128-bit, 192-bit, and 256-bit keys. Longer key lengths provide more security, yet for AES they are less efficient (in most other variable key length ciphers, setup gets more expensive, but encryption does not). In several of our own benchmarks, 128-bit AES is generally only about 33% faster than 256-bit AES. Also, 256-bit AES runs at least 50% faster than Triple-DES does. When it was the de facto standard, Triple-DES was considered adequate for almost all applications.

In the real world, 128 bits of security may be enough for all time, even considering that the ciphers we use today are probably nowhere near as good as they could be. And if it ever becomes something to worry about, it will be news on geek web sites like Slashdot. Basically, when the U.S. government went through the AES standardization process, they were thinking ahead in asking for algorithms capable of supporting 192-bit and 256-bit keys, just in case future advances like quantum computing somehow reduce the effective key strength of symmetric algorithms.

Until there's a need for bigger keys, we recommend sticking with 128-bit keys when using AES as there is no reason to take the efficiency hit when using AES. We say this particularly because we don't see anything on the horizon that is even a remote threat.

However, this advice assumes you're really getting 128 bits of effective strength. If you refuse to use random data to prevent against collision and precomputation attacks, it definitely makes sense to move to larger key sizes to obtain your desired security margin.

See Also

Recipe 5.3, Recipe 7.3, Recipe 10.7, Recipe 11.1

5.4. Selecting a Cipher Mode

Problem

You need to use a low-level interface to encryption. You have chosen a block cipher and need to select the mode in which to use that cipher.

Solution

There are various tradeoffs. For general-purpose use, we recommend CWC mode in conjunction with AES, as we discuss in the following section. If you wish to do your own message authentication, we recommend CTR mode, as long as you're careful with it.

Discussion

First, we should emphasize that you should use a low-level mode only if it is absolutely necessary, because of the ease with which accidental security vulnerabilities can arise. For general-purpose use, we recommend a high-level abstraction, such as that discussed in Recipe 5.16.

With that out of the way, we'll note that each cipher mode has its advantages and drawbacks. Certain drawbacks are common to all of the popular cipher modes and should usually be solved at another layer. In particular:

§ If a network attack destroys or modifies data in transit, any cipher mode that does not perform integrity checking will, if the attacker does his job properly, fail to detect an error. The modes we discuss that provide built-in integrity checking are CWC, CCM, and OCB.

§ When an attacker does tamper with a data stream by adding or truncating, most modes will be completely unable to recover. In some limited circumstances, CFB mode can recover, but this problem is nonetheless better solved at the protocol layer.

§ Especially when padding is not necessary, the ciphertext length gives away information about the length of the original message, which can occasionally be useful to an attacker. This is a covert channel, but one that most people choose to ignore. If you wish to eliminate risks with regard to this problem, pad to a large length, even if padding is not needed. To get rid of the risk completely, send fixed-size messages at regular intervals, whether or not there is "real" data to send. Bogus messages to eliminate covert channels are called cover traffic.

§ Block ciphers leak information about the key as they get used. Some block cipher modes leak a lot more information than others. In particular, CBC mode leaks a lot more information than something like CTR mode.

WARNING

If you do not use a cipher mode that provides built-in integrity checking, be sure to use a MAC (message authentication code) whenever encrypting.

In the following sections, we'll go over the important properties of each of the most popular modes, pointing out the tradeoffs involved with each (we'll avoid discussing the details of the modes here; we'll do that in later recipes). Note that if a problem is listed for only a single cipher mode and goes unmentioned elsewhere, it is not a problem for those other modes. For each of the modes we discuss, speed is not a significant concern; the only thing that has a significant impact on performance is the underlying block cipher.[9]

Electronic Code Book (ECB) mode

This mode simply breaks up a message into blocks and directly encrypts each block with the raw encryption operation. It does not have any desirable security properties and should not be used under any circumstances. We cover raw encryption as a building block for building other modes, but we don't cover ECB itself because of its poor security properties.

ECB has been standardized by NIST (the U.S. National Institute for Standards and Technology).

The primary disadvantages of ECB mode are:

§ Encrypting a block of a fixed value always yields the same result, making ECB mode particularly susceptible to dictionary attacks.

§ When encrypting more than one block and sending the results over an untrusted medium, it is particularly easy to add or remove blocks without detection (that is, ECB is susceptible to tampering, capture replay, and other problems). All other cipher modes that lack integrity checking have similar problems, but ECB is particularly bad.

§ The inputs to the block cipher are never randomized because they are always exactly equal to the corresponding block of plaintext.

§ Offline precomputation is feasible.

The mode does have certain advantages, but do note that other modes share these advantages:

§ Multiblock messages can be broken up, and the pieces encrypted in parallel.

§ Random access of messages is possible; the 1,024th block can be decrypted without decrypting other data blocks.

However, the advantages of ECB do not warrant its use.

We do discuss how to use ECB to encrypt a block at a time in Recipe 5.5, when it is necessary in implementing other cryptographic primitives.

Cipher Block Chaining (CBC) mode

CBC mode is a simple extension to ECB mode that adds a significant amount of security. CBC works by breaking the message up into blocks, then using XOR to combine the ciphertext of the previous block with the plaintext of the current block. The result is then encrypted in ECB mode. The very first block of plaintext is XOR'd with an initialization vector (IV). The IV can be publicly known, and it must be randomly selected for maximum security. Many people use sequential IVs or even fixed IVs, but that is not at all recommended. For example, SSL has had security problems in the past when using CBC without random IVs. Also note that if there are common initial strings, CBC mode can remain susceptible to dictionary attacks if no IV or similar mechanism is used. As with ECB, padding is required, unless messages are always block-aligned.

CBC has been standardized by NIST.

The primary disadvantages of CBC mode are:

§ Encryption cannot be parallelized (though decryption can be, and there are encryption workarounds that break interoperability; see Recipe 5.14).

§ There is no possibility of offline precomputation.

§ Capture replay of entire or partial messages can be possible without additional consideration.

§ The mode requires an initial input that must be random. It is not sufficient to use a unique but predictable value.

§ The mode leaks more information than is optimal. We wouldn't use it to output more than 240 blocks.

§ The primary advantage of CBC mode is that it captures the desirable properties of ECB mode, while removing most of the drawbacks.

We discuss CBC mode in Recipe 5.6.

Counter (CTR) mode

Whereas ECB and CBC are block-based modes, counter (CTR) mode and the rest of the modes described in this section simulate a stream cipher. That is, they use block-based encryption as an underlying primitive to produce a pseudo-random stream of data, known as a keystream . The plaintext is turned into ciphertext by XOR'ing it with the keystream.

CTR mode generates a block's worth of keystream by encrypting a counter using ECB mode. The result of the encryption is a block of keystream. The counter is then incremented. Generally, the counter being publicly known is acceptable, though it's always better to keep it a secret if possible. The counter can start at a particular value, such as zero, or something chosen at random, and increment by one every time. (The initial counter value is a nonce, which is subtly different from an initialization vector; see Recipe 4.9.) Alternatively, the counter can be modified every time using a deterministic pseudo-random number generator that doesn't repeat until all possible values are generated. The only significant requirements are that the counter value never be repeated and that both sides of an encryption channel know the order in which to use counters. In practice, part of the counter is usually chosen randomly at keying time, and part is sequential. Both parts help thwart particular kinds of risks.

Despite being over 20 years old, CTR mode has only recently been standardized by NIST as part of the AES standardization process.

The primary disadvantages of CTR mode are:

§ Flipping bits in the plaintext is very easy because flipping a ciphertext bit flips the corresponding plaintext bit (this problem is shared with all stream cipher modes). As with other encryption algorithms, message integrity checks are absolutely necessary for adequate security.

§ Reusing {key, counter} pairs is disastrous. Generally, if there is any significant risk of reusing a {key, nonce} pair (e.g., across reboot), it is best to avoid ever reusing a single key across multiple messages (or data streams). (See Recipe 4.11 for advice if you wish to use one base secret and derive multiple secrets from it.)

§ CTR mode has inadequate security when using ciphers with 64-bit blocks, unless you use a large random nonce and a small counter, which drastically limits the number of messages that can be sent. For this reason, OCB is probably still preferable for such ciphers, but CTR is clearly better for 128-bit block ciphers.

The primary advantages of CTR mode are:

§ The keystream can be precomputed.

§ The keystream computation can be done in parallel.

§ Random access into the keystream is possible. (The 1,024th byte can be decrypted with only a single raw encryption operation.)

§ For ciphers where raw encryption and decryption require separate algorithms (particularly AES), only a single algorithm is necessary. In such a case, the faster of the two algorithms can be used (though you will get incompatible results if you use decryption where someone else uses encryption).

§ CTR mode leaks incredibly little information about the key. After 264 encryptions, an attacker would learn about a bit's worth of information on a 128-bit key.

CTR mode is old and simple, and its security properties are well understood. It has recently gained a lot of favor in the cryptographic community over other solutions for using block ciphers in streaming modes, particularly as the world moves to AES with its 128-bit blocks.

Many of the "better" modes that provide built-in integrity checking, such as CWC and CCM mode, use CTR mode as a component because of its desirable properties.

We discuss CTR mode in Recipe 5.9.

Output Feedback (OFB) mode

OFB mode is another streaming mode, much like CTR mode. The keystream is generated by continually encrypting the last block of keystream to produce the next block. The first block of keystream is generated by encrypting a nonce. OFB mode shares many properties with CTR mode, although CTR mode has additional benefits. Therefore, OFB mode is seeing less and less use these days.

OFB mode has been standardized by NIST.

The primary disadvantages of OFB mode are:

§ Bit-flipping attacks are easy, as with any streaming mode. Again, integrity checks are a must.

§ Reusing a {key, none} pair is disastrous (but is easy to avoid). Generally, if there is any significant risk of reusing a {key, nonce} pair (e..g., across reboot), it is best to avoid reusing a single key across multiple messages or data streams. (See Recipe 4.11 for advice if you wish to use one base secret, and derive multiple secrets from it.)

§ Keystream computation cannot be done in parallel.

The primary advantages of OFB mode are:

§ Keystreams can be precomputed.

§ For ciphers where raw encryption and decryption operations require separate algorithms (particularly AES), only a single algorithm is necessary. In such a case, the faster of the two algorithms can be used (though you will get incompatible results if you use decryption where someone else uses encryption).

§ It does not have nonce-size problems when used with 64-bit block ciphers.

§ When used properly, it leaks information at the same (slow) rate that CTR mode does.

We discuss OFB mode in Recipe 5.8.

Cipher Feedback (CFB) mode

CFB mode generally works similarly to OFB mode, except that in its most common configuration, it produces keystream by always encrypting the last block of ciphertext, instead of the last block of keystream.

CFB mode has been standardized by NIST.

The primary disadvantages of CFB mode are:

§ Bit-flipping attacks are easy, as with any streaming mode. Again, integrity checks are a must.

§ Reusing a {key, nonce} pair is disastrous (but is easy to avoid). Generally, if there is any significant risk of reusing a {key, nonce} pair (e.g., across reboot), it is best to avoid reusing a single key across multiple messages or data streams.

§ Encryption cannot be parallelized (though decryption can be).

The primary advantages of CFB mode are:

§ For ciphers where raw encryption and decryption operations require separate algorithms (particularly AES), only a single algorithm is necessary. In such a case, the faster of the two algorithms can be used.

§ A minor bit of precomputational work can be done in advance of receiving a block-sized element of data, but this is not very significant compared to CTR mode or OFB mode.

§ It does not have nonce-size problems when used with 64-bit block ciphers.

These days, CFB mode is rarely used because CTR mode and OFB mode provide more advantages with no additional drawbacks.

We discuss CFB mode in Recipe 5.7.

Carter-Wegman + CTR (CWC) mode

CWC mode is a high-level encryption mode that provides both encryption and built-in message integrity, similar to CCM and OCB modes (discussed later).

CWC is a new mode, introduced by Tadayoshi Kohno, John Viega, and Doug Whiting. NIST is currently considering CWC mode for standardization.

The primary disadvantages of CWC are:

§ The required nonce must never be reused (this is easy to avoid).

§ It isn't well suited for use with 64-bit block ciphers. It does work well with AES, of course.

The primary advantages of CWC mode are:

§ CWC ensures message integrity in addition to performing encryption.

§ The additional functionality requires minimal message expansion. (You would need to send the same amount of data to perform integrity checking with any of the cipher modes described earlier.)

§ CWC is parallelizable (hardware implementations can achieve speeds above 10 gigabits per second).

§ CWC has provable security properties while using only a single block cipher key. This means that under reasonable assumptions on the underlying block cipher, the mode provides excellent secrecy and message integrity if the nonce is always unique.

§ CWC leverages all the good properties of CTR mode, such as being able to handle messages without padding and being slow to leak information.

§ For ciphers where raw encryption and decryption operations require separate algorithms (particularly AES), only a single algorithm is necessary. In such a case, the faster of the two algorithms can be used (though you will get incompatible results if you use decryption where someone else uses encryption).

We believe that the advantages of CWC mode make it more appealing for general-purpose use than all other modes. However, the problem of repeating nonces is a serious one that developers often get wrong. See Recipe 5.10, where we provide a high-level wrapper to CWC mode that is designed to circumvent such problems.

Offset Codebook (OCB) mode

OCB mode is a patented encryption mode that you must license to use.[10] CWC offers similar properties and is not restricted by patents.

OCB is reasonably new. It was introduced by Phil Rogaway and is based on earlier work at IBM. Both parties have patents covering this work, and a patent held by the University of Maryland also may apply. OCB is not under consideration by any standards movements.

The primary disadvantages of OCB mode are:

§ It is restricted by patents.

§ The required nonce must never be reused (this is easy to avoid).

§ It isn't well suited for use with 64-bit block ciphers. It does work well with AES, of course.

The primary advantages of OCB mode are:

§ OCB ensures message integrity in addition to performing encryption.

§ The additional functionality requires minimal message expansion (you would need to send the same amount of data to perform integrity checking with any of the previously mentioned cipher modes).

§ OCB is fully parallelizable (hardware implementations can achieve speeds above 10 gigabits per second).

§ OCB has provable security properties while using only a single block cipher key. This means that under reasonable assumptions on the underlying block cipher, the mode provides excellent secrecy and message integrity if the nonce is always unique.

§ Messages can be of arbitrary length (there is no need for block alignment).

§ For ciphers where raw encryption and decryption operations require separate algorithms (particularly AES), only a single algorithm is necessary. In such a case, the faster of the two algorithms can be used (though you will get incompatible results if you use decryption where someone else uses encryption).

Because of its patent status and the availability of free alternatives with essentially identical properties (particularly CWC mode), we recommend against using OCB mode. If you're interested in using it anyway, see Phil Rogaway's OCB page at http://www.cs.ucdavis.edu/~rogaway/ocb/.

CTR plus CBC-MAC (CCM) mode

While OCB mode has appealing properties, its patent status makes it all but useless for most applications. CCM is another alternative that provides many of the same properties, without any patent encumbrance. There are some disadvantages of CCM mode, however:

§ While encryption and decryption can be parallelized, the message integrity check cannot be. OCB and CWC both avoid this limitation.

§ In some applications, CCM can be nonoptimal because the length of the message must be known before processing can begin.

§ The required nonce must never be reused (this is easy to avoid).

§ It isn't well suited to 64-bit block ciphers. It does work well with AES, of course.

CCM is also fairly new (more recent than OCB, but a bit older than CWC). It was introduced by Doug Whiting, Russ Housley, and Niels Fergusen. NIST is currently considering it for standardization.

The primary advantages of CCM mode are:

§ CCM ensures message integrity in addition to performing encryption.

§ The message integrity functionality requires minimal message expansion (you would need to send the same amount of data to perform integrity checking with any of the previously mentioned cipher modes).

§ CCM has provable security properties while using only a single key. This means that under reasonable assumptions on the underlying block cipher, the mode provides near-optimal secrecy and message integrity if the required nonce is always unique.

§ CCM leverages most of the good properties of CTR mode, such as being able to handle messages without padding and being slow to leak information.

§ For ciphers where raw encryption and decryption operations require separate algorithms (particularly AES), only a single algorithm is necessary. In such a case, the faster of the two algorithms can be used (though you will get incompatible results if you use decryption where someone else uses encryption).

In this book, we focus on CWC mode instead of CCM mode because CWC mode offers additional advantages, even though in many environments those advantages are minor. However, if you wish to use CCM mode, we recommend that you grab an off-the-shelf implementation of it because the mode is somewhat complex in comparison to standard modes. As of this writing, there are three free, publicly available implementations of CCM mode:

§ The reference implementation: http://hifn.com/support/ccm.htm

§ The implementation from Secure Software: http://www.securesoftware.com/ccm.php

§ The implementation from Brian Gladman: http://fp.gladman.plus.com/AES/ccm.zip

See Also

§ CCM reference implementation: http://hifn.com/support/ccm.htm

§ CCM implementation from Secure Software: http://www.securesoftware.com/ccm.php

§ CCM implementation from Brian Gladman: http://fp.gladman.plus.com/AES/ccm.zip

§ CWC home page: http://www.zork.org/cwc/

§ OCB home page: http://www.cs.ucdavis.edu/~rogaway/ocb/

§ Recipe 4.9, Recipe 4.11, Recipe 5.5-Recipe 5.10, Recipe 5.14, Recipe 5.16


[9] Integrity-aware modes will necessarily be slower than raw encryption modes, but CWC and OCB are faster than combining an integrity primitive with a standard mode, and CCM is just as fast as doing so.

[10] At least one other patent also needs to be licensed to use this mode legally.

5.5. Using a Raw Block Cipher

Problem

You're trying to make one of our implementations for other block cipher modes work. They all use raw encryption operations as a foundation, and you would like to understand how to plug in third-party implementations.

Solution

Raw operations on block ciphers consist of three operations: key setup, encryption of a block, and decryption of a block. In other recipes, we provide three macros that you need to implement to use our code. In the discussion for this recipe, we'll look at several desirable bindings for these macros.

Discussion

WARNING

Do not use raw encryption operations in your own designs! Such operations should only be used as a fundamental building block by skilled cryptographers.

Raw block ciphers operate on fixed-size chunks of data. That size is called the block size . The input and output are of this same fixed length. A block cipher also requires a key, which may be of a different length than the block size. Sometimes an algorithm will allow variable-length keys, but the block size is generally fixed.

Setting up a block cipher generally involves turning the raw key into a key schedule . Basically, the key schedule is just a set of keys derived from the original key in a cipher-dependent manner. You need to create the key schedule only once; it's good for every use of the underlying key because raw encryption always gives the same result for any {key, input} pair (the same is true for decryption).

Once you have a key schedule, you can generally pass it, along with an input block, into the cipher encryption function (or the decryption function) to get an output block.

To keep the example code as simple as possible, we've written it assuming you are going to want to use one and only one cipher with it (though it's not so difficult to make the code work with multiple ciphers).

To get the code in this book working, you need to define several macros:

SPC_BLOCK_SZ

Denotes the block size of the cipher in bytes.

SPC_KEY_SCHED

This macro must be an alias for the key schedule type that goes along with your cipher. This value will be library-specific and can be implemented by typedef instead of through a macro. Note that the key schedule type should be an array of bytes of some fixed size, so that we can ask for the size of the key schedule using sizeof(SPC_KEY_SCHED).

SPC_ENCRYPT_INIT(sched, key, keybytes) and SPC_DECRYPT_INIT(sched, key, keybytes)

Both of these macros take a pointer to a key schedule to write into, the key used to derive that schedule, and the number of bytes in that key. If you are using an algorithm with fixed-size keys, you can ignore the third parameter. Note that once you've built a key schedule, you shouldn't be able to tell the difference between different key lengths. In many implementations, initializing for encryption and initializing for decryption are the same operation.

SPC_DO_ENCRYPT(sched, in, out) and SPC_DO_DECRYPT(sched, in, out)

Both of these macros are expected to take a pointer to a key schedule and two pointers to memory corresponding to the input block and the output block. Both blocks are expected to be of size SPC_BLOCK_SZ.

In the following sections, we'll provide some bindings for these macros for Brian Gladman's AES implementation and for the OpenSSL API. Unfortunately, we cannot use Microsoft's CryptoAPI because it does not allow for exchanging symmetric encryption keys without encrypting them (see Recipe 5.26 and Recipe 5.27 to see how to work around this limitation)—and that would add significant complexity to what we're trying to achieve with this recipe. In addition, AES is only available in the .NET framework, which severely limits portability across various Windows versions. (The .NET framework is available only for Windows XP and Windows .NET Server 2003.)

Brian Gladman's AES implementation

Brian Gladman has written the fastest freely available AES implementation to date. He has a version in x86 assembly that works with Windows and a portable C version that is faster than the assembly versions other people offer. It's available from his web page athttp://fp.gladman.plus.com/AES/.

To bind his implementation to our macros, do the following:

#include "aes.h"

#define SPC_BLOCK_SZ 16

typedef aes_ctx SPC_KEY_SCHED;

#define SPC_ENCRYPT_INIT(sched, key, keybytes) aes_enc_key(key, keybytes, sched)

#define SPC_DECRYPT_INIT(sched, key, keybytes) aes_dec_key(key, keybytes, sched)

#define SPC_DO_ENCRYPT(sched, in, out) aes_enc_block(in, out, sched)

#define SPC_DO_DECRYPT(sched, in, out) aes_dec_block(in, out, sched)

OpenSSL block cipher implementations

Next, we'll provide implementations for these macros for all of the ciphers in OpenSSL 0.9.7. Note that the block size for all of the algorithms listed in this section is 8 bytes, except for AES, which is 16.

Table 5-2 lists the block ciphers that OpenSSL exports, along with the header file you need to include for each cipher and the associated type for the key schedule.

Table 5-2. Block ciphers supported by OpenSSL

Cipher

Header file

Key schedule type

AES

openssl/aes.h

AES_KEY

Blowfish

openssl/blowfish.h

BF_KEY

CAST5

openssl/cast.h

CAST_KEY

DES

openssl/des.h

DES_key_schedule

3-key Triple-DES

openssl/des.h

DES_EDE_KEY

2-key Triple-DES

openssl/des.h

DES_EDE_KEY

IDEA

openssl/idea.h

IDEA_KEY_SCHEDULE

RC2

openssl/rc2.h

RC2_KEY

RC5

openssl/rc5.h

RC5_32_KEY

Table 5-3 provides implementations of the SPC_ENCRYPT_INIT macro for each of the block ciphers listed in Table 5-2.

Table 5-3. Implementations for the SPC_ENCRYPT_INIT macro for each OpenSSL-supported block cipher

Cipher

OpenSSL-based SPC_ENCRYPT_INIT implementation

AES

AES_set_encrypt_key(key, keybytes * 8, sched)

Blowfish

BF_set_key(sched, keybytes, key)

CAST5

CAST_set_key(sched, keybytes, key)

DES

DES_set_key_unchecked((DES_cblock *)key, sched)

3-key Triple-DES

DES_set_key_unchecked((DES_cblock *)key, &sched->ks1); \DES_set_key_unchecked((DES_cblock *)(key + 8), &sched->ks2); \DES_set_key_unchecked((DES_cblock *)(key + 16), &sched->ks3);

2-key Triple-DES

DES_set_key_unchecked((DES_cblock *)key, &sched->ks1); \DES_set_key_unchecked((DES_cblock *)(key + 8), &sched->ks2);

IDEA

idea_set_encrypt_key(key, sched);

RC2

RC2_set_key(sched, keybytes, key, keybytes * 8);

RC5

RC5_32_set_key(sched, keybytes, key, 12);

In most of the implementations in Table 5-3, SPC_DECRYPT_INIT will be the same as SPC_ENCRYPT_INIT (you can define one to the other). The two exceptions are AES and IDEA. For AES:

#define SPC_DECRYPT_INIT(sched, key, keybytes) \

AES_set_decrypt_key(key, keybytes * 8, sched)

For IDEA:

#define SPC_DECRYPT_INIT(sched, key, keybytes) { \

IDEA_KEY_SCHEDULE tmp;\

idea_set_encrypt_key(key, &tmp);\

idea_set_decrypt_key(&tmp, sched);\

}

Table 5-4 and Table 5-5 provide implementations of the SPC_DO_ENCRYPT and SPC_DO_DECRYPT macros.

Table 5-4. Implementations for the SPC_DO_ENCRYPT macro for each OpenSSL-supported block cipher

Cipher

OpenSSL-based SPC_DO_ENCRYPT implementation

AES

AES_encrypt(in, out, sched)

Blowfish

BF_ecb_encrypt(in, out, sched, 1)

CAST5

CAST_ecb_encrypt(in, out, sched, 1)

DES

DES_ecb_encrypt(in, out, sched, 1)

3-key Triple-DES

DES_ecb3_encrypt((DES_cblock *)in, (DES_cblock *)out, \ &sched->ks1, &sched->ks2, &sched->ks3, 1);

2-key Triple-DES

DES_ecb3_encrypt((DES_cblock *)in, (DES_cblock *)out, \ &sched->ks1, &sched->ks2, &sched->ks1, 1);

IDEA

idea_ecb_encrypt(in, out, sched);

RC2

RC2_ecb_encrypt(in, out, sched, 1);

RC5

RC5_32_ecb_encrypt(in, out, sched, 1);

Table 5-5. Implementations for the SPC_DO_DECRYPT macro for each OpenSSL-supported block cipher

Cipher

OpenSSL-based SPC_DO_DECRYPT implementation

AES

AES_decrypt(in, out, sched)

Blowfish

BF_ecb_encrypt(in, out, sched, 0)

CAST5

CAST_ecb_encrypt(in, out, sched, 0)

DES

DES_ecb_encrypt(in, out, sched, 0)

3-key Triple-DES

DES_ecb3_encrypt((DES_cblock *)in, (DES_cblock *)out, \ &sched->ks1, &sched->ks2, &sched->ks3, 0);

2-key Triple-DES

DES_ecb3_encrypt((DES_cblock *)in, (DES_cblock *)out, \ &sched->ks1, &sched->ks2, &sched->ks1, 0);

IDEA

idea_ecb_encrypt(in, out, sched);

RC2

RC2_ecb_encrypt(in, out, sched, 0);

RC5

RC5_32_ecb_encrypt(in, out, sched, 0);

See Also

§ Brian Gladman's AES page: http://fp.gladman.plus.com/AES/

§ OpenSSL home page: http://www.openssl.org/

§ Recipe 5.4, Recipe 5.26, Recipe 5.27.

5.6. Using a Generic CBC Mode Implementation

Problem

You want a more high-level interface for CBC mode than your library provides. Alternatively, you want a portable CBC interface, or you have only a block cipher implementation and you would like to use CBC mode.

Solution

CBC mode XORs each plaintext block with the previous output block before encrypting. The first block is XOR'd with the IV. Many libraries provide a CBC implementation. If you need code that implements CBC mode, you will find it in the following discussion.

Discussion

WARNING

You should probably use a higher-level abstraction, such as the one discussed in Recipe 5.16. Use a raw mode only when absolutely necessary, because there is a huge potential for introducing a security vulnerability by accident. If you still want to use CBC, be sure to use a message authentication code with it (see Chapter 6).

CBC mode is a way to use a raw block cipher and, if used properly, it avoids all the security risks associated with using the block cipher directly. CBC mode works on a message in blocks, where blocks are a unit of data on which the underlying cipher operates. For example, AES uses 128-bit blocks, whereas older ciphers such as DES almost universally use 64-bit blocks.

See Recipe 5.4 for a discussion of the advantages and disadvantages of this mode, as well as a comparison to other cipher modes.

CBC mode works (as illustrated in Figure 5-1) by taking the ciphertext output for the previous block, XOR'ing that with the plaintext for the current block, and encrypting the result with the raw block cipher. The very first block of plaintext gets XOR'd with an initialization vector, which needs to be randomly selected to ensure meeting security goals but which may be publicly known.

WARNING

Many people use sequential IVs or even fixed IVs, but that is not at all recommended. For example, SSL has had security problems in the past when using CBC without random IVs. Also note that if there are common initial strings, CBC mode can remain susceptible to dictionary attacks if no IV or similar mechanism is used. As with ECB, padding is required unless messages are always block-aligned.

CBC mode

Figure 5-1. CBC mode

Many libraries already come with an implementation of CBC mode for any ciphers they support. Some don't, however. For example, you may only get an implementation of the raw block cipher when you obtain reference code for a new cipher.

Generally, CBC mode requires padding. Because the cipher operates on block-sized quantities, it needs to have a way of handling messages that do not break up evenly into block-sized parts. This is done by adding padding to each message, as described in Recipe 5.11. Padding always adds to the length of a message. If you wish to avoid message expansion, you have a couple of options. You can ensure that your messages always have a length that is a multiple of the block size; in that case, you can simply turn off padding. Otherwise, you have to use a different mode. See Recipe 5.4 for our mode recommendations. If you're really a fan of CBC mode, you can support arbitrary-length messages without message expansion using a modified version of CBC mode known as ciphertext stealing or CTS mode. We do not discuss CTS mode in the book, but there is a recipe about it on this book's web site.

Here, we present a reasonably optimized implementation of CBC mode that builds upon the raw block cipher interface presented in Recipe 5.5. It also requires the spc_memset( ) function from Recipe 13.2.

The high-level API

This implementation has two APIs. The first API is the high-level API, which takes a message as input and returns a dynamically allocated result. This API only deals with padded messages. If you want to turn off cipher padding, you will need to use the incremental interface.

unsigned char *spc_cbc_encrypt(unsigned char *key, size_t kl, unsigned char *iv,

unsigned char *in, size_t il, size_t *ol);

unsigned char *spc_cbc_decrypt(unsigned char *key, size_t kl, unsigned char *iv,

unsigned char *in, size_t il, size_t *ol);

Both functions pass out the number of bytes in the result by writing to the memory pointed to by the final argument. If decryption fails for some reason, spc_cbc_decrypt( ) will return 0. Such an error means that the input was not a multiple of the block size, or that the padding was wrong.

TIP

These two functions erase the key from memory before exiting. You may want to have them erase the plaintext as well.

Here's the implementation of the above interface:

#include <stdlib.h>

#include <string.h>

unsigned char *spc_cbc_encrypt(unsigned char *key, size_t kl, unsigned char *iv,

unsigned char *in, size_t il, size_t *ol) {

SPC_CBC_CTX ctx;

size_t tmp;

unsigned char *result;

if (!(result = (unsigned char *)malloc(((il / SPC_BLOCK_SZ) * SPC_BLOCK_SZ) +

SPC_BLOCK_SZ))) return 0;

spc_cbc_encrypt_init(&ctx, key, kl, iv);

spc_cbc_encrypt_update(&ctx, in, il, result, &tmp);

spc_cbc_encrypt_final(&ctx, result+tmp, ol);

*ol += tmp;

return result;

}

unsigned char *spc_cbc_decrypt(unsigned char *key, size_t kl, unsigned char *iv,

unsigned char *in, size_t il, size_t *ol) {

int success;

size_t tmp;

SPC_CBC_CTX ctx;

unsigned char *result;

if (!(result = (unsigned char *)malloc(il))) return 0;

spc_cbc_decrypt_init(&ctx, key, kl, iv);

spc_cbc_decrypt_update(&ctx, in, il, result, &tmp);

if (!(success = spc_cbc_decrypt_final(&ctx, result+tmp, ol))) {

*ol = 0;

spc_memset(result, 0, il);

free(result);

return 0;

}

*ol += tmp;

result = (unsigned char *)realloc(result, *ol);

return result;

}

Note that this code depends on the SPC_CBC_CTX data type, as well as the incremental CBC interface, neither of which we have yet discussed.

SPC_CBC_CTX data type

Let's look at the SPC_CBC_CTX data type. It's defined as:

typedef struct {

SPC_KEY_SCHED ks;

int ix;

int pad;

unsigned char iv[SPC_BLOCK_SZ];

unsigned char ctbuf[SPC_BLOCK_SZ];

} SPC_CBC_CTX;

The ks field is an expanded version of the cipher key. The ix field is basically used to determine how much data is needed before we have processed data that is a multiple of the block length. The pad field specifies whether the API needs to add padding or should expect messages to be exactly block-aligned. The iv field is used to store the initialization vector for the next block of encryption. The ctbuf field is only used in decryption to cache ciphertext until we have enough to fill a block.

Incremental initialization

To begin encrypting or decrypting, we need to initialize the mode. Initialization is different for each mode. Here are the functions for initializing an SPC_CBC_CTX object:

void spc_cbc_encrypt_init(SPC_CBC_CTX *ctx, unsigned char *key, size_t kl,

unsigned char *iv) {

SPC_ENCRYPT_INIT(&(ctx->ks), key, kl);

spc_memset(key, 0, kl);

memcpy(ctx->iv, iv, SPC_BLOCK_SZ);

ctx->ix = 0;

ctx->pad = 1;

}

void spc_cbc_decrypt_init(SPC_CBC_CTX *ctx, unsigned char *key, size_t kl,

unsigned char *iv) {

SPC_DECRYPT_INIT(&(ctx->ks), key, kl);

spc_memset(key, 0, kl);

memcpy(ctx->iv, iv, SPC_BLOCK_SZ);

ctx->ix = 0;

ctx->pad = 1;

}

These functions are identical, except that they call the appropriate method for keying, which may be different depending on whether we're encrypting or decrypting. Both of these functions erase the key that you pass in!

Note that the initialization vector (IV) must be selected randomly. You should also avoid encrypting more than about 240 blocks of data using a single key. See Recipe 4.9 for more on initialization vectors.

Now we can add data as we get it using the spc_cbc_encrypt_update( ) and spc_cbc_decrypt_update( ) functions. These functions are particularly useful when a message comes in pieces. You'll get the same results as if the message had come in all at once. When you wish to finish encrypting or decrypting, you call spc_cbc_encrypt_final( ) or spc_cbc_decrypt_final( ), as appropriate.

TIP

You're responsible for making sure the proper init, update, and final calls are made, and that they do not happen out of order.

Incremental encrypting

The function spc_cbc_encrypt_update( ) has the following signature:

int spc_cbc_encrypt_update(CBC_CTX *ctx, unsigned char *in, size_t il,

unsigned char *out, size_t *ol);

This function has the following arguments:

ctx

Pointer to the SPC_CBC_CTX object associated with the current message.

in

Pointer to the plaintext data to be encrypted.

il

Number indicating how many bytes of plaintext are to be encrypted.

out

Pointer to a buffer where any incremental ciphertext output should be written.

ol

Pointer into which the number of ciphertext bytes written to the output buffer is placed. This argument may be NULL, in which case the caller is already expected to know the length of the output.

WARNING

Our implementation of this function always returns 1, but a hardware-based implementation might have an unexpected failure, so it's important to check the return value!

This API is in the spirit of PKCS #11,[11] which provides a standard cryptographic interface to hardware. We do this so that the above functions can have the bulk of their implementations replaced with calls to PKCS #11-compliant hardware. Generally, PKCS #11 reverses the order of input and output argument sets. Also, it does not securely wipe key material.

WARNING

Because this API is PKCS #11-compliant, it's somewhat more low-level than it needs to be and therefore is a bit difficult to use properly. First, you need to be sure that the output buffer is big enough to hold the input; otherwise, you will have a buffer overflow. Second, you need to make sure the out argument always points to the first unused byte in the output buffer; otherwise, you will keep overwriting the same data every time spc_cbc_encrypt_update( ) outputs data.

If you are using padding and you know the length of the input message in advance, you can calculate the output length easily. If the message is of a length that is an exact multiple of the block size, the output message will be a block larger. Otherwise, the message will get as many bytes added to it as necessary to make the input length a multiple of the block size. Using integer math, we can calculate the output length as follows, where il is the input length:

((il / SPC_BLOCK_SZ) * SPC_BLOCK_SZ) + SPC_BLOCK_SZ

If we do not have the entire message at once, when using padding the easiest thing to do is to assume there may be an additional block of output. That is, if you pass in 7 bytes, allocating 7 + SPC_BLOCK_SZ is safe. If you wish to be a bit more precise, you can always add SPC_BLOCK_SZ bytes to the input length, then reduce the number to the next block-aligned size. For example, if we have an 8-byte block, and we call spc_cbc_encrypt_update( ) with 7 bytes, there is no way to get more than 8 bytes of output, no matter how much data was buffered internally. Note that if no data was buffered internally, we won't get any output!

Of course, you can exactly determine the amount of data to pass in if you are keeping track of how many bytes are buffered at any given time (which you can do by looking at ctx->ix). If you do that, add the buffered length to your input length. The amount of output is always the largest block-aligned value less than or equal to this total length.

If you're not using padding, you will get a block of output for every block of input. To switch off padding, you can call the following function, passing in a for the second argument:

void spc_cbc_set_padding(SPC_CBC_CTX *ctx, int pad) {

ctx->pad = pad;

}

Here's our implementation of spc_cbc_encrypt_update( ) :

int spc_cbc_encrypt_update(SPC_CBC_CTX *ctx, unsigned char *in, size_t il,

unsigned char *out, size_t *ol) {

/* Keep a ptr to in, which we advance; we calculate ol by subtraction later. */

int i;

unsigned char *start = out;

/* If we have leftovers, but not enough to fill a block, XOR them into the right

* places in the IV slot and return. It's not much stuff, so one byte at a time

* is fine.

*/

if (il < SPC_BLOCK_SZ-ctx->ix) {

while (il--) ctx->iv[ctx->ix++] ^= *in++;

if (ol) *ol = 0;

return 1;

}

/* If we did have leftovers, and we're here, fill up a block then output the

* ciphertext.

*/

if (ctx->ix) {

while (ctx->ix < SPC_BLOCK_SZ) --il, ctx->iv[ctx->ix++] ^= *in++;

SPC_DO_ENCRYPT(&(ctx->ks), ctx->iv, ctx->iv);

for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)

((unsigned int *)out)[i] = ((unsigned int *)(ctx->iv))[i];

out += SPC_BLOCK_SZ;

}

/* Operate on word-sized chunks, because it's easy to do so. You might gain a

* couple of cycles per loop by unrolling and getting rid of i if you know your

* word size a priori.

*/

while (il >= SPC_BLOCK_SZ) {

for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)

((unsigned int *)(ctx->iv))[i] ^= ((unsigned int *)in)[i];

SPC_DO_ENCRYPT(&(ctx->ks), ctx->iv, ctx->iv);

for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)

((unsigned int *)out)[i] = ((unsigned int *)(ctx->iv))[i];

out += SPC_BLOCK_SZ;

in += SPC_BLOCK_SZ;

il -= SPC_BLOCK_SZ;

}

/* Deal with leftovers... one byte at a time is fine. */

for (i = 0; i < il; i++) ctx->iv[i] ^= in[i];

ctx->ix = il;

if (ol) *ol = out-start;

return 1;

}

The following spc_cbc_encrypt_final( ) function outputs any remaining data and securely wipes the key material in the context, along with all the intermediate state. If padding is on, it will output one block. If padding is off, it won't output anything. If padding is off and the total length of the input wasn't a multiple of the block size, spc_cbc_encrypt_final( ) will return 0. Otherwise, it will always succeed.

int spc_cbc_encrypt_final(SPC_CBC_CTX *ctx, unsigned char *out, size_t *ol) {

int ret;

unsigned char pad;

if (ctx->pad) {

pad = SPC_BLOCK_SZ - ctx->ix;

while (ctx->ix < SPC_BLOCK_SZ) ctx->iv[ctx->ix++] ^= pad;

SPC_DO_ENCRYPT(&(ctx->ks), ctx->iv, out);

spc_memset(ctx, 0, sizeof(SPC_CBC_CTX));

if(ol) *ol = SPC_BLOCK_SZ;

return 1;

}

if(ol) *ol = 0;

ret = !(ctx->ix);

spc_memset(ctx, 0, sizeof(SPC_CBC_CTX));

return ret;

}

This function has the following arguments:

ctx

Pointer to the SPC_CBC_CTX object being used for the current message.

out

Pointer to the output buffer, if any. It may be NULL when padding is disabled.

ol

The number of output bytes written to the output buffer is placed into this pointer. This argument may be NULL, in which case the output length is not written.

Incremental decryption

The CBC decryption API is largely similar to the encryption API, with one major exception. When encrypting, we can output a block of data every time we take in a block of data. When decrypting, that's not possible. We can decrypt data, but until we know that a block isn't the final block, we can't output it because part of the block may be padding. Of course, with padding turned off, that restriction could go away, but our API acts the same with padding off, just to ensure consistent behavior.

The spc_cbc_decrypt_update( ) function, shown later in this section, has the following signature:

int spc_decrypt_update(SPC_CBC_CTX *ctx, unsigned char *in, size_t il,

unsigned char *out, size_t *ol);

This function has the following arguments:

ctx

Pointer to the SPC_CBC_CTX object being used for the current message.

in

Pointer to the ciphertext input buffer.

inlen

Number of bytes contained in the ciphertext input buffer.

out

Pointer to a buffer where any incremental plaintext output should be written.

ol

Pointer into which the number of output bytes written to the output buffer is placed. This argument may be NULL, in which case the output length is not written.

This function can output up to SPC_BLOCK_SZ - 1 bytes more than is input, depending on how much data has previously been buffered.

int spc_cbc_decrypt_update(SPC_CBC_CTX *ctx, unsigned char *in, size_t il,

unsigned char *out, size_t *ol) {

int i;

unsigned char *next_iv, *start = out;

/* If there's not enough stuff to fit in ctbuf, dump it in there and return */

if (il < SPC_BLOCK_SZ - ctx->ix) {

while (il--) ctx->ctbuf[ctx->ix++] = *in++;

if (ol) *ol = 0;

return 1;

}

/* If there's stuff in ctbuf, fill it. */

if (ctx->ix % SPC_BLOCK_SZ) {

while (ctx->ix < SPC_BLOCK_SZ) {

ctx->ctbuf[ctx->ix++] = *in++;

--il;

}

}

if (!il) {

if (ol) *ol = 0;

return 1;

}

/* If we get here, and the ctbuf is full, it can't be padding. Spill it. */

if (ctx->ix) {

SPC_DO_DECRYPT(&(ctx->ks), ctx->ctbuf, out);

for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++) {

((int *)out)[i] ^= ((int *)ctx->iv)[i];

((int *)ctx->iv)[i] = ((int *)ctx->ctbuf)[i];

}

out += SPC_BLOCK_SZ;

}

if (il > SPC_BLOCK_SZ) {

SPC_DO_DECRYPT(&(ctx->ks), in, out);

for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)

((int *)out)[i] ^= ((int *)ctx->iv)[i];

next_iv = in;

out += SPC_BLOCK_SZ;

in += SPC_BLOCK_SZ;

il -= SPC_BLOCK_SZ;

} else next_iv = ctx->iv;

while (il > SPC_BLOCK_SZ) {

SPC_DO_DECRYPT(&(ctx->ks), in, out);

for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)

((int *)out)[i] ^= ((int *)next_iv)[i];

next_iv = in;

out += SPC_BLOCK_SZ;

in += SPC_BLOCK_SZ;

il -= SPC_BLOCK_SZ;

}

/* Store the IV. */

for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)

((int *)ctx->iv)[i] = ((int *)next_iv)[i];

ctx->ix = 0;

while (il--) ctx->ctbuf[ctx->ix++] = *in++;

if (ol) *ol = out - start;

return 1;

}

Finalizing CBC-mode decryption is done with spc_cbc_decrypt_final( ) , whose listing follows. This function will return 1 if there are no problems or 0 if the total input length is not a multiple of the block size or if padding is on and the padding is incorrect.

If the call is successful and padding is on, the function will write into the output buffer anywhere from 0 to SPC_BLOCK_SZ bytes. If padding is off, a successful function will always write SPC_BLOCK_SZ bytes into the output buffer.

As with spc_cbc_encrypt_final( ) , this function will securely erase the contents of the context object before returning.

int spc_cbc_decrypt_final(SPC_CBC_CTX *ctx, unsigned char *out, size_t *ol) {

unsigned int i;

unsigned char pad;

if (ctx->ix != SPC_BLOCK_SZ) {

if (ol) *ol = 0;

/* If there was no input, and there's no padding, then everything is OK. */

spc_memset(&(ctx->ks), 0, sizeof(SPC_KEY_SCHED));

spc_memset(ctx, 0, sizeof(SPC_CBC_CTX));

return (!ctx->ix && !ctx->pad);

}

if (!ctx->pad) {

SPC_DO_DECRYPT(&(ctx->ks), ctx->ctbuf, out);

for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)

((int *)out)[i] ^= ((int *)ctx->iv)[i];

if (ol) *ol = SPC_BLOCK_SZ;

spc_memset(ctx, 0, sizeof(SPC_CBC_CTX));

return 1;

}

SPC_DO_DECRYPT(&(ctx->ks), ctx->ctbuf, ctx->ctbuf);

spc_memset(&(ctx->ks), 0, sizeof(SPC_KEY_SCHED));

for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)

((int *)ctx->ctbuf)[i] ^= ((int *)ctx->iv)[i];

pad = ctx->ctbuf[SPC_BLOCK_SZ - 1];

if (pad > SPC_BLOCK_SZ) {

if (ol) *ol = 0;

spc_memset(ctx, 0, sizeof(SPC_CBC_CTX));

return 0;

}

for (i = 1; i < pad; i++) {

if (ctx->ctbuf[SPC_BLOCK_SZ - 1 - i] != pad) {

if (ol) *ol = 0;

spc_memset(ctx, 0, sizeof(SPC_CBC_CTX));

return 0;

}

}

for (i = 0; i < SPC_BLOCK_SZ - pad; i++)

*out++ = ctx->ctbuf[i];

if (ol) *ol = SPC_BLOCK_SZ - pad;

spc_memset(ctx, 0, sizeof(SPC_CBC_CTX));

return 1;

}

See Also

§ PKCS #11 web page: http://www.rsasecurity.com/rsalabs/pkcs/pkcs-11/

§ Recipe 4.9, Recipe 5.4, Recipe 5.5, Recipe 5.11, Recipe 5.16, Recipe 13.2


[11] PKCS #11 is available from http://www.rsasecurity.com/rsalabs/pkcs/pkcs-11/.

5.7. Using a Generic CFB Mode Implementation

Problem

You want a more high-level interface for CFB mode than your library provides. Alternatively, you want a portable CFB interface, or you have only a block cipher implementation and would like to use CFB mode.

Solution

CFB mode generates keystream by encrypting a "state" buffer, which starts out being the nonce and changes after each output, based on the actual outputted value.

Many libraries provide a CFB implementation. If you need code that implements this mode, you will find it in the following Section 5.7.3.

Discussion

WARNING

You should probably use a higher-level abstraction, such as the one discussed in Recipe 5.16. Use a raw mode only when absolutely necessary, because there is a huge potential for introducing a security vulnerability by accident. If you still want to use CFB, be sure to use a message authentication code with it (see Chapter 6).

CFB is a stream-based mode. Encryption occurs by XOR'ing the keystream bytes with the plaintext bytes, as shown in Figure 5-2. The keystream is generated one block at a time, and it is always dependent on the previous keystream block as well as the plaintext data XOR'd with the previous keystream block.

CFB does this by keeping a "state" buffer, which is initially the nonce. As a block's worth of data gets encrypted, the state buffer has some or all of its bits shifted out and ciphertext bits shifted in. The amount of data shifted in before each encryption operation is the "feedback size," which is often the block size of the cipher, meaning that the state function is always replaced by the ciphertext of the previous block. See Figure 5-2 for a graphical view of CFB mode.

CFB mode

Figure 5-2. CFB mode

The block size of the cipher is important to CFB mode because keystream is produced in block-sized chunks and therefore requires keeping track of block-sized portions of the ciphertext. CFB is fundamentally a streaming mode, however, because the plaintext is encrypted simply by XOR'ing with the CFB keystream.

In Recipe 5.4, we discuss the advantages and drawbacks of CFB and compare it to other popular modes.

These days, CFB mode is rarely used because CTR and OFB modes (CTR mode in particular) provide more advantages, with no additional drawbacks. Of course, we recommend a higher-level mode over all of these, one that provides stronger security guarantees—for example, CWC or CCM mode.

Many libraries already come with an implementation of CFB mode for any ciphers they support. However, some don't. For example, you may only get an implementation of the raw block cipher when you obtain reference code for a new cipher.

In the following sections we present a reasonably optimized implementation of CFB mode that builds upon the raw block cipher interface presented in Recipe 5.5. It also requires the spc_memset( ) function from Recipe 13.2.

TIP

This implementation is only for the case where the feedback size is equal to the cipher block size. This is the most efficient mechanism and is no less secure than other feedback sizes, so we strongly recommend this approach.

The high-level API

This implementation has two APIs. The first is a high-level API, which takes a message as input and returns a dynamically allocated result.

unsigned char *spc_cfb_encrypt(unsigned char *key, size_t kl, unsigned char *nonce,

unsigned char *in, size_t il);

unsigned char *spc_cfb_decrypt(unsigned char *key, size_t kl, unsigned char *nonce,

unsigned char *in, size_t il)

Both of the previous functions output the same number of bytes as were input, unless a memory allocation error occurs, in which case 0 is returned.

TIP

These two functions erase the key from memory before exiting. You may want to have them erase the plaintext as well.

Here's the implementation of the interface:

#include <stdlib.h>

#include <string.h>

unsigned char *spc_cfb_encrypt(unsigned char *key, size_t kl, unsigned char *nonce,

unsigned char *in, size_t il) {

SPC_CFB_CTX ctx;

unsigned char *out;

if (!(out = (unsigned char *)malloc(il))) return 0;

spc_cfb_init(&ctx, key, kl, nonce);

spc_cfb_encrypt_update(&ctx, in, il, out);

spc_cfb_final(&ctx);

return out;

}

unsigned char *spc_cfb_decrypt(unsigned char *key, size_t kl, unsigned char *nonce,

unsigned char *in, size_t il) {

SPC_CFB_CTX ctx;

unsigned char *out;

if (!(out = (unsigned char *)malloc(il))) return 0;

spc_cfb_init(&ctx, key, kl, nonce);

spc_cfb_decrypt_update(&ctx, in, il, out);

spc_cfb_final(&ctx);

return out;

}

Note that this code depends on the SPC_CFB_CTX data type and the incremental CFB interface, both discussed in the following sections.

The incremental API

Let's look at the SPC_CFB_CTX data type. It's defined as:

typedef struct {

SPC_KEY_SCHED ks;

int ix;

unsigned char nonce[SPC_BLOCK_SZ];

} SPC_CFB_CTX;

The ks field is an expanded version of the cipher key (block ciphers generally use a single key to derive multiple keys for internal use). The ix field is used to determine how much keystream we have buffered. The nonce field is really the buffer in which we store the input to the next encryption, and it is the place where intermediate keystream bytes are stored.

To begin encrypting or decrypting, we need to initialize the mode. Initialization is the same operation for both encryption and decryption:

void spc_cfb_init(SPC_CFB_CTX *ctx, unsigned char *key, size_t kl, unsigned char

*nonce) {

SPC_ENCRYPT_INIT(&(ctx->ks), key, kl);

spc_memset(key,0, kl);

memcpy(ctx->nonce, nonce, SPC_BLOCK_SZ);

ctx->ix = 0;

}

TIP

Note again that we remove the key from memory during this operation.

Never use the same nonce (often called an IV in this context; see Recipe 4.9) twice with a single key. To implement that recommendation effectively, never reuse a key. Alternatively, pick a random starting IV each time you key, and never output more than about 240 blocks using a single key.

Now we can add data as we get it using the spc_cfb_encrypt_update( ) or spc_cfb_decrypt_update( ) function, as appropriate. These functions are particularly useful when a message may arrive in pieces. You'll get the same results as if it all arrived at once. When you want to finish encrypting or decrypting, call spc_cfb_final( ) .

TIP

You're responsible for making sure the proper init, update, and final calls are made, and that they do not happen out of order.

The function spc_cfb_encrypt_update( ), which is shown later in this section, has the following signature:

int spc_cfb_encrypt_update(CFB_CTX *ctx, unsigned char *in, size_t il,

unsigned char *out);

This function has the following arguments:

ctx

Pointer to the SPC_CFB_CTX object associated with the current message.

in

Pointer to the plaintext data to be encrypted.

il

Number of bytes of plaintext to be encrypted.

out

Pointer to the output buffer, which needs to be exactly as long as the input plaintext data.

WARNING

Our implementation of this function always returns 1, but a hardware-based implementation might have an unexpected failure, so it's important to check the return value!

This API is in the spirit of PKCS #11, which provides a standard cryptographic interface to hardware. We do this so that the above functions can have the bulk of their implementations replaced with calls to PKCS #11-compliant hardware. PKCS #11 APIs generally pass out data explicitly indicating the length of data outputted, while we ignore that because it will always be zero on failure or the size of the input buffer on success. Also note that PKCS #11-based calls tend to order their arguments differently from the way we do, and they will not generally wipe key material, as we do in our initialization and finalization routines.

WARNING

Because this API is developed with PKCS #11 in mind, it's somewhat more low-level than it needs to be and therefore is a bit difficult to use properly. First, you need to be sure the output buffer is big enough to hold the input; otherwise, you will have a buffer overflow. Second, you need to make sure the out argument always points to the first unused byte in the output buffer. Otherwise, you will keep overwriting the same data every time spc_cfb_encrypt_update( ) outputs.

Here's our implementation of spc_cfb_encrypt_update( ) :

int spc_cfb_encrypt_update(SPC_CFB_CTX *ctx, unsigned char *in, size_t il,

unsigned char *out) {

int i;

if (ctx->ix) {

while (ctx->ix) {

if (!il--) return 1;

ctx->nonce[ctx->ix] = *out++ = *in++ ^ ctx->nonce[ctx->ix++];

ctx->ix %= SPC_BLOCK_SZ;

}

}

if (!il) return 1;

while (il >= SPC_BLOCK_SZ) {

SPC_DO_ENCRYPT(&(ctx->ks), ctx->nonce, ctx->nonce);

for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++) {

((int *)ctx->nonce)[i] = ((int *)out)[i] = ((int *)in)[i] ^

((int *)ctx->nonce)[i];

}

il -= SPC_BLOCK_SZ;

in += SPC_BLOCK_SZ;

out += SPC_BLOCK_SZ;

}

SPC_DO_ENCRYPT(&(ctx->ks), ctx->nonce, ctx->nonce);

for (i = 0; i <il; i++)

ctx->nonce[ctx->ix] = *out++ = *in++ ^ ctx->nonce[ctx->ix++];

return 1;

}

Decryption has a similar API, but a different implementation:

int spc_cfb_decrypt_update(SPC_CFB_CTX *ctx, unsigned char *in, size_t il,

unsigned char *out) {

int i, x;

char c;

if (ctx->ix) {

while (ctx->ix) {

if (!il--) return 1;

c = *in;

*out++ = *in++ ^ ctx->nonce[ctx->ix];

ctx->nonce[ctx->ix++] = c;

ctx->ix %= SPC_BLOCK_SZ;

}

}

if (!il) return 1;

while (il >= SPC_BLOCK_SZ) {

SPC_DO_ENCRYPT(&(ctx->ks), ctx->nonce, ctx->nonce);

for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++) {

x = ((int *)in)[i];

((int *)out)[i] = x ^ ((int *)ctx->nonce)[i];

((int *)ctx->nonce)[i] = x;

}

il -= SPC_BLOCK_SZ;

in += SPC_BLOCK_SZ;

out += SPC_BLOCK_SZ;

}

SPC_DO_ENCRYPT(&(ctx->ks), ctx->nonce, ctx->nonce);

for (i = 0; i < il; i++) {

c = *in;

*out++ = *in++ ^ ctx->nonce[ctx->ix];

ctx->nonce[ctx->ix++] = c;

}

return 1;

}

To finalize either encryption or decryption, use spc_cfb_final( ) , which never needs to output anything, because CFB is a streaming mode:

int spc_cfb_final(SPC_CFB_CTX *ctx) {

spc_memset(&ctx, 0, sizeof(SPC_CFB_CTX));

return 1;

}

See Also

Recipe 4.9, Recipe 5.4, Recipe 5.5, Recipe 5.16, Recipe 13.2

5.8. Using a Generic OFB Mode Implementation

Problem

You want a more high-level interface for OFB mode than your library provides. Alternatively, you want a portable OFB interface, or you have only a block cipher implementation and you would like to use OFB mode.

Solution

OFB mode encrypts by generating keystream, then combining the keystream with the plaintext via XOR. OFB generates keystream one block at a time. Each block of keystream is produced by encrypting the previous block of keystream, except for the first block, which is generated by encrypting the nonce.

Many libraries provide an OFB implementation. If you need code implementing this mode, you will find it in the following Section 5.8.3.

Discussion

WARNING

You should probably use a higher-level abstraction, such as the one discussed in Recipe 5.16. Use a raw mode only when absolutely necessary, because there is a huge potential for introducing a security vulnerability by accident. If you still want to use OFB, be sure to use a message authentication code with it.

OFB mode is a stream-based mode. Encryption occurs by XOR'ing the keystream bytes with the plaintext bytes, as shown in Figure 5-3. The keystream is generated one block at a time, by encrypting the previous keystream block.[12] The first block is generated by encrypting the nonce.

OFB mode

Figure 5-3. OFB mode

This mode shares many properties with counter mode (CTR), but CTR mode has additional benefits. OFB mode is therefore seeing less and less use these days. Of course, we recommend a higher-level mode than both of these modes, one that provides stronger security guarantees—for example, CWC or CCM mode.

In Recipe 5.4, we discuss the advantages and drawbacks of OFB and compare it to other popular modes.

Many libraries already come with an implementation of OFB mode for any ciphers they support. However, some don't. For example, you may only get an implementation of the raw block cipher when you obtain reference code for a new cipher.

In the following sections we present a reasonably optimized implementation of OFB mode that builds upon the raw block cipher interface presented in Recipe 5.5. It also requires the spc_memset( ) function from Recipe 13.2.

The high-level API

This implementation has two APIs. The first is a high-level API, which takes a message as input and returns a dynamically allocated result.

unsigned char *spc_ofb_encrypt(unsigned char *key, size_t kl, unsigned char *nonce,

unsigned char *in, size_t il);

unsigned char *spc_ofb_decrypt(unsigned char *key, size_t kl, unsigned char *nonce,

unsigned char *in, size_t il)

Both of these functions output the same number of bytes as were input, unless a memory allocation error occurs, in which case 0 is returned. The decryption routine is exactly the same as the encryption routine and is implemented by macro.

TIP

These two functions also erase the key from memory before exiting. You may want to have them erase the plaintext as well.

Here's the implementation of the interface:

#include <stdlib.h>

#include <string.h>

unsigned char *spc_ofb_encrypt(unsigned char *key, size_t kl, unsigned char *nonce,

unsigned char *in, size_t il) {

SPC_OFB_CTX ctx;

unsigned char *out;

if (!(out = (unsigned char *)malloc(il))) return 0;

spc_ofb_init(&ctx, key, kl, nonce);

spc_ofb_update(&ctx, in, il, out);

spc_ofb_final(&ctx);

return out;

}

#define spc_ofb_decrypt spc_ofb_encrypt

Note that the previous code depends on the SPC_OFB_CTX data type and the incremental OFB interface, both discussed in the following sections.

The incremental API

Let's look at the SPC_OFB_CTX data type. It's defined as:

typedef struct {

SPC_KEY_SCHED ks;

int ix;

unsigned char nonce[SPC_BLOCK_SZ];

} SPC_OFB_CTX;

The ks field is an expanded version of the cipher key (block ciphers generally use a single key to derive multiple keys for internal use). The ix field is used to determine how much of the last block of keystream we have buffered (i.e., that hasn't been used yet). The nonce field is really the buffer in which we store the current block of the keystream.

To begin encrypting or decrypting, we need to initialize the mode. Initialization is the same operation for both encryption and decryption:

void spc_ofb_init(SPC_OFB_CTX *ctx, unsigned char *key, size_t kl, unsigned char

*nonce) {

SPC_ENCRYPT_INIT(&(ctx->ks), key, kl);

spc_memset(key,0, kl);

memcpy(ctx->nonce, nonce, SPC_BLOCK_SZ);

ctx->ix = 0;

}

TIP

Note again that we remove the key from memory during this operation.

Never use the same nonce (often called an IV in this context) twice with a single key. Use a secure random value or a counter. See Recipe 4.9 for more information on nonces.

Now we can add data as we get it using the spc_ofb_update( ) function. This function is particularly useful when a message arrives in pieces. You'll get the same results as if it all arrived at once. When you want to finish encrypting or decrypting, call spc_ofb_final( ).

TIP

You're responsible for making sure the init, update, and final calls do not happen out of order.

The function spc_ofb_update( ) has the following signature:

int spc_ofb_update(OFB_CTX *ctx, unsigned char *in, size_t il, unsigned char *out);

This function has the following arguments:

ctx

Pointer to the SPC_OFB_CTX object associated with the current message.

in

Pointer to a buffer containing the data to be encrypted or decrypted.

il

Number of bytes contained in the input buffer.

out

Pointer to the output buffer, which needs to be exactly as long as the input buffer.

WARNING

Our implementation of this function always returns 1, but a hardware-based implementation might have an unexpected failure, so it's important to check the return value!

This API is in the spirit of PKCS #11, which provides a standard cryptographic interface to hardware. We do this so that the above functions can have the bulk of their implementations replaced with calls to PKCS #11-compliant hardware. PKCS #11 APIs generally pass out data explicitly indicating the length of data outputted, while we ignore that because it will always be zero on failure or the size of the input buffer on success. Also note that PKCS #11-based calls tend to order their arguments differently from the way we do, and they will not generally wipe key material, as we do in our initialization and finalization routines.

WARNING

Because this API is developed with PKCS #11 in mind, it's somewhat more low-level than it needs to be, and therefore is a bit difficult to use properly. First, you need to be sure the output buffer is big enough to hold the input; otherwise, you will have a buffer overflow. Second, you need to make sure the out argument always points to the first unused byte in the output buffer. Otherwise, you will keep overwriting the same data every time spc_ofb_update( ) outputs.

Here's our implementation of spc_ofb_update( ) :

int spc_ofb_update(SPC_OFB_CTX *ctx, unsigned char *in, size_t il, unsigned char

*out) {

int i;

if (ctx->ix) {

while (ctx->ix) {

if (!il--) return 1;

*out++ = *in++ ^ ctx->nonce[ctx->ix++];

ctx->ix %= SPC_BLOCK_SZ;

}

}

if (!il) return 1;

while (il >= SPC_BLOCK_SZ) {

SPC_DO_ENCRYPT(&(ctx->ks), ctx->nonce, ctx->nonce);

for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)

((int *)out)[i] = ((int *)in)[i] ^ ((int *)ctx->nonce)[i];

il -= SPC_BLOCK_SZ;

in += SPC_BLOCK_SZ;

out += SPC_BLOCK_SZ;

}

SPC_DO_ENCRYPT(&(ctx->ks), ctx->nonce, ctx->nonce);

for (i = 0; i < il; i++) *out++ = *in++ ^ ctx->nonce[ctx->ix++];

return 1;

}

To finalize either encryption or decryption, use the spc_ofb_final( ) call, which never needs to output anything, because OFB is a streaming mode:

int spc_ofb_final(SPC_OFB_CTX *ctx) {

spc_memset(&ctx, 0, sizeof(SPC_OFB_CTX));

return 1;

}

See Also

Recipe 4.9, Recipe 5.4, Recipe 5.5, Recipe 5.16, Recipe 13.2


[12] As with CFB mode, the "feedback size" could conceivably be smaller than the block size, but such schemes aren't secure.

5.9. Using a Generic CTR Mode Implementation

Problem

You want to use counter (CTR) mode and your library doesn't provide an interface, or you want to use a more high-level interface than your library provides. Alternatively, you would like a portable CTR interface, or you have only a block cipher implementation and you would like to use CTR mode.

Solution

CTR mode encrypts by generating keystream, then combining the keystream with the plaintext via XOR. This mode generates keystream one block at a time by encrypting plaintexts that are the same, except for an ever-changing counter, as shown in Figure 5-4. Generally, the counter value starts at zero and is incremented sequentially.

Counter (CTR) mode

Figure 5-4. Counter (CTR) mode

Few libraries provide a CTR implementation, because it has only recently come into favor, despite the fact that it is a very old mode with great properties. We provide code implementing this mode in the following Section 5.9.3.

Discussion

WARNING

You should probably use a higher-level abstraction, such as the one discussed in Recipe 5.16. Use a raw mode only when absolutely necessary, because there is a huge potential for introducing asecurity vulnerability by accident. If you still want to use CTR mode, be sure to use a message authentication code with it.

CTR mode is a stream-based mode. Encryption occurs by XOR'ing the keystream bytes with the plaintext bytes. The keystream is generated one block at a time by encrypting a plaintext block that includes a counter value. Given a single key, the counter value must be unique for every encryption.

This mode has many benefits over the "standard" modes (e.g., ECB, CBC, CFB, and OFB). However, we recommend a higher-level mode, one that provides stronger security guarantees (i.e., message integrity detection), such as CWC or CCM modes. Most high-level modes use CTR mode as a component.

In Recipe 5.4, we discuss the advantages and drawbacks of CTR mode and compare it to other popular modes.

Like most other modes, CTR mode requires a nonce (often called an IV in this context). Most modes use the nonce as an input to encryption, and thus require something the same size as the algorithm's block length. With CTR mode, the input to encryption is generally the concatenation of the nonce and a counter. The counter is usually at least 32 bits, depending on the maximum amount of data you might want to encrypt with a single {key, nonce} pair. We recommend using a good random value for the nonce.

In the following sections we present a reasonably optimized implementation of CTR mode that builds upon the raw block cipher interface presented in Recipe 5.5. It also requires the spc_memset( ) function from Recipe 13.2. By default, we use a 6-byte counter, which leaves room for a nonce of SPC_BLOCK_SZ - 6 bytes. With AES and other ciphers with 128-bit blocks, this is sufficient space.

WARNING

CTR mode with 64-bit blocks is highly susceptible to birthday attacks unless you use a large random portion to the nonce, which limits the message you can send with a given key. In short, don't use CTR mode with 64-bit block ciphers.

The high-level API

This implementation has two APIs. The first is a high-level API, which takes a message as input and returns a dynamically allocated result.

unsigned char *spc_ctr_encrypt(unsigned char *key, size_t kl, unsigned char *nonce,

unsigned char *in, size_t il);

unsigned char *spc_ctr_decrypt(unsigned char *key, size_t kl, unsigned char *nonce,

unsigned char *in, size_t il)

Both of the previous functions output the same number of bytes as were input, unless a memory allocation error occurs, in which case 0 is returned. The decryption routine is exactly the same as the encryption routine, and it is implemented by macro.

TIP

These two functions also erase the key from memory before exiting. You may want to have them erase the plaintext as well.

Here's the implementation of the interface:

#include <stdlib.h>

#include <string.h>

unsigned char *spc_ctr_encrypt(unsigned char *key, size_t kl, unsigned char *nonce,

unsigned char *in, size_t il) {

SPC_CTR_CTX ctx;

unsigned char *out;

if (!(out = (unsigned char *)malloc(il))) return 0;

spc_ctr_init(&ctx, key, kl, nonce);

spc_ctr_update(&ctx, in, il, out);

spc_ctr_final(&ctx);

return out;

}

#define spc_ctr_decrypt spc_ctr_encrypt

Note that this code depends on the SPC_CTR_CTX data type and the incremental CTR interface, both discussed in the following sections. In particular, the nonce size varies depending on the value of the SPC_CTR_BYTES macro (introduced in the next subsection).

The incremental API

Let's look at the SPC_CTR_CTX data type. It's defined as:

typedef struct {

SPC_KEY_SCHED ks;

int ix;

unsigned char ctr[SPC_BLOCK_SZ];

unsigned char ksm[SPC_BLOCK_SZ];

} SPC_CTR_CTX;

The ks field is an expanded version of the cipher key (block ciphers generally use a single key to derive multiple keys for internal use). The ix field is used to determine how much of the last block of keystream we have buffered (i.e., that hasn't been used yet). The ctr block holds the plaintext used to generate keystream blocks. Buffered keystream is held in ksm.

To begin encrypting or decrypting, you need to initialize the mode. Initialization is the same operation for both encryption and decryption, and it depends on a statically defined value SPC_CTR_BYTES, which is used to compute the nonce size.

#define SPC_CTR_BYTES 6

void spc_ctr_init(SPC_CTR_CTX *ctx, unsigned char *key, size_t kl, unsigned char

*nonce) {

SPC_ENCRYPT_INIT(&(ctx->ks), key, kl);

spc_memset(key, 0, kl);

memcpy(ctx->ctr, nonce, SPC_BLOCK_SZ - SPC_CTR_BYTES);

spc_memset(ctx->ctr + SPC_BLOCK_SZ - SPC_CTR_BYTES, 0, SPC_CTR_BYTES);

ctx->ix = 0;

}

TIP

Note again that we remove the key from memory during this operation.

Now you can add data as you get it using the spc_ctr_update( ) function. This function is particularly useful when a message arrives in pieces. You'll get the same results as if it all arrived at once. When you want to finish encrypting or decrypting, call spc_ctr_final( ).

TIP

You're responsible for making sure the initialization, updating, and finalization calls do not happen out of order.

The function spc_ctr_update( ) has the following signature:

int spc_ctr_update(CTR_CTX *ctx, unsigned char *in, size_t il, unsigned char *out);

This function has the following arguments:

ctx

Pointer to the SPC_CTR_CTX object associated with the current message.

in

Pointer to a buffer containing the data to be encrypted or decrypted.

il

Number of bytes contained by the input buffer.

out

Pointer to the output buffer, which needs to be exactly as long as the input buffer.

WARNING

Our implementation of this function always returns 1, but a hardware-based implementation might have an unexpected failure, so it's important to check the return value!

This API is in the spirit of PKCS #11, which provides a standard cryptographic interface to hardware. We do this so that the above functions can have the bulk of their implementations replaced with calls to PKCS #11-compliant hardware. PKCS #11 APIs generally pass out data explicitly indicating the length of data outputted, while we ignore that because it will always be zero on failure or the size of the input buffer on success. Also note that PKCS #11-based calls tend to order their arguments differently from the way we do, and they will not generally wipe key material, as we do in our initialization and finalization routines.

WARNING

Because this API is developed with PKCS #11 in mind, it's somewhat more low-level than it needs to be, and therefore is a bit difficult to use properly. First, you need to be sure the output buffer is big enough to hold the input; otherwise, you will have a buffer overflow. Second, you need to make sure the out argument always points to the first unused byte in the output buffer. Otherwise, you will keep overwriting the same data every time spc_ctr_update( ) outputs data.

Here's our implementation of spc_ctr_update( ) , along with a helper function:

static inline void ctr_increment(unsigned char *ctr) {

unsigned char *x = ctr + SPC_CTR_BYTES;

while (x-- != ctr) if (++(*x)) return;

}

int spc_ctr_update(SPC_CTR_CTX *ctx, unsigned char *in, size_t il, unsigned char

*out) {

int i;

if (ctx->ix) {

while (ctx->ix) {

if (!il--) return 1;

*out++ = *in++ ^ ctx->ksm[ctx->ix++];

ctx->ix %= SPC_BLOCK_SZ;

}

}

if (!il) return 1;

while (il >= SPC_BLOCK_SZ) {

SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr, out);

ctr_increment(ctx->ctr);

for (i = 0; i < SPC_BLOCK_SZ / sizeof(int); i++)

((int *)out)[i] ^= ((int *)in)[i];

il -= SPC_BLOCK_SZ;

in += SPC_BLOCK_SZ;

out += SPC_BLOCK_SZ;

}

SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr, ctx->ksm);

ctr_increment(ctx->ctr);

for (i = 0; i < il; i++)

*out++ = *in++ ^ ctx->ksm[ctx->ix++];

return 1;

}

To finalize either encryption or decryption, use the spc_ctr_final( ) call, which never needs to output anything, because CTR is a streaming mode:

int spc_ctr_final(SPC_CTR_CTX *ctx) {

spc_memset(&ctx, 0, sizeof(SPC_CTR_CTX));

return 1;

}

See Also

Recipe 4.9, Recipe 5.4, Recipe 5.5, Recipe 5.16, Recipe 13.2

5.10. Using CWC Mode

Problem

You want to use CWC mode to get encryption and message integrity in a single mode.

Solution

Use the reference implementation available from http://www.zork.org/cwc/, or use Brian Gladman's implementation, available from http://fp.gladman.plus.com/AES/cwc.zip.

Discussion

CWC mode is a mode of operation for providing both encryption and message integrity. This mode is parallelizable, fast in both software and hardware (where it can achieve speeds of 10 gigabits per second), unencumbered by patents, and provably secure to good bounds with standard assumptions. (We compare CWC to other modes in Recipe 5.4.)

CWC mode is not simple to implement because it uses a universal hash function as a component that is conceptually straightforward but somewhat complex to implement well. We therefore recommend using an off-the-shelf implementation, such as the implementation on the official CWC web page (http://www.zork.org/cwc/).

Here, we'll discuss how to use the distribution available from the CWC web page. This implementation has a set of macros similar to the macros we develop in Recipe 5.5 allowing you to bind the library to any AES implementation. In particular, if you edit local_options.h , you need to do the following:

1. Set AES_KS_T to whatever value you would set SPC_KEY_SCHED (see Recipe 5.5).

2. Set CWC_AES_SETUP to whatever value you would set SPC_ENCRYPT_INIT (see Recipe 5.5).

3. Set CWC_AES_ENCRYPT to whatever value you would set SPC_DO_ENCRYPT (see Recipe 5.5).

Once those bindings are made, the Zork CWC implementation has a simple API that accepts an entire message at once:

int cwc_init(cwc_t ctx[1], u_char key[ ], int keybits);

void cwc_encrypt_message(cwc_t ctx[1], u_char a[ ], u_int32 alen, u_char pt[ ],

u_int32 ptlen, u_char nonce[11], u_char output[ ]);

int cwc_decrypt_message(cwc_t ctx[1], u_char a[ ], u_int32 alen, u_char ct[ ],

u_int32 ctlen, u_char nonce[11], u_char output[ ]);

void cwc_cleanup(cwc_t ctx[1]);

If you have very large messages, this API insists that you buffer them before encrypting or decrypting. That's not a fundamental limitation of CWC mode, but only of this implementation. A future version of the implementation might change that, but do note that it would require partially decrypting a message before the library could determine whether the message is authentic. The API above does not decrypt if the message isn't authentic.

TIP

If you need to operate on very large messages, check out Brian Gladman's CWC implementation, which works incrementally.

This API looks slightly different from the all-in-one APIs we've presented for other modes in this chapter. It's actually closer to the incremental mode. The CWC mode has a notion of individual messages. It is intended that each message be sent individually. You're expected to use a single key for a large number of messages, but each message gets its own nonce. Generally, each message is expected to be short but can be multiple gigabytes.

Note that encrypting a message grows the message by 16 bytes. The extra 16 bytes at the end are used for ensuring the integrity of the message (it is effectively the result of a message authentication code; see Chapter 6).

The previous API assumes that you have the entire message to encrypt or decrypt at once. In the following discussion, we'll talk about the API that allows you to incrementally process a single message.

The cwc_init( ) function allows us to initialize a CWC context object of type cwc_t that can be reused across multiple messages. Generally, a single key will be used for an entire session. The first argument is a pointer to the cwc_t object (the declaration as an array of one is a specification saying that the pointer is only to a single object rather than to an array of objects). The second argument is the AES key, which must be a buffer of 16, 24, or 32 bytes. The third argument specifies the number of bits in the key (128, 192 or 256). The function fails if keybits is not a correct value.

The cwc_encrypt_message( ) function has the following arguments:

ctx

Pointer to the cwc_t context object.

a

Buffer containing optional data that you would like to authenticate, but that does not need to be encrypted, such as plaintext headers in the HTTP protocol.

alen

Length of extra authentication data buffer, specified in bytes. It may be zero if there is no such data.

pt

Buffer containing the plaintext you would like to encrypt and authenticate.

ptlen

Length of the plaintext buffer. It may be zero if there is no data to be encrypted.

nonce

Pointer to an 11-byte buffer, which must be unique for each message. (See Recipe 4.9 for hints on nonce selection.)

output

Buffer into which the ciphertext is written. This buffer must always be at least ptlen + 16 bytes in size because the message grows by 16 bytes when the authentication value is added.

This function always succeeds. The cwc_decrypt_message( ) function, on the other hand, returns 1 on success, and 0 on failure. Failure occurs only if the message integrity check fails, meaning the data has somehow changed since it was originally encrypted. This function has the following arguments:

ctx

Pointer to the cwc_t context object.

a

Buffer containing optional data that you would like to authenticate, but that was not encrypted, such as plaintext headers in the HTTP protocol.

alen

Length of extra authentication data buffer, specified in bytes. It may be zero if there is no such data.

ct

Buffer containing the ciphertext you would like to authenticate and decrypt if it is valid.

ctlen

Length of the ciphertext buffer. It may be zero if there is no data to be decrypted.

nonce

Pointer to an 11-byte buffer, which must be unique for each message. (See Recipe 4.9 for hints on nonce selection.)

output

Buffer into which the plaintext is written. This buffer must always be at least ctlen - 16 bytes in size because the message shrinks by 16 bytes when the authentication value is removed.

The cwc_cleanup( ) function simply wipes the contents of the cwc context object passed into it.

See Also

§ CWC implementation from Brian Gladman: http://fp.gladman.plus.com/AES/cwc.zip

§ CWC home page: http://www.zork.org/cwc

§ Recipe 5.4, Recipe 5.5

5.11. Manually Adding and Checking Cipher Padding

Problem

You want to add padding to data manually, then check it manually when decrypting.

Solution

There are many subtle ways in which padding can go wrong, so use an off-the-shelf scheme, such as PKCS block cipher padding.

Discussion

TIP

Padding is applied to plaintext; when decrypting, you must check for proper padding of the resulting data to determine where the plaintext message actually ends.

Generally, it is not a good idea to add padding yourself. If you're using a reasonably high-level abstraction, padding will be handled for you. In addition, padding often isn't required, for example, when using a stream cipher or one of many common block cipher modes (including CWC, CTR, CCM, OFB, and CFB).

Because ECB mode really shouldn't be used for stream-based encryption, the only common case where padding is actually interesting is when you're using CBC mode.

If you are in a situation where you do need padding, we recommend that you use a standard scheme. There are many subtle things that can go wrong (although the most important requirement is that padding always be unambiguous[13]), and there's no good reason to wing it.

The most widespread standard padding for block ciphers is called PKCS block padding. The goal of PKCS block padding is that the last byte of the padded plaintext should unambiguously describe how much padding was added to the message. PKCS padding sets every byte of padding to the number of bytes of padding added. If the input is block-aligned, an entire block of padding is added. For example, if four bytes of padding were needed, the proper padding would be:

0x04040404

If you're using a block cipher with 64-bit (8-byte) blocks, and the input is block-aligned, the padding would be:

0x0808080808080808

Here's an example API for adding and removing padding:

void spc_add_padding(unsigned char *pad_goes_here, int ptlen, int bl) {

int i, n = (ptlen - 1) % bl + 1;

for (i = 0; i < n; i++) *(pad_goes_here + i) = (unsigned char)n;

}

int spc_remove_padding(unsigned char *lastblock, int bl) {

unsigned char i, n = lastblock[bl - 1];

unsigned char *p = lastblock + bl;

/* In your programs you should probably throw an exception or abort instead. */

if (n > bl || n <= 0) return -1;

for (i = n; i; i--) if (*--p != n) return -1;

return bl - n;

}

The spc_add_padding( ) function adds padding directly to a preallocated buffer called pad_goes_here. The function takes as input the length of the plaintext and the block length of the cipher. From that information, we figure out how many bytes to add, and we write the result into the appropriate buffer.

The spc_remove_padding( ) function deals with unencrypted plaintext. As input, we pass it the final block of plaintext, along with the block length of the cipher. The function looks at the last byte to see how many padding bytes should be present. If the final byte is bigger than the block length or is less than one, the padding is not in the right format, indicating a decryption error. Finally, we check to see whether the padded bytes are all in the correct format. If everything is in order, the function will return the number of valid bytes in the final block of data, which could be anything from zero to one less than the block length.


[13] Because of this, it's impossible to avoid adding data to the end of the message, even when the message is block-aligned, at least if you want your padding scheme to work with arbitrary binary data.

5.12. Precomputing Keystream in OFB, CTR, CCM, or CWC Modes (or with Stream Ciphers)

Problem

You want to save computational resources when data is actually flowing over a network by precomputing keystream so that encryption or decryption will consist merely of XOR'ing data with the precomputed keystream.

Solution

If your API has a function that performs keystream generation, use that. Otherwise, call the encryption routine, passing in N bytes set to 0, where N is the number of bytes of keystream you wish to precompute.

Discussion

Most cryptographic APIs do not have an explicit way to precompute keystream for cipher modes where such precomputation makes sense. Fortunately, any byte XOR'd with zero returns the original byte. Therefore, to recover the keystream, we can "encrypt" a string of zeros. Then, when we have data that we really do wish to encrypt, we need only XOR that data with the stored keystream.

If you have the source for the encryption algorithm, you can remove the final XOR operation to create a keystream-generating function. For example, the spc_ctr_update( ) function from Recipe 5.9 can be adapted easily into the following keystream generator:

int spc_ctr_keystream(SPC_CTR_CTX *ctx, size_t il, unsigned char *out) {

int i;

if (ctx->ix) {

while (ctx->ix) {

if (!il--) return 1;

*out++ = ctx->ksm[ctx->ix++];

ctx->ix %= SPC_BLOCK_SZ;

}

}

if (!il) return 1;

while (il >= SPC_BLOCK_SZ) {

SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr, out);

ctr_increment(ctx->ctr);

il -= SPC_BLOCK_SZ;

out += SPC_BLOCK_SZ;

}

SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr, ctx->ksm);

ctr_increment(ctx->ctr);

for (i = 0; i <il; i++) *out++ = ctx->ksm[ctx->ix++];

return 1;

}

Note that we simply remove the in argument along with the XOR operation whenever we write to the output buffer.

5.13. Parallelizing Encryption and Decryption in Modes That Allow It (Without Breaking Compatibility)

Problem

You want to parallelize encryption, decryption, or keystream generation.

Solution

Only some cipher modes are naturally parallelizable in a way that doesn't break compatibility. In particular, CTR mode is naturally parallizable, as are decryption with CBC and CFB. There are two basic strategies: one is to treat the message in an interleaved fashion, and the other is to break it up into a single chunk for each parallel process.

The first strategy is generally more practical. However, it is often difficult to make either technique result in a speed gain when processing messages in software.

Discussion

TIP

Parallelizing encryption and decryption does not necessarily result in a speed improvement. To provide any chance of a speedup, you'll certainly need to ensure that multiple processors are working in parallel. Even in such an environment, data sets may be too small to run faster when they are processed in parallel.

Some cipher modes can have independent parts of the message operated upon independently. In such cases, there is the potential for parallelization. For example, with CTR mode, the keystream is computed in blocks, where each block of keystream is generated by encrypting a unique plaintext block. Those blocks can be computed in any order.

In CBC, CFB, and OFB modes, encryption can't really be parallelized because the ciphertext for a block is necessary to create the ciphertext for the next block; thus, we can't compute ciphertext out of order. However, for CBC and CFB, when we decrypt, things are different. Because we only need the ciphertext of a block to decrypt the next block, we can decrypt the next block before we decrypt the first one.

There are two reasonable strategies for parallelizing the work. When a message shows up all at once, you might divide it roughly into equal parts and handle each part separately. Alternatively, you can take an interleaved approach, where alternating blocks are handled by different threads. That is, the actual message is separated into two different plaintexts, as shown in Figure 5-5.

Encryption through interleaving

Figure 5-5. Encryption through interleaving

If done correctly, both approaches will result in the correct output. We generally prefer the interleaving approach, because all threads can do work with just a little bit of data available. This is particularly true in hardware, where buffers are small.

With a noninterleaving approach, you must wait at least until the length of the message is known, which is often when all of the data is finally available. Then, if the message length is known in advance, you must wait for a large percentage of the data to show up before the second thread can be launched.

Even the interleaved approach is a lot easier when the size of the message is known in advance because it makes it easier to get the message all in one place. If you need the whole message to come in before you know the length, parallelization may not be worthwhile, because in many cases, waiting for an entire message to come in before beginning work can introduce enough latency to thwart the benefits of parallelization.

If you aren't generally going to get an entire message all at once, but you are able to determine the biggest message you might get, another reasonably easy approach is to allocate a result buffer big enough to hold the largest possible message.

For the sake of simplicity, let's assume that the message arrives all at once and you might want to process a message with two parallel threads. The following code provides an example API that can handle CTR mode encryption and decryption in parallel (remember that encryption and decryption are the same operation in CTR mode).

Because we assume the message is available up front, all of the information we need to operate on a message is passed into the function spc_pctr_setup( ) , which requires a context object (here, the type is SPC_CTR2_CTX), the key, the key length in bytes, a nonce SPC_BLOCK_SZ - SPC_CTR_BYTES in length, the input buffer, the length of the message, and the output buffer. This function does not do any of the encryption and decryption, nor does it copy the input buffer anywhere.

To process the first block, as well as every second block after that, call spc_pctr_do_odd( ) , passing in a pointer to the context object. Nothing else is required because the input and output buffers used are the ones passed to the spc_pctr_setup( ) function. If you test, you'll notice that the results are exactly the same as with the CTR mode implementation from Recipe 5.9.

This code requires the preliminaries from Recipe 5.5, as well as the spc_memset( ) function from Recipe 13.2.

#include <stdlib.h>

#include <string.h>

typedef struct {

SPC_KEY_SCHED ks;

size_t len;

unsigned char ctr_odd[SPC_BLOCK_SZ];

unsigned char ctr_even[SPC_BLOCK_SZ];

unsigned char *inptr_odd;

unsigned char *inptr_even;

unsigned char *outptr_odd;

unsigned char *outptr_even;

} SPC_CTR2_CTX;

static void pctr_increment(unsigned char *ctr) {

unsigned char *x = ctr + SPC_CTR_BYTES;

while (x-- != ctr) if (++(*x)) return;

}

void spc_pctr_setup(SPC_CTR2_CTX *ctx, unsigned char *key, size_t kl,

unsigned char *nonce, unsigned char *in, size_t len,

unsigned char *out) {

SPC_ENCRYPT_INIT(&(ctx->ks), key, kl);

spc_memset(key,0, kl);

memcpy(ctx->ctr_odd, nonce, SPC_BLOCK_SZ - SPC_CTR_BYTES);

spc_memset(ctx->ctr_odd + SPC_BLOCK_SZ - SPC_CTR_BYTES, 0, SPC_CTR_BYTES);

memcpy(ctx->ctr_even, nonce, SPC_BLOCK_SZ - SPC_CTR_BYTES);

spc_memset(ctx->ctr_even + SPC_BLOCK_SZ - SPC_CTR_BYTES, 0, SPC_CTR_BYTES);

pctr_increment(ctx->ctr_even);

ctx->inptr_odd = in;

ctx->inptr_even = in + SPC_BLOCK_SZ;

ctx->outptr_odd = out;

ctx->outptr_even = out + SPC_BLOCK_SZ;

ctx->len = len;

}

void spc_pctr_do_odd(SPC_CTR2_CTX *ctx) {

size_t i, j;

unsigned char final[SPC_BLOCK_SZ];

for (i = 0; i + SPC_BLOCK_SZ < ctx->len; i += 2 * SPC_BLOCK_SZ) {

SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr_odd, ctx->outptr_odd);

pctr_increment(ctx->ctr_odd);

pctr_increment(ctx->ctr_odd);

for (j = 0; j < SPC_BLOCK_SZ / sizeof(int); j++)

((int *)ctx->outptr_odd)[j] ^= ((int *)ctx->inptr_odd)[j];

ctx->outptr_odd += SPC_BLOCK_SZ * 2;

ctx->inptr_odd += SPC_BLOCK_SZ * 2;

}

if (i < ctx->len) {

SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr_odd, final);

for (j = 0; j < ctx->len - i; j++)

ctx->outptr_odd[j] = final[j] ^ ctx->inptr_odd[j];

}

}

void spc_pctr_do_even(SPC_CTR2_CTX *ctx) {

size_t i, j;

unsigned char final[SPC_BLOCK_SZ];

for (i = SPC_BLOCK_SZ; i + SPC_BLOCK_SZ < ctx->len; i += 2 * SPC_BLOCK_SZ) {

SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr_even, ctx->outptr_even);

pctr_increment(ctx->ctr_even);

pctr_increment(ctx->ctr_even);

for (j = 0; j < SPC_BLOCK_SZ / sizeof(int); j++)

((int *)ctx->outptr_even)[j] ^= ((int *)ctx->inptr_even)[j];

ctx->outptr_even += SPC_BLOCK_SZ * 2;

ctx->inptr_even += SPC_BLOCK_SZ * 2;

}

if (i < ctx->len) {

SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr_even, final);

for (j = 0; j < ctx->len - i; j++)

ctx->outptr_even[j] = final[j] ^ ctx->inptr_even[j];

}

}

int spc_pctr_final(SPC_CTR2_CTX *ctx) {

spc_memset(&ctx, 0, sizeof(SPC_CTR2_CTX));

return 1;

}

See Also

Recipe 5.5, Recipe 5.9, Recipe 13.2

5.14. Parallelizing Encryption and Decryption in Arbitrary Modes (Breaking Compatibility)

Problem

You are using a cipher mode that is not intrinsically parallelizable, but you have a large data set and want to take advantage of multiple processors at your disposal.

Solution

Treat the data as multiple streams of interleaved data.

Discussion

TIP

Parallelizing encryption and decryption does not necessarily result in a speed improvement. To provide any chance of a speedup, you will certainly need to ensure that multiple processors are working in parallel. Even in such an environment, data sets may be too small to run faster when they are processed in parallel.

Recipe 5.13 demonstrates how to parallelize CTR mode encryption on a per-block level using a single encryption context. Instead of having spc_pctr_do_even( ) and spc_pctr_do_odd( ) share a key and nonce, you could use two separate encryption contexts. In such a case, there is no need to limit your choice of mode to one that is intrinsically parallelizable. However, note that you won't get the same results when using two separate contexts as you do when you use a single context, even if you use the same key and IV or nonce (remembering that IV/nonce reuse is a bad idea—and that certainly applies here).

One consideration is how much to interleave. There's no need to interleave on a block level. For example, if you are using two parallel encryption contexts, you could encrypt the first 1,024 bytes of data with the first context, then alternate every 1,024 bytes.

Generally, it is best to use a different key for each context. You can derive multiple keys from a single base key, as shown in Recipe 4.11.

It's easiest to consider interleaving only at the plaintext level, particularly if you're using a block-based mode, where padding will generally be added for each cipher context. In such a case, you would send the encrypted data in multiple independent streams and reassemble it after decryption.

See Also

Recipe 4.11, Recipe 5.13

5.15. Performing File or Disk Encryption

Problem

You want to encrypt a file or a disk.

Solution

If you're willing to use a nonce or an initialization vector, standard modes such as CBC and CTR are acceptable. For file-at-a-time encryption, you can avoid the use of a nonce or IV altogether by using the LION construction, described in Section 5.15.3.

Generally, keys will be generated from a password. For that, use PKCS #5, as discussed in Recipe 4.10.

Discussion

Disk encryption is usually done in fixed-size chunks at the operating system level. File encryption can be performed in chunks so that random access to an encrypted file doesn't require decrypting the entire file. This also has the benefit that part of a file can be changed without reencrypting the entire file.

CBC mode is commonly used for this purpose, and it is used on chunks that are a multiple of the block size of the underlying block cipher, so that padding is never necessary. This eliminates any message expansion that one would generally expect with CBC mode.

However, when people are doing disk or file encryption with CBC mode, they often use a fixed initialization vector. That's a bad idea because an initialization vector is expected to be random for CBC mode to obtain its security goals. Using a fixed IV leads to dictionary-like attacks that can often lead to recovering, at the very least, the beginning of a file.

Other modes that require only a nonce (not an initialization vector) tend to be streaming modes. These fail miserably when used for disk encryption if the nonce does not change every single time the contents associated with that nonce change.

TIP

Keys for disk encryption are generally created from a password. Such keys will be only as strong as the password. See Recipe 4.10 for a discussion of turning a password into a cryptographic key.

For example, if you're encrypting file-by-file in 8,192-byte chunks, you need a separate nonce for each 8,192-byte chunk, and you need to select a new nonce every single time you want to protect a modified version of that chunk. You cannot just make incremental changes, then reencrypt with the same nonce.

In fact, even for modes where sequential nonces are possible, they really don't make much sense in the context of file encryption. For example, some people think they can use just one CTR mode nonce for the entire disk. But if you ever reuse the same piece of keystream, there are attacks. Therefore, any time you change even a small piece of data, you will have to reencrypt the entire disk using a different nonce to maintain security. Clearly, that isn't practical.

Therefore, no matter what mode you choose to use, you should choose random initial values.

Many people don't like IVs or nonces for file encryption because of storage space issues. They believe they shouldn't "waste" space on storing an IV or nonce. When you're encrypting fixed-size chunks, there are not any viable alternatives; if you want to ensure security, you must use an IV.

If you're willing to accept message expansion, you might want to consider a high-level mode such as CWC, so that you can also incorporate integrity checks. In practice, integrity checks are usually ignored on filesystems, though, and the filesystems trust that the operating system's access control system will ensure integrity.

Actually, if you're willing to encrypt and decrypt on a per-file basis, where you cannot decrypt the file in parts, you can actually get rid of the need for an initialization vector by using LION, which is a construction that takes a stream cipher and hash function and turns them into a block cipher that has an arbitrary block size. Essentially, LION turns those constructs into a single block cipher that has a variable block length, and you use the cipher in ECB mode.

Throughout this book, we repeatedly advise against using raw block cipher operations for things like file encryption. However, when the block size is always the same length as the message you want to encrypt, ECB mode isn't so bad. The only problem is that, given a {key, plaintext} pair, an unchanged file will always encrypt to the same value. Therefore, an attacker who has seen a particular file encrypted once can find any unchanged versions of that file encrypted with the same key. A single change in the file thwarts this problem, however. In practice, most people probably won't be too concerned with this kind of problem.

Using raw block cipher operations with LION is useful only if the block size really is the size of the file. You can't break the file up into 8,192-byte chunks or anything like that, which can have a negative impact on performance, particularly as the file size gets larger.

Considering what we've discussed, something like CBC mode with a randomly chosen IV per block is probably the best solution for pretty much any use, even if it does take up some additional disk space. Nonetheless, we recognize that people may want to take an approach where they only need to have a key, and no IV or nonce.

Therefore, we'll show you LION, built out of the RC4 implementation from Recipe 5.23 and SHA1 (see Recipe 6.7). The structure of LION is shown in Figure 5-6.

TIP

While we cover RC4 because it is popular, we strongly recommend you use SNOW 2.0 instead, because it seems to have a much more comfortable security margin.

The one oddity of this technique is that files must be longer than the output size of the message digest function (20 bytes in the case of SHA1). Therefore, if you have files that small, you will either need to come up with a nonambiguous padding scheme, which is quite complicated to do securely, or you'll need to abandon LION (either just for small messages or in general).

LION requires a key that is twice as long as the output size of the message digest function. As with regular CBC-style encryption for files, if you're using a cipher that takes fixed-size keys, we expect you'll generate a key of the appropriate length from a password.

The structure of LION

Figure 5-6. The structure of LION

We also assume a SHA1 implementation with a very standard API. Here, we use an API that works with OpenSSL, which should be easily adaptable to other libraries. To switch hash functions, replace the SHA1 calls as appropriate, and change the value of HASH_SZ to be the digest size of the hash function that you wish to use.

The function spc_lion_encrypt( ) encrypts its first argument, putting the result into the memory pointed to by the second argument. The third argument specifies the size of the message, and the last argument is the key. Again, note that the input size must be larger than the hash function's output size.

The spc_lion_decrypt( ) function takes a similar argument set as spc_lion_encrypt( ), merely performing the inverse operation.

#include <stdio.h>

#include <openssl/rc4.h>

#include <openssl/sha.h>

#define HASH_SZ 20

#define NUM_WORDS (HASH_SZ / sizeof(int))

void spc_lion_encrypt(char *in, char *out, size_t blklen, char *key) {

int i, tmp[NUM_WORDS];

RC4_KEY k;

/* Round 1: R = R ^ RC4(L ^ K1) */

for (i = 0; i < NUM_WORDS; i++)

tmp[i] = ((int *)in)[i] ^ ((int *)key)[i];

RC4_set_key(&k, HASH_SZ, (char *)tmp);

RC4(&k, blklen - HASH_SZ, in + HASH_SZ, out + HASH_SZ);

/* Round 2: L = L ^ SHA1(R) */

SHA1(out + HASH_SZ, blklen - HASH_SZ, out);

for (i = 0; i < NUM_WORDS; i++)

((int *)out)[i] ^= ((int *)in)[i];

/* Round 3: R = R ^ RC4(L ^ K2) */

for (i = 0; i < NUM_WORDS; i++)

tmp[i] = ((int *)out)[i] ^ ((int *)key)[i + NUM_WORDS];

RC4_set_key(&k, HASH_SZ, (char *)tmp);

RC4(&k, blklen - HASH_SZ, out + HASH_SZ, out + HASH_SZ);

}

void spc_lion_decrypt(char *in, char *out, size_t blklen, char *key) {

int i, tmp[NUM_WORDS];

RC4_KEY k;

for (i = 0; i < NUM_WORDS; i++)

tmp[i] = ((int *)in)[i] ^ ((int *)key)[i + NUM_WORDS];

RC4_set_key(&k, HASH_SZ, (char *)tmp);

RC4(&k, blklen - HASH_SZ, in + HASH_SZ, out + HASH_SZ);

SHA1(out + HASH_SZ, blklen - HASH_SZ, out);

for (i = 0; i < NUM_WORDS; i++) {

((int *)out)[i] ^= ((int *)in)[i];

tmp[i] = ((int *)out)[i] ^ ((int *)key)[i];

}

RC4_set_key(&k, HASH_SZ, (char *)tmp);

RC4(&k, blklen - HASH_SZ, out + HASH_SZ, out + HASH_SZ);

}

See Also

Recipe 4.10, Recipe 5.23, Recipe 6.7

5.16. Using a High-Level, Error-Resistant Encryption and Decryption API

Problem

You want to do encryption or decryption without the hassle of worrying about choosing an encryption algorithm, performing an integrity check, managing a nonce, and so on.

Solution

Use the following "Encryption Queue" implementation, which relies on the reference CWC mode implementation (discussed in Recipe 5.10) and the key derivation function from Recipe 4.11.

Discussion

WARNING

Be sure to take into account the fact that functions in this API can fail, particularly the decryption functions. If a decryption function fails, you need to fail gracefully. In Recipe 9.12, we discuss many issues that help ensure robust network communication that we don't cover here.

This recipe provides an easy-to-use interface to symmetric encryption. The two ends of communication must set up cipher queues in exactly the same configuration. Thereafter, they can exchange messages easily until the queues are destroyed.

This code relies on the reference CWC implementation discussed in Recipe 5.10. We use CWC mode because it gives us both encryption and integrity checking using a single key with a minimum of fuss.

We add a new data type, SPC_CIPHERQ , which is responsible for keeping track of queue state. Here's the declaration of the SPC_CIPHERQ data type:

typedef struct {

cwc_t ctx;

unsigned char nonce[SPC_BLOCK_SZ];

} SPC_CIPHERQ;

SPC_CIPHERQ objects are initialized by calling spc_cipherq_setup( ) , which requires the code from Recipe 5.5, as well as an implementation of the randomness API discussed in Recipe 11.2:

#include <stdlib.h>

#include <string.h>

#include <cwc.h>

#define MAX_KEY_LEN (32) /* 256 bits */

size_t spc_cipherq_setup(SPC_CIPHERQ *q, unsigned char *basekey, size_t keylen,

size_t keyuses) {

unsigned char dk[MAX_KEY_LEN];

unsigned char salt[5];

spc_rand(salt, 5);

spc_make_derived_key(basekey, keylen, salt, 5, 1, dk, keylen);

if (!cwc_init(&(q->ctx), dk, keylen * 8)) return 0;

memcpy(q->nonce, salt, 5);

spc_memset(basekey, 0, keylen);

return keyuses + 1;

}

The function has the following arguments:

q

SPC_CIPHERQ context object.

basekey

Shared key used by both ends of communication (the "base key" that will be used to derive session keys).

keylen

Length of the shared key in bytes, which must be 16, 24, or 32.

keyuses

Indicates how many times the current key has been used to initialize a SPC_CIPHERQ object. If you are going to reuse keys, it is important that this argument be used properly.

WARNING

On error, spc_cipherq_setup() returns 0. Otherwise, it returns the next value it would expect to receive for the keyuses argument. Be sure to save this value if you ever plan to reuse keys.

Note also that basekey is erased upon successful initialization.

Every time you initialize an SPC_CIPHERQ object, a key specifically for use with that queue instance is generated, using the basekey and the keyuses arguments. To derive the key, we use the key derivation function discussed in Recipe 4.11. Note that this is useful when two parties share a long-term key that they wish to keep reusing. However, if you exchange a session key at connection establishment (i.e., using one of the techniques from Chapter 8), the key derivation step is unnecessary, because reusing {key, nonce} pairs is already incredibly unlikely in such a situation.

Both communicating parties must initialize their queue with identical parameters.

When you're done with a queue, you should deallocate internally allocated memory by calling spc_cipherq_cleanup( ) :

void spc_cipherq_cleanup(SPC_CIPHERQ *q) {

spc_memset(q, 0, sizeof(SPC_CIPHERQ));

}

Here are implementations of the encryption and decryption operations (including a helper function), both of which return a newly allocated buffer containing the results of the appropriate operation:

static void increment_counter(SPC_CIPHERQ *q) {

if (!++q->nonce[10]) if (!++q->nonce[9]) if (!++q->nonce[8]) if (!++q->nonce[7])

if (!++q->nonce[6]) ++q->nonce[5];

}

unsigned char *spc_cipherq_encrypt(SPC_CIPHERQ *q, unsigned char *m, size_t mlen,

size_t *ol) {

unsigned char *ret;

if (!(ret = (unsigned char *)malloc(mlen + 16))) {

if (ol) *ol = 0;

return 0;

}

cwc_encrypt(&(q->ctx), 0, 0, m, mlen, q->nonce, ret);

increment_counter(q);

if (ol) *ol = mlen + 16;

return ret;

}

unsigned char *spc_cipherq_decrypt(SPC_CIPHERQ *q, unsigned char *m, size_t mlen,

size_t *ol) {

unsigned char *ret;

if (!(ret = (unsigned char *)malloc(mlen - 16))) {

if (ol) *ol = 0;

return 0;

}

if (!cwc_decrypt(&(q->ctx), 0, 0, m, mlen, q->nonce, ret)) {

free(ret);

if (ol) *ol = 0;

return 0;

}

increment_counter(q);

if (ol) *ol = mlen - 16;

return ret;

}

The functions spc_cipherq_encrypt( ) and spc_cipherq_decrypt( ) each take four arguments:

q

SPC_CIPHERQ object to use for encryption or decryption.

m

Message to be encrypted or decrypted.

mlen

Length of the message to be encrypted or decrypted, in bytes.

ol

The number of bytes returned from the encryption or decryption operation is stored in this integer pointer. This may be NULL if you don't need the information. The number of bytes returned will always be the message length plus 16 bytes for encryption, or the message length minus 16 bytes for decryption.

These functions don't check for counter rollover because you can use this API to send over 250 trillion messages with a single key, which should be adequate for any use.

TIP

Instead of using such a large counter, it is a good idea to use only five bytes for the counter and initialize the rest with a random salt value. The random salt helps prevent against a class of problems in which the attacker amortizes the cost of an attack by targeting a large number of possible keys at once. In Recipe 9.12, we show a similar construction that uses both a salt and a counter in the nonce.

If you do think you might send more messages under a single key, be sure to rekey in time. (This scheme is set up to handle at least four trillion keyings with a single base key.)

In the previous code, the nonces are separately managed by both parties in the communication. They each increment by one when appropriate, and will fail to decrypt a message with the wrong nonce. Thus, this solution prevents capture replay attacks and detects message drops or message reordering, all as a result of implicit message numbering. Some people like explicit message numbering and would send at least a message number, if not the entire nonce, with each message (though you should always compare against the previous nonce to make sure it's increasing). In addition, if there's a random portion to the nonce as we suggested above, the random portion needs to be communicated to both parties. In Recipe 9.12, we send the nonce explicitly with each message, which helps communicate the portion randomly selected at connection setup time.

It's possible to mix and match calls to spc_cipherq_encrypt( ) and spc_cipherq_decrypt( ) using a single context. However, if you want to use this API in this manner, do so only if the communicating parties send messages in lockstep. If parties can communicate asynchronously (that is, without taking turns), there is the possibility for a race condition in which the SPC_CIPHERQ states on each side of the communication get out of sync, which will needlessly cause decryption operations to fail.

If you need to perform asynchronous communication with an infrastructure like this, you could use two SPC_CIPHERQ instances, one where the client encrypts messages for the server to decrypt, and another where the server encrypts messages for the client to decrypt.

The choice you need to make is whether each SPC_CIPHERQ object should be keyed separately or should share the same key. Sharing the same key is possible, as long as you ensure that the same {key, nonce} pair is never reused. The way to do this is to manage two sets of nonces that can never collide. Generally, you do this by setting the high bit of the nonce buffer to 1 in one context and 0 in another context.

Here's a function that takes an existing context that has been set up, but not otherwise used, and turns it into two contexts with the same key:

void spc_cipherq_async_setup(SPC_CIPHERQ *q1, SPC_CIPHERQ *q2) {

memcpy(q2, q1, sizeof(SPC_CIPHERQ));

q1->nonce[0] &= 0x7f; /* The upper bit of q1's nonce is always 0. */

q2->nonce[0] |= 0x80; /* The upper bit of q2's nonce is always 1. */

}

We show a similar trick in which we use only one abstraction in Recipe 9.12.

See Also

Recipe 4.11, Recipe 5.5, Recipe 5.10, Recipe 9.12, Recipe 11.2

5.17. Performing Block Cipher Setup (for CBC, CFB, OFB, and ECB Modes) in OpenSSL

Problem

You need to set up a cipher so that you can perform encryption and/or decryption operations in CBC, CFB, OFB, or ECB mode.

Solution

Here are the steps you need to perform for cipher setup in OpenSSL, using their high-level API:

1. Make sure your code includes openssl/evp.h and links to libcrypto (-lcrypto).

2. Decide which algorithm and mode you want to use, looking up the mode in Table 5-6 to determine which function instantiates an OpenSSL object representing that mode. Note that OpenSSL provides only a CTR mode implementation for AES. See Recipe 5.9 for more on CTR mode.

3. Instantiate a cipher context (type EVP_CIPHER_CTX).

4. Pass a pointer to the cipher context to EVP_CIPHER_CTX_init( ) to initialize memory properly.

5. Choose an IV or nonce, if appropriate to the mode (all except ECB).

6. Initialize the mode by calling EVP_EncryptInit_ex( ) or EVP_DecryptInit_ex( ) , as appropriate:

7. int EVP_EncryptInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER *type, ENGINE

8. *engine, unsigned char *key, unsigned char *ivornonce);

9. int EVP_DecryptInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER *type, ENGINE

*engine, unsigned char *key, unsigned char *ivornonce);

10.If desired, perform any additional configuration the cipher may allow (see Recipe 5.20).

Discussion

WARNING

Use the raw OpenSSL API only when absolutely necessary because there is a huge potential for introducing a security vulnerability by accident. For general-purpose use, we recommend a high-level abstraction, such as that discussed in Recipe 5.16.

The OpenSSL EVP API is a reasonably high-level interface to a multitude of cryptographic primitives. It attempts to abstract out most algorithm dependencies, so that algorithms are easy to swap.[14]

The EVP_EncryptInit_ex( ) and EVP_DecryptInit_ex( ) functions set up a cipher context object to be used for further operations. It takes four arguments that provide all the information necessary before encryption or decryption can begin. Both take the same arguments:

ctx

Pointer to an EVP_CIPHER_CTX object, which stores cipher state across calls.

type

Pointer to an EVP_CIPHER object, which represents the cipher configuration to use (see the later discussion).

engine

Pointer to an ENGINE object representing the actual implementation to use. For example, if you want to use hardware acceleration, you can pass in an ENGINE object that represents your cryptographic accelerator.

key

Pointer to the encryption key to be used.

ivornonce

Pointer to an initialization vector or none, if appropriate (use NULL otherwise). For CBC, CFB, and OFB modes, the initialization vector or nonce is always the same size as the block size of the cipher, which is often different from the key size of the cipher.

There are also deprecated versions of these calls, EVP_EncryptInit( ) and EVP_DecryptInit( ) , that are the same except that they do not take the engine argument, and they use only the built-in software implementation.

Calling a function that returns an EVP_CIPHER object will cause the cipher's implementation to load dynamically and place information about the algorithm into an internal table if it has not yet done so. Alternatively, you can load all possible symmetric ciphers at once with a call to the functionOpenSSL_add_all_ciphers( ) , or all ciphers and message digest algorithms with a call to the function OpenSSL_add_all_algorithms( ) (neither function takes any arguments). For algorithms that have been loaded, you can retrieve pointers to their objects by name using the EVP_get_cipherbyname( )function, which takes a single parameter of type char *, representing the desired cipher configuration.

Table 5-6 summarizes the possible functions that can load ciphers (if necessary) and return EVP_CIPHER objects. The table also shows the strings that can be used to look up loaded ciphers.

TIP

As noted in Recipe 5.2, we personally recommend AES-based solutions, or (of the ciphers OpenSSL offers) Triple-DES if AES is not appropriate. If you use other algorithms, be sure to research them thoroughly.

Table 5-6. Cipher instantiation reference

Cipher

Key strength / actual size (if different)

Cipher mode

Call for EVP_CIPHER object

Cipher lookup string

AES

128 bits

ECB

EVP_aes_128_ecb( )

aes-128-ecb

AES

128 bits

CBC

EVP_aes_128_cbc( )

aes-128-cbc

AES

128 bits

CFB

EVP_aes_128_cfb( )

aes-128-cfb

AES

128 bits

OFB

EVP_aes_128_ofb( )

aes-128-ofb

AES

192 bits

ECB

EVP_aes_192_ecb( )

aes-192-ecb

AES

192 bits

CBC

EVP_aes_192_cbc( )

aes-192-cbc

AES

192 bits

CFB

EVP_aes_192_cfb( )

aes-192-cfb

AES

192 bits

OFB

EVP_aes_192_ofb( )

aes-192-ofb

AES

256 bits

ECB

EVP_aes_256_ecb( )

aes-256-ecb

AES

256 bits

CBC

EVP_aes_256_cbc( )

aes-256-cbc

AES

256 bits

CFB

EVP_aes_256_cfb( )

aes-256-cfb

AES

256 bits

OFB

EVP_aes_256_ofb( )

aes-256-ofb

Blowfish

128 bits

ECB

EVP_bf_ecb( )

bf-ecb

Blowfish

128 bits

CBC

EVP_bf_cbc( )

bf-cbc

Blowfish

128 bits

CFB

EVP_bf_cfb( )

bf-cfb

Blowfish

128 bits

OFB

EVP_bf_ofb( )

bf-ofb

CAST5

128 bits

ECB

EVP_cast_ecb( )

cast-ecb

CAST5

128 bits

CBC

EVP_cast_cbc( )

cast-cbc

CAST5

128 bits

CFB

EVP_cast_cfb( )

cast-cfb

CAST5

128 bits

OFB

EVP_cast_ofb( )

cast-ofb

DES

Effective: 56 bitsActual: 64 bits

ECB

EVP_des_ecb( )

des-ecb

DES

Effective: 56 bitsActual: 64 bits

CBC

EVP_des_cbc( )

des-cbc

DES

Effective: 56 bitsActual: 64 bits

CFB

EVP_des_cfb( )

des-cfb

DES

Effective: 56 bitsActual: 64 bits

OFB

EVP_des_ofb( )

des-ofb

DESX

Effective[15]: 120 bitsActual: 128 bits

CBC

EVP_desx_cbc( )

desx

3-key Triple-DES

Effective: 112 bitsActual: 192 bits

ECB

EVP_des_ede3( )

des-ede3

3-key Triple-DES

Effective: 112 bitsActual: 192 bits

CBC

EVP_des_ede3_cbc( )

des-ede3-cbc

3-key Triple-DES

Effective: 112 bitsActual: 192 bits

CFB

EVP_des_ede3_cfb( )

des-ede3-cfb

3-key Triple-DES

Effective: 112 bitsActual: 192 bits

OFB

EVP_des_ede3_ofb( )

des-ede3-ofb

2-key Triple-DES

Effective: 112 bitsActual: 128 bits

ECB

EVP_des_ede( )

des-ede

2-key Triple-DES

Effective: 112 bitsActual: 128 bits

CBC

EVP_des_ede_cbc( )

des-ede-cbc

2-key Triple-DES

Effective: 112 bitsActual: 128 bits

CFB

EVP_des_ede_cfb( )

des-ede-cfb

2-key Triple-DES

Effective: 112 bitsActual: 128 bits

OFB

EVP_des_ede_ofb( )

des-ede-ofb

IDEA

128 bits

ECB

EVP_idea_ecb( )

idea-ecb

IDEA

128 bits

CBC

EVP_idea_cbc( )

idea-cbc

IDEA

128 bits

CFB

EVP_idea_cfb( )

idea-cfb

IDEA

128 bits

OFB

EVP_idea_ofb( )

idea-ofb

RC2™

128 bits

ECB

EVP_rc2_ecb( )

rc2-ecb

RC2™

128 bits

CBC

EVP_rc2_cbc( )

rc2-cbc

RC2™

128 bits

CFB

EVP_rc2_cfb( )

rc2-cfb

RC2™

128 bits

OFB

EVP_rc2_ofb( )

rc2-ofb

RC4™

40 bits

n/a

EVP_rc4_40( )

rc4-40

RC4™

128 bits

n/a

EVP_rc4( )

rc4

RC5™

128 bits

ECB

EVP_rc5_32_16_12_ecb( )

rc5-ecb

RC5™

128 bits

CBC

EVP_rc5_32_16_12_cbc( )

rc5-cbc

RC5™

128 bits

CFB

EVP_rc5_32_16_12_cfb( )

rc5-cfb

RC5™

128 bits

OFB

EVP_rc5_32_16_12_ofb( )

rc5-ofb

[15] There are known plaintext attacks against DESX that reduce the effective strength to 60 bits, but these are generally considered infeasible.

For stream-based modes (CFB and OFB), encryption and decryption are identical operations. Therefore, EVP_EncryptInit_ex( ) and EVP_DecryptInit_ex( ) are interchangeable in these cases.

WARNING

While RC4 can be set up using these instructions, you must be very careful to set it up securely. We discuss how to do so in Recipe 5.23.

Here is an example of setting up an encryption context using 128-bit AES in CBC mode:

#include <openssl/evp.h>

#include <openssl/rand.h>

/* key must be of size EVP_MAX_KEY_LENGTH.

* iv must be of size EVP_MAX_IV_LENGTH.

*/

EVP_CIPHER_CTX *sample_setup(unsigned char *key, unsigned char *iv) {

EVP_CIPHER_CTX *ctx;

/* This uses the OpenSSL PRNG . See Recipe 11.9 */

RAND_bytes(key, EVP_MAX_KEY_LENGTH);

RAND_bytes(iv, EVP_MAX_IV_LENGTH);

if (!(ctx = (EVP_CIPHER_CTX *)malloc(sizeof(EVP_CIPHER_CTX)))) return 0;

EVP_CIPHER_CTX_init(ctx);

EVP_EncryptInit_ex(ctx, EVP_aes_128_cbc( ), 0, key, iv);

return ctx;

}

This example selects a key and initialization vector at random. Both of these items need to be communicated to any party that needs to decrypt the data. The caller therefore needs to be able to recover this information. In this example, we handle this by having the caller pass in allocated memory, which we fill with the new key and IV. The caller can then communicate them to the other party in whatever manner is appropriate.

Note that to make replacing algorithms easier, we always create keys and initialization vectors of the maximum possible length, using macros defined in the openssl/evp.h header file.

See Also

Recipe 5.2, Recipe 5.9, Recipe 5.16, Recipe 5.18, Recipe 5.20, Recipe 5.23


[14] EVP stands for "envelope."

5.18. Using Variable Key-Length Ciphers in OpenSSL

Problem

You're using a cipher with an adjustable key length, yet OpenSSL provides no default cipher configuration for your desired key length.

Solution

Initialize the cipher without a key, call EVP_CIPHER_CTX_set_key_length( ) to set the appropriate key length, then set the key.

Discussion

Many of the ciphers supported by OpenSSL support variable key lengths. Whereas some, such as AES, have an available call for each possible key length, others (in particular, RC4) allow for nearly arbitrary byte-aligned keys. Table 5-7 lists ciphers supported by OpenSSL, and the varying key lengths those ciphers can support.

Table 5-7. Variable key sizes

Cipher

OpenSSL-supported key sizes

Algorithm's possible key sizes

AES

128, 192, and 256 bits

128, 192, and 256 bits

Blowfish

Up to 256 bits

Up to 448 bits

CAST5

40-128 bits

40-128 bits

RC2

Up to 256 bits

Up to 1,024 bits

RC4

Up to 256 bits

Up to 2,048 bits

RC5

Up to 256 bits

Up to 2,040 bits

While RC2, RC4, and RC5 support absurdly high key lengths, it really is overkill to use more than a 256-bit symmetric key. There is not likely to be any greater security, only less efficiency. Therefore, OpenSSL puts a hard limit of 256 bits on key sizes.

When calling the OpenSSL cipher initialization functions, you can set to NULL any value you do not want to provide immediately. If the cipher requires data you have not yet provided, clearly encryption will not work properly.

Therefore, we can choose a cipher using EVP_EncryptInit_ex( ) without specifying a key, then set the key size using EVP_CIPHER_CTX_set_key_length( ) , which takes two arguments: the first is the context initialized by the call to EVP_EncryptInit_ex( ), and the second is the new key length in bytes.

Finally, we can set the key by calling EVP_EncryptInit_ex( ) again, passing in the context and any new data, along with NULL for any parameters we've already set. For example, the following code would set up a 256-bit version of Blowfish in CBC mode:

#include <openssl/evp.h>

EVP_CIPHER_CTX *blowfish_256_cbc_setup(char *key, char *iv) {

EVP_CIPHER_CTX *ctx;

if (!(ctx = (EVP_CIPHER_CTX *)malloc(sizeof(EVP_CIPHER_CTX)))) return 0;

EVP_CIPHER_CTX_init(ctx);

/* Uses 128-bit keys by default. We pass in NULLs for the parameters that we'll

* fill in after properly setting the key length.

*/

EVP_EncryptInit_ex(ctx, EVP_bf_cbc( ), 0, 0, 0);

EVP_CIPHER_CTX_set_key_length(ctx, 32);

EVP_EncryptInit_ex(ctx, 0, 0, key, iv);

return ctx;

}

5.19. Disabling Cipher Padding in OpenSSL in CBC Mode

Problem

You're encrypting in CBC or ECB mode, and the length of your data to encrypt is always a multiple of the block size. You would like to avoid padding because it adds an extra, unnecessary block of output.

Solution

OpenSSL has a function that can turn padding on and off for a context object:

int EVP_CIPHER_CTX_set_padding(EVP_CIPHER_CTX *ctx, int pad);

Discussion

Particularly when you are implementing another encryption mode, you may always be operating on block-sized chunks, and it can be inconvenient to deal with padding. Alternatively, some odd protocol may require a nonstandard padding scheme that causes you to pad the data manually before encryption (and to remove the pad manually after encryption).

The second argument of this function should be zero to turn padding off, and non-zero to turn it on.

5.20. Performing Additional Cipher Setup in OpenSSL

Problem

Using OpenSSL, you want to adjust a configurable parameter of a cipher other than the key length.

Solution

OpenSSL provides an obtuse, ioctl()-style API for setting uncommon cipher parameters on a context object:

int EVP_CIPHER_CTX_ctrl(EVP_CIPHER_CTX *ctx, int type, int arg, void *ptr);

Discussion

OpenSSL doesn't provide much flexibility in adjusting cipher characteristics. For example, the three AES configurations are three specific instantiations of a cipher called Rijndael, which has nine different configurations. However, OpenSSL supports only the three standard ones.

Nevertheless, there are two cases in which OpenSSL does allow for configurability. In the first case, it allows for setting the "effective key bits" in RC2. As a result, the RC2 key is crippled so that it is only as strong as the effective size set. We feel that this functionality is completely useless.

In the second case, OpenSSL allows you to set the number of rounds used internally by the RC5 algorithm. By default, RC5 uses 12 rounds. And while the algorithm should take absolutely variable-length rounds, OpenSSL allows you to set the number only to 8, 12, or 16.

The function EVP_CIPHER_CTX_ctrl( ) can be used to set or query either of these values, given a cipher of the appropriate type. This function has the following arguments:

ctx

Pointer to the cipher context to be modified.

type

Value indicating which operation to perform (more on this a little later).

arg

Numerical value to set, if appropriate (it is otherwise ignored).

ptr

Pointer to an integer for querying the numerical value of a property, if appropriate (the result is placed in the integer being pointed to).

The type argument can be one of the four macros defined in openssl/evp.h:

EVP_CTRL_GET_RC2_KEY_BITS

EVP_CTRL_SET_RC2_KEY_BITS

EVP_CTRL_GET_RC5_ROUNDS

EVP_CTRL_SET_RC5_ROUNDS

For example, to set an RC5 context to use 16 rounds:

EVP_CIPHER_CTX_ctrl(ctx, EVP_CTRL_SET_RC5_ROUNDS, 16, NULL);

To query the number of rounds, putting the result into an integer named r:

EVP_CIPHER_CTX_ctrl(ctx, EVP_CTRL_GET_RC5_ROUNDS, 0, &r);

5.21. Querying Cipher Configuration Properties in OpenSSL

Problem

You want to get information about a particular cipher context in OpenSSL.

Solution

For most properties, OpenSSL provides macros for accessing them. For other things, we can access the members of the cipher context structure directly.

To get the actual object representing the cipher:

EVP_CIPHER *EVP_CIPHER_CTX_cipher(EVP_CIPHER_CTX *ctx);

To get the block size of the cipher:

int EVP_CIPHER_CTX_block_size(EVP_CIPHER_CTX *ctx);

To get the key length of the cipher:

int EVP_CIPHER_CTX_key_length(EVP_CIPHER_CTX *ctx);

To get the length of the initialization vector:

int EVP_CIPHER_CTX_iv_length(EVP_CIPHER_CTX *ctx);

To get the cipher mode being used:

int EVP_CIPHER_CTX_mode(EVP_CIPHER_CTX *ctx);

To see if automatic padding is disabled:

int pad = (ctx->flags & EVP_CIPH_NO_PADDING);

To see if we are encrypting or decrypting:

int encr = (ctx->encrypt);

To retrieve the original initialization vector:

char *iv = (ctx->oiv);

Discussion

The EVP_CIPHER_CTX_cipher( ) function is actually implemented as a macro that returns an object of type EVP_CIPHER. The cipher itself can be queried, but interesting queries can also be made on the context object through appropriate macros.

All functions returning lengths return them in bytes.

The EVP_CIPHER_CTX_mode( ) function returns one of the following predefined values:

EVP_CIPH_ECB_MODE

EVP_CIPH_CBC_MODE

EVP_CIPH_CFB_MODE

EVP_CIPH_OFB_MODE

5.22. Performing Low-Level Encryption and Decryption with OpenSSL

Problem

You have set up your cipher and want to perform encryption and decryption.

Solution

Use the following suite of functions:

int EVP_EncryptUpdate(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl,

unsigned char *in, int inl);

int EVP_EncryptFinal_ex(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl);

int EVP_DecryptUpdate(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl,

unsigned char *in, int inl);

int EVP_DecryptFinal_ex(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl);

Discussion

WARNING

As a reminder, use a raw mode only if you really know what you're doing. For general-purpose use, we recommend a high-level abstraction, such as that discussed in Recipe 5.16. Additionally, be sure to include some sort of integrity validation whenever encrypting, as we discuss throughout Chapter 6.

The signatures for the encryption and decryption routines are identical, and the actual routines are completely symmetric. Therefore, we'll only discuss the behavior of the encryption functions, and you can infer the behavior of the decryption functions from that.

EVP_EncryptUpdate( ) has the following arguments:

ctx

Pointer to the cipher context previously initialized with EVP_EncryptInit_ex( ).

out

Buffer into which any output is placed.

outl

Pointer to an integer, into which the number of bytes written to the output buffer is placed.

in

Buffer containing the data to be encrypted.

inl

Number of bytes contained in the input buffer.

EVP_EncryptFinal_ex( ) takes the following arguments:

ctx

Pointer to the cipher context previously initialized with EVP_EncryptInit_ex( ).

out

Buffer into which any output is placed.

outl

Pointer to an integer, into which the number of bytes written to the output buffer is placed.

There are two phases to encryption in OpenSSL: update, and finalization. The basic idea behind update mode is that you're feeding in data to encrypt, and if there's incremental output, you get it. Calling the finalization routine lets OpenSSL know that all the data to be encrypted with this current context has already been given to the library. OpenSSL then does any cleanup work necessary, and it will sometimes produce additional output. After a cipher is finalized, you need to reinitialize it if you plan to reuse it, as described in Recipe 5.17.

In CBC and ECB modes, the cipher cannot always encrypt all the plaintext you give it as that plaintext arrives, because it requires block-aligned data to operate. In the finalization phase, those algorithms add padding if appropriate, then yield the remaining output. Note that, because of the internal buffering that can happen in these modes, the output to any single call of EVP_EncryptUpdate( ) or EVP_EncryptFinal_ex( ) can be about a full block larger or smaller than the actual input. If you're encrypting data into a single buffer, you can always avoid overflow if you make the output buffer an entire block bigger than the input buffer. Remember, however, that if padding is turned off (as described in Recipe 5.19), the library will be expecting block-aligned data, and the output will always be the same size as the input.

In OFB and CFB modes, the call to EVP_EncryptUpdate( ) will always return the amount of data you passed in, and EVP_EncryptFinal_ex( ) will never return any data. This is because these modes are stream-based modes that don't require aligned data to operate. Therefore, it is sufficient to call only EVP_EncryptUpdate( ), skipping finalization entirely. Nonetheless, you should always call the finalization function so that the library has the chance to do any internal cleanup that may be necessary. For example, if you're using a cryptographic accelerator, the finalization call essentially gives the hardware license to free up resources for other operations.

These functions all return 1 on success, and 0 on failure. EVP_EncryptFinal_ex( ) will fail if padding is turned off and the data is not block-aligned. EVP_DecryptFinal_ex( ) will fail if the decrypted padding is not in the proper format. Additionally, any of these functions may fail if they are using hardware acceleration and the underlying hardware throws an error. Beyond those problems, they should not fail. Note again that when decrypting, this API has no way of determining whether the data decrypted properly. That is, the data may have been modified in transit; other means are necessary to ensure integrity (i.e., use a MAC, as we discuss throughout Chapter 6).

Here's an example function that, when given an already instantiated cipher context, encrypts an entire plaintext message 100 bytes at a time into a single heap-allocated buffer, which is returned at the end of the function. This example demonstrates how you can perform multiple encryption operations over time and keep encrypting into a single buffer. This code will work properly with any of the OpenSSL-supported cipher modes.

#include <stdlib.h>

#include <openssl/evp.h>

/* The integer pointed to by rb receives the number of bytes in the output.

* Note that the malloced buffer can be realloced right before the return.

*/

char *encrypt_example(EVP_CIPHER_CTX *ctx, char *data, int inl, int *rb) {

int i, ol, tmp;

char *ret;

ol = 0;

if (!(ret = (char *)malloc(inl + EVP_CIPHER_CTX_block_size(ctx)))) abort( );

for (i = 0; i < inl / 100; i++) {

if (!EVP_EncryptUpdate(ctx, &ret[ol], &tmp, &data[ol], 100)) abort( );

ol += tmp;

}

if (inl % 100) {

if (!EVP_EncryptUpdate(ctx, &ret[ol], &tmp, &data[ol], inl % 100)) abort( );

ol += tmp;

}

if (!EVP_EncryptFinal_ex(ctx, &ret[ol], &tmp)) abort( );

ol += tmp;

if (rb) *rb = ol;

return ret;

}

Here's a simple function for decryption that decrypts an entire message at once:

#include <stdlib.h>

#include <openssl/evp.h>

char *decrypt_example(EVP_CIPHER_CTX *ctx, char *ct, int inl) {

/* We're going to null-terminate the plaintext under the assumption that it's

* non-null terminated ASCII text. The null can otherwise be ignored if it

* wasn't necessary, though the length of the result should be passed back in

* such a case.

*/

int ol;

char *pt;

if (!(pt = (char *)malloc(inl + EVP_CIPHER_CTX_block_size(ctx) + 1))) abort( );

EVP_DecryptUpdate(ctx, pt, &ol, ct, inl);

if (!ol) { /* There is no data to decrypt */

free(pt);

return 0;

}

pt[ol] = 0;

return pt;

}

See Also

Recipe 5.16, Recipe 5.17

5.23. Setting Up and Using RC4

Problem

You want to use RC4 securely.

Solution

You can't be very confident about the security of RC4 for general-purpose use, owing to theoretical weaknesses. However, if you're willing to use only a very few RC4 outputs (a limit of about 100,000 bytes of output), you can take a risk, as long as you properly set it up.

Before using the standard initialization functions provided by your cryptographic library, take one of the following two steps:

§ Cryptographically hash the key material before using it.

§ Discard the first 256 bytes of the generated keystream.

After initialization, RC4 is used just as any block cipher in a streaming mode is used.

Most libraries implement RC4, but it is so simple that we provide an implementation in the following section.

Discussion

RC4 is a simple cipher that is really easy to use once you have it set up securely, which is actually difficult to do! Due to this key-setup problem, RC4's theoretical weaknesses, and the availability of faster solutions that look more secure, we recommend you just not use RC4. If you're looking for a very fast solution, we recommend SNOW 2.0.

In this recipe, we'll start off ignoring the RC4 key-setup problem. We'll show you how to use RC4 properly, giving a complete implementation. Then, after all that, we'll discuss how to set it up securely.

WARNING

As with any other symmetric encryption algorithm, it is particularly important to use a MAC along with RC4 to ensure data integrity. We discuss MACs extensively in Chapter 6.

RC4 requires a little bit of state, including a 256-byte buffer and two 8-bit counters. Here's a declaration for an RC4_CTX data type:

typedef struct {

unsigned char sbox[256];

unsigned char i, j;

} RC4_CTX;

In OpenSSL, the same sort of context is named RC4_KEY, which is a bit of a misnomer. Throughout this recipe, we will use RC4_CTX, but our implementation is otherwise compatible with OpenSSL's (our functions have the same names and parameters). You'll only need to include the correct header file, and alias RC4_CTX to RC4_KEY.

The "official" RC4 key setup function isn't generally secure without additional work, but we need to have it around anyway:

#include <stdlib.h>

void RC4_set_key(RC4_CTX *c, size_t keybytes, unsigned char *key) {

int i, j;

unsigned char keyarr[256], swap;

c->i = c->j = 0;

for (i = j = 0; i < 256; i++, j = (j + 1) % keybytes) {

c->sbox[i] = i;

keyarr[i] = key[j];

}

for (i = j = 0; i < 256; i++) {

j += c->sbox[i] + keyarr[i];

j %= 256;

swap = c->sbox[i];

c->sbox[i] = c->sbox[j];

c->sbox[j] = swap;

}

}

The RC4 function has the following arguments:

c

Pointer to an RC4_CTX object.

n

Number of bytes to encrypt.

in

Buffer to encrypt.

out

Output buffer.

void RC4(RC4_CTX *c, size_t n, unsigned char *in, unsigned char *out) {

unsigned char swap;

while (n--) {

c->j += c->sbox[++c->i];

swap = c->sbox[c->i];

c->sbox[c->i] = c->sbox[c->j];

c->sbox[c->j] = swap;

swap = c->sbox[c->i] + c->sbox[c->j];

*out++ = *in++ ^ c->sbox[swap];

}

}

That's it for an RC4 implementation. This function can be used incrementally or as an "all-in-one" solution.

Now let's look at how to key RC4 properly.

Without going into the technical details of the problems with RC4 key setup, it's sufficient to say that the real problem occurs when you key multiple RC4 instances with related keys. For example, in some circles it is common to use a truncated base key, then concatenate a counter for each message (which is not a good idea in and of itself because it reduces the effective key strength).

The first way to solve this problem is to use a cryptographic hash function to randomize the key. If your key is 128 bits, you can use MD5 and take the entire digest value, or you can use a hash function with a larger digest, such as SHA1 or SHA-256, truncating the result to the appropriate size.

Here's some code for setting up an RC4 context by hashing key material using MD5 (include openssl/md5.h to have this work directly with OpenSSL's implementation). MD5 is fine for this purpose; you can also use SHA1 and truncate to 16 bytes.

/* Assumes you have not yet initialized the context, but have allocated it. */

void secure_rc4_setup1(RC4_CTX *ctx, char *key) {

char res[16]; /* 16 is the size in bytes of the resulting MD5 digest. */

MD5(key, 16, res);

RC4_set_key(ctx, 16, res);

}

Note that RC4 does not use an initialization vector.

Another option is to start using RC4, but throw away the first 256 bytes worth of keystream. One easy way to do that is to encrypt 256 bits of garbage and ignore the results:

/* Assumes an already instantiated RC4 context. */

void secure_rc4_setup2(RC4_CTX *ctx) {

char buf[256] = {0,};

RC4(ctx, sizeof(buf), buf, buf);

spc_memset(buf, 0, sizeof(buf));

}

5.24. Using One-Time Pads

Problem

You want to use an encryption algorithm that has provable secrecy properties, and deploy it in a fashion that does not destroy the security properties of the algorithm.

Solution

Settle for more realistic security goals. Do not use a one-time pad.

Discussion

One-time pads are provably secure if implemented properly. Unfortunately, they are rarely used properly. A one-time pad is very much like a stream cipher. Encryption is simply XOR'ing the message with the keystream. The security comes from having every single bit of the keystream be truly random instead of merely cryptographically random. If portions of the keystream are reused, the security of data encrypted with those portions is incredibly weak.

There are a number of big hurdles when using one-time pads:

§ It is very close to impossible to generate a truly random keystream in software. (See Chapter 11 for more information.)

§ The keystream must somehow be shared between client and server. Because there can be no algorithm to produce the keystream, some entity will need to produce the keystream and transmit it securely to both parties.

§ The keystream must be as long as the message. If you have a message that's bigger than the keystream you have remaining, you can't send the entire message.

§ Integrity checking is just as important with one-time pads as with any other encryption technique. As with the output of any stream cipher, if you modify a bit in the ciphertext generated by a one-time pad, the corresponding bit of the plaintext will flip. In addition, one-time pads have no built-in mechanism for detecting truncation or additive attacks. Message authentication in a provably secure manner essentially requires a keystream twice the data length.

Basically, the secure deployment of one-time pads is almost always highly impractical. You are generally far better off using a good high-level interface to encryption and decryption, such as the one provided in Recipe 5.16.

See Also

Recipe 5.16

5.25. Using Symmetric Encryption with Microsoft's CryptoAPI

Problem

You are developing an application that will run on Windows and make use of symmetric encryption. You want to use Microsoft's CryptoAPI.

Solution

Microsoft's CryptoAPI is available on most versions of Windows that are widely deployed, so it is a reasonable solution for many uses of symmetric encryption. CryptoAPI contains a small, yet nearly complete, set of functions for creating and manipulating symmetric encryption keys (which the Microsoft documentation usually refers to as session keys ), exchanging keys, and encrypting and decrypting data. While the information in the following Section 5.25.3 will not provide you with all the finer details of using CryptoAPI, it will give you enough background to get started using the API successfully.

Discussion

CryptoAPI is designed as a high-level interface to various cryptographic constructs, including hashes, MACs, public key encryption, and symmetric encryption. Its support for public key cryptography makes up the majority of the API, but there is also a small subset of functions for symmetric encryption.

Before you can do anything with CryptoAPI, you first need to acquire a provider context. CryptoAPI provides a generic API that wraps around Cryptographic Service Providers (CSPs), which are responsible for doing all the real work. Microsoft provides several different CSPs that provide implementations of various algorithms. For symmetric cryptography, two CSPs are widely available and of interest: Microsoft Base Cryptographic Service Provider and Microsoft Enhanced Cryptographic Service Provider. A third, Microsoft AES Cryptographic Service Provider, is available only in the .NET framework. The Base CSP provides RC2, RC4, and DES implementations. The Enhanced CSP adds implementations for DES, two-key Triple-DES, and three-key Triple-DES. The AES CSP adds implementations for AES with 128-bit, 192-bit, and 256-bit key lengths.

For our purposes, we'll concentrate only on the enhanced CSP. Acquiring a provider context is done with the following code. We use the CRYPT_VERIFYCONTEXT flag here because we will not be using private keys with the context. It doesn't necessarily hurt to omit the flag (which we will do inRecipe 5.26 and Recipe 5.27, for example), but if you don't need public key access with the context, you should use the flag. Some CSPs may require user input when CryptAcquireContext( ) is called without CRYPT_VERIFYCONTEXT.

#include <windows.h>

#include <wincrypt.h>

HCRYPTPROV SpcGetCryptContext(void) {

HCRYPTPROV hProvider;

if (!CryptAcquireContext(&hProvider, 0, MS_ENHANCED_PROV, PROV_RSA_FULL,

CRYPT_VERIFYCONTEXT)) return 0;

return hProvider;

}

Once a provider context has been successfully acquired, you need a key. The API provides three ways to obtain a key object, which is stored by CryptoAPI as an opaque object to which you'll have only a handle:

CryptGenKey( )

Generates a random key.

CryptDeriveKey( )

Derives a key from a password or passphrase.

CryptImportKey( )

Creates a key object from key data in a buffer.

All three functions return a new key object that keeps the key data hidden and has associated with it a symmetric encryption algorithm and a set of flags that control the behavior of the key. The key data can be obtained from the key object using CryptExportKey( ) if the key object allows it. The CryptExportKey( ) and CryptImportKey( ) functions provide the means for exchanging keys.

NOTE

The CryptExportKey( ) function will only allow you to export a symmetric encryption key encrypted with another key. For maximum portability across all versions of Windows, a public key should be used. However, Windows 2000 introduced the ability to encrypt the symmetric encryption key with another symmetric encryption key. Similarly, CryptImportKey( ) can only import symmetric encryption keys that are encrypted.

If you need the raw key data, you must first export the key in encrypted form, then decrypt from it (see Recipe 5.27). While this may seem like a lot of extra work, the reason is that CryptoAPI was designed with the goal of making it very difficult (if not impossible) to unintentionally disclose sensitive information.

Generating a new key with CryptGenKey( ) that can be exported is very simple, as illustrated in the following code. If you don't want the new key to be exportable, simply remove the CRYPT_EXPORTABLE flag.

HCRYPTKEY SpcGetRandomKey(HCRYPTPROV hProvider, ALG_ID Algid, DWORD dwSize) {

DWORD dwFlags;

HCRYPTKEY hKey;

dwFlags = ((dwSize << 16) & 0xFFFF0000) | CRYPT_EXPORTABLE;

if (!CryptGenKey(hProvider, Algid, dwFlags, &hKey)) return 0;

return hKey;

}

Deriving a key with CryptDeriveKey( ) is a little more complex. It requires a hash object to be created and passed into it in addition to the same arguments required by CryptGenKey( ). Note that once the hash object has been used to derive a key, additional data cannot be added to it, and it should be immediately destroyed.

HCRYPTKEY SpcGetDerivedKey(HCRYPTPROV hProvider, ALG_ID Algid, LPTSTR password) {

BOOL bResult;

DWORD cbData;

HCRYPTKEY hKey;

HCRYPTHASH hHash;

if (!CryptCreateHash(hProvider, CALG_SHA1, 0, 0, &hHash)) return 0;

cbData = lstrlen(password) * sizeof(TCHAR);

if (!CryptHashData(hHash, (BYTE *)password, cbData, 0)) {

CryptDestroyHash(hHash);

return 0;

}

bResult = CryptDeriveKey(hProvider, Algid, hHash, CRYPT_EXPORTABLE, &hKey);

CryptDestroyHash(hHash);

return (bResult ? hKey : 0);

}

Importing a key with CryptImportKey( ) is, in most cases, just as easy as generating a new random key. Most often, you'll be importing data obtained directly from CryptExportKey( ), so you'll already have an encrypted key in the form of a SIMPLEBLOB, as required by CryptImportKey( ). If you need to import raw key data, things get a whole lot trickier—see Recipe 5.26 for details.

HCRYPTKEY SpcImportKey(HCRYPTPROV hProvider, BYTE *pbData, DWORD dwDataLen,

HCRYPTKEY hPublicKey) {

HCRYPTKEY hKey;

if (!CryptImportKey(hProvider, pbData, dwDataLen, hPublicKey, CRYPT_EXPORTABLE,

&hKey)) return 0;

return hKey;

}

When a key object is created, the cipher to use is tied to that key, and it must be specified as an argument to either CryptGenKey( ) or CryptDeriveKey( ). It is not required as an argument by CryptImportKey( ) because the cipher information is stored as part of the SIMPLEBLOB structure that is required. Table 5-8 lists the symmetric ciphers that are available using one of the three Microsoft CSPs.

Table 5-8. Symmetric ciphers supported by Microsoft Cryptographic Service Providers

Cipher

Cryptographic Service Provider

ALG_ID constant

Key length

Block size

RC2

Base, Enhanced, AES

CALG_RC2

40 bits

64 bits

RC4

Base

CALG_RC4

40 bits

n/a

RC4

Enhanced, AES

CALG_RC4

128 bits

n/a

DES

Enhanced, AES

CALG_DES

56 bits

64 bits

2-key Triple-DES

Enhanced, AES

CALG_3DES_112

112 bits (effective)

64 bits

3-key Triple-DES

Enhanced, AES

CALG_3DES

168 bits (effective)

64 bits

AES

AES

CALG_AES_128

128 bits

128 bits

AES

AES

CALG_AES_192

192 bits

128 bits

AES

AES

CALG_AES_256

256 bits

128 bits

The default cipher mode to be used depends on the underlying CSP and the algorithm that's being used, but it's generally CBC mode. The Microsoft Base and Enhanced CSPs provide support for CBC, CFB, ECB, and OFB modes (see Recipe 5.4 for a discussion of cipher modes). The mode can be set using the CryptSetKeyParam( ) function:

BOOL SpcSetKeyMode(HCRYPTKEY hKey, DWORD dwMode) {

return CryptSetKeyParam(hKey, KP_MODE, (BYTE *)&dwMode, 0);

}

#define SpcSetMode_CBC(hKey) SpcSetKeyMode((hKey), CRYPT_MODE_CBC)

#define SpcSetMode_CFB(hKey) SpcSetKeyMode((hKey), CRYPT_MODE_CFB)

#define SpcSetMode_ECB(hKey) SpcSetKeyMode((hKey), CRYPT_MODE_ECB)

#define SpcSetMode_OFB(hKey) SpcSetKeyMode((hKey), CRYPT_MODE_OFB)

In addition, the initialization vector for block ciphers will be set to zero, which is almost certainly not what you want. The function presented below, SpcSetIV( ) , will allow you to set the IV for a key explicitly or will generate a random one for you. The IV should always be the same size as the block size for the cipher in use.

BOOL SpcSetIV(HCRYPTPROV hProvider, HCRYPTKEY hKey, BYTE *pbIV) {

BOOL bResult;

BYTE *pbTemp;

DWORD dwBlockLen, dwDataLen;

if (!pbIV) {

dwDataLen = sizeof(dwBlockLen);

if (!CryptGetKeyParam(hKey, KP_BLOCKLEN, (BYTE *)&dwBlockLen, &dwDataLen, 0))

return FALSE;

dwBlockLen /= 8;

if (!(pbTemp = (BYTE *)LocalAlloc(LMEM_FIXED, dwBlockLen))) return FALSE;

bResult = CryptGenRandom(hProvider, dwBlockLen, pbTemp);

if (bResult)

bResult = CryptSetKeyParam(hKey, KP_IV, pbTemp, 0);

LocalFree(pbTemp);

return bResult;

}

return CryptSetKeyParam(hKey, KP_IV, pbIV, 0);

}

Once you have a key object, it can be used for encrypting and decrypting data. Access to the low-level algorithm implementation is not permitted through CryptoAPI. Instead, a high-level OpenSSL EVP-like interface is provided (see Recipe 5.17 and Recipe 5.22 for details on OpenSSL's EVP API), though it's somewhat simpler. Both encryption and decryption can be done incrementally, but there is only a single function for each.

The CryptEncrypt( ) function is used to encrypt data all at once or incrementally. As a convenience, the function can also pass the plaintext to be encrypted to a hash object to compute the hash as data is passed through for encryption. CryptEncrypt( ) can be somewhat tricky to use because it places the resulting ciphertext into the same buffer as the plaintext. If you're using a stream cipher, this is no problem because the ciphertext is usually the same size as the plaintext, but if you're using a block cipher, the ciphertext can be up to a whole block longer than the plaintext. The following convenience function handles the buffering issues transparently for you. It requires the spc_memcpy( ) function from Recipe 13.2.

BYTE *SpcEncrypt(HCRYPTKEY hKey, BOOL bFinal, BYTE *pbData, DWORD *cbData) {

BYTE *pbResult;

DWORD dwBlockLen, dwDataLen;

ALG_ID Algid;

dwDataLen = sizeof(ALG_ID);

if (!CryptGetKeyParam(hKey, KP_ALGID, (BYTE *)&Algid, &dwDataLen, 0)) return 0;

if (GET_ALG_TYPE(Algid) != ALG_TYPE_STREAM) {

dwDataLen = sizeof(DWORD);

if (!CryptGetKeyParam(hKey, KP_BLOCKLEN, (BYTE *)&dwBlockLen, &dwDataLen, 0))

return 0;

dwDataLen = ((*cbData + (dwBlockLen * 2) - 1) / dwBlockLen) * dwBlockLen;

if (!(pbResult = (BYTE *)LocalAlloc(LMEM_FIXED, dwDataLen))) return 0;

CopyMemory(pbResult, pbData, *cbData);

if (!CryptEncrypt(hKey, 0, bFinal, 0, pbResult, &dwDataLen, *cbData)) {

LocalFree(pbResult);

return 0;

}

*cbData = dwDataLen;

return pbResult;

}

if (!(pbResult = (BYTE *)LocalAlloc(LMEM_FIXED, *cbData))) return 0;

CopyMemory(pbResult, pbData, *cbData);

if (!CryptEncrypt(hKey, 0, bFinal, 0, pbResult, cbData, *cbData)) {

LocalFree(pbResult);

return 0;

}

return pbResult;

}

The return from SpcEncrypt( ) will be a buffer allocated with LocalAlloc( ) that contains the ciphertext version of the plaintext that's passed as an argument into the function as pbData. If the function fails for some reason, the return from the function will be NULL, and a call to GetLastError( )will return the error code. This function has the following arguments:

hKey

Key to use for performing the encryption.

bFinal

Boolean value that should be passed as FALSE for incremental encryption except for the last piece of plaintext to be encrypted. To encrypt all at once, pass TRUE for bFinal in the single call to SpcEncrypt( ). When CryptEncrypt( ) gets the final plaintext to encrypt, it performs any cleanup that is needed to reset the key object back to a state where a new encryption or decryption operation can be performed with it.

pbData

Plaintext.

cbData

Pointer to a DWORD type that should hold the length of the plaintext pbData buffer. If the function returns successfully, it will be modified to hold the number of bytes returned in the ciphertext buffer.

Decryption works similarly to encryption. The function CryptDecrypt( ) performs decryption either all at once or incrementally, and it also supports the convenience function of passing plaintext data to a hash object to compute the hash of the plaintext as it is decrypted. The primary difference between encryption and decryption is that when decrypting, the plaintext will never be any longer than the ciphertext, so the handling of data buffers is less complicated. The following function, SpcDecrypt( ) , mirrors the SpcEncrypt( ) function presented previously.

BYTE *SpcDecrypt(HCRYPTKEY hKey, BOOL bFinal, BYTE *pbData, DWORD *cbData) {

BYTE *pbResult;

DWORD dwBlockLen, dwDataLen;

ALG_ID Algid;

dwDataLen = sizeof(ALG_ID);

if (!CryptGetKeyParam(hKey, KP_ALGID, (BYTE *)&Algid, &dwDataLen, 0)) return 0;

if (GET_ALG_TYPE(Algid) != ALG_TYPE_STREAM) {

dwDataLen = sizeof(DWORD);

if (!CryptGetKeyParam(hKey, KP_BLOCKLEN, (BYTE *)&dwBlockLen, &dwDataLen, 0))

return 0;

dwDataLen = ((*cbData + dwBlockLen - 1) / dwBlockLen) * dwBlockLen;

if (!(pbResult = (BYTE *)LocalAlloc(LMEM_FIXED, dwDataLen))) return 0;

} else {

if (!(pbResult = (BYTE *)LocalAlloc(LMEM_FIXED, *cbData))) return 0;

}

CopyMemory(pbResult, pbData, *cbData);

if (!CryptDecrypt(hKey, 0, bFinal, 0, pbResult, cbData)) {

LocalFree(pbResult);

return 0;

}

return pbResult;

}

Finally, when you're finished using a key object, be sure to destroy the object by calling CryptDestroyKey( ) and passing the handle to the object to be destroyed. Likewise, when you're done with a provider context, you must release it by calling CryptReleaseContext( ) .

See Also

Recipe 5.4, Recipe 5.17, Recipe 5.22, Recipe 5.26, Recipe 5.27, Recipe 13.2

5.26. Creating a CryptoAPI Key Object from Raw Key Data

Problem

You have a symmetric key from another API, such as OpenSSL, that you would like to use with CryptoAPI. Therefore, you must create a CryptoAPI key object with the key data.

Solution

The Microsoft CryptoAPI is designed to prevent unintentional disclosure of sensitive key information. To do this, key information is stored in opaque data objects by the Cryptographic Service Provider (CSP) used to create the key object. Key data is exportable from key objects, but the data must be encrypted with another key to prevent accidental disclosure of the raw key data.

Discussion

In Recipe 5.25, we created a convenience function, SpcGetCryptContext( ) , for obtaining a handle to a CSP context object. This function uses the CRYPT_VERIFYCONTEXT flag with the underlying CryptAcquireContext( ) function, which serves to prevent the use of private keys with the obtained context object. To be able to import and export symmetric encryption keys, you need to obtain a handle to a CSP context object without that flag, and use that CSP context object for creating the keys you wish to use. We'll create a new function called SpcGetExportableContext( ) that will return a CSP context object suitable for creating, importing, and exporting symmetric encryption keys.

#include <windows.h>

#include <wincrypt.h>

HCRYPTPROV SpcGetExportableContext(void) {

HCRYPTPROV hProvider;

if (!CryptAcquireContext(&hProvider, 0, MS_ENHANCED_PROV, PROV_RSA_FULL, 0)) {

if (GetLastError( ) != NTE_BAD_KEYSET) return 0;

if (!CryptAcquireContext(&hProvider, 0, MS_ENHANCED_PROV, PROV_RSA_FULL,

CRYPT_NEWKEYSET)) return 0;

}

return hProvider;

}

SpcGetExportableContext( ) will obtain a handle to the Microsoft Enhanced Cryptographic Service Provider that allows for the use of private keys. Public key pairs are stored in containers by the underlying CSP. This function will use the default container, creating it if it doesn't already exist.

Every public key container can have a special public key pair known as an exchange key , which is the key that we'll use to encrypt the exported key data. The function CryptGetUserKey( ) is used to obtain the exchange key. If it doesn't exist, SpcImportKeyData( ) , listed later in this section, will create a 1,024-bit exchange key, which will be stored as the exchange key in the public key container so future attempts to get the key will succeed. The special algorithm identifier AT_KEYEXCHANGE is used to reference the exchange key.

Symmetric keys are always imported via CryptImportKey( ) in "simple blob" format, specified by the SIMPLEBLOB constant passed to CryptImportKey( ). A simple blob is composed of a BLOBHEADER structure, followed by an ALG_ID for the algorithm used to encrypt the key data. The raw key data follows the BLOBHEADER and ALG_ID header information. To import the raw key data into a CryptoAPI key, a simple blob structure must be constructed and passed to CryptImportKey( ).

Finally, the raw key data must be encrypted using CryptEncrypt( ) and the exchange key. (The CryptEncrypt( ) function is described in more detail in Recipe 5.25.) The return from SpcImportKeyData( ) will be a handle to a CryptoAPI key object if the operation was performed successfully; otherwise, it will be 0. The CryptoAPI makes a copy of the key data internally in the key object it creates, so the key data passed into the function may be safely freed. The spc_memset( ) function from Recipe 13.2 is used here to destroy the unencrypted key data before returning.

HCRYPTKEY SpcImportKeyData(HCRYPTPROV hProvider, ALG_ID Algid, BYTE *pbKeyData,

DWORD cbKeyData) {

BOOL bResult = FALSE;

BYTE *pbData = 0;

DWORD cbData, cbHeaderLen, cbKeyLen, dwDataLen;

ALG_ID *pAlgid;

HCRYPTKEY hImpKey = 0, hKey;

BLOBHEADER *pBlob;

if (!CryptGetUserKey(hProvider, AT_KEYEXCHANGE, &hImpKey)) {

if (GetLastError( ) != NTE_NO_KEY) goto done;

if (!CryptGenKey(hProvider, AT_KEYEXCHANGE, (1024 << 16), &hImpKey))

goto done;

}

cbData = cbKeyData;

cbHeaderLen = sizeof(BLOBHEADER) + sizeof(ALG_ID);

if (!CryptEncrypt(hImpKey, 0, TRUE, 0, 0, &cbData, cbData)) goto done;

if (!(pbData = (BYTE *)LocalAlloc(LMEM_FIXED, cbData + cbHeaderLen)))

goto done;

CopyMemory(pbData + cbHeaderLen, pbKeyData, cbKeyData);

cbKeyLen = cbKeyData;

if (!CryptEncrypt(hImpKey, 0, TRUE, 0, pbData + cbHeaderLen, &cbKeyLen, cbData))

goto done;

pBlob = (BLOBHEADER *)pbData;

pAlgid = (ALG_ID *)(pbData + sizeof(BLOBHEADER));

pBlob->bType = SIMPLEBLOB;

pBlob->bVersion = 2;

pBlob->reserved = 0;

pBlob->aiKeyAlg = Algid;

dwDataLen = sizeof(ALG_ID);

if (!CryptGetKeyParam(hImpKey, KP_ALGID, (BYTE *)pAlgid, &dwDataLen, 0))

goto done;

bResult = CryptImportKey(hProvider, pbData, cbData + cbHeaderLen, hImpKey, 0,

&hKey);

if (bResult) spc_memset(pbKeyData, 0, cbKeyData);

done:

if (pbData) LocalFree(pbData);

CryptDestroyKey(hImpKey);

return (bResult ? hKey : 0);

}

See Also

Recipe 5.25, Recipe 13.2

5.27. Extracting Raw Key Data from a CryptoAPI Key Object

Problem

You have a symmetric key stored in a CryptoAPI key object that you want to use with another API, such as OpenSSL.

Solution

The Microsoft CryptoAPI is designed to prevent unintentional disclosure of sensitive key information. To do this, key information is stored in opaque data objects by the Cryptographic Service Provider (CSP) used to create the key object. Key data is exportable from key objects, but the data must be encrypted with another key to prevent accidental disclosure of the raw key data.

To extract the raw key data from a CryptoAPI key, you must first export the key using the CryptoAPI function CryptoExportKey( ) . The key data obtained from this function will be encrypted with another key, which you can then use to decrypt the encrypted key data to obtain the raw key data that another API, such as OpenSSL, can use.

Discussion

To export a key using the CryptoExportKey( ) function, you must provide the function with another key that will be used to encrypt the key data that's to be exported. Recipe 5.26 includes a function, SpcGetExportableContext( ), that obtains a handle to a CSP context object suitable for exporting keys created with it. The CSP context object uses a "container" to store public key pairs. Every public key container can have a special public key pair known as an exchange key, which is the key that we'll use to decrypt the exported key data.

The function CryptGetUserKey( ) is used to obtain the exchange key. If it doesn't exist, SpcExportKeyData( ) , listed later in this section, will create a 1,024-bit exchange key, which will be stored as the exchange key in the public key container so future attempts to get the key will succeed. The special algorithm identifier AT_KEYEXCHANGE is used to reference the exchange key.

Symmetric keys are always exported via CryptExportKey( ) in "simple blob" format, specified by the SIMPLEBLOB constant passed to CryptExportKey( ). The data returned in the buffer from CryptExportKey( ) will have a BLOBHEADER structure, followed by an ALG_ID for the algorithm used to encrypt the key data. The raw key data will follow the BLOBHEADER and ALG_ID header information. For extracting the raw key data from a CryptoAPI key, the data in the BLOBHEADER structure and the ALG_ID are of no interest, but you must be aware of their existence so that you can skip over them to find the encrypted key data.

Finally, the encrypted key data can be decrypted using CryptDecrypt( ) and the exchange key. The CryptDecrypt( ) function is described in more detail in Recipe 5.25. The decrypted data is the raw key data that can now be passed off to other APIs or used in protocols that already provide their own protection for the key. The return from SpcExportKeyData( ) will be a buffer allocated with LocalAlloc( ) that contains the unencrypted symmetric key if no errors occur; otherwise, NULL will be returned.

#include <windows.h>

#include <wincrypt.h>

BYTE *SpcExportKeyData(HCRYPTPROV hProvider, HCRYPTKEY hKey, DWORD *cbData) {

BOOL bResult = FALSE;

BYTE *pbData = 0, *pbKeyData;

HCRYPTKEY hExpKey = 0;

if (!CryptGetUserKey(hProvider, AT_KEYEXCHANGE, &hExpKey)) {

if (GetLastError( ) != NTE_NO_KEY) goto done;

if (!CryptGenKey(hProvider, AT_KEYEXCHANGE, (1024 << 16), &hExpKey))

goto done;

}

if (!CryptExportKey(hKey, hExpKey, SIMPLEBLOB, 0, 0, cbData)) goto done;

if (!(pbData = (BYTE *)LocbalAlloc(LMEM_FIXED, *cbData))) goto done;

if (!CryptExportKey(hKey, hExpKey, SIMPLEBLOB, 0, pbData, cbData))

goto done;

pbKeyData = pbData + sizeof(BLOBHEADER) + sizeof(ALG_ID);

(*cbData) -= (sizeof(BLOBHEADER) + sizeof(ALG_ID));

bResult = CryptDecrypt(hExpKey, 0, TRUE, 0, pbKeyData, cbData);

done:

if (hExpKey) CryptDestroyKey(hExpKey);

if (!bResult && pbData) LocalFree(pbData);

else if (pbData) MoveMemory(pbData, pbKeyData, *cbData);

return (bResult ? (BYTE *)LocalReAlloc(pbData, *cbData, 0) : 0);

}

See Also

Recipe 5.25, Recipe 5.26