Threat Modeling: Designing for Security (2014)

Part IV. Threat Modeling in Technologies and Tricky Areas

Chapter 16. Threats to Cryptosystems

Cryptography is the art of communicating securely in the presence of adversaries. Cryptography, or crypto, as nearly everyone calls it (to save two syllables), figures into a good number of defenses. You can use it as part of how you address spoofing, tampering, repudiation, and information disclosure. It's important for people working in and around security to understand cryptographic tools; and perhaps more important, to understand the common mistakes made while using them, because failing to understand these mistakes can lead to overconfidence, and mistakes made by overconfident people are a major source of real-world problems.

Cryptographers have been enumerating threats for a very long time. They have done more than most other branches of security to quantify the security of their systems, and have over a long time evolved their thinking about threat models (in the sense of what attacks they worry about). And frankly, crypto can be a lot of fun. If you want to dabble in cryptography, the right place to get started is by breaking cryptosystems. (In a lab, not breaking into other people's systems in possible violation of ethics or law. Thanks! I knew you'd understand.) There are many targets, and the worst you'll do is not break anything. In contrast, if you try to get started by building cryptosystems without developing knowledge of how to break them, you'll inevitably have false confidence in what you've built. It's easy to build a cryptosystem whose flaws you can't see.

This chapter is mostly a reference chapter. It does not replace a good text on cryptography, but encourages you to use the largest building blocks available, rather than rolling your own. If you need to roll your own, you need to first read all of Cryptography Engineering by Niels Ferguson, Bruce Schneier, and Yoshi Kohno (Wiley, 2012). I don't want to scare anyone away from crypto, nor do I want you to be overconfident.

In this chapter, you'll first learn about cryptographic primitives. Primitives are the smallest units of cryptography that anyone will ever use. (They're often called building blocks, but this book uses “building block” to refer to the building blocks in the 4-stage model.) After the cryptographic primitives, you'll learn about the classic crypto attackers and attacks against cryptosystems. Those are followed by advice on building with crypto, and a set of things worth remembering. The chapter closes with a discussion of an over-reliance on secrecy to protect a system in the context of Kerckhoffs, the person who crystalized that discussion.

Cryptographic Primitives

There are only a few basic cryptographic primitives: symmetric and asymmetric encryption, hash functions, and pseudo-random number generators (PRNGs). Understanding them will help you avoid mistakes in their use. This section also covers a set of techniques that are useful for preserving privacy—that is, preventing certain types of information disclosure attacks. Lastly, you'll look at a few important modern constructions that you should be familiar with.

Basic Primitives

The primitives just mentioned are at the very heart of cryptography. If you don't understand them, they're easy to confuse, and confusing them will leave you or your customers insecure. This section uses some standard conventions and terminology: Alice and Bob are typically the people who want to communicate. They do this by sending messages, which are also called plaintext. The cryptosystems they've agreed to use often have ciphers and sometimes keys. A cipher has various functions, such as encrypt and decrypt. The following sections begin with the category of system you're most likely familiar with, symmetric cryptosystems. The others are shown in Table 16.1.

Table 16.1 Cryptographic Primitives

Primitive	Goal	Input	Output
Symmetric cryptography	Confidentiality	Plaintext (any length)	Ciphertext (same length)
Asymmetric	Authentication	Plaintext (any length)	Fixed length signature
Asymmetric	Key agreement	Counterparty information	A session key
Hash	Fingerprint	Plaintext (any length)	Fixed length message dependent fingerprint
PRNG	Randomness	Various	Hard to predict bits

Symmetric Cryptography

Symmetric encryption systems—also known as private key or secret key systems, or even just ciphers—use a key known to both (or all) sides of a conversation, thus the name symmetric. The Caesar cipher is an example of a (very simple) symmetric system. Letters are shifted three letters to the right, with A being represented by D, B by E, and so on. Someone must either know the key, guess the key or break the system to read the message. Ideally, you use a system that is harder to break.

Assume that Alice and Bob have shared a key. The original, unencrypted message Alice wants to send to Bob is the plaintext. She and Bob have previously agreed on a symmetric cipher to use, along with a key. The plaintext and key are inputs to the cipher's encryption function, which outputs a ciphertext. She sends this ciphertext to Bob, not worrying about who might see it, because if the cryptosystem is strong, that ciphertext is unintelligible to everyone without the key. Those with the key feed the key and the ciphertext to the decryption function, which outputs the plaintext. The symmetry of the system results from the same key being used on both ends. Symmetric systems protect against information disclosure attacks when the key is secure. Unfortunately, establishing the key in a secure way is tremendously tricky. Years ago, governments would send keys in briefcases handcuffed to couriers, which is expensive and slow.

There are two types of symmetric algorithms, block ciphers and stream ciphers. A block cipher takes a block of input (often 128 bits) and encrypts it with a key, producing an output block of the same size as the input. Many constructions that use a block cipher also use an initialization vector (IV). This is a random number that is input to the cipher along with the key and the first block to ensure that identical plaintexts are not encrypted the same way.

A stream cipher takes a key and produces a key stream. That key stream can be XORed with a plaintext bit stream to produce ciphertext. The same key stream is produced at the receiver, which can then use XOR to get the plaintext. If you ever reuse a key stream, your ciphertexts can be XORed, and the output of that is the XOR of the plaintexts. That's highly undesirable. The most common stream cipher, RC4, has a host of issues. Be sure to review the advice in Cryptography Engineering before you use RC4. (Let me save you time—RC4 is not mentioned in Cryptography Engineering, because it's so hard to use securely, but as you review the advice there, you'll find something that will work safely. Just don't use RC4.)

The simple rule for symmetric encryption is to use AES in CBC or CTR mode. AES is the U.S. “Advanced Encryption Standard,” and the modes refer to the precise ways bits are managed. (CBC is “cipher block chaining,” while CTR is “counter” mode.) The pros and cons of each are subtle, and debate rages on. However, using one or the other will almost always be the right choice. To be explicit, this means you should avoid ECB mode. ECB stands for “electronic code book,” although you can think of it as “extreme cryptographic blunder” because it's an extreme cryptographic blunder to use the mode.

Asymmetric Cryptography

In an asymmetric encryption system, each party has a set of mathematically related keys, and through the use of really cool math, manages a variety of useful results, including encryption, key agreement, and digital signatures. Asymmetric encryption systems are also called public key systems, because one of the keys is made public.

Note

Asymmetric cryptography is also known as public key encryption.

This section presents a simplified version of the math that doesn't use much more than high school algebra and then discusses how asymmetric systems can be used.

Alice and Bob each pick a number. Alice picks a, Bob picks b. They then agree on a number g. Maybe their number is the closing of the Dow Jones index on the previous day. It can be public; it just needs to be hard for an adversary to influence.

Alice calculates A = g^a and Bob calculates B = g^b. Alice then sends A, and Bob sends B. Then Alice calculates a secret key, s = B^a, and Bob calculates s = A^b. Now, because multiplication is commutative, A^b = B^a (or, really, g^ab = g^ba ). If you assume that it's hard to calculate a from g and ga, then it's hard for anyone who doesn't know a or b to figure out what s is. Alice and Bob have now agreed on a secret without ever sending that secret back and forth. It's not that hard to figure the nth root of g^a, and from there you can approximate, but this is a simplified description. Therefore, to get more security, we'll add one more element, which is that Alice and Bob do all of this math modulo p. Math modulo p is like clock arithmetic. 12 hours after 7, it's 7. So 7 modulo 12 is 7, and 19 modulo 12 is also 7. We can replace 12 with any other number, which is called p. When a, b, g, and p are large, and satisfy certain mathematical properties, then figuring out a or b from g^a and g^b is hard. Because it's hard, Alice and Bob could use s as a key in a symmetric encryption system. They want to do this because symmetric encryption is fast, and asymmetric encryption is slow.

This is way cool, and it's worth taking a few moments to work through some of the math yourself to see it work. However, if you use the preceding system as presented, you will be successfully attacked. The example is designed to help you get a feel for what's possible and why. It is not intended to replace a full college-level course in mathematical cryptography, which is needed before you can make interesting mistakes in this space.

Once you've worked through the math, you can use asymmetric systems for three things beyond online key agreement: encryption, offline key exchange, and signatures. The cryptosystem's encrypt function allows anyone to encrypt a message using a public key so that only the person with the private key can read it. If the message is the key to another, then that key can be used to encrypt or decrypt messages. (The usual reason for this is that symmetric systems are much, much faster than asymmetric systems.) Most important, asymmetric signature systems can be used to sign a message, and anyone with the public key can verify the signature. A quirk of the math in RSA, one of the first public key cryptosystems, leads many people to overgeneralize with the claim that signatures are the inverse of encryption.

The pitfalls related to effective key exchange, signatures, and encryption are substantial and often subtle. Using systems such as Diffie-Hellman, RSA, or DSA as originally presented is foolish.

Hashes

The third important building block are hashes, also called one-way functions, or message digests. A hash takes an input of some arbitrary length and produces a fixed-length digest or hash of the input. Ideally, any change to the input completely transforms the output. This means that other building blocks can assume they'll always see a fixed-length input. For example, a signature algorithm can assume that it will always see input of the same length, and defer the problem of arbitrary length input to the hash function.

Hashes are also used to store passwords, usually with a salt. A salt is an additional input to a hash function, designed to ensure that identical inputs create different outputs. Therefore, if Alice and Bob are both using the password Secret1, then the simple approach to storing a hash of each of their passwords would reveal that Alice and Bob have the same password (but not what it is). With salting, the threat from sorting is mitigated. There's other threats, and more information on the storage of passwords in Chapter 14, “Accounts and Identity.”

Hashes can also be used to create fingerprints. A fingerprint in this context is a hash of something longer, designed to be used by people to authenticate that each person has the same copy of the longer thing (such as a file or message). This approach is useful because hashes are faster to calculate than digital signatures. For example, if you want to sign a copy of a web browser, you could calculate the signature on all the millions of bytes of the browser, or you could calculate the signature on the 16 to 64 bytes (128 to 512 bits) of a hash. It's also possible for people to validate a fingerprint. Some people will print public key fingerprints on a business card to allow you to validate the authenticity of that public key. Printing the entire key would be less usable.

Lastly, hashes can be used to calculate message authentication codes (MACs). A message authentication code uses a key known to each side of a protocol, and the key and message are hashed together to produce a MAC. The key is kept private on each endpoint, while the message and MAC are sent.

The most famous of these hashes, MD5, has now been effectively and publicly broken. Do not expect MD5 to give you any security, although sometimes you'll need to use it because of a counterparty. Instead, use the latest in the SHA series from NIST.

Randomness

The final important cryptographic primitive is random numbers, sometimes provided by dice. But of course dice are too slow to provide enough random numbers for cryptographic use, so they are usually provided by pseudo-random number generators (PRNGs). For a simple example, if Alice chooses her public key based on a bad random number source (say, microseconds since boot), then an attacker can decide how long he thinks most systems will stay up, and start searching at the median value. If Alice uses a good random number, then the search is much harder. The essence here is that you have some entropy (information unknown to the attacker), and you want to make the best possible use of that entropy.

If you have access to a good hardware source of randomness, which produces lots of entropy, use it. Otherwise, you'll need to use a PRNG to give you more pseudo-random bits. Unfortunately, PRNGs are incredibly difficult to build well, and many things that appear random in computer systems turn out to be somewhat predictable. That's why John von Neumann has said “Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin” (von Neumann, 1951). Fortunately, most operating systems now include decent-quality PRNGs. Make sure you're using one everywhere you need randomness.

Note that randomness is easier for human-operated endpoints, which can take randomness from keyboard interrupts, mouse movements, and other inputs that are hard for an attacker to observe. If you have machines in a data center, you may need to augment randomness. There are hardware devices to do this, sometimes included in a CPU.

Note

As this book went to press, a great deal of controversy erupted over possible backdoors in random number generators. The advice remains that you should rely on the operating system. The risk associated with altering PRNGs is larger than the risk that your library's RNG is malicious. You want to avoid the sorts of mistakes the Debian maintainers introduced into OpenSSL (Debian, 2008). If you are implementing an operating system, then use the hardware RNG with the mixing approach now used by FreeBSD (Goodin, 2013).

Privacy Primitives

There is a small number of primitives that are useful for addressing information disclosure:

§ Steganography: This is the art of hiding messages—for example, encoding your message in the least-significant bits of an image file, such as a photograph. Effective steganography is useful for hiding messages from attackers to conceal the existence of communication. It is also used to encode anti-piracy information in audio and video (“watermarking”). If you need to implement steganography, the best work tends to appear at the Information Hiding Conference.

§ Mixes and Mix networks: These are tools for taking a set of messages, mixing them together, and then sending them out in a way that unlinks the messages from the sender. They generally require latency to work.

§ Onion routing: This is a technique for sending messages over a network that unlinks messages from senders, like mix networks. Each message is wrapped in multiple layers of encryption which are wrapped or unwrapped one at a time (like an onion). Tor is the best-known onion router implementation today.

§ Blinding: These techniques allow a system to sign a message without seeing it. Blinding is covered in Chapter 8, “Defensive Tactics and Technologies,” in the section on mitigating privacy threats.

Modern Cryptographic Primitives

These primitives are more specific than the basic primitives introduced above. When your problem can be addressed with one of these, that's far better than cobbling something together.

Perfect Forward Secrecy

Perfect forward secrecy (PFS) is a property that ensures that if one of the (long-term) keys involved in a protocol is later compromised, then the messages which were sent under a forward secrecy scheme cannot be decrypted.

Authenticated Key Exchange

Authenticated key exchange (AKE) is so hard to do right that cryptographers are now developing primitives to address it. AKE is focused on channels (confidential, authenticated, integrity-protected) created over an adversary-controlled connection. There is a related problem, authenticated and confidential channel establishment (ACCE), which is similarly complex. If you find yourself in these spaces, defer to experts and their work. See “On the Security of TLS Renegotiation”(Giesen, 2013) for the state of the art as this book is being written.

Random Oracles

An oracle is a construct that answers questions put to it. In cryptography, the answers an oracle gives are consistent. The random oracle is one that responds to any query with a response chosen randomly from all possible responses. For any given query, it will always return the same response. A random oracle acts much the way we want hash functions to act, but it is mathematically simple. Oracles are almost always available to the attacker.

If you see a proof of something with a random oracle, be aware that today's hash functions are not random oracles.

Proven Cryptosystems

When reading about a cryptosystem, you will occasionally see that it is accompanied by a mathematical proof, and you may be tempted to consider any proven cryptosystem superior to those without proofs. It is helpful to be aware of three things. First, the proofs may or may not cover the same things that you're worried about (unless you have a personal burning hatred of second preimages). Second, the proofs are sometimes exceptionally complex mathematical constructs, which can be hard even for mathematicians to follow. If the system has been altered to make a proof possible, then there's a trade-off. It may be the right one, and you should evaluate that against your needs. Lastly, most proofs rest on a set of implicit assumptions that cryptographers expect their audience will understand (Koblitz, 2007).

Certificates and PKI

If Alice publishes a public key, such as A, then anyone who knows that the private part of key A is maintained by Alice can encrypt a message such that only Alice can read it. You can think of this like a set of public mailboxes with names on them. One is labeled Alice, another is labeled Bob. Anyone can walk up to the appropriate mailbox and drop in a letter for Alice or Bob. Only Alice can decrypt messages dropped in the Alice box, and only Bob can decrypt messages in the Bob box, because each has the appropriate private key. Anyone can create a mailbox with their name on it, and anyone can (virtually) walk up and drop a message in that mailbox. Of course, nothing prevents me from creating a mailbox with your name on it, which means I have a key needed to read messages intended for you. If I can trick some senders into sending me messages for you, then I can read those messages. If I'm clever, once I read them, I can forward them on so you'll never know they were read by someone else.

In theory, if all these public keys are collected into a directory, then anyone can use the directory to look up the key for a person. Among the many troubles with this in practice are the many people named Alice, or figuring out if Bob is Bob, Bobby, Robert, or something else in the directory. Some party may attempt to vouch for who's who, creating certificates of various forms, by signing a public key and some associated identifying information. These parties are often called certificate authorities or certificate issuers. The snarky may ask who made them authorities.

Each certificate has either two or three parts, depending on how you count. Using the most common way of counting, a certificate has two parts: the public and the private. The public part includes a public key and a signature on it. The other counting method identifies a public part, a private part, and a certificate file. When sending certificates, it is important to only send the public part or the certificate file. The private part should never be shared. You want to share the public part and the certificate, and it's fine to do so, or to reveal the fingerprint of either. Ideally, the private key is generated on the machine on which it will be used, and never leaves that machine.

Classic Threat Actors

In discussing protocols, cryptographers use names for the participants. This helps people keep track of the human meaning of the variables. The first letter of the name is used as the mathematical variable:

§ Alice and Bob are the classic actors in cryptographic protocols. They were introduced in the first paper on RSA, and have been with us ever since, communicating in the presence of adversaries. Sometimes there are additional participants, with names added in alphabetic order, such as Carol, Chuck, or Dave. The additional participants will usually not include “E” names, because those are reserved for Eve, the oldest attacker. Generally, references to “Alice and Bob” mean “the parties trying to communicate securely.”

§ Eve is an eavesdropper. She is able to tap into the network between Alice and Bob, but is either unwilling or unable to modify their communications. Another way to state this is that Eve only takes advantage of information disclosure threats to violate confidentiality. If you're a movie fan, you'll know that Eve may have three faces, but she almost always goes by “Eve.” Other actor names show more variation.

§ Mallory can also tap into the network but is willing and able to create, destroy, or modify communication content. Usually, Mallory is able to eavesdrop as well, and if he can't, the reason why should be clearly and convincingly articulated. Mallory is occasionally called some other M name. Mallory takes advantage of network spoofing, tampering, information disclosure, and denial-of-service threats. Sometimes a message created by Mallory can implement a repudiation threat, but that's a second-order effect of successful spoofing or tampering.

§ Alice or Bob as traitor can happen because sometimes Alice or Bob are not only communicating in the presence of adversaries. Sometimes after a betrayal, blackmail, or other threat, Bob has actively turned against Alice (or vice versa). These attacks commonly subvert a shared secret or, by carefully constructing messages, convince Alice to solve equations that either leak information about her private keys or trick her or her software into having those keys perform operations not intended by Alice. These threats map less well to STRIDE, but they are usually information disclosure or elevation of privilege (insofar as Bob is getting answers not intended by Alice). Even when Alice and Bob are trustworthy, insufficient authentication can lead to the same impacts.

§ Trent is a “trusted third party” who, by mutual agreement, can perform operations that everyone will believe. (See the “Perspective” sidebar immediately after this list for more on trust.)

§ Victor is also a trusted third party, but one who only performs verification operations.

Note

The word “trust” is used in a somewhat unusual way by cryptographers: Trusted parties can betray the people who trust them. The term is used to describe what the trusted party can do, rather than what you expect them to do. Therefore, “a trusted, but not trustworthy person was caught selling secrets.” This use of “trust” leads to a great deal of confusion. Terms like “relied-upon” are more clear, but sometimes awkward to use. When you see systems that assert trust in Trent or Victor, ask yourself, do you want to rely on that party to behave in a trustworthy way, and what's the downside if they betray you? Are there alternatives that allow you to address that threat?

Attacks Against Cryptosystems

Successful cryptographic attacks usually either lead to information disclosure or allow tampering that the cryptography is intended to prevent. The following list of cryptographic attacks, focuses both on attacks that help you understand what can go wrong with your systems and on attacks that are reasonably understandable to those who are not specializing in cryptography.

Note

Linear and differential cryptanalysis are intentionally excluded from this list. They are mathematical attacks on the design of cryptosystems, and if you're engineering software, then the recommended algorithms have been vetted against those attacks, so you can't accidentally make yourself vulnerable. The excluded attacks are complex to describe and easy to describe incorrectly. More importantly, if you are applying the overall advice to use well-understood primitives, they are unlikely to be relevant.

§ Known ciphertext: Known ciphertext attacks are those for which nothing but the ciphertext is available to Eve (or the ciphertext and addressing information). For example, a radio intercept of a ciphertext might include some indicator of the recipient. Ideally, the best attack against such messages is a brute-force one, whereby Eve (who we assume has knowledge of the cryptosystem) must try each possible key to determine whether the message can be decrypted.

§ Chosen ciphertext: A chosen ciphertext is an attack in which Mallory can insert a chosen ciphertext. For example, if your payroll database has (block-)encrypted salary values, you might not know the CEO's salary, but it might be fun to insert the ciphertext of his salary in the database cell that applies to you. There's also a set of attacks in which inserting chosen ciphertext gets you less information than the full plaintext, but such information disclosure can often be leveraged into a larger attack.

§ Chosen plaintext: Choosing your opponent's plaintext is a great attack and often seems fanciful. However, during World War II, allied cryptanalysts were trying to figure out Japanese code names for islands. They instructed the base at Midway Island to radio (in the clear) that they were running out of water. Shortly after that, they intercepted a message that stated “AF is short on water,” thus indicating that AF was Midway. (Presumably, the intercept was in Japanese. [Kahn, 1996].) It is usually surprisingly easy to get your plaintext into modern protocols, because everything is being done by programs that have no ability to notice that their input is bizarre.

§ Adaptive chosen: These are variants of the chosen ciphertext or plaintext attacks in which the adversary can inject something, observe your response, and then inject something else.

§ Man-in-the-middle (MITM): These are executed by Mallory, sitting on the network and altering traffic as it goes by, preventing Bob from seeing Alice's real messages. It is almost always a mistake to assume that Mallory can't do this because of any non-cryptographic reason. Even if you are building for a very specific and controlled environment, over the lifetime of your implementation the environment is likely to change in unpredictable ways. MITM attacks include (but are not limited to) relay, replay, reflection, and downgrade attacks.

§ Relay, or chess master, attacks: How do you beat a chess master if you're a rank amateur? It's easy! Play chess by mail against two masters at once. If you're trying to convince one chess master to vouch for your skills, this is an interesting attack. Of course, not everyone cares how good you are at chess, but the attack generalizes.The modern approach is to be a MITM between Alice and Bob. If Alice is a web browser, and Bob is a web server, then Mallory can present a web server to Alice, Alice will dutifully enter her password, and Mallory will submit it via HTTPS to Bob.

§ Replay: A replay attack is one in which Mallory captures messages and resends them. For example, let's say Mallory knows that spymaster Alice uses time to address messages to her field agents. (That is, Bob listens at 7:30, Charlie listens at 8:00.) After Alice sends the encrypted, authenticated message “You've been discovered! Flee!” at 7:30, Mallory gleefully re-broadcasts it every half-hour for weeks, with police waiting by the ticket counters at the train station.

§ Reflection: In these attacks, Mallory plays back Alice's content to Alice. They're a little more subtle than replay, reflection, or other MITM attacks. For example, let's say there's an authentication protocol that relies on a secret key, k, and the authentication consists of encrypting a random nonce with k. Then Alice would send a hello, Bob sends n, and Alice sends n back encrypted with key k. When Bob receives n encrypted with k, he moves to an authenticated state. If Mallory wants to impersonate Alice, then he watches Bob send n, and then asks Bob to authenticate himself, with the same nonce, n. Bob conveniently sends Mallory the encrypted nonce, which Mallory then sends, pretending to be Alice.

§ Downgrade attacks: These are attacks against protocols, executed by a MITM. Downgrade attacks can occur when an insecure version of a protocol is updated but the client or server is unsure what version the other side speaks. Mallory stands in the middle, impersonating each to the other, and forces them to use the less secure version. There are three classes of defense against downgrades: The first hashes all previous messages into the later messages; the second relies on one side or the other (or both) having a memory of what version its counterparties use, and that memory is used to detect and either alert or refuse the change; while the third and most secure defense involves not speaking the old version.

§ Birthday attacks: These are named after the surprising fact that if you have just 23 people in a room, there's a 50 percent chance that two of them will have the same birthday, and a 99 percent chance with 57 people. You can find many good explanations of this online, so let's discuss the attack. An attacker generates a set of documents, looking for any two that will hash to the same value. Hashes, like birthdays, have a fixed set of possible values. When an attacker has two documents that have the same hash, she can substitute one for the other. For example, one document might say “attack at dawn,” the other might say “retreat at 7:15.” If they hash to the same thing, software is unlikely to catch the substitution.

§ Traffic analysis: These attacks involve looking at a set of messages and learning things from the patterns without ever actually understanding the message. For example, if Bob and Charlie routinely retransmit Alice's messages, perhaps Alice is more senior to them in the organization. Similarly, if Alice always responds quicker to Charlie's messages than to Bob's, then perhaps Alice reports to Charlie. Thus, information is disclosed to Eve even if she can't obtain any message content. Message size can also play an important role in traffic analysis. For example, if your “buy” button is the only 3,968 byte image on your site, an attacker watching HTTPS traffic can identify the request for that image.

§ Length extension attacks: If Mallet knows the hash of “foo,” it is trivial for him to calculate the hash of “foobar.” That's true even if “foo” is a complex data structure containing secrets known only to Alice and Bob. This attack affected a lot of people using hashes for authentication in URLs, to which attackers could add extra parameters (Duong, 2009).

§ Timing attacks: Subtle differences in the time it takes to perform cryptographic operations can reveal information such as the length of a message, where in your code an operation fails, or the hamming weight of a private key. (Hamming weight is the sum of the 1 bits in a string.) These differences can be detected over network connections. The simplest mitigation is to always specify a fixed length of time before returning.

§ Side channel attacks: These are focused on extracting information from the physical state of the computer. For example, the power draw of a computer changes as it executes instructions and the CPU makes a small amount of sound. In various quantum cryptographic systems, mirrors are used, and the positions and movements of the mirrors reveal information. “Acoustic Cryptanalysis: On Nosy People and Noisy Machines” (Shamir, 2013) contains a clear introduction to acoustic cryptanalysis.

§ Rubber hose cryptanalysis: Sometimes the easiest way to beat a cryptosystem is to beat the person using it until they give you the key. The degree of violence varies by locale and attacker. Some may literally use a rubber hose, others may just lock Alice up until she gives in.

Building with Crypto

This section discusses a few points of cryptographic engineering, including making choices about cryptosystems to use, planning for upgrades, authenticating at the outermost layer, and some techniques for managing keys.

Making Choices

Because cryptography is hard, it's easy to decide that the right design includes flexibility and negotiation. If you're designing a protocol, and you're not sure if you should use CBC or CTR mode, why not negotiate it at the start? The reason is because such negotiation explodes your attack surface, your analysis work, your future compatibility work, and your test plan. It provides all sorts of places for a MITM or other attacker to wedge themselves in. Therefore, make a decision about what you're building, and keep it simple. (This point derives from Thomas Ptacek's blog article “Applied Cryptography Engineering” [Ptacek, 2013]).If you have multiple authentication options, then the attacker gets to choose the weakest one to attack. Once they can attack authentication, they can perform all sorts of other attacks.

Preparing for Upgrades

To the extent that you're building your own protocols, you should assume that there will be failures. A useful defensive pattern is to incorporate a hash of all previous messages you've sent into the next message. However, a MITM who is terminating the connections and relaying the data can still bypass this. If you do not have a plan and test suite for upgrades, you almost certainly will end up being unable to securely upgrade.

Key Management

Key management is about ensuring that each party has the right cryptographic keys, and that there's some mapping between each cryptographic key and an account or roles within a system.

The cryptographic methods for key management can be split into symmetric systems—that is, those that rely on a shared key—and asymmetric systems.

Shared key systems usually mean Kerberos. If you need to use a shared key system, Kerberos has much to recommend it. Such systems suffer from spoofing problems, whereby each entity which knows a key can use it in ways that implicate the other parties associated with that key.

Asymmetric systems reduce the problem of distributing keys in a confidential and authentic way by removing the confidentiality requirement; therefore, asymmetric systems are generally preferable. You still need to authenticate keys, and there are a number of ways to do so:

§ Perhaps the simplest is variously called TOFU (trust on first use) or persistence. The idea, introduced to a broad audience by SSH, was to get a key on first connect and to then persist that key, warning the user if it changed. There are two malicious scenarios in which it would change, and one benign one. The malicious scenarios are the start or end of a man-in-the-middle attack (either way, the key changes). In the benign scenario, the systems administrator changes the key. Key rotation is generally desirable, as it limits exposure to undetected compromise, however, it can be operationally difficult to do in a secure and smooth way.

§ A system called Convergence builds on the idea of memory by sharing the key that various participants see. Discord between perspectives is generally indicative of an attack (Marlinspike, 2011). A wrinkle is the common security goal of “one key, one service, one machine.” For example, a 100-machine “web farm” at a bank might have 100 different keys, one per machine, and the Convergence system will issue a lot of alarms.

§ PGP introduced a system called web of trust. The idea is that each participant maintains their own set of keys, and any participant can sign a key. Such a signature indicates that they vouch for the key mapping to an identifier (in PGP's case, e-mail addresses and names). Web of trust doesn't scale particularly well, in part because of usability, and in part because the meaning of a signature varies from person to person.

§ Another option is having a limited set of parties vouch for the keys, by issuing certificates. There's an entire industry of organizations that will sign cryptographic keys for a fee under the banner of “public key infrastructures” or PKI. There are also mechanisms such as DNSSEC. DNSSEC is built on a root key, operated by Verisign on behalf of ICANN, which is responsible for delegating (signing) various zones, such as .com. There are a variety of systems, most prevalently DANE, for inserting other keys into the DNSSEC root of trust. Criticisms of DNSSEC include that Verisign also operates a wiretapping infrastructure business, the Verisign NetDiscovery Lawful Interception Service, (Verisign, 2007) and that the company is based primarily in the United States, where it may be subject to a variety of interference.

§ PKI can be seen as a limited web of trust, with only a few entities able to vouch for a key. Certificate Pinning takes that limitation a step further, with software deciding that only a small set of certificate issuers will be allowed to issue certificates that will be relied upon. For example, Google's Chrome web browser limits who can issue a certificate for Gmail.

Authenticating before Decrypting

Moxie Marlinspike, who created the Perspectives system, asserts that all cryptographic systems should ensure that they authenticate a message before doing anything else (Marlinspike, 2011). He outlines a number of attacks that were made possible by violating what he calls his “Cryptographic Doom Principle.” Of course, you need to remember to check the message authentication code in constant time. (The authenticate before decrypting approach is also called “encrypt-then-MAC.”)

Things to Remember About Crypto

This advice is more general than the advice for builders, but it's a set of things which you should remember as you're using crypto.

Use a Cryptosystem Designed by Professionals

The design and analysis of new cryptosystems is a specialized field of mathematics, and the tools that cryptanalysts have developed over the last 40 years go through amateur systems like a chainsaw goes through puppies. Really. It's that ugly. If you don't know what an adaptive chosen plaintext attack is or why you'd need to worry about it, let me simplify this: Use the highest level protocol or system designed by professionals which meets your security requirements.

There is only one exception to this advice: If you work for a national cryptographic agency, you may have in-house algorithms. Please send me a copy. I'll review them and let you know if they're acceptable. Other than that, just don't do it.

Use Cryptographic Code Built and Tested by Professionals

This might look like a restatement of the previous point. It's not. Rather, you should use the crypto libraries built into your operating system, or otherwise use an implementation that's solid and well tested. A lot of subtle mistakes can creep in if you implement your own. In short, it is faster, cheaper, and more reliable to use crypto than to re-implement it.

Note

Several of the issues discussed in this chapter are the subject of fierce debate in the cryptographic engineering or research communities. Which block cipher mode to use or how to gather and manage randomness are not simple questions. If you're interested in those debates, you can find them. Or you can leave them to those professionals.

Cryptography Is Not Magic Security Dust

Cryptography offers a powerful toolset for solving problems. However, you have to apply the tools appropriately to solve a well-defined problem at hand. For example, if you're coding an SSL connection (using a standard SSL library, of course), have you thought through the failure modes and what you want to have happen in each, and implemented that error-handling code?

Assume It Will All Become Public

This is a restatement of Kerckhoffs's Principle, “The system must not require that it is a secret, and it must be able to fall into enemy hands.” The upshot of this is that your system will almost always include two parts: a system, which is widely known and deployed, and a key, which is regularly and easily changed—see, for example, the password management systems in Chapter 14 Those password management systems are the best we know how to do, in large part because security experts have discussed them at length. The only secrets are the passwords themselves; and those are (relatively) easily changed. See the “Secret Systems: Kerckhoffs and His Principles” section later in this chapter for more details on Kerckhoffs's Principle.

There are many ways in which programmers violate this principle, but one is so common it's worth calling out: embedding a symmetric key in your code. That key is available to everyone with a debugger. Don't do it.

You Still Need to Manage Keys

Key management is hard. All the preceding discussion aside, there are times when you want to manage keys either with local code or manually to ensure that what's happening is what you think is happening. For example, OpenSSL can't check whether the domain name matches the certificate presented, because that's a separate layer. Similarly, the CertGetCertificateChain() call will by default chain to any of the 100 or so roots in the Windows Root Certificate Store.

Whatever keys you are managing, you'll need to ensure that you can create them (with sufficient entropy available), store them securely, and revoke or expire them appropriately.

Secret Systems: Kerckhoffs and His Principles

Auguste Kerckhoffs proposed a set of principles for cryptosystems in 1883. There are two that are generally referred to as “Kerckhoffs's Principle.” They follow his first principle, which is that a system must be undecipherable. The two that are often invoked today are as follows:

1. [The cryptographic system] must not be required to be secret, and it must be able to fall into the hands of the enemy without inconvenience.

2. Its key must be communicable and retainable without the help of written notes, and changeable or modifiable at the will of the correspondents.

From these, we get a variety of restatements, many of which lose the subtlety of the original (and in at least one case, the translation adds an interesting subtlety) [Kerckhoffs, 1883; Petitcolas 2013]. The core concept is that the system must not be secret, which implies that we need something that users can modify. This can be read as an early statement that security by obscurity doesn't work. While that's true, at some level, security requires that some things (such as keys) remain secret. It may be helpful to consider the difference between obscurity, in the sense of difficult to find, and secret, in the sense of intentionally hidden.

Kerckhoffs's principles survive because they carry with them a lot of subtle wisdom. For example, the first part of Kerckhoffs's third principle relates to useful properties of passwords.

There is nothing inherently wrong with a system being secret, as long as everyone relying on the security of the system has sufficient confidence that it has undergone sufficient security scrutiny. If your customers or prospects are arguing about the security of your secret sauce, then by definition they do not have sufficient confidence in the scrutiny and threat modeling it has undergone.

There is another aspect of secrecy, which is that it works differently in the physical and computer worlds. In the physical world, attackers who wants to steal the Colonel's secret recipe have to physically enter the kitchen and watch as the secret herbs and spices are blended. Doing so puts them at risk. They are at risk of exposure, being challenged, or even arrested for trespassing. Even probing the security measures around the kitchen involves a degree of risk. Contrast that with trying to break the security of a popular program. It's popular, so anyone can get a copy. You can set up a computer, and probe away to your heart's content. You can debug your exploit code in the comfort of your own home, with zero risk.

This difference is usually implicit, and leads to very different thinking about the value of secrecy around security measures in various communities. Protecting the location of a security camera makes a lot less sense if you've released a brick-for-brick model of your headquarters (Swire, 2004). Similarly, it may be that when you operate a service, you can get more value from obscurity than when you ship a product. That's because you may be able to observe probing. However, you only get that value from obscurity if you actually have ways to detect such probes, analyze what you detect, and respond when appropriate.

Summary

Cryptography is an important set of tools and techniques for addressing threats. There are a wide variety of very sharp tools in the cryptographer's toolbox. The most important are symmetric and asymmetric cryptosystems, hashes, and randomness. You should understand each of these well enough to avoid misusing them. There are more specific primitives for privacy, and a wide set of more specific tools such as forward secrecy and certificates. Cryptographers like to name the parties to their protocols. The names have a pattern and you'll sometimes see that pattern outside of cryptographic documents.

There are a wide variety of threats to cryptosystems; it is likely the area where threats and mitigations have been studied the longest. This chapter has presented a small attack library; more complete ones fill entire books.

If you find yourself building with cryptography, you'll want to make choices, not offer up an infinite list of options (and test cases). Such a list of options will make it harder to plan for upgrades, which is a tremendously important thing to get right from the start. Many of your problems will involve key management. Key management ranges from persistence to a trusted third party vouching for a key. Recall that “trusted” in security is an inverse of its use in polite society: The trusted party is the one who can betray you, and minimizing trust (who can betray you) is a good engineering goal.

You should always use a cryptosystem designed, implemented, and tested by professionals. Your operating system probably comes with a library of these. You should assume that your cryptosystem will become public, and not let that worry you.

What you should worry about is that cryptography is hard. Your goal should be to do it well enough that rather than attacking your crypto, the attackers bypass it. That brings you back into the chess games and “how deep to go” questions which you learned about in Chapter 7 “Processing and Managing Threats.”