Modern Cryptography: Applied Mathematics for Encryption and Informanion Security (2016)
PART IV. Applications
Chapter 17. Cryptanalysis
In this chapter we will cover the following:
Classic techniques of cryptanalysis
Modern methods
Rainbow tables
The birthday paradox
Other methods for breaching cryptography
What does it mean to “break” a cipher? It means finding a method to decrypt the message that is more efficient than using a brute-force attempt—simply trying every possible key. If the encryption algorithm uses a 128-bit key, for example, that means 2128 possible keys to try. In the decimal number system, that is 3.402 * 1038 possible keys. If you are able to attempt 1 million keys every second, it could still take as long as 10,790,283,070,806,014,188,970,529 years to break a cipher with a 128-bit key.
This brings us to cryptanalysis: what is it? Cryptanalysis involves using techniques (other than brute force) to attempt to derive the key. Keep in mind that any attempt to crack any non-trivial cryptographic algorithm is simply that—an attempt. There is no guarantee of any method working, and whether a method works or not, it will probably require a long and tedious process. This should make sense, because if cracking encryption were a trivial process, encryption would be useless.
In fact, cryptanalysis is a very tedious and at times frustrating endeavor. It is entirely possible to work for months, and your only result may be to have a little more information on the key that was used for encrypting a given message. What we see in movies, when encryption is broken in hours or even minutes, is simply not realistic.
Who needs to understand cryptanalysis? Obviously certain intelligence-gathering agencies and military personnel have a need for a strong working knowledge of cryptanalysis—beyond what this chapter will provide. In some cases, law enforcement officers and forensics analysts need to understand cryptanalysis at least well enough to know what is feasible and what is not. It is not uncommon for suspects to use encryption to hide evidence from authorities. Furthermore, cryptanalysis techniques are often used to test algorithms. If a new algorithm, or a variation of an old algorithm, is proposed, you can begin testing that algorithm by subjecting it to appropriate cryptanalysis techniques.
Note
Providing a thorough exploration of cryptanalysis would take more than a single chapter of a book. In fact, many introductory cryptography books ignore the topic altogether. The goal of this chapter is simply to acquaint you with the fundamentals of cryptanalysis. If you want to study this topic in more depth, this chapter should provide a solid foundation. You will also find links and suggestions for further study here. If you don’t want to learn more about cryptanalysis, this chapter will provide more than enough information for most security professionals.
Classic Cryptanalysis Methods
Recall that this book began with a study of classic ciphers to help you become comfortable with the concepts of cryptography before delving into more modern algorithms. As with ciphers, it is often easier for a student of cryptography to understand classic cryptanalysis methods before attempting to study modern methods. This section will examine methods that are effective only against classic ciphers. These won’t help you break RSA, AES, or similar ciphers, but they might help you more fully understand the concepts of cryptanalysis.
Frequency Analysis
Frequency analysis is the basic tool for breaking most classical ciphers. In natural languages, certain letters of the alphabet appear more frequently than others. By examining those frequencies, you can derive some information about the key that was used. This method is very effective against classic ciphers such as Caesar, Vigenère, and so on. It is far less effective against modern methods, however. In fact, with modern methods, the most likely result is that you will get some basic information about the key, but you will not get the key. Remember that in the English language, the two most common three-letter words are the and and. The most common single-letter words are I and a. And if you see double letters in a word, they are most likely ee or oo.
Here are a few general facts that will help you—at least with English as the plain text:
T is the most common first letter of a word.
E is the most common last letter of a word.
The is the most common word.
Very common two-letter combinations are he, re, an, th, er, and in.
Very common three-letter combinations are ent, ing, ion, and and.
As mentioned, these methods are quite effective with any of the classic ciphers discussed in Chapters 1 and 2, but are not effective against more modern ciphers. When I’m teaching introductory cryptography courses, I often conduct a lab to illustrate this. In this lab, I ask each student to write a brief paragraph, select some substitution shift (such as +1, –2, +3, and so on), and then apply that single substitution cipher to their message. Essentially, they are applying a Caesar cipher to their messages. Then I ask the students to exchange the resulting cipher text with a classmate. Each student then applies frequency analysis to attempt to break the cipher. Having conducted this lab on numerous occasions for many years, I’ve typically found that about half the class can crack the cipher within 10 to 15 minutes, and these students usually have no prior background in cryptography or cryptanalysis. This lab serves multiple purposes: it introduces the students to primitive cryptanalysis, and it illustrates the fact that classic ciphers are no longer adequate for modern cryptographic purposes.
Let’s try an example. Consider this cipher text that was created using a single substitution cipher:
Notice the three, two-letter words:
Notice, too, that qh appears twice. There are quite a few two-letter words in English, but an and of are two of the most common. If you assume of, then that means q is o and h is f.
Now let’s look at the three-letter words: qpg and vig. Continuing our assumption that q is o and h is f, that would mean that qpg is a word starting with o. How many three-letter words start with o? It turns out that there are quite a few: oil, oak, one, off, ohm, and so on. If you have some knowledge of the subject matter of the message, you might be able to determine if a less common word (such as oak or oil) is likely; if not, you can pick a very common word—either off or one. If you guess one, then that means q is o, h is f, p is n, and g is e. Can we determine anything from this? Let’s take a look:
In every case, the cipher text is +2 from the plain text (if our guesses are right). If you apply this to the entire cipher text, and reduce each letter by two, you get this:
You have just deciphered the message. If your guesses lead to different shifts (such as some are +2, some are +3), then either you are wrong or it is a multi-substitution cipher.
Kasiski Examination
The Kasiski examination (sometimes called Kasiski’s Test or Kasiski’s Method), developed by Friedrich Kasiski in 1863, is a method of attacking polyalphabetic substitution ciphers, such as the Vigenère cipher. You can use this method to deduce the length of the keyword used in the polyalphabetic substitution cipher. Once you discover the length of the keyword, you line up the cipher text in n columns, where n is the length of the keyword. Then each column can be treated as a mono-alphabetic substitution cipher and then cracked with simple frequency analysis. The method involves looking for repeated strings in the cipher text—the longer the cipher text, the more effective this method will be.
Note
Friedrich Kasiski (1805–1881) was a German officer and cryptographer. He published a book on cryptography in 1863 entitled Die Geheimschriften und die Dechiffrir-Kunst (Secret Writing and the Art of Deciphering).
Modern Methods
Of course, cracking modern cryptographic methods is no trivial task. In fact, the most likely outcome of any attempt is failure. But with enough time and resources (such as computational power, sample cipher texts and plain texts, and so on) it is possible.
Here are some attack techniques that can be used to try to crack modern ciphers:
Known plain-text attack With this technique, the attacker obtains a number of plain-text/cipher-text pairs. Using this information, the attacker attempts to derive information about the key being used. Any chance of success requires that the attacker have many thousands of plain-text/cipher-text pairs.
Chosen plain-text attack In this case, the attacker obtains the cipher texts corresponding to a set of plain texts of his own choosing. This enables the attacker to attempt to derive the key used and thus decrypt other messages encrypted with that key. This can be difficult, but it is not impossible.
Cipher text–only attack Here, the attacker has access only to a collection of cipher texts. The attack is completely successful if the corresponding plain texts—or, even better, the key—can be deduced. The ability to obtain any information about the underlying plain text is considered a success. This method is far more likely than using known plain text, but it’s also the most difficult method.
Related-key attack This attack is similar to a chosen plain-text attack, except the attacker can obtain cipher texts encrypted under two different keys. This is a very useful attack if the attacker can obtain the plain text and matching cipher text.
The chosen plain-text and known plain-text attacks often puzzle students who are new to cryptanalysis. How, they ask, can you get samples of plain text and cipher text? You cannot simply ask the target to hand over such samples, can you? Actually, it is not that difficult. Consider, for example, that many people use signature blocks in their e-mails. If you send an e-mail to the target and get a response, the signature block is an example of plain text. If you later intercept an encrypted e-mail from that target, you already know to match the encrypted text with the plain-text signature block at the end of the e-mail. This is just one trivial example of how to get plain-text/cipher-text pairs.
More and different mathematically sophisticated methods can also be used, including linear, differential, integral, and mod n cryptanalysis. Some of these are applications of known plain-text or chosen plain-text attacks.
Linear Cryptanalysis
This technique, invented by Japanese cryptographer Mitsuru Matsui, is a known plaintext attack that uses a linear approximation to describe the behavior of the block cipher. Given enough pairs of plain text and corresponding cipher text, you can obtain bits of information about the key. Obviously, the more pairs of plain text and cipher text you have, the greater the chance of success. Linear cryptanalysis is based on finding affine approximations to the action of a cipher. It is commonly used on block ciphers.
Note
Mitsuru Matsui is known not only for the discovery of linear cryptanalysis, but he has published extensively on other cryptanalysis issues.1
Remember that cryptanalysis is an attempt to crack cryptography. For example, with the 56-bit Data Encryption Standard (DES) key, a brute-force attack could take up to 256 attempts. Linear cryptanalysis will take 247 known plain texts.2 This is better than brute force, but it is still impractical in most situations. Matsui first applied linear cryptanalysis to the FEAL (Fast Data Encipherment Algorithm) cipher and then later to DES. However, the DES application required 247 known plain-text samples, making it impractical.
Note
FEAL was proposed as an alternative to DES, and there have been various revisions of the cipher. Unfortunately, it was susceptible to several types of cryptanalysis.
In linear cryptanalysis, a linear equation expresses the equality of two expressions that consist of binary variables that are XOR’d. For example, in Figure 17-1, the equation XORs the sum of the first and third plain-text bits, and the first cipher-text bit is equal to the second bit of the key.
FIGURE 17-1 The basics of linear cryptanalysis
You can use linear cryptanalysis to slowly re-create the key that was used. After doing this for each bit, you will have an equation of the form shown in Figure 17-2.
FIGURE 17-2 The form of linear cryptanalysis
You can then use Matsui’s Algorithm 2, using known plain-text/cipher-text pairs, to guess the values of the key bits involved in the approximation. The right side of the equation is the partial key (the object is to derive some bits for part of the key). Now count how many times the approximation holds true over all the known plain-text/cipher-text pairs. This count is called T. The partial key that has a T value and that has the greatest absolute difference from half the number of plain-text/cipher-text pairs is determined to be the most likely set of values for those key bits. In this way, you can derive a probable partial key.
Differential Cryptanalysis
Invented by Eli Biham and Adi Shamir, differential cryptanalysis is a form of cryptanalysis that is applicable to symmetric key algorithms. According to Christopher Swenson, “Differential cryptanalysis focuses on finding a relationship between the changes that occur in the output bits as a result of changing some of the input bits.”3 It originally worked only with chosen plain text. However, later research found it could also work with known plain text and cipher text only.
The differential cryptanalysis attacker examines pairs of plain-text inputs that are related by some constant difference. The usual way to define the differences is via an XOR operation, but other methods can also be used. The attacker computes the differences in the resulting cipher texts and looks for some statistical pattern. The resulting differences are called the differential. The basic idea is that by analyzing the changes in some chosen plain text, and noting the difference in the outputs resulting from encrypting each text, it is possible to recover some properties of the key.
Differential cryptanalysis measures the XOR difference between two values. Differentials are often denoted with the symbol Ω; for example, you might have a differential Ωa and another differential Ωb. A characteristic is composed of two matching differentials; for example, differential Ωa in the input produces differential Ωb in the output. The characteristic demonstrates that the specified differential in the input leads to a particular differential in the output. Differential cryptanalysis is also about probabilities. So the question being asked is, What is the probability that a given differential in the input Ωa will lead to a particular differential in the output Ωb?
In most cases, differential analysis starts with the substitution box (S-box). Because most symmetric ciphers use S-boxes, this is a natural and convenient place to start. If, for example, you assume an input of X1 that produces output of Y1 and an input of X2 that produces an output of Y2, this produces a differential (that is, the difference between X1 and X2 produces the difference between Y1 and Y2). This is expressed as follows:
Next you need to consider the relationship between input differentials and output differentials. To do this, you have to consider all possible values of Ωi and measure how this changes the values of Ωo. For each possible value of X1, X2, and Ωi, you measure the change in Y1, Y2, and Ωo, and then record that information.
Note
Even though differential cryptanalysis was publically discovered in the 1980s, DES is resistant to differential cryptanalysis based on the structure of the DES S-box. Since the DES S-box is the portion of DES that was constructed by the NSA, it stands to reason that the NSA was aware of differential cryptanalysis in the 1970s.
Higher-Order Differential Cryptanalysis
An improvement on differential cryptanalysis was developed by Lars Knudsen in 1994. Higher-order differential cryptanalysis focuses on the differences between differences that would be found with ordinary differential cryptanalysis. This technique has been shown to be more powerful than ordinary differential cryptanalysis. Specifically, it was applied to a symmetric algorithm known as the KN-cipher, which had previously been proven to be immune to standard differential cryptanalysis. Higher-order differential cryptanalysis has also been applied to a variety of other algorithms, including CAST.
Truncated Differential Cryptanalysis
Ordinary differential cryptanalysis focuses on the full difference between two texts and the resulting cipher text, but truncated differentials cryptanalysis analyzes only partial differences.4 By taking partial differences into account, it is possible to use two or more differences within the same plain-text/cipher-text pair. As the name suggests, this technique is only interested in making predictions of some of the bits, instead of the entire block. It has been applied to Twofish, Camellia, Skipjack, IDEA, and other block ciphers.
Impossible Differential Cryptanalysis
Put another way, standard differential cryptanalysis is concerned with differences that propagate through the cipher with a greater probability than expected. Impossible differential cryptanalysis is looking for differences that have a probability of 0 at some point in the algorithm. This has been used against Camellia, ARIA, TEA, Rijndael, Twofish, Serpent, Skipjack, and other algorithms.
Integral Cryptanalysis
Integral cryptanalysis was first described by Lars Knudsen. This attack is particularly useful against block ciphers based on substitution-permutation networks. While differential analysis looks at pairs of inputs that differ by only 1 bit position, with all other bits identical, integral analysis, for block size b, holds b – k bits constant and runs the other k bits through all 2k possibilities. So k = 1 is just differential cryptanalysis; k > 1 is a new technique: integral cryptanalysis.
Mod n Cryptanalysis
Mod n cryptanalysis can be used for either block or stream ciphers. This method was developed in 1999 by John Kelsey, Bruce Schneier, and David Wagner. This excerpt from a paper written by the inventors provides a good overview of the technique:
Nearly all modern statistical attacks on product ciphers work by learning some way to distinguish the output of all but the last rounds from a random permutation. In a linear attack, there is a slight correlation between the plain text and the last-round input; in a differential attack, the relationship between a pair of inputs to the last round isn’t quite random. Partitioning attacks, higher-order differential attacks, differential attacks, and related-key attacks all fit into this pattern.
Mod n cryptanalysis is another attack along these lines. We show that, in some cases, the value of the last-round input modulo n is correlated to the value of the plain text modulo n. In this case, the attacker can use this correlation to collect information about the last-round subkey. Ciphers that sufficiently attenuate statistics based on other statistical effects (linear approximations, differential characteristics, etc.) are not necessarily safe from correlations modulo n.5
Asymmetric Cryptanalysis and RSA
So far, we have focused on symmetric ciphers, particularly block ciphers. However, there are also known weaknesses in asymmetric ciphers. Because RSA is the most widely used asymmetric cipher, I will focus on it here. This section will introduce you to several issues with RSA; for more information, refer to any of the other sources referenced in this section.
Recent studies have discovered potential flaws in RSA. Heninger and Shacham6 found that RSA implementations that used a smaller modulus were susceptible to cryptanalysis attacks. In their study, they considered RSA implementations that used a small exponent in the algorithm. A smaller modulus is sometimes used to increase the efficiency of the RSA algorithm. However, the size of the modulus value also could be used to reduce the set of possible factors and thus decrease the time required to factor the public key. In fact, a great many RSA implementations use e = 216 + 1 = 65537. So a cryptanalysis already has the public key and thus has e and n. And the n is relatively small, making it possible, with extensive computing power and time, to derive the private key. The authors of this study clearly showed that it is possible to derive the private RSA key, which would render that particular RSA encryption implementation useless.
In their methodology, Heninger and Shacham formulated a series of linear equations that would progressively approximate the RSA private key. The approximations were based on factoring the public key. This technique is very similar to the linear cryptanalysis method for cryptanalysis of symmetric key algorithms.7, 8
Zhao and Qi9 also used implementations that have a smaller modulus operator. The authors of this study applied modular arithmetic, a subset of number theory, to analyze weaknesses in RSA. Many implementations of RSA use a shorter modulus operator to make the algorithm execute more quickly. Like Heninger and Shacham, Zhao and Qi showed that, based on the mathematical relationships between the elements of the RSA algorithm, increases in efficiency resulting from a smaller modulus will also render a decrease in the efficacy of that RSA implementation.
In their study, Zhao and Qi used a lattice matrix attack on the RSA implementation to attempt to factor the public key and derive the private key. The specifics of this mathematical methodology are not relevant to this discussion; what is significant, however, is that the researchers used an approach different from Heninger and Shacham and achieved the same results on RSA applications using a small modulus.
Aciiçmez and Schindler10 examined the RSA cryptographic algorithm as implemented in Secure Sockets Layer (SSL). Given that RSA and SSL/TLS are used for online banking and e-commerce, the security of any implementation of the protocol is an important topic. Aciiçmez and Schindler wanted to determine whether there were flaws in the implementation that would allow an unintended third party to break the SSL implementation. The authors explained how a particular type of cryptanalysis can be used to break this specific implementation of RSA. This analysis was dependent upon essential elements of number theory.
In their study of SSL using RSA, Aciiçmez and Schindler examined the timing for modular arithmetic operations used in that specific implementation of RSA. This ultimately led to a method for factoring the public key, thus yielding the private key used in that RSA implementation. This methodology is important because normal approaches to factoring the public key are entirely too time consuming to be of practical use (Swenson, 2008). It is important to derive some additional information about the implementation of RSA in order to attempt a more practical approach to factoring. By using number theory, specifically with respect to the functionality of modular arithmetic, the researchers significantly decreased the time required for factoring the public key. Clearly these findings are significant and show an important problem with some implementations of RSA.
These studies are simply a sample of known attack vectors against RSA. Many of these attacks depend on a small modulus. And you have seen that many RSA implementations use the same small modulus (e = 216 + 1 = 65537). It is also true that increases in computing power will make these attacks, as well as brute-force attempts to crack RSA, even more practical. So far, the cryptography community has reacted by simply using ever larger key sizes for RSA. It seems likely that an entirely new asymmetric algorithm will be needed. But in the meantime, when you implement RSA, make sure that you not only use a large key size, but that you be wary of using too small a modulus value.
General Rules for Cryptanalysis
Regardless of the technique used, three resources are required for all cryptanalysis:
Time The number of “primitive operations” that must be performed. This is quite loose; primitive operations could be basic computer instructions, such as addition, XOR, shift, and so forth, or entire encryption methods.
Memory The amount of storage required to perform the attack.
Data The quantity of plain texts and cipher texts required.
With infinite time, memory, and data, any cipher can be broken. Of course, nobody has infinite resources, and, in fact, resources are generally quite limited—particularly time. It would do no good to break the encryption of military communications and learn of an attack, if that break occurred several weeks after the attack. The information is useful only if it is timely.
Note
In general, the primary advantage that a government entity has in cryptanalysis is the resource of memory. A supercomputer, or even a cluster of high-end servers, can put far more resources to breaking a cipher than an individual with a single computer can. Even with those resources, however, breaking a modern cipher is far from trivial. It is still an onerous, resource-intensive task with no guarantee of success. And breaking a modern cipher with a PC is simply not feasible.
With any cryptanalysis attack, varying degrees of success are possible:
Total break The attacker deduces the secret key.
Global deduction The attacker discovers a functionally equivalent algorithm for encryption and decryption, but without learning the key.
Instance (local) deduction The attacker discovers additional plain texts (or cipher texts) not previously known.
Information deduction The attacker gains some Shannon information about plain texts (or cipher texts) not previously known.
Distinguishing algorithm The attacker can distinguish the cipher from a random permutation.
Consider this list carefully. A total break may be the only type of cryptanalysis success you’ve considered. However, that is only one possible definition of a successful cryptanalysis—and, in fact, it is the least likely outcome. In general, if your cryptanalysis produces more information about the key than was previously known, it is considered a success.
Rainbow Tables
Many passwords are stored with a cryptographic hash, which prevents the network or database administrator from reading the password. Recall from Chapter 9 that cryptographic hashes are not reversible—it is not merely computationally difficult, or impractical, to “unhash” something; it is mathematically impossible. It would seem, then, that breaking passwords protected by cryptographic hashes is impossible. However, that is not actually the case.
A rainbow table is a precomputed table of hashes. The most primitive way to create such a table would be to precompute hashes of all possible passwords of a given size. Assuming a standard English keyboard, there are 26 characters in uppercase and 26 in lowercase (52 possibilities), 10 digits, and about 8 special characters (#, !, $, and so on)—or about 70 possible values available for each character (the value 70 is just a rough estimate to illustrate this concept). A single character password could have 70 possible values, whereas a two-character password could have 702, or 4900 possible values. Even an 8-character password could have up to 576,480,100,000,000 possible values! So calculating tables that account for all passwords of any length from 5 to 10 characters would be computationally intensive and would require a great deal of storage.
Because this method for composing precomputed tables of hashes is very primitive, hash chains can be used instead to make this process efficient and to reduce the space needed to store the precomputed hashes. To create a hash chain, you use a reduction function, we will call R, that maps hash values back to plain-text values. This is not unhashing or reversing a hash; it is merely a method to precompute hashes more quickly.
Another, even more advanced, method is to replace the reduction function with a sequence of related reduction functions R1…Rk. The issue then becomes how to implement this process. For example, Microsoft Windows stores password hashes in the Security Account Manager (SAM) file. To find a password, you would first obtain the SAM file for a target machine, and then search the contents through rainbow tables for matches.
The Windows password-cracker tool Ophcrack can automate this process for you. After loading a CD/DVD containing rainbow tables, Ophcrack boots to a live version of Linux. Then it launches a tool that copies the SAM file from the Windows machine and searches the rainbow tables on the CD/DVD for a match. Figure 17-3 shows the Ophcrack interface taken from an actual system, so some information has been redacted for security purposes.
FIGURE 17-3 Ophcrack interface
Note
Rainbow tables get very large. As of this writing, no portable rainbow tables can return passwords that are more than 12 to 14 characters in length. Therefore, longer passwords are useful for thwarting this attack.
The Birthday Paradox
If none of the previous methods works, and you are forced to rely on brute force, finding the key may not take as much time as you think. Consider DES, for example, with its 56-bit key. This means 256 possible keys, or in decimal notation, 72,057,594,037,927,936 possible keys. Even if you could try 1 million keys per second, this would take 72,057,594,037 seconds, or about 2285 years—much too long. You should realize that you probably won’t have to try every single key; you might get the right key just by chance in the first 100 you try, though the odds against that are beyond astronomical! Once you pass the first half the possible keys (at about the 1142-year mark) it gets increasingly likely that you will hit the key with your next attempt. That still is not helpful.
There is a mathematical puzzle that can help, however, called the birthday paradox (sometimes called the birthday problem). Here’s the basic idea: How many people would you need to have in a room to have a strong likelihood that 2 would have the same birthday (month and day, not year)? Obviously, if you put 367 people in a room, at least 2 of them must have the same birthday, as there are only 365 days in a year (plus one more on a leap year). However, we are not asking how many people you need to guarantee a match, just how many you need to create a strong probability. It just so happens that with only 23 people in the room, there is a 50 percent chance that 2 people will have the same birthday.
How is this possible with so few people? How is it that such a low number can work? Basic probability tells us that when events are independent of one another, the probability of all the events occurring is equal to a product of the probabilities of each event occurring. So the probability that the first person does not share a birthday with any previous person is 1.0, or 100 percent, since there are no previous people in the set: That can be written as 365/365. For the second person, there is only one preceding person, and the odds that the second person has a birthday different from the first are 364/365. For the third person, the odds of having a different birthday than either of the two preceding people are 363/365. Since each of these is independent, we can compute the probability as follows:
365/365 * 364/365 * 363/365 * 362/365 … * 342/365
(342 is the probability of the 23rd person sharing a birthday with a preceding person.) Let’s convert these to decimal values (truncating at the third decimal point):
1 * 0.997 * 0.994 * 0.991 * 0.989 * 0.986 * … 0.936 = 0.49 or 49%
This 0.49, or 49 percent, is the probability that none of the 23 will have a birthday in common, which means there is a 0.51, or 51 percent (better than even odds) probability that 2 of the 23 will have a birthday in common.
Just for reference, if 30 people are involved, the probability that 2 have the same birthday is 0.706, or 70.6 percent. With 50 people the probability raises to 0.97, or 97 percent, which is quite high. This does not simply apply to birthdays, of course. The same concept can be applied to any set of data, and it is often used in cryptography and cryptanalysis.
In reference to cryptographic hash functions, the goal is to find two different inputs that produce the same output. When two inputs produce the same output from a cryptographic hash, this is referred to as a collision. It just so happens that from any set of n elements, the number of samples required to get a match or collision with a greater than 0.5, or 50 percent, probability is 1.174 √n. Returning to the preceding birthday problem, 1.174 √365 = 22.49.
You can apply this to other cryptographic issues as well. Let’s return to the DES algorithm. Recall that it has 72,057,594,037,927,936 possible keys. To guarantee that you find the correct key, you may need to check all 72,057,594,037,927,936 possibilities. However, if all you are seeking is a greater than 50 percent chance of finding the right key, you can try 1.774 √72,057,594,037,927,936 possible keys—only 476,204,499 possible keys. This is far more manageable. In fact, again referring to our earlier scenario with DES, if you can attempt 1 million keys per second, you have a 50 percent chance of finding the right key in 7.9 minutes. Although 1 million keys per second is beyond the capability of your standard personal computer, it is well within the capabilities of supercomputers or even computing clusters.
The birthday paradox is one reason why larger key sizes are necessary for security. If we move our attention from DES to an AES 128-bit key, there are approximately 3.402 * 1038 possible keys. Applying the birthday paradox gives us 1.774 √(3.402 * 1038), or 32,724,523,986,760,744,567, keys that need to be attempted to have a 50 percent chance of finding a match. This number is large enough that it is computationally infeasible to break it through brute-force methods, even with the birthday paradox.
Note
For more details on this mathematical problem and how it applies to cryptography, you can consult Chapter 2 of Modern Cryptanalysis: Techniques for Advanced Code Breaking (Wiley, 2008), which provides even more detailed coverage of the birthday paradox. Professor Dan Boneh also created a very good video on this topic at www.youtube.com/watch?v=ZZovSCFZffM.
Other Methods
Cryptanalysis is a formal process whereby one applies specific techniques in an attempt to crack cryptography. However, as you have seen in this chapter, one’s ability to crack a cipher completely is quite limited. In many cases, the best one can hope to achieve is to derive some additional information about the target key. In many cases, breaking cryptography is so time consuming, and the probability of actually deriving the key is so small, that cryptanalysis provides no practical means of breaking ciphers. Put another way, cryptanalysis is an excellent tool for testing the security of a given cipher, but it is usually not efficient enough to make it a practical means for situations that require that the cipher be compromised in a short period of time. For example, a law enforcement agency that needs to break the encryption on a hard drive is unlikely to find cryptanalysis techniques of much use. However, using other methods for compromising ciphers might provide more immediate results. These methods all depend on some flaw in the implementation of the cryptography.
Other Passwords
Particularly with e-mail and hard drive encryption, a user is typically required to know some password in order to decrypt and access the information. Many hard drive encryption tools use very strong encryption. For example, Microsoft BitLocker, introduced with Windows 7, uses AES with a 128-bit key.
Several open source hard drive encryption tools use AES with a 256-bit key. It is simply not feasible to break this key. However, because it is possible that the user has used the same password (or a substantially similar permutation) somewhere else—such as with an e-mail account, or with a Windows password—you can check these sources for the key. You might even use Ophcrack on a Windows computer and use those passwords (and permutations of those passwords) to try and decrypt the encrypted partition or e-mail.
After the fact, many people are not even aware that their e-mail password has been cracked. Several web sites keep lists of e-mail accounts that have been breached, including haveibeenpwned.com and PwnedList.com.
Note
If you are a law enforcement officer attempting to breach an encrypted drive belonging to a suspect, you can check these sites to see if the suspect’s e-mail account has been breached. You can then use the e-mail password, and close permutations, to attempt to decrypt the hard drive.
Related Data
It is often the case that people choose passwords that have some meaning for them. This makes memorization much easier. I frequently advise forensic analysts to learn all they can about a suspect, and photographing or videotaping the area where the computer is seized can be quite valuable as well. If, for example, the suspect is an enthusiastic fan of a particular sports team, with memorabilia extensively displayed in his or her home or office, this type of information might aid in breaking encrypted drives, phones, and e-mails. It is at least reasonably likely that this individual’s cryptography password is related to a beloved sports team.
Spyware
Of course, the easiest way to breach encryption is to see the password as the user types it in. This is often accomplished via spyware. For intelligence-gathering agencies, this is a viable option. For law enforcement, this can be done only with a warrant, and for civilians this is simply not an option.
Side-Channel Attack
Side-channel attacks are more difficult to accomplish. They are based on studying the physical components of a cryptographic system, seeking clues as to the cryptographic keys. For example, a gateway router that implements a virtual private network might be examined to study its power consumption or timing, or to see if there are electromagnetic leaks that provide cryptographic clues that can be used in cryptanalysis.
There are variations of this attack, including the following:
Timing attacks Attacks that measure how much time various computations take to perform.
Data remanence Attacks that gather sensitive data or portions thereof that remain in memory after having been deleted.
Electromagnetic attacks Attacks based on leaked electromagnetic radiation.
Electromagnetic emanations are particularly interesting with respect to the most common sort of network cable, unshielded twisted pair (UTP). You can attempt to detect the signal passing through UTP cable and derive information from it. As you might suspect, this is a nontrivial task. In fact, in response to this issue, the U.S. National Security Agency established the TEMPEST standard, which prohibits cables used to transmit classified data from being within a certain distance of cables transmitting unclassified data to prevent information “leakage.”
Resources
As stated at the beginning of this chapter, the goal of this chapter is to get you familiar with the concepts of cryptanalysis. If you are seeking more specific information, perhaps even tutorials on cryptanalysis, I suggest you first master the information in this chapter, and then consider the following resources:
“A Tutorial on Linear and Differential Cryptanalysis,” by Howard M. Heys, Memorial University of Newfoundland, www.engr.mun.ca/~howard/PAPERS/ldc_tutorial.pdf.
“Differential Cryptanalysis Tutorial,” The Amazing King web site, www.theamazingking.com/crypto-diff.php. This source is not a typical scholarly resource and provides information in a way that is easy to understand.
“Linear Cryptanalysis Tutorial,” The Amazing King web site, www.theamazingking.com/crypto-linear.php.
“A Self-Study Course in Block-Cipher Cryptanalysis,” by Bruce Schneier, www.schneier.com/paper-self-study.pdf.
“A Tutorial on Linear and Differential Cryptanalysis,” by Howard M. Heys. Cryptologia 26, 3 (July 2002), 189–221, http://dx.doi.org/10.1080/0161-110291890885.
“Cryptography and Network Security” video, by Prof. D. Mukhopadhyay, www.youtube.com/watch?v=xcBqraHhcJU.
Conclusions
This chapter provided a broad general introduction to cryptanalysis. You should be very familiar with classic techniques such as frequency analysis, and you should understand the concept of the birthday paradox, as well as how it is applied to cryptographic problems.
For modern methods such as linear cryptanalysis and differential cryptanalysis, you should have a general understanding of the concepts and some grasp of the applications. It is not critical that you be able to apply these techniques. You should, however, have a working knowledge of rainbow tables.
Test Your Knowledge
1. What is brute-force cracking?
2. Applying the birthday paradox, how many keys would you need to try out of n possible keys to have a 50 percent chance of finding a match?
3. In the English language, __________ is the most common first letter of a word.
4. __________ is the most common word in the English language.
5. With a __________, the attacker obtains a number of plain-text/cipher-text pairs.
6. __________ is based on finding affine approximations to the action of a cipher.
7. Using ___________, the attacker discovers additional plain texts (or cipher text) not previously known.
8. In _____________, the attacker discovers a functionally equivalent algorithm for encryption and decryption, but without learning the key.
9. A __________ is essentially a precomputed table of hashes.
10. The concept of a hash-chain is to use a ___________, we will call R, that __________.
Answers
1. Trying every possible key to guess the right one
2. 774√n
3. T
4. the
5. known plain-text attack
6. Linear cryptanalysis
7. instance (local) deduction
8. global deduction
9. rainbow table
10. reduction function, maps hash values back to plain-text values
Endnotes
1. For a list of Mitsuru Matsui’s publications, see www.cryptographersworld.com/a.php?a=64.
2. M. Matsui, “Linear Cryptanalysis Method for DES Cipher,” 1999, www.cs.bgu.ac.il/~beimel/Courses/crypto2001/Matsui.pdf.
3. C. Swenson, Modern Cryptanalysis: Techniques for Advanced Code Breaking (Wiley, 2008).
4. J. Kelsey, B. Schneier, and D. Wagner, “Mod n Cryptanalysis, with Applications Against RC5P and M6,” 1999, www.schneier.com/paper-mod3.pdf.
5. N. Heninger and H. Shacham, “Reconstructing RSA Private Keys from Random Key Bits,” Advances in Cryptology Lecture Notes in Computer Science, 1 (1) (2009).
6. B. Su, W. Wu, and W. Zhang, “Security of the SMS4 Block Cipher Against Differential Cryptanalysis,” Journal of Computer Science and Technology, 26(1), 130–138 (2011).
7. A. Alekseychuk, L. Kovalchuk, and S. Pal’chenko, “Cryptographic Parameters of s-Boxes that Characterize the Security of GOST-like Block Ciphers Against Linear and Differential Cryptanalysis,” Zakhist Inform, 2, 12–23 (2007).
8. Y. Zhao and W. Qi, “Small Private-Exponent Attack on RSA with Primes Sharing Bits,” Lecture Notes in Computer Science, 4779, 221–229 (2007).
9. O. Aciiçmez and W. Schindler, “A vulnerability in RSA implementations due to instruction cache analysis and its demonstration on OpenSSL,” Proceedings of the 2008 Cryptographer’s Track at the RSA conference on topics in cryptology, http://portal.acm.org/citation.cfm?id=1791688.1791711&coll=DL&dl=GUIDE&CFID=9605139&CFTOKEN=26457223.