Introduction To Network Security: Theory And Practice (2015)

Chapter 2. Data Encryption Algorithms

The history of using secret writing to protect valuable information is probably as long as the history of written language itself. Computer cryptography was created to protect confidential data in digital forms, and it thrives in the Internet era. Data encryption is a critical component of computer cryptography. It uses encryption algorithms and secret keys to transform data from that which is readable to that which is unintelligible. Encryption algorithms must be reversible, so that data can be transformed, using the same secret key, from the unintelligible form back to its original form. Encryption algorithms of this kind are referred to as conventional encryption algorithms or symmetric-key encryption algorithms.

For example, let be a fixed permutation of the 26 English letters, which maps letter to , to , and to . Replacing each letter in a given English message according to this mapping, we can transform the message to a new form that is unintelligible to an untrained eye. This is a simple encryption algorithm, where the secret key is . Exercise 1.7 uses this algorithm, where the secret key is FEBDTAIGHKLMNJPQRSOCVWXYZU. Replacing each letter according to the reverse mapping, namely, replacing with , with , , and with , the data in the new form can be transformed back to its original form. Thus, devising an encryption algorithm is not difficult. What is difficult is to devise good encryption algorithms.

Good encryption algorithms must satisfy a number of requirements. This chapter first describes design criteria of encryption algorithms. It then presents several common block cipher encryption algorithms, including Data Encryption Standard (DES), triple-DES, and Advanced Encryption Standard (AES). It also introduces common block cipher modes and the RC4 stream cipher. Finally, it describes how to generate secret keys.

2.1 Data Encryption Algorithm Design Criteria

Any message written over a fixed set of symbols can always be represented as a binary string. A binary string is a sequence of 0's and 1's. Several standard binary encoding schemes, referred to as character code sets, have been established to encode various sets of computer symbols for different written languages. For example, the ASCII code set encodes English letters and other commonly used symbols into binary strings; the GB 2312-80 code set encodes simplified Chinese characters; the EBCDIC code set encodes western European languages; and the ISO 8859 code set encodes accented Latin and non-Latin European languages, including Greek, Semitic, and Hebrew. The Unicode code set and the ISO 10646 code set intend to unify all code sets to encode all languages. Without loss of generality, we assume that plaintext data and ciphertext data are binary strings.

Binary digits 0 and 1 are called bits. To reduce computation overhead costs, encryption algorithms should only use bit operations that are easy to implement on electronic computers. For instance, permuting bits in a binary string is a simple binary operation.

Let be a binary string. Define the length of , denoted by , to be the number of bits contained in . If , we say that is an -bit binary string.

Let and be a non-negative integer. We use to denote the following -bit binary string:

Let and be two binary strings, where . We use XY to denote the concatenation operation of and ; that is,

For clarity, we may also use to denote the concatenation operation XY.

2.1.1 ASCII Code

The ASCII code set consists of all 7-bit binary strings (see Appendix A), representing non-negative integers from 0 to 127. The first 32 ASCII codes and the last ASCII code are control codes, which are not displayable. ASCII codes from number 32 to 126 encode uppercase and lowercase English letters, decimal digits, punctuation marks, and arithmetic operation notations. Because a byte that is an 8-bit binary string is the basic storage unit in a computer, we often use one byte to represent one ASCII code by prepending a zero. This allows us to expand the ASCII code set to represent up to 128 extra symbols by setting the leftmost bit in each ASCII code from 0 to 1. Sometimes, we also use the leftmost bit as a parity bit for error detection. In any case, using the 8-bit ASCII code set to encode an English message will result in a binary string of length divisible by 8. Using other code sets such as unicode to encode data may result in a binary string of length divisible by 16. Without loss of generality, we assume that any plaintext message is encoded as a binary string of length divisible by 8.

2.1.2 XOR Encryption

The exclusive-OR operation, denoted by or XOR, is a simple binary operation, where

Thus, for any , we have

We can think of the exclusive-OR operation for a single bit as addition modulo 2. Let and be two -bit binary strings. Define

Thus, and .

We use , , and to denote an encryption algorithm, a decryption algorithm, and a secret key, respectively. When and appear in the same context, it is understood that is the reverse algorithm of .

Let be a positive integer divisible by 8 and an -bit secret key. Divide the plaintext data into a sequence of blocks

where each block is -bit long, except possibly the last block . If , add an 8-bit control code at the end of once or several times to obtain a new block such that its length is exactly . For example, we may use the control code nl = 00001010 to pad . This procedure is called padding. For simplicity, we still use to represent the new block.

An encryption algorithm that encrypts one block at a time is called a block cipher algorithm (or simply block cipher). In a block cipher algorithm, the value of is often selected to be 64 or 128. When equals the length of the basic code used in the underlying language, for example, when , we call the encryption algorithm a stream cipher algorithm. Thus, on the surface, the difference between a block cipher and a stream cipher is the length of the block. In stream cipher algorithms, padding is not needed.

We can use the XOR operation to design an encryption algorithm. Let be a secret key of length . The encryption algorithm encrypts to produce a ciphertext block as follows:

The decryption algorithm decrypts into as follows:

XOR encryption is the simplest encryption algorithm. For example, let and , then encrypts FUN as follows:

Plaintext:	F	U	N	(Padding)
ASCII:	01000110	01010101	01001110	00001010
Secret key:	10011010	10011011	10011010	10011011
Ciphertext:	11011100	11001110	11010100	10010001

The XOR encryption algorithm is simple and fast. But the resulting level of security is low. For example, eavesdroppers can easily calculate the secret key from a plaintext–ciphertext pair as follows:

Attacks such as this that derive secret keys using a small number of samples of ciphertext data and the corresponding plaintext data are referred to as known-plaintext attacks.

To protect XOR encryptions from known-plaintext attacks, users must change encryption keys frequently. If each encryption key is used exactly once, the XOR encryption algorithm offers the best security there is. This security is known as information theoreticsecurity. This method is referred to as one-time pads. To implement the one-time-pad scheme, one must first generate a long list of encryption keys sufficient for applications in the foreseeable future, make two identical copies, and distribute the list to the sender and the receiver. Both sides must then use the same keys synchronously and remove a key from the list once it is used. The one-time-pad scheme is secure and simple. However, it is unscalable. Implementing one-time pads for network communications would require each pair of users to generate, transmit, and store a huge number of secret keys. This is formidable. Thus, we must explore different methods to devise encryption algorithms that are not only secure, but also practical. On the other hand, although it is not wise to be used alone, the XOR operation still is a major operation employed in all mainstream encryption algorithms.

2.1.3 Criteria of Data Encryptions

Encryption keys must be kept secret at all times. Encryption algorithms may also be kept secret. A secret encryption algorithm is by itself a cryptosystem. For example, during World War II while fighting against the Japanese forces, the U.S. Marine Corps used the language spoken by Navajos, a remote native American Indian tribe, to generate secret codes. This encryption scheme was never broken by the Japanese.

Keeping encryption algorithms secret, however, does not help to study and verify the security of these algorithms, nor does it help to establish encryption standards. Thus, we assume that encryption algorithms are publicly disclosed, an assumption called Kerchoffs' principle. Only encryption keys are to be kept secret. To be practical, encryption keys must be reusable.

Good encryption algorithms must satisfy the following criteria.

2.1.3.1 Efficiency

The operations used in encryption and decryption algorithms must be easy to implement on hardware and software. Executing these algorithms should only consume moderate resources. In particular, the time complexity and the space complexity of the algorithms must each be kept within a small constant factor of the input size.

To achieve efficiency, encryption algorithms should only employ operations that are easy to implement on a computer. The following operations are common in mainstream encryption algorithms: exclusive-OR, permutation, substitution, circular shift, and operations on finite fields. Permutation, substitution, and circular shift are unary operations. The circular shift operation is a special form of permutation. A permutation is a one-to-one mapping, while a substitution is a many-to-one mapping.

2.1.3.2 Resistance of Statistical Analysis

Encryption algorithms must destroy any statistical structure in the plaintext data, making any statistical analysis useless. The common way to achieve this goal is to require encryption algorithms to satisfy the diffusion and confusion requirements.

1. By diffusion, it means that a change of a single bit in the plaintext string will cause a number of bits in the ciphertext string to be changed. These bits should be distributed in the ciphertext string as evenly as possible. In other words, every single bit in the ciphertext data is affected by a number of bits evenly spread across the plaintext string.

2. By confusion, it means that a change of a single bit in the encryption key will cause a number of bits in the ciphertext string to be changed. These bits should be distributed in the ciphertext string as evenly as possible. In other words, every single bit in the ciphertext data is affected by a number of bits evenly spread across the encryption key.

Diffusion and confusion are also referred to as avalanche effects.

Diffusion may be achieved by repeatedly executing a fixed sequence of operations for a fixed number of rounds on strings generated from the previous round.

Confusion may be achieved by the following method:

1. Generate a number of subkeys from the encryption key.

2. Use the first subkey to operate on the plaintext string in the first round.

3. Use each subsequent subkey in each subsequent round to operate on the new string generated from the previous round.

Combining these two methods in a coherent way, we may be able to obtain an encryption algorithm that offers both diffusion and confusion.

2.1.3.3 Resistance of Brute-Force Attacks

Suppose that the encryption key is bits long. After eavesdropping a ciphertext message , the eavesdropper could use brute force to decipher by calculating for each -bit binary string . If is readable and makes sense, then it would likely be the original plaintext message. As there are different -bit binary strings, such a brute force attack incurs a time complexity in the magnitude of . Thus, must be sufficiently large to thwart brute-force attacks.

This time complexity of is often used as a benchmark to determine the effectiveness of a cryptanalysis method. If a cryptanalysis method can break an encryption algorithm with a time complexity much less than , then this method will be considered useful.

What value of (the length of the key) would be sufficient depends on computing technologies in the near future. It is a common belief that using would be sufficient for many years to come.

2.1.3.4 Resistance of Other Attacks

Encryption algorithms must also resist other types of attacks. In addition to the known-plaintext attacks, these attacks include chosen-plaintext attacks and mathematical attacks.

In a chosen-plaintext attack, the attacker chooses a particular plaintext message as a bait to lure his opponents to encrypt it, where the chosen plaintext message contains useful information for the attacker. During World War II, for example, the intelligence department of the U.S. Pacific Fleet suspected that a certain code frequently occurring in the intercepted Japanese encrypted messages meant “Midway Atoll”. To confirm this suspicion, the U.S. intelligence deliberately had a plaintext message sent out, requesting a replacement of a broken facility in Midway to lure the Japanese intelligence to encrypt it. From this, they were able to confirm that their suspicions were indeed correct.

In mathematical attacks, the attacker uses mathematical methods to decipher encrypted messages. These methods include differential cryptanalysis, linear cryptanalysis, and algebraic cryptanalysis. Detailed discussions of these attacks are beyond the scope of this book.

2.1.4 Implementation Criteria

Implementations of encryption algorithms must resist side channel attacks. Side channel attacks do not attack the algorithms directly. Instead, they explore loopholes in the implementation environments. For example, the timing attack is a common side channel attack. In a timing attack, the attacker analyzes the computing time of certain operations, which may help obtain useful information about the encryption key. Timing attacks could be useful if the run-time of certain operations in the underlying encryption algorithm fluctuates substantially on different bit values in the encryption key.

One way to combat timing attacks is to flatten the computation-time differences between instructions by, for example, executing a few redundant operations on instructions that use much less time to execute.