Foundations of Coding: Compression, Encryption, Error Correction (2015)

Chapter 3. Cryptology

3.4 Public Key Cryptography

3.4.1 Motivations and Main Principles

We have seen that symmetric cryptosystems can be almost secure and efficient in terms of computation time. Nevertheless, from the middle of the seventies, new questions were raised:

· Before using a symmetric cryptosystem, how is a key agreed upon?

· How is a secure communication between two entities with no preliminary key exchange established ?

In order to answer these questions, Diffie and Hellman laid the foundations of public key cryptosystems in 1976, using the analogy with a mail-box and assuming that Bob is the only person having the key:

· any person is able to send a message to Bob;

· only Bob is able to read the messages stored in his mail- box.

On the other hand, a symmetric cryptosystem can be seen as a safe whose key is shared between Alice and Bob.

Keeping in mind such a system and in order to keep the notation introduced in Equation (3.1 ), one has . To be more precise, is a public key that is published in some kind of directory (actually in the form of a certificate: see Section 3.6.2) in such way that anyone is able to obtain this key, check its origin, and encrypt a message with it. is a private and secret key that is kept and used by Bob in order to decrypt the ciphertexts. As the two keys are distinct, public key encryption is also called asymmetric encryption(Figure 3.16).

nfg016

Figure 3.16 Principle of public key encryption

Is it possible to give the characteristics of such a cryptosystem using the notions of information theory? In such a context, the measure of the secrecy consists in computing the remaining information in the private key, assuming that the public key is known: . Besides, an asymmetric algorithm must satisfy with the functions D and E publicly known. This point implies that the key is completely determined once the public key is known: one only has to invert the function . In theory, at least, it implies that and, thus, that an asymmetric system has no secret at all. In practice, the two keys and are linked but they are chosen in such way that it is far too difficult to compute the value of the key from .

For this purpose, one uses the principle of one-way functions, which was introduced in Section 1.3.3.4. Therefore, the secret of a public key cryptosystem cannot be characterized by Shannon's information theory; this secret does not come from the uncertainty of the private key but rather from the inner difficulty of computing from the only and the ciphertext C. The mathematical tool that enables one to characterize this difficulty is the theory of algorithmic complexity.

The most well-known and most widely used public key encryption algorithm—mainly because of its ease of use—is the RSA system, whose name comes from its designers, Rivest, Shamir, and Adleman. The difficulty of RSA is based on the difficulty of factoring an integer.

3.4.2 Rivest-Shamir-Adleman (RSA) Encryption

When combining the Euler theorem and the Chinese Remainder theorem (see page 38), one is able to obtain a common congruence relation for all elements in when n is the product of two prime numbers. This result, which was presented by Rivest, Shamir, and Adleman in 1978, is at the origin of RSA public key cryptography.

Theorem 24 (RSA Theorem)

Let be the product of two prime numbers.Let a be any element in . Then, for any positive integer k, one has .

Proof

There are two cases: if a is invertible in , then one directly applies the Euler theorem and thus . Computing the power of this relation and multiplying it by a, one obtains the desired relation.

If a is zero, the relation modulo n is immediate. Otherwise, one uses the Fermat theorem twice modulo p and modulo q.

Indeed, if a is nonzero modulo p prime, then . Computing the power and multiplying by a, one obtains .

On the other hand, if , then one has necessarily . In the same way, one obtains a similar relation modulo q: .

As p and q are coprime, it is enough to apply the uniqueness of the Chinese Remainder theorem to have .

From this statement, one deduces the RSA cryptosystem. In such system, a user generates a pair (public key and private key) using the following procedure:

1. Choose two large primes p and q. p and q must contain at least 150 digits each.

2. Compute

3. Choose a small integer e coprime with

4. Compute d, the inverse of e modulo .

5. Release the pair as the RSA public key.

6. Keep the pair secret: this pair is the RSA private key.

Therefore, one has

1. RSA encryption:

2. RSA decryption:

One easily checks that, for , one has . Indeed, , according to RSA theorem.

Exercise 3.16 (RSA Encryption)

1. Determine the public key and the private key for and with (and justify the validity of this choice).

2. Encrypt the letter B in ASCII code (66) using the public key and check that the private key actually enables one to recover the original message.

Solution on page 310.

Exercise 3.17 (RSA Encryption/Decryption)

We now consider the toy RSA public key .

1. What is the ciphertext associated with the message using this key?

2. Compute the private key corresponding to the public key .

3. Decrypt the message . One may use the following result: .

4. Is it possible to obtain a ciphertext equal to 625 using this public key? Same question with the private key.

Solution on page 311.

Exercise 3.18 (Vigenère Encryption and RSA)

We consider a cryptosystem over the alphabet where each symbol in the alphabet is represented with a number between 0 and 26 as follows: ngr003

Given a key and a plaintext , the ciphertext is given by

This is what we call Vigenère encryption. Here is a French ciphertext that Oscar intercepted for you:

HWQIO QVPIF TDIHT Y-WAF NGY-F COMVI CGEVZ CVIAF JDFZK

YLYHG YGEHR SHMMX CVHBF AJYKN ZIXHP ZHEQY YJRHT YWMUK

YKPBY YGEHA G-DY- YWDTF MHFZK YZYHX CISVI CHIVZ}

1. Propose a method for the cryptanalysis of the Vigenère encryption starting by determining the length of the key.

2. Oscar and Eve also work on the cryptanalysis of this text they have intercepted. Oscar sends to Eve the length of the key that he has managed to determine. For this, he uses the RSA public key of Eve . Ivan intercepts the RSA ciphertext containing the length of the key: he obtains 10. What is the length of the key?

3. Eve has managed to decrypt the second and the third encryption key. She sends them to Oscar using the RSA public key of the latter . In the same way, Ivan intercepts the ciphertexts of (he obtains 48) and (4). What are the values of and ?

Note: one may use the following results:

and .

4. Now that you have the length of the key, decrypt this text. Specify the key used for the encryption and the decryption (in the form of a string of characters). The repartition (in %) of the symbols used in a text written in French is summarized in Table 3.11.

Solution on page 311.

Table 3.11 Frequential Repartition of Symbols in a Text Written in French

–	18.35%
E	14.86%
S	6.97%
A	6.40%
N	6.23%
…	…

3.4.2.1 Efficiency and Robustness of RSA

Thanks to the algorithms that we have seen in the previous sections:

· It is easy to generate large prime numbers, at least when accepting an error margin (see Miller–Rabin test, Section 1.3.4.1, on page 44). In the case of RSA, the error is not very dangerous. Indeed, if one commits an error and believes that p and q are prime numbers, one will soon realize that they are not: either the key d is not invertible or some parts of the decrypted message are incomprehensible. In this case, one is still able to change the RSA keys (by recomputing p and q);

· Computing the pair is very easy: it is enough to apply the extended Euclidean algorithm;

· Finally, encryption and decryption are performed using modular exponentiation. We have seen that such exponentiation can be performed quite efficiently.

The security provided by RSA mainly relies on the difficulty of factoring large integers. Indeed, if an attacker is able to factorize the number of the public key, then he is able to deduce directly and thus to compute the private key from the public key using the extended Euclidean algorithm. Therefore, if there exists a fast algorithm for factoring large numbers, breaking RSA also becomes easy.

After 20 years of research, no more efficient method than factoring n has been published to break RSA. However, the reciprocal proposition : “ if factoring large numbers is hard then breaking an RSA ciphertext is hard” has not been proved. In other words, today, one does not know if breaking RSA is as difficult as factoring n.

However, one is able to demonstrate the equivalence between “determining d from ” and “factoring n.” One way is trivial: knowing p and q, one is able to easily compute d for this is what is done during the key generation. Reciprocally, computing p and q from can be obtained:

· by a deterministic algorithm if e is small (typically —see Exercise 3.19);

· by a probabilistic algorithm (a variant of Miller–Rabin algorithm) in general (Exercise 3.20).

Exercise 3.19 (Factoring n from n, e, and d for e Small)

We consider the RSA cryptosystem, with as keys, where e is “small.”

1. Prove that there exists , such that .

2. One assumes that p and q are different from 2 and 3. Prove that .

3. Then propose an algorithm for factoring n.

Solution on page 313.

Exercise 3.20 (Factoring n from n, e, and d)

One considers an RSA cryptosystem . Let , such that divides (In other words: ).

1. Show that there exists a variant of the Miller–Rabin algorithm that enables one to factorize n, that is, to determine p and q. The algorithm should perform only a single attempt and should return either the factors of n or an error.

2. What is the average number of calls of this algorithm that one should perform in order to find out the factorization of n?

Solution on page 313.

Hence, breaking RSA by computing the private key d from the public key is as difficult as factoring n (this problem is known to be difficult if p and q are very large). The confidence in RSA relies on this statement. Current factoring limits deal with numbers of approximately 232 decimal digits (so far, the current record⁶ is held by Kleinjung, Aoki, Franke, Lenstra, Thomé, Bos, Gaudry, Kruppa, Montgomery, Osvik, te Riele, Timofeev, and Zimmermann, obtained in December 2009, during the RSA-768 challenge). Today, one considers that a modulo n of 2048 bits is secure.

3.4.2.2 Attacks on RSA

The key generation which was presented previously is a simplified version, and in practice, one should be careful when generating the keys. These precautions are the results of attacks exploiting some relations between the RSA encryption parameters. Some attacks will be presented in the sequel in the form of exercises. Recommendations concerning the implementation of RSA-based cryptosystems (key generation, etc.) are summarized in the PKCS#1 standard. This standard is regularly updated (depending on the breakthroughs in the cryptanalysis of RSA).

Exercise 3.21 (Common Exponent Attack on RSA)

The RSA public keys of William, Jack, and Averell are, respectively, equal to , and . Joe sends the same message x to each one of them, with , using their respective public keys.

Prove that Lucky Luke, who is able to obtain , and on the network, can easily compute x.

Hint. One recalls (or assumes !) that for a and k integers, the method of Newton–Raphson (convergence of the sequence ) enables one to compute very quickly the value of , with an arithmetic complexity bound of the order of .

Solution on page 315.

Exercise 3.22 (Common Modulus Attack on RSA)

An implementation of RSA gives the same modulus n (product of two prime numbers) to Alice and Bob but with two different pairs of keys and . Moreover, one assumes that and are coprime (it is true in general). Then, one assumes that Alice and Bob receive the same message M encrypted using their respective public key. Oscar intercepts both messages and , which are known to be two ciphertexts of the same plaintext. Show that Oscar is then able to easily recover the message M.

Solution on page 316.

Exercise 3.23 (Well Chosen Ciphertext Attack on RSA)

Eve intercepts the ciphertext c sent by Bob to Alice: . In order to decrypt c, Eve proceeds in the following way:

1. Eve chooses an integer randomly and computes x:= ;

2. Eve computes ;

3. Eve asks Alice to sign y using her private key; Alice sends to Eve.

Show that Eve is now able to recover the message m sent by Bob. Morality?

Solution on page 316.

Exercise 3.24 (Factorial Attack or Pollard p-1 Attack)

Let p and q be the two prime numbers that are used to build the modulus of an RSA public key. The integer is even; let B be the smallest integer such that (factorial of B) is a multiple of . In other words, there exists an integer such that:

1. Let be a prime factor of . Show that .

2. Let be an integer. Show that .

3. Let ; deduce that is a multiple of p.

4. Let k be an integer; what is the cost of computing ?

5. Deduce a bound on the cost of computing with respect to B and .

6. If is a divisor of , with B small, then an attacker who knows n but neither p nor q is able to perform the following attack: assuming that C is a small upper bound of B (), he computes

7. Prove that, if G is neither equal to 1 nor n, then an attacker has necessarily broken RSA with the modulo n.

8. One assumes that is a divisor of with ; moreover, one assumes that is small (for instance ).

Show that this attack enables one to break RSA. Give an upper bound on the cost of this attack.

9. What is the counter-measure ?

Solution on page 316.

3.4.3 El Gamal Encryption

While RSA relies on the problem of factoring integers, El Gamal encryption exploits the Discrete Logarithm Problem (DLP), which was studied in Section 1.3.3.3. The key generation is performed in the following way:

· Let p be a prime number for which the DLP is difficult in .

· Let be a primitive root.

· Let s be an integer and .

· Then, the public key is the triplet .

· The corresponding private key is .

1. El Gamal encryption: Let M be the plaintext one wishes to encrypt and let be a secret random number. One has

equation

2. El Gamal decryption: For , one defines

Indeed .

Therefore, it is possible to easily encrypt a message using modular exponentiation. Decryption without the private key is equivalent to solving the discrete logarithm.

However, when one has the private key, it is enough to perform a modular exponentiation, followed by the extended Euclidean algorithm to find the inverse of an element in . The main interest of this kind of encryption is that, for a given level of security, it can work with smaller numbers – hence quicker –than RSA. Indeed, the complexity of the attacks performed on RSA and DLP make it that the size of the RSA parameters must be roughly twice as large as those required for an El Gamal encryption.

Besides, as DLP can be applied in any group, one is able to extend El Gamal to other groups in which the Discrete Logarithm Problem is much more difficult. For instance, this is the case for the group of points on an elliptic curve, see Section 1.3.6, in which index calculus attacks do not succeed. On the basis of the conjectured complexity bound of the currently known attacks on RSA and Elliptic Curve Cryptography (ECC), the NESSIE⁷ project as well as the U.S. NIST thus proposed the equivalence guide of Table 3.12.

Table 3.12 Conjectured Compared Security of Block Ciphers

	80	112	128	196	256
bits	Skipjack	3-DES	AES-small	AES-medium	AES-large
RSA	1024	2048	3072	7168	13312
ECC	192	224	256	384	521

Indeed, the only known attacks for the discrete logarithm in elliptic curve groups are of baby-step/giant-step or Pollard's rho type, with complexity bound , where n is the number of points in the curve. Thus, the recommended number of points in the elliptic curve (and therefore the cardinal number of the underlying finite field) should be around , where t is the required security.