MODULAR ARITHMETIC WITH THE MULTIPLICATIVE AND AFFINE CIPHERS - Hacking Secret Ciphers with Python (2013)

Hacking Secret Ciphers with Python (2013)

Chapter 14: MODULAR ARITHMETIC WITH THE MULTIPLICATIVE AND AFFINE CIPHERS

Topics Covered In This Chapter:

· Modular Arithmetic

· “Mod” is “Remainder Of”(Sort Of)

· GCD: Greatest Common Divisor (aka Greatest Common Factor)

· Multiple Assignment Trick

· Euclid’s Algorithm for Finding the GCD of Two Numbers

· “Relatively Prime”

· The Multiplicative Cipher

· Finding Modular Inverses

· The cryptomath Module

“People have been defending their own privacy for centuries with whispers, darkness, envelopes, closed doors, secret handshakes, and couriers. The technologies of the past did not allow for strong privacy, but electronic technologies do.”

Eric Hughes, “A Cypherpunk's Manifesto”, 1993

The multiplicative and affine ciphers are similar to the Caesar cipher, except instead of adding a key to a symbol’s index in a string, these ciphers use multiplication. But before we learn how to encrypt and decrypt with these ciphers, we’re going to need to learn a little math. This knowledge is also needed for the last cipher in this book, the RSA cipher.

Oh No Math!

Don’t let it scare you that you need to learn some math. The principles here are easy to learn from pictures, and we’ll see that they are directly useful in cryptography.

Math Oh Yeah!

That’s more like it.

Modular Arithmetic (aka Clock Arithmetic)

This is a clock in which I’ve replaced the 12 with a 0. (I’m a programmer. I think it’s weird that the day begins at 12 AM instead of 0 AM.) Ignore the hour, minute, and second hands. We just need to pay attention to the numbers.

Figure 14-1. A clock with a zero o’clock.

3 O’Clock + 5 Hours = 8 O’Clock

If the current time is 3 o’clock, what time will it be in 5 hours? This is easy enough to figure out. 3 + 5 = 8. It will be 8 o’clock. Think of the hour hand on the clock in Figure 14-1 starting at 3, and then moving 5 hours clockwise. It will end up at 8. This is one way we can double-check our math.

10 O’Clock + 5 Hours = 3 O’Clock

If the current time is 10 o’clock, what time will it be in 5 hours? If you add 10 + 5, you get 15. But 15 o’clock doesn’t make sense for clocks like the one to the right. It only goes up to 12. So to find out what time it will be, we subtract 15 – 12 = 3. The answer is it will be 3 o’clock. (Whether or not it is 3 AM or 3PM depends on if the current time is 10 AM or 10 PM. But it doesn’t matter for modular arithmetic.)

If you think of the hour hand as starting at 10 and then moving forward 5 hours, it will land on 3. So double-checking our math by moving the hour hand clockwise shows us that we are correct.

10 O’Clock + 200 Hours = 6 O’Clock

If the current time is 10 o’clock, what time will it be in 200 hours? 200 + 10 = 210, and 210 is larger than 12. So we subtract 210 – 12 = 198. But 198 is still larger than 12, so we subtract 12 again. 198 – 12 = 186. If we keep subtracting 12 until the difference is less than 12, we end up with 6. If the current time is 10 o’clock, the time 200 hours later will be 6 o’clock.

If we wanted to double check our 10 o’clock + 200 hours math, we would keep moving the hour hand around and around the clock face. When we’ve moved the hour hand the 200 th time, it will end up landing on 6.

The % Mod Operator

This sort of “wrap-around” arithmetic is called modular arithmetic. We say “fifteen mod twelve” is equal to 3. (Just like how “15 o’clock” mod twelve would be “3 o’clock.) In Python, the mod operator is the % percent sign. Try typing the following into the interactive shell:

>>> 15 % 12

3

>>> 210 % 12

6

>>> 10 % 10

0

>>> 20 % 10

0

>>>

“Mod” is “Division Remainder”(Sort Of)

You can think of the mod operator as a “division remainder” operator. 21 ÷ 5 = 4 remainder 1. And 21 % 5 = 1. This works pretty well for positive numbers, but not for negative numbers. -21 ÷ 5 = -4 remainder -1. But the result of a mod operation will never be negative. Instead, think of that -1 remainder as being the same as 5 – 1, which comes to 4. This is exactly what -21 % 5 evaluates to:

>>> -21 % 5

4

>>>

But for the purposes of cryptography in this book, we’ll only be modding positive numbers.

Practice Exercises, Chapter 14, Set A

Practice exercises can be found at http://invpy.com/hackingpractice14A.

GCD: Greatest Common Divisor (aka Greatest Common Factor)

Factors are the numbers that can be multiplied to produce a particular number. Look at this simple multiplication:

4 × 6 = 24

In the above math problem, we say 4 and 6 are factors of 24. (Another name for factor is divisor.) The number 24 also has some other factors:

8 × 3 = 24

12 × 2 = 24

24 × 1 = 24

From the above three math problems, we can see that 8 and 3 are also factors of 24, as are 12 and 2, and 24 and 1. So we can say the factors of 24 are: 1, 2, 3, 4, 6, 8, 12, and 24.

Let’s look at the factors of 30:

1 × 30 = 30

2 × 15 = 30

3 × 10 = 30

5 × 6 = 30

So the factors of 30 are 1, 2, 3, 5, 6, 10, 15, and 30. (Notice that any number will always have 1 and itself as factors.) If you look at the list of factors for 24 and 30, you can see that the factors that they have in common are 1, 2, 3, and 6. The greatest number of these is 6, so we call 6 the greatest common factor (or, more commonly, the greatest common divisor) of 24 and 30.

Visualize Factors and GCD with Cuisenaire Rods

Figure 14-2. Each Cuisenaire rod has a different color for each integer length.

Above are some rectangular blocks with a width of 1 unit, 2 units, 3 units, and so on. The block’s length can be used to represent a number. You can count the number of squares in each block to determine the length and number. These blocks (sometimes called Cuisenaire rods) can be used to visualize math operations, like 3 + 2 = 5 or 5 × 3 = 15:

Figure 14-3. Using Cuisenaire rods to demonstrate addition and multiplication.

If we represent the number 30 as a block that is 30 units long, a number is a factor of 30 if the number’s blocks can evenly fit with the 30-block. You can see that 3 and 10 are factors of 30:

Figure 14-4. Cuisenaire rods demonstrating factors.

But 4 and 7 are not factors of 30, because the 4-blocks and 7-blocks won’t evenly fit into the 30-block:

Figure 14-5. Cuisenaire rods demonstrating numbers that are not factors of 30.

The Greatest Common Divisor of two blocks (that is, two numbers represented by those blocks) is the longest block that can evenly fit both blocks.

Figure 14-6. Cuisenaire rods demonstrating Greatest Common Divisor.

More information about Cuisenaire rods can be found at http://invpy.com/cuisenaire.

Practice Exercises, Chapter 14, Set B

Practice exercises can be found at http://invpy.com/hackingpractice14B.

Multiple Assignment

Our GCD function will use Python’s multiple assignment trick. The multiple assignment trick lets you assign more than one variable with a single assignment statement. Try typing the following into the interactive shell:

>>> spam, eggs = 42, 'Hello'

>>> spam

42

>>> eggs

'Hello'

>>> a, b, c, d = ['Alice', 'Bob', 'Carol', 'David']

>>> a

'Alice'

>>> b

'Bob'

>>> c

'Carol'

>>> d

'David'

>>>

The variable names on the left side of the = operator and the values on the right side of the = operator are separated by a comma. You can also assign each of the values in a list to its own variable, if the number of items in the list is the same as the number of variables on the left side of the =operator.

Be sure to have the same number of variables as you have values, otherwise Python will raise an error that says the call needs more or has too many values:

>>> a, b, c = 1, 2

Traceback (most recent call last):

File "<pyshell#8>", line 1, in <module>

a, b, c = 1, 2

ValueError: need more than 2 values to unpack

>>> a, b, c = 1, 2, 3, 4, 5, 6

Traceback (most recent call last):

File "<pyshell#9>", line 1, in <module>

a, b, c = 1, 2, 3, 4, 5, 6

ValueError: too many values to unpack

>>>

Swapping Values with the Multiple Assignment Trick

One of the main uses of the multiple assignment trick is to swap the values in two variables. Try typing the following into the interactive shell:

>>> spam = 'hello'

>>> eggs = 'goodbye'

>>> spam, eggs = eggs, spam

>>> spam

'goodbye'

>>> eggs

'hello'

We will use this swapping trick in our implementation of Euclid’s algorithm.

Euclid’s Algorithm for Finding the GCD of Two Numbers

Figuring out the GCD of two numbers will be important for doing the multiplicative and affine ciphers. It seems simple enough: just look at the numbers and write down any factors you can think of, then compare the lists and find the largest number that is in both of them.

But to program a computer to do it, we’ll need to be more precise. We need an algorithm (that is, a specific series of steps we execute) to find the GCD of two numbers.

A mathematician who lived 2,000 years ago named Euclid came up with an algorithm for finding the greatest common divisor of two numbers. Here’s a statue of Euclid at Oxford University:

Figure 14-7. Euclid may or may not have looked like this.

Of course since no likeness or description of Euclid exists in any historical document, no one knows what he actually looked like at all. (Artists and sculptors just make it up.) This statue could also be called, “Statue of Some Guy with a Beard”.

Euclid’s GCD algorithm is short. Here’s a function that implements his algorithm as Python code, which returns the GCD of integers a and b:

def gcd(a, b):

while a != 0:

a, b = b % a, a

return b

If you call this function from the interactive shell and pass it 24 and 30 for the a and b parameters, the function will return 6. You could have done this yourself with pencil and paper. But since you’ve programmed a computer to do this, it can easily handle very large numbers:

>>> gcd(24, 30)

6

>>> gcd(409119243, 87780243)

6837

>>>

How Euclid’s algorithm works is beyond the scope of this book, but you can rely on this function to return the GCD of the two integers you pass it.

“Relatively Prime”

Relatively prime numbers are used for the multiplicative and affine ciphers. We say that two numbers are relatively prime if their greatest common divisor is 1. That is, the numbers a and b are relatively prime to each other if gcd(a, b) == 1.

Practice Exercises, Chapter 14, Set C

Practice exercises can be found at http://invpy.com/hackingpractice14C.

The Multiplicative Cipher

In the Caesar cipher, encrypting and decrypting symbols involved converting them to numbers, adding or subtracting the key, and then converting the new number back to a symbol.

What if instead of adding the key to do the encryption, we use multiplication? There would be a “wrap-around” issue, but the mod operator would solve that. For example, let’s use the symbol set of just uppercase letters and the key 7. Here’s a list of the letters and their numbers:

0

1

2

3

4

5

6

7

8

9

10

11

12

A

B

C

D

E

F

G

H

I

J

K

L

M

13

14

15

16

17

18

19

20

21

22

23

24

25

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

To find what the symbol F encrypts to with key 7, multiply its number (5) by 7 and mod by 26 (to handle the “wrap-around” with our 26-symbol set). Then use that number’s symbol. (5 × 7) mod 26 = 9, and 9 is the number for the symbol J. So F encrypts to J in the multiplicative cipher with key 7. Do the same with all of the letters:

Table 14-1. Encrypting each letter with the multiplicative cipher with key 7.

Plaintext

Symbol

Number

Encryption with Key 7

Ciphertext

Symbol

A

0

(0 * 7) % 26 = 0

A

B

1

(1 * 7) % 26 = 7

H

C

2

(2 * 7) % 26 = 14

O

D

3

(3 * 7) % 26 = 21

V

E

4

(4 * 7) % 26 = 2

C

F

5

(5 * 7) % 26 = 9

J

G

6

(6 * 7) % 26 = 16

Q

H

7

(7 * 7) % 26 = 23

X

I

8

(8 * 7) % 26 = 4

E

J

9

(9 * 7) % 26 = 11

L

K

10

(10 * 7) % 26 = 18

S

L

11

(11 * 7) % 26 = 25

Y

M

12

(12 * 7) % 26 = 6

G

N

13

(13 * 7) % 26 = 13

N

O

14

(14 * 7) % 26 = 20

U

P

15

(15 * 7) % 26 = 1

B

Q

16

(16 * 7) % 26 = 8

I

R

17

(17 * 7) % 26 = 15

P

S

18

(18 * 7) % 26 = 22

W

T

19

(19 * 7) % 26 = 3

D

U

20

(20 * 7) % 26 = 10

K

V

21

(21 * 7) % 26 = 17

R

W

22

(22 * 7) % 26 = 24

Y

X

23

(23 * 7) % 26 = 5

F

Y

24

(24 * 7) % 26 = 12

M

Z

25

(25 * 7) % 26 = 19

T

You will end up with this mapping for the key 7: to encrypt you replace the top letter with the letter under it, and vice versa to decrypt:

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

H

O

V

C

J

Q

X

E

L

S

Y

G

N

U

B

I

P

W

D

K

R

Y

F

M

T

It wouldn’t take long for an attacker to brute-force through the first 7 keys. But the good thing about the multiplicative cipher is that it can work with very large keys, like 8,953,851 (which has the letters of the alphabet map to the letters AXUROLIFCZWTQNKHEBYVSPMJGD). It would take quite some time for a computer to brute-force through nearly nine million keys.

Practice Exercises, Chapter 14, Set D

Practice exercises can be found at http://invpy.com/hackingpractice14D.

Multiplicative Cipher + Caesar Cipher = The Affine Cipher

One downside to the multiplicative cipher is that the letter A always maps to the letter A. This is because A’s number is 0, and 0 multiplied by anything will always be 0. We can fix this by adding a second key that performs a Caesar cipher encryption after the multiplicative cipher’s multiplication and modding is done.

This is called the affine cipher. The affine cipher has two keys. “Key A” is the integer that the letter’s number is multiplied by. After modding this number by 26, “Key B” is the integer that is added to the number. This sum is also modded by 26, just like in the original Caesar cipher.

This means that the affine cipher has 26 times as many possible keys as the multiplicative cipher. It also ensures that the letter A does not always encrypt to the letter A.

Figure 14-8. The encryption and decryption are mirrors of each other.

The First Affine Key Problem

There are two problems with the multiplicative cipher’s key and affine cipher’s Key A. You cannot just use any number for Key A. For example, if you chose the key 8, here is the mapping you would end up with:

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

I

Q

Y

G

O

W

E

M

U

C

K

S

A

I

Q

Y

G

O

W

E

M

U

C

K

S

This mapping doesn’t work at all! Both the letters C and P encrypt to Q. When we encounter a Q in the ciphertext, how do we know which it decrypts to?! The same problem exists for encrypting A and N, F and S, and many others.

So some keys will work in the affine cipher while others will not. The secret to determining which key numbers will work is this:

In the affine cipher, the Key A number and the size of the symbol set must be relatively prime to each other. That is, gcd(key, size of symbol set) == 1.

We can use the gcd() function we wrote earlier to test this. The key 7 works as an affine cipher key because gcd(7, 26) returns 1. The larger key 8,953,851 will also work because gcd(8953851, 26) also returns 1. However, the key 8 did not work because gcd(8, 26) is 2. If the GCD of the key and the symbol set size is not 1, then they are not relatively prime and the key won’t work.

The math we learned earlier sure is coming in handy now. We need to know how mod works because it is part of the GCD and affine cipher algorithms. And we need to know how GCD works because that will tell us if a pair of numbers is relatively prime. And we need to know if a pair of numbers is relatively prime or not in order to choose valid keys for the affine cipher.

The second problem with affine cipher’s key is discussed in the next chapter.

Decrypting with the Affine Cipher

In the Caesar cipher, we used addition to encrypt and subtraction to decrypt. In the affine cipher, we use multiplication to encrypt. You might think that we need to divide to decrypt with the affine cipher. But if you try this yourself, you’ll quickly see that it doesn’t work. To decrypt with the affine cipher, we need to multiply by the key’s modular inverse.

A modular inverse (which we will call i) of two numbers (which we will call a and m) is such that (a * i) % m == 1. For example, let’s find the modular inverse of “5 mod 7”. There is some number i where (5 * i) % 7 will equal “1”. We will have to brute-force this calculation:

· 1 isn’t the modular inverse of 5 mod 7, because (5 * 1) % 7 = 5.

· 2 isn’t the modular inverse of 5 mod 7, because (5 * 2) % 7 = 3.

· 3 is the modular inverse of 5 mod 7, because (5 * 3) % 7 = 1.

The encryption key and decryption keys for the affine cipher are two different numbers. The encryption key can be anything we choose as long as it is relatively prime to 26 (which is the size of our symbol set). If we have chosen the key 7 for encrypting with the affine cipher, the decryption key will be the modular inverse of 7 mod 26:

· 1 is not the modular inverse of 7 mod 26, because (7 * 1) % 26 = 7.

· 2 is not the modular inverse of 7 mod 26, because (7 * 2) % 26 = 14.

· 3 is not the modular inverse of 7 mod 26, because (7 * 3) % 26 = 21.

· 4 is not the modular inverse of 7 mod 26, because (7 * 4) % 26 = 2.

· 5 is not the modular inverse of 7 mod 26, because (7 * 5) % 26 = 9.

· 6 is not the modular inverse of 7 mod 26, because (7 * 6) % 26 = 16.

· 7 is not the modular inverse of 7 mod 26, because (7 * 7) % 26 = 23.

· 8 is not the modular inverse of 7 mod 26, because (7 * 8) % 26 = 4.

· 9 is not the modular inverse of 7 mod 26, because (7 * 9) % 26 = 11.

· 10 is not the modular inverse of 7 mod 26, because (7 * 10) % 26 = 18.

· 11 is not the modular inverse of 7 mod 26, because (7 * 11) % 26 = 25.

· 12 is not the modular inverse of 7 mod 26, because (7 * 12) % 26 = 6.

· 13 is not the modular inverse of 7 mod 26, because (7 * 13) % 26 = 13.

· 14 is not the modular inverse of 7 mod 26, because (7 * 14) % 26 = 20.

· 15 is the modular inverse of 7 mod 26, because (7 * 15) % 26 = 1.

So the affine cipher decryption key is 15. To decrypt a ciphertext letter, we take that letter’s number and multiply it by 15, and then mod 26. This will be the number of the original plaintext’s letter.

Finding Modular Inverses

In order to calculate the modular inverse to get the decryption key, we could take a brute-force approach and start testing the integer 1, and then 2, and then 3, and so on like we did above. But this will be very time-consuming for large keys like 8,953,851.

There is an algorithm for finding the modular inverse just like there was for finding the Greatest Common Divisor. Euclid’s Extended Algorithm can be used to find the modular inverse of a number:

def findModInverse(a, m):

if gcd(a, m) != 1:

return None # no mod inverse exists if a & m aren't relatively prime

u1, u2, u3 = 1, 0, a

v1, v2, v3 = 0, 1, m

while v3 != 0:

q = u3 // v3 # // is the integer division operator

v1, v2, v3, u1, u2, u3 = (u1 - q * v1), (u2 - q * v2), (u3 - q * v3), v1, v2, v3

return u1 % m

You don’t have to understand how Euclid’s Extended Algorithm works in order to make use of it. We’re just going to have our programs call this function. If you’d like to learn more about how it works, you can read http://invpy.com/euclid.

The // Integer Division Operator

You may have noticed the // operator used in the findModInverse() function above. This is the integer division operator. It divides two numbers and rounds down. Try typing the following into the interactive shell:

>>> 41 // 7

5

>>> 41 / 7

5.857142857142857

>>> 10 // 5

2

>>> 10 / 5

2.0

>>>

Notice that an expression with the // integer division operator always evaluates to an int, not a float.

Source Code of the cryptomath Module

The gcd() and findModInverse() functions will be used by more than one of our cipher programs later in this book, so we should put this code into a separate module. In the file editor, type in the following code and save it as cryptomath.py:

Source code for cryptomath.py

1. # Cryptomath Module

2. # http://inventwithpython.com/hacking (BSD Licensed)

3.

4. def gcd(a, b):

5. # Return the GCD of a and b using Euclid's Algorithm

6. while a != 0:

7. a, b = b % a, a

8. return b

9.

10.

11. def findModInverse(a, m):

12. # Returns the modular inverse of a % m, which is

13. # the number x such that a*x % m = 1

14.

15. if gcd(a, m) != 1:

16. return None # no mod inverse if a & m aren't relatively prime

17.

18. # Calculate using the Extended Euclidean Algorithm:

19. u1, u2, u3 = 1, 0, a

20. v1, v2, v3 = 0, 1, m

21. while v3 != 0:

22. q = u3 // v3 # // is the integer division operator

23. v1, v2, v3, u1, u2, u3 = (u1 - q * v1), (u2 - q * v2), (u3 - q * v3), v1, v2, v3

24. return u1 % m

The GCD algorithm is described earlier in this chapter. The findModInverse() function implements an algorithm called Euclid’s Extended Algorithm. How these functions work is beyond the scope of this book, but you don’t have to know how the code works in order to make use of it.

From the interactive shell, you can try out these functions after importing the module. Try typing the following into the interactive shell:

>>> import cryptomath

>>> cryptomath.gcd(24, 32)

8

>>> cryptomath.gcd(37, 41)

1

>>> cryptomath.findModInverse(7, 26)

15

>>> cryptomath.findModInverse(8953851, 26)

17

>>>

Practice Exercises, Chapter 14, Set E

Practice exercises can be found at http://invpy.com/hackingpractice14E.

Summary

Since the multiplicative cipher is the same thing as the affine cipher except using Key B of 0, we won’t have a separate program for the multiplicative cipher. And since it is just a less secure version of the affine cipher, you shouldn’t use it anyway. The source code to our affine cipher program will be presented in the next chapter.

The math presented in this chapter isn’t so hard to understand. Modding with the % operator finds the “remainder” between two numbers. The Greatest Common Divisor function returns the largest number that can divide two numbers. If the GCD of two numbers is 1, we say that those numbers are “relatively prime” to each other. The most useful algorithm to find the GCD of two numbers is Euclid’s Algorithm.

The affine cipher is sort of like the Caesar cipher, except it uses multiplication instead of addition to encrypt letters. Not all numbers will work as keys though. The key number and the size of the symbol set must be relatively prime towards each other.

To decrypt with the affine cipher we also use multiplication. To decrypt, the modular inverse of the key is the number that is multiplied. The modular inverse of “a mod m” is a number i such that (a * i) % m == 1. To write a function that finds the modular inverse of a number, we use Euclid’s Extended Algorithm.

Once we understand these math concepts, we can write a program for the affine cipher in the next chapter.