Hacking Secret Ciphers with Python (2013)

Chapter 4: STRINGS AND WRITING PROGRAMS

Topics Covered In This Chapter:

· Strings

· String concatenation and replication

· Using IDLE to write source code

· Saving and running programs in IDLE

· The print() function

· The input() function

· Comments

That's enough of integers and math for now. Python is more than just a calculator. In this chapter, we will learn how to store text in variables, combine text together, and display text on the screen. We will also make our first program, which greets the user with the text, “Hello World!” and lets the user type in a name.

Strings

In Python, we work with little chunks of text called string values (or simply strings). All of our cipher and hacking programs deal with string values to turn plaintext like 'One if by land, two if by space.' into ciphertext like 'Tqe kg im npqv, jst kg im oapxe.'. The plaintext and ciphertext are represented in our program as string values, and there’s a lot of ways that Python code can manipulate these values.

We can store string values inside variables just like integer and floating point values. When we type strings, we put them in between two single quotes (') to show where the string starts and ends. Type this in to the interactive shell:

>>> spam = 'hello'

>>>

The single quotes are not part of the string value. Python knows that 'hello' is a string and spam is a variable because strings are surrounded by quotes and variable names are not.

If you type spam into the shell, you should see the contents of the spam variable (the 'hello' string.) This is because Python will evaluate a variable to the value stored inside it: in this case, the string 'hello'.

>>> spam = 'hello'

>>> spam

'hello'

>>>

Strings can have almost any keyboard character in them. (We’ll talk about special “escape characters” later.) These are all examples of strings:

>>> 'hello'

'hello'

>>> 'Hi there!'

'Hi there!’

>>> 'KITTENS'

'KITTENS'

>>> ''

>>> '7 apples, 14 oranges, 3 lemons'

'7 apples, 14 oranges, 3 lemons'

>>> 'Anything not pertaining to elephants is irrelephant.'

'Anything not pertaining to elephants is irrelephant.'

>>> 'O*&#wY%*&OcfsdYO*&gfC%YO*&%3yc8r2'

'O*&#wY%*&OcfsdYO*&gfC%YO*&%3yc8r2'

Notice that the '' string has zero characters in it; there is nothing in between the single quotes. This is known as a blank string or empty string.

String Concatenation with the + Operator

You can add together two string values into one new string value by using the + operator. Doing this is called string concatenation. Try entering 'Hello' + 'World! ' into the shell:

>>> 'Hello' + 'World!'

'HelloWorld!'

>>>

To put a space between “Hello” and “World!”, put a space at the end of the 'Hello' string and before the single quote, like this:

>>> 'Hello ' + 'World!'

'Hello World!'

>>>

Remember, Python will concatenate exactly the strings you tell it to concatenate. If you want a space in the resulting string, there must be a space in one of the two original strings.

The + operator can concatenate two string values into a new string value ('Hello ' + 'World!' to 'Hello World!'), just like it could add two integer values into a new integer value (2 + 2 to 4). Python knows what the + operator should do because of the data types of the values. Every value is of a data type. The data type of the value 'Hello' is a string. The data type of the value 5 is an integer. The data type of the data that tells us (and the computer) what kind of data the value is.

The + operator can be used in an expression with two strings or two integers. If you try to use the + operator with a string value and an integer value, you will get an error. Type this code into the interactive shell:

>>> 'Hello' + 42

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

TypeError: Can't convert 'int' object to str implicitly

>>> 'Hello' + '42'

'Hello42'

>>>

String Replication with the * Operator

You can also use the * operator on a string and an integer to do string replication. This will replicate (that is, repeat) a string by however many times the integer value is. Type the following into the interactive shell:

>>> 'Hello' * 3

'HelloHelloHello'

>>> spam = 'Abcdef'

>>> spam = spam * 3

>>> spam

'AbcdefAbcdefAbcdef'

>>> spam = spam * 2

>>> spam

'AbcdefAbcdefAbcdefAbcdefAbcdefAbcdef'

>>>

The * operator can work with two integer values (it will multiply them). It can also work with a string value and an integer value (it will replicate the string). But it cannot work with two string values, which would cause an error:

>>> 'Hello' * 'world!'

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

TypeError: can't multiply sequence by non-int of type 'str'

>>>

What string concatenation and string replication show is that operators in Python can do different things based on the data types of the values they operate on. The + operator can do addition or string concatenation. The * operator can do multiplication or string replication.

Printing Values with the print() Function

There is another type of Python instruction called a print() function call. Type the following into the interactive shell:

>>> print('Hello!')

Hello!

>>> print(42)

>>>

A function (like print() in the above example) has code in that performs a task, such as printing values on the screen. There are many different functions that come with Python. To call a function means to execute the code that is inside the function.

The instructions in the above example pass a value to the print() function in between the parentheses, and the print() function will print the value to the screen. The values that are passed when a function is called are called arguments. (Arguments are the same as values though. We just call values this when they are passed to function calls.) When we begin to write programs, the way we make text appear on the screen is with the print() function.

You can pass an expression to the print() function instead of a single value. This is because the value that is actually passed to the print() function is the evaluated value of that expression. Try this string concatenation expression in the interactive shell:

>>> spam = 'Al'

>>> print('Hello, ' + spam)

Hello, Al

>>>

The 'Hello, ' + spam expression evaluates to 'Hello, ' + spam, which then evaluates to the string value 'Hello, Al'. This string value is what is passed to the print() function call.

Escape Characters

Sometimes we might want to use a character that cannot easily be typed into a string value. For example, we might want to put a single quote character as part of a string. But we would get an error message because Python thinks that single quote is the quote ending the string value, and the text after it is bad Python code instead of just the rest of the string. Type the following into the interactive shell:

>>> print('Al's cat is named Zophie.')

File "<stdin>", line 1

print('Al's cat is named Zophie.')

SyntaxError: invalid syntax

>>>

To use a single quote in a string, we need to use escape characters. An escape character is a backslash character followed by another character. For example, \t, \n or \'. The slash tells Python that the character after the slash has a special meaning. Type the following into the interactive shell:

>>> print('Al\'s cat is named Zophie.')

Al's cat is named Zophie.

>>>

An escape character helps us print out letters that are hard to type into the source code. Table 4-1 shows some escape characters in Python:

Table 4-1. Escape Characters

Escape Character	What Is Actually Printed
\\	Backslash (\)
\'	Single quote (')
\"	Double quote (")
\n	Newline
\t	Tab

The backslash always precedes an escape character, even if you just want a backslash in your string. This line of code would not work:

>>> print('He flew away in a green\teal helicopter.')

He flew away in a green eal helicopter.

This is because the “t” in “teal” was seen as an escape character since it came after a backslash. The escape character \t simulates pushing the Tab key on your keyboard. Escape characters are there so that strings can have characters that cannot be typed in.

Instead, try this code:

>>> print('He flew away in a green\\teal helicopter.')

He flew away in a green\teal helicopter.

Quotes and Double Quotes

Strings don’t always have to be in between two single quotes in Python. You can use double quotes instead. These two lines print the same thing:

>>> print('Hello world')

Hello world

>>> print("Hello world")

Hello world

But you cannot mix single and double quotes. This line will give you an error:

>>> print('Hello world")

SyntaxError: EOL while scanning single-quoted string

>>>

I like to use single quotes so I don’t have to hold down the shift key on the keyboard to type them. It’s easier to type, and the computer doesn’t care either way.

But remember, just like you have to use the escape character \' to have a single quote in a string surrounded by single quotes, you need the escape character \" to have a double quote in a string surrounded by double quotes. For example, look at these two lines:

>>> print('I asked to borrow Alice\'s car for a week. She said, "Sure."')

I asked to borrow Alice's car for a week. She said, "Sure."

>>> print("She said, \"I can't believe you let him borrow your car.\"")

She said, "I can't believe you let him borrow your car."

You do not need to escape double quotes in single-quote strings, and you do not need to escape single quotes in the double-quote strings. The Python interpreter is smart enough to know that if a string starts with one kind of quote, the other kind of quote doesn’t mean the string is ending.

Practice Exercises, Chapter 4, Set A

Practice exercises can be found at http://invpy.com/hackingpractice4A.

Indexing

Your encryption programs will often need to get a single character from a string. Indexing is the adding of square brackets [ and ] to the end of a string value (or a variable containing a string) with a number between them. This number is called the index, and tells Python which position in the string has the character you want. The index of the first character in a string is 0. The index 1 is for the second character, the index 2 is for the third character, and so on.

Type the following into the interactive shell:

>>> spam = 'Hello'

>>> spam[0]

'H'

>>> spam[1]

'e'

>>> spam[2]

'l'

Notice that the expression spam[0] evaluates to the string value 'H', since H is the first character in the string 'Hello'. Remember that indexes start at 0, not 1. This is why the H’s index is 0, not 1.

Figure 4-1. The string 'Hello' and its indexes.

Indexing can be used with a variable containing a string value or a string value by itself such as 'Zophie'. Type this into the interactive shell:

>>> 'Zophie'[2]

'p'

The expression 'Zophie'[2] evaluates to the string value 'p'. This 'p' string is just like any other string value, and can be stored in a variable. Type the following into the interactive shell:

>>> eggs = 'Zopie'[2]

>>> eggs

'p'

>>>

If you enter an index that is too large for the string, Python will display an “index out of range” error message. There are only 5 characters in the string 'Hello'. If we try to use the index 10, then Python will display an error saying that our index is “out of range”:

>>> 'Hello'[10]

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

IndexError: string index out of range

>>>

Negative Indexes

Negative indexes start at the end of a string and go backwards. The negative index -1 is the index of the last character in a string. The index -2 is the index of the second to last character, and so on.

Type the following into the interactive shell:

>>> 'Hello'[-1]

'o'

>>> 'Hello'[-2]

'l'

>>> 'Hello'[-3]

'l'

>>> 'Hello'[-4]

'e'

>>> 'Hello'[-5]

'H'

>>> 'Hello'[0]

'H'

>>>

Notice that -5 and 0 are the indexes for the same character. Most of the time your code will use positive indexes, but sometimes it will be easier to use negative indexes.

Slicing

If you want to get more than one character from a string, you can use slicing instead of indexing. A slice also uses the [ and ] square brackets but has two integer indexes instead of one. The two indexes are separate by a : colon. Type the following into the interactive shell:

>>> 'Howdy'[0:3]

'How'

>>>

The string that the slice evaluates to begins at the first index and goes up to, but not including, the second index. The 0 index of the string value 'Howdy' is the H and the 3 index is the d. Since a slice goes up to but not including the second index, the slice 'Howdy'[0:3] evaluates to the string value 'How'.

Try typing the following into the interactive shell:

>>> 'Hello world!'[0:5]

'Hello'

>>> 'Hello world!'[6:12]

'world!'

>>> 'Hello world!'[-6:-1]

'world'

>>> 'Hello world!'[6:12][2]

'r'

>>>

Notice that the expression 'Hello world!'[6:12][2] first evaluates to 'world!'[2] which is an indexing that further evaluates to 'r'.

Unlike indexes, slicing will never give you an error if you give it too large of an index for the string. It will just return the widest matching slice it can:

>>> 'Hello'[0:999]

'Hello'

>>> 'Hello'[2:999]

'llo'

>>> 'Hello'[1000:2000]

>>>

The expression 'Hello'[1000:2000] returns a blank string because the index 1000 is after the end of the string, so there are no possible characters this slice could include.

Blank Slice Indexes

If you leave out the first index of a slice, Python will automatically think you want to specify index 0 for the first index. The expressions 'Howdy'[0:3] and 'Howdy'[:3] evaluate the same string:

>>> 'Howdy'[:3]

'How'

>>> 'Howdy'[0:3]

'How'

>>>

If you leave out the second index, Python will automatically think you want to specify the rest of the string:

>>> 'Howdy'[2:]

'wdy'

>>>

Slicing is a simple way to get a “substring” from a larger string. (But really, a “substring” is still just a string value like any other string.) Try typing the following into the shell:

>>> myName = 'Zophie the Fat Cat'

>>> myName[-7:]

'Fat Cat'

>>> myName[:10]

'Zophie the'

>>> myName[7:]

'the Fat Cat'

>>>

Practice Exercises, Chapter 4, Set B

Practice exercises can be found at http://invpy.com/hackingpractice4B.

Writing Programs in IDLE’s File Editor

Until now we have been typing instructions one at a time into the interactive shell. When we write programs though, we type in several instructions and have them run without waiting on us for the next one. Let’s write our first program!

The name of the software program that provides the interactive shell is called IDLE, the Interactive DeveLopement Environment. IDLE also has another part besides the interactive shell called the file editor.

At the top of the Python shell window, click on the File ► New Window. A new blank window will appear for us to type our program in. This window is the file editor. The bottom right of the file editor window will show you line and column that the cursor currently is in the file.

Figure 4-2. The file editor window. The cursor is at line 1, column 0.

You can always tell the difference between the file editor window and the interactive shell window because the interactive shell will always have the >>> prompt in it.

Hello World!

A tradition for programmers learning a new language is to make their first program display the text “Hello world!” on the screen. We’ll create our own Hello World program now.

Enter the following text into the new file editor window. We call this text the program’s source code because it contains the instructions that Python will follow to determine exactly how the program should behave.

Source Code of Hello World

This code can be downloaded from http://invpy.com/hello.py. If you get errors after typing this code in, compare it to the book’s code with the online diff tool at http://invpy.com/hackingdiff (or email me at al@inventwithpython.com if you are still stuck.)

hello.py

1. # This program says hello and asks for my name.

2. print('Hello world!')

3. print('What is your name?')

4. myName = input()

5. print('It is good to meet you, ' + myName)

The IDLE program will give different types of instructions different colors. After you are done typing this code in, the window should look like this:

Figure 4-3. The file editor window will look like this after you type in the code.

Saving Your Program

Once you’ve entered your source code, save it so that you won’t have to retype it each time we start IDLE. To do so, from the menu at the top of the File Editor window, choose File ► Save As. The Save As window should open. Enter hello.py in the File Name field, then click Save. (See Figure 4-4.)

You should save your programs every once in a while as you type them. That way, if the computer crashes or you accidentally exit from IDLE you won’t lose everything you’ve typed. As a shortcut, you can press Ctrl-S on Windows and Linux or ⌘-S on OS X to save your file.

Figure 4-4. Saving the program.

A video tutorial of how to use the file editor is available from this book's website at http://invpy.com/hackingvideos.

Running Your Program

Now it’s time to run our program. Click on Run ► Run Module or just press the F5 key on your keyboard. Your program should run in the shell window that appeared when you first started IDLE. Remember, you have to press F5 from the file editor’s window, not the interactive shell’s window.

When your program asks for your name, go ahead and enter it as shown in Figure 4-5:

Figure 4-5. What the interactive shell looks like when running the “Hello World” program.

Now when you push Enter, the program should greet you (the user, that is, the one using the program) by name. Congratulations! You’ve written your first program. You are now a beginning computer programmer. (You can run this program again if you like by pressing F5 again.)

If you get an error that looks like this:

Hello world!

What is your name?

Albert

Traceback (most recent call last):

File "C:/Python27/hello.py", line 4, in <module>

myName = input()

File "<string>", line 1, in <module>

NameError: name 'Albert' is not defined

...this means you are running the program with Python 2, instead of Python 3. This makes the penguin in the first chapter sad. (The error is caused by the input() function call, which does different things in Python 2 and 3.) Please install Python 3 from http://python.org/getit before continuing.

Opening The Programs You’ve Saved

Close the file editor by clicking on the X in the top corner. To reload a saved program, choose File ► Open from the menu. Do that now, and in the window that appears choose hello.py and press the Open button. Your saved hello.py program should open in the File Editor window.

How the “Hello World” Program Works

Each line that we entered is an instruction that tells Python exactly what to do. A computer program is a lot like a recipe. Do the first step first, then the second, and so on until you reach the end. Each instruction is followed in sequence, beginning from the very top of the program and working down the list of instructions. After the program executes the first line of instructions, it moves on and executes the second line, then the third, and so on.

We call the program’s following of instructions step-by-step the program execution, or just the execution for short. The execution starts at the first line of code and then moves downward. The execution can skip around instead of just going from top to bottom, and we’ll find out how to do this in the next chapter.

Let’s look at our program one line at a time to see what it’s doing, beginning with line number 1.

Comments

hello.py

1. # This program says hello and asks for my name.

This line is called a comment. Comments are not for the computer, but for you, the programmer. The computer ignores them. They’re used to remind you of what the program does or to tell others who might look at your code what it is that your code is trying to do. Any text following a #sign (called the pound sign) is a comment. (To make it easier to read the source code, this book prints out comments in a light gray-colored text.)

Programmers usually put a comment at the top of their code to give the program a title. The IDLE program displays comments in red text to help them stand out.

Functions

A function is kind of like a mini-program inside your program. It contains lines of code that are executed from top to bottom. Python provides some built-in functions that we can use (you’ve already used the print() function). The great thing about functions is that we only need to know what the function does, but not how it does it. (You need to know that the print() function displays text on the screen, but you don’t need to know how it does this.)

A function call is a piece of code that tells our program to run the code inside a function. For example, your program can call the print() function whenever you want to display a string on the screen. The print() function takes the value you type in between the parentheses as input and displays the text on the screen. Because we want to display Hello world! on the screen, we type the print function name, followed by an opening parenthesis, followed by the 'Hello world!' string and a closing parenthesis.

The print() function

hello.py

2. print('Hello world!')

3. print('What is your name?')

This line is a call to the print() function (with the string to be printed going inside the parentheses). We add parentheses to the end of function names to make it clear that we’re referring to a function named print(), not a variable named print. The parentheses at the end of the function let us know we are talking about a function, much like the quotes around the number '42' tell us that we are talking about the string '42' and not the integer 42.

Line 3 is another print() function call. This time, the program displays “What is your name?”

The input() function

hello.py

4. myName = input()

Line 4 has an assignment statement with a variable (myName) and a function call (input()). When input() is called, the program waits for the user to type in some text and press Enter. The text string that the user types in (their name) becomes the string value that is stored in myName.

Like expressions, function calls evaluate to a single value. The value that the function call evaluates to is called the return value. (In fact, we can also use the word “returns” to mean the same thing for function calls as “evaluates”.) In this case, the return value of the input() function is the string that the user typed in-their name. If the user typed in Albert, the input() function call evaluates (that is, returns) to the string 'Albert'.

The function named input() does not need any arguments (unlike the print() function), which is why there is nothing in between the parentheses.

hello.py

5. print('It is good to meet you, ' + myName)

For line 5’s print() call, we use the plus operator (+) to concatenate the string 'It is good to meet you, ' and the string stored in the myName variable, which is the name that our user input into the program. This is how we get the program to greet us by name.

Ending the Program

Once the program executes the last line, it stops. At this point it has terminated or exited and all of the variables are forgotten by the computer, including the string we stored in myName. If you try running the program again and typing a different name it will print that name.

Hello world!

What is your name?

Alan

It is good to meet you, Alan

Remember, the computer only does exactly what you program it to do. In this program it is programmed to ask you for your name, let you type in a string, and then say hello and display the string you typed.

But computers are dumb. The program doesn’t care if you type in your name, someone else’s name, or just something silly. You can type in anything you want and the computer will treat it the same way:

Hello world!

What is your name?

poop

It is good to meet you, poop

Practice Exercises, Chapter 4, Set C

Practice exercises can be found at http://invpy.com/hackingpractice4C.

Summary

Writing programs is just about knowing how to speak the computer’s language. While you learned a little bit of this in the last chapter, in this chapter you’ve put together several Python instructions to make a complete program that asks for the user’s name and then greets them.

All of our programs later in this book will be more complex and sophisticated, but don’t worry. The programs will all be explained line by line. And you can always enter instructions into the interactive shell to see what they do before they are all put into a complete program.

Now let’s start with our first encryption program: the reverse cipher.