Py Filling: Lists, Tuples, Dictionaries, and Sets - Introducing Python (2014)

Introducing Python (2014)

Chapter 3. Py Filling: Lists, Tuples, Dictionaries, and Sets

In Chapter 2 we started at the bottom with Python’s basic data types: booleans, integers, floats, and strings. If you think of those as atoms, the data structures in this chapter are like molecules. That is, we combine those basic types in more complex ways. You will use these every day. Much of programming consists of chopping and glueing data into specific forms, and these are your hacksaws and glue guns.

Lists and Tuples

Most computer languages can represent a sequence of items indexed by their integer position: first, second, and so on down to the last. You’ve already seen Python strings, which are sequences of characters. You’ve also had a little preview of lists, which you’ll now see are sequences of anything.

Python has two other sequence structures: tuples and lists. These contain zero or more elements. Unlike strings, the elements can be of different types. In fact, each element can be any Python object. This lets you create structures as deep and complex as you like.

Why does Python contain both lists and tuples? Tuples are immutable; when you assign elements to a tuple, they’re baked in the cake and can’t be changed. Lists are mutable, meaning you can insert and delete elements with great enthusiasm. I’ll show many examples of each, with an emphasis on lists.

NOTE

By the way, you might hear two different pronunciations for tuple. Which is right? If you guess wrong, do you risk being considered a Python poseur? No worries. Guido van Rossum, the creator of Python, tweeted “I pronounce tuple too-pull on Mon/Wed/Fri and tub-pull on Tue/Thu/Sat. On Sunday I don’t talk about them. :)”

Lists

Lists are good for keeping track of things by their order, especially when the order and contents might change. Unlike strings, lists are mutable. You can change a list in-place, add new elements, and delete or overwrite existing elements. The same value can occur more than once in a list.

Create with [] or list()

A list is made from zero or more elements, separated by commas, and surrounded by square brackets:

>>> empty_list = [ ]

>>> weekdays = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']

>>> big_birds = ['emu', 'ostrich', 'cassowary']

>>> first_names = ['Graham', 'John', 'Terry', 'Terry', 'Michael']

You can also make an empty list with the list() function:

>>> another_empty_list = list()

>>> another_empty_list

[]

NOTE

Comprehensions shows one more way to create a list, called a list comprehension.

The weekdays list is the only one that actually takes advantage of list order. The first_names list shows that values do not need to be unique.

NOTE

If you only want to keep track of unique values and don’t care about order, a Python set might be a better choice than a list. In the previous example, big_birds could have been a set. You’ll read about sets a little later in this chapter.

Convert Other Data Types to Lists with list()

Python’s list() function converts other data types to lists. The following example converts a string to a list of one-character strings:

>>> list('cat')

['c', 'a', 't']

This example converts a tuple (coming up after lists in this chapter) to a list:

>>> a_tuple = ('ready', 'fire', 'aim')

>>> list(a_tuple)

['ready', 'fire', 'aim']

As I mentioned earlier in Split with split(), use split() to chop a string into a list by some separator string:

>>> birthday = '1/6/1952'

>>> birthday.split('/')

['1', '6', '1952']

What if you have more than one separator string in a row in your original string? Well, you get an empty string as a list item:

>>> splitme = 'a/b//c/d///e'

>>> splitme.split('/')

['a', 'b', '', 'c', 'd', '', '', 'e']

If you had used the two-character separator string // instead, you would get this:

>>> splitme = 'a/b//c/d///e'

>>> splitme.split('//')

>>>

['a/b', 'c/d', '/e']

Get an Item by Using [ offset ]

As with strings, you can extract a single value from a list by specifying its offset:

>>> marxes = ['Groucho', 'Chico', 'Harpo']

>>> marxes[0]

'Groucho'

>>> marxes[1]

'Chico'

>>> marxes[2]

'Harpo'

Again, as with strings, negative indexes count backward from the end:

>>> marxes[-1]

'Harpo'

>>> marxes[-2]

'Chico'

>>> marxes[-3]

'Groucho'

>>>

NOTE

The offset has to be a valid one for this list—a position you have assigned a value previously. If you specify an offset before the beginning or after the end, you’ll get an exception (error). Here’s what happens if we try to get the sixth Marx brother (offset 5 counting from 0), or the fifth before the end:

>>> marxes = ['Groucho', 'Chico', 'Harpo']

>>> marxes[5]

Traceback (most recent call last):

File "<stdin>", line 1, in<module>

IndexError: list index out of range

>>> marxes[-5]

Traceback (most recent call last):

File "<stdin>", line 1, in<module>

IndexError: list index out of range

Lists of Lists

Lists can contain elements of different types, including other lists, as illustrated here:

>>> small_birds = ['hummingbird', 'finch']

>>> extinct_birds = ['dodo', 'passenger pigeon', 'Norwegian Blue']

>>> carol_birds = [3, 'French hens', 2, 'turtledoves']

>>> all_birds = [small_birds, extinct_birds, 'macaw', carol_birds]

So what does all_birds, a list of lists, look like?

>>> all_birds

[['hummingbird', 'finch'], ['dodo', 'passenger pigeon', 'Norwegian Blue'], 'macaw',

[3, 'French hens', 2, 'turtledoves']]

Let’s look at the first item in it:

>>> all_birds[0]

['hummingbird', 'finch']

The first item is a list: in fact, it’s small_birds, the first item we specified when creating all_birds. You should be able to guess what the second item is:

>>> all_birds[1]

['dodo', 'passenger pigeon', 'Norwegian Blue']

It’s the second item we specified, extinct_birds. If we want the first item of extinct_birds, we can extract it from all_birds by specifying two indexes:

>>> all_birds[1][0]

'dodo'

The [1] refers to the list that’s the second item in all_birds, whereas the [0] refers to the first item in that inner list.

Change an Item by [ offset ]

Just as you can get the value of a list item by its offset, you can change it:

>>> marxes = ['Groucho', 'Chico', 'Harpo']

>>> marxes[2] = 'Wanda'

>>> marxes

['Groucho', 'Chico', 'Wanda']

Again, the list offset needs to be a valid one for this list.

You can’t change a character in a string in this way, because strings are immutable. Lists are mutable. You can change how many items a list contains, and the items themselves.

Get a Slice to Extract Items by Offset Range

You can extract a subsequence of a list by using a slice:

>>> marxes = ['Groucho', 'Chico,' 'Harpo']

>>> marxes[0:2]

['Groucho', 'Chico']

A slice of a list is also a list.

As with strings, slices can step by values other than one. The next example starts at the beginning and goes right by 2:

>>> marxes[::2]

['Groucho', 'Harpo']

Here, we start at the end and go left by 2:

>>> marxes[::-2]

['Harpo', 'Groucho']

And finally, the trick to reverse a list:

>>> marxes[::-1]

['Harpo', 'Chico', 'Groucho']

Add an Item to the End with append()

The traditional way of adding items to a list is to append() them one by one to the end. In the previous examples, we forgot Zeppo, but that’s all right because the list is mutable, so we can add him now:

>>> marxes.append('Zeppo')

>>> marxes

['Groucho', 'Chico', 'Harpo', 'Zeppo']

Combine Lists by Using extend() or +=

You can merge one list into another by using extend(). Suppose that a well-meaning person gave us a new list of Marxes called others, and we’d like to merge them into the main marxes list:

>>> marxes = ['Groucho', 'Chico', 'Harpo', 'Zeppo']

>>> others = ['Gummo', 'Karl']

>>> marxes.extend(others)

>>> marxes

['Groucho', 'Chico', 'Harpo', 'Zeppo', 'Gummo', 'Karl']

Alternatively, you can use +=:

>>> marxes = ['Groucho', 'Chico', 'Harpo', 'Zeppo']

>>> others = ['Gummo', 'Karl']

>>> marxes += others

>>> marxes

['Groucho', 'Chico', 'Harpo', 'Zeppo', 'Gummo', 'Karl']

If we had used append(), others would have been added as a single list item rather than merging its items:

>>> marxes = ['Groucho', 'Chico', 'Harpo', 'Zeppo']

>>> others = ['Gummo', 'Karl']

>>> marxes.append(others)

>>> marxes

['Groucho', 'Chico', 'Harpo', 'Zeppo', ['Gummo', 'Karl']]

This again demonstrates that a list can contain elements of different types. In this case, four strings, and a list of two strings.

Add an Item by Offset with insert()

The append() function adds items only to the end of the list. When you want to add an item before any offset in the list, use insert(). Offset 0 inserts at the beginning. An offset beyond the end of the list inserts at the end, like append(), so you don’t need to worry about Python throwing an exception.

>>> marxes.insert(3, 'Gummo')

>>> marxes

['Groucho', 'Chico', 'Harpo', 'Gummo', 'Zeppo']

>>> marxes.insert(10, 'Karl')

>>> marxes

['Groucho', 'Chico', 'Harpo', 'Gummo', 'Zeppo', 'Karl']

Delete an Item by Offset with del

Our fact checkers have just informed us that Gummo was indeed one of the Marx Brothers, but Karl wasn’t. Let’s undo that last insertion:

>>> del marxes[-1]

>>> marxes

['Groucho', 'Chico', 'Harpo', 'Gummo', 'Zeppo']

When you delete an item by its position in the list, the items that follow it move back to take the deleted item’s space, and the list’s length decreases by one. If we delete 'Harpo' from the last version of the marxes list, we get this as a result:

>>> marxes = ['Groucho', 'Chico', 'Harpo', 'Gummo', 'Zeppo']

>>> marxes[2]

'Harpo'

>>> del marxes[2]

>>> marxes

['Groucho', 'Chico', 'Gummo', 'Zeppo']

>>> marxes[2]

'Gummo'

NOTE

del is a Python statement, not a list method—you don’t say marxes[-2].del(). It’s sort of the reverse of assignment (=): it detaches a name from a Python object and can free up the object’s memory if that name was the last reference to it.

Delete an Item by Value with remove()

If you’re not sure or don’t care where the item is in the list, use remove() to delete it by value. Goodbye, Gummo:

>>> marxes = ['Groucho', 'Chico', 'Harpo', 'Gummo', 'Zeppo']

>>> marxes.remove('Gummo')

>>> marxes

['Groucho', 'Chico', 'Harpo', 'Zeppo']

Get an Item by Offset and Delete It by Using pop()

You can get an item from a list and delete it from the list at the same time by using pop(). If you call pop() with an offset, it will return the item at that offset; with no argument, it uses -1. So, pop(0) returns the head (start) of the list, and pop() or pop(-1) returns the tail (end), as shown here:

>>> marxes = ['Groucho', 'Chico', 'Harpo', 'Zeppo']

>>> marxes.pop()

'Zeppo'

>>> marxes

['Groucho', 'Chico', 'Harpo']

>>> marxes.pop(1)

'Chico'

>>> marxes

['Groucho', 'Harpo']

NOTE

It’s computing jargon time! Don’t worry, these won’t be on the final exam. If you use append() to add new items to the end and pop() to remove them from the same end, you’ve implemented a data structure known as a LIFO (last in, first out) queue. This is more commonly known as a stack. pop(0) would create a FIFO (first in, first out) queue. These are useful when you want to collect data as they arrive and work with either the oldest first (FIFO) or the newest first (LIFO).

Find an Item’s Offset by Value with index()

If you want to know the offset of an item in a list by its value, use index():

>>> marxes = ['Groucho', 'Chico', 'Harpo', 'Zeppo']

>>> marxes.index('Chico')

1

Test for a Value with in

The Pythonic way to check for the existence of a value in a list is using in:

>>> marxes = ['Groucho', 'Chico', 'Harpo', 'Zeppo']

>>> 'Groucho' inmarxes

True

>>> 'Bob' inmarxes

False

The same value may be in more than one position in the list. As long as it’s in there at least once, in will return True:

>>> words = ['a', 'deer', 'a' 'female', 'deer']

>>> 'deer' inwords

True

NOTE

If you check for the existence of some value in a list often and don’t care about the order of items, a Python set is a more appropriate way to store and look up unique values. We’ll talk about sets a little later in this chapter.

Count Occurrences of a Value by Using count()

To count how many times a particular value occurs in a list, use count():

>>> marxes = ['Groucho', 'Chico', 'Harpo']

>>> marxes.count('Harpo')

1

>>> marxes.count('Bob')

0

>>> snl_skit = ['cheeseburger', 'cheeseburger', 'cheeseburger']

>>> snl_skit.count('cheeseburger')

3

Convert to a String with join()

Combine with join() discusses join() in greater detail, but here’s another example of what you can do with it:

>>> marxes = ['Groucho', 'Chico', 'Harpo']

>>> ', '.join(marxes)

'Groucho, Chico, Harpo'

But wait: you might be thinking that this seems a little backward. join() is a string method, not a list method. You can’t say marxes.join(', '), even though it seems more intuitive. The argument to join() is a string or any iterable sequence of strings (including a list), and its output is a string. If join() were just a list method, you couldn’t use it with other iterable objects such as tuples or strings. If you did want it to work with any iterable type, you’d need special code for each type to handle the actual joining. It might help to remember: join() is the opposite ofsplit(), as demonstrated here:

>>> friends = ['Harry', 'Hermione', 'Ron']

>>> separator = ' * '

>>> joined = separator.join(friends)

>>> joined

'Harry * Hermione * Ron'

>>> separated = joined.split(separator)

>>> separated

['Harry', 'Hermione', 'Ron']

>>> separated == friends

True

Reorder Items with sort()

You’ll often need to sort the items in a list by their values rather than their offsets. Python provides two functions:

§ The list function sort() sorts the list itself, in place.

§ The general function sorted() returns a sorted copy of the list.

If the items in the list are numeric, they’re sorted by default in ascending numeric order. If they’re strings, they’re sorted in alphabetical order:

>>> marxes = ['Groucho', 'Chico', 'Harpo']

>>> sorted_marxes = sorted(marxes)

>>> sorted_marxes

['Chico', 'Groucho', 'Harpo']

sorted_marxes is a copy, and creating it did not change the original list:

>>> marxes

['Groucho', 'Chico', 'Harpo']

But, calling the list function sort() on the marxes list does change marxes:

>>> marxes.sort()

>>> marxes

['Chico', 'Groucho', 'Harpo']

If the elements of your list are all of the same type (such as strings in marxes), sort() will work correctly. You can sometimes even mix types—for example, integers and floats—because they are automatically converted to one another by Python in expressions:

>>> numbers = [2, 1, 4.0, 3]

>>> numbers.sort()

>>> numbers

[1, 2, 3, 4.0]

The default sort order is ascending, but you can add the argument reverse=True to set it to descending:

>>> numbers = [2, 1, 4.0, 3]

>>> numbers.sort(reverse=True)

>>> numbers

[4.0, 3, 2, 1]

Get Length by Using len()

len() returns the number of items in a list:

>>> marxes = ['Groucho', 'Chico', 'Harpo']

>>> len(marxes)

3

Assign with =, Copy with copy()

When you assign one list to more than one variable, changing the list in one place also changes it in the other, as illustrated here:

>>> a = [1, 2, 3]

>>> a

[1, 2, 3]

>>> b = a

>>> b

[1, 2, 3]

>>> a[0] = 'surprise'

>>> a

['surprise', 2, 3]

So what’s in b now? Is it still [1, 2, 3], or ['surprise', 2, 3]? Let’s see:

>>> b

['surprise', 2, 3]

Remember the sticky note analogy in Chapter 2? b just refers to the same list object as a; therefore, whether we change the list contents by using the name a or b, it’s reflected in both:

>>> b

['surprise', 2, 3]

>>> b[0] = 'I hate surprises'

>>> b

['I hate surprises', 2, 3]

>>> a

['I hate surprises', 2, 3]

You can copy the values of a list to an independent, fresh list by using any of these methods:

§ The list copy() function

§ The list() conversion function

§ The list slice [:]

Our original list will be a again. We’ll make b with the list copy() function, c with the list() conversion function, and d with a list slice:

>>> a = [1, 2, 3]

>>> b = a.copy()

>>> c = list(a)

>>> d = a[:]

Again, b, c, and d are copies of a: they are new objects with their own values and no connection to the original list object [1, 2, 3] to which a refers. Changing a does not affect the copies b, c, and d:

>>> a[0] = 'integer lists are boring'

>>> a

['integer lists are boring', 2, 3]

>>> b

[1, 2, 3]

>>> c

[1, 2, 3]

>>> d

[1, 2, 3]

Tuples

Similar to lists, tuples are sequences of arbitrary items. Unlike lists, tuples are immutable, meaning you can’t add, delete, or change items after the tuple is defined. So, a tuple is similar to a constant list.

Create a Tuple by Using ()

The syntax to make tuples is a little inconsistent, as we’ll demonstrate in the examples that follow.

Let’s begin by making an empty tuple using ():

>>> empty_tuple = ()

>>> empty_tuple

()

To make a tuple with one or more elements, follow each element with a comma. This works for one-element tuples:

>>> one_marx = 'Groucho',

>>> one_marx

('Groucho',)

If you have more than one element, follow all but the last one with a comma:

>>> marx_tuple = 'Groucho', 'Chico', 'Harpo'

>>> marx_tuple

('Groucho', 'Chico', 'Harpo')

Python includes parentheses when echoing a tuple. You don’t need them—it’s the trailing commas that really define a tuple—but using parentheses doesn’t hurt. You can use them to enclose the values, which helps to make the tuple more visible:

>>> marx_tuple = ('Groucho', 'Chico', 'Harpo')

>>> marx_tuple

('Groucho', 'Chico', 'Harpo')

Tuples let you assign multiple variables at once:

>>> marx_tuple = ('Groucho', 'Chico', 'Harpo')

>>> a, b, c = marx_tuple

>>> a

'Groucho'

>>> b

'Chico'

>>> c

'Harpo'

This is sometimes called tuple unpacking.

You can use tuples to exchange values in one statement without using a temporary variable:

>>> password = 'swordfish'

>>> icecream = 'tuttifrutti'

>>> password, icecream = icecream, password

>>> password

'tuttifrutti'

>>> icecream

'swordfish'

>>>

The tuple() conversion function makes tuples from other things:

>>> marx_list = ['Groucho', 'Chico', 'Harpo']

>>> tuple(marx_list)

('Groucho', 'Chico', 'Harpo')

Tuples versus Lists

You can often use tuples in place of lists, but they have many fewer functions—there is no append(), insert(), and so on—because they can’t be modified after creation. Why not just use lists instead of tuples everywhere?

§ Tuples use less space.

§ You can’t clobber tuple items by mistake.

§ You can use tuples as dictionary keys (see the next section).

§ Named tuples (see Named Tuples) can be a simple alternative to objects.

§ Function arguments are passed as tuples (see Functions).

I won’t go into much more detail about tuples here. In everyday programming, you’ll use lists and dictionaries more. Which is a perfect segue to…

Dictionaries

A dictionary is similar to a list, but the order of items doesn’t matter, and they aren’t selected by an offset such as 0 or 1. Instead, you specify a unique key to associate with each value. This key is often a string, but it can actually be any of Python’s immutable types: boolean, integer, float, tuple, string, and others that you’ll see in later chapters. Dictionaries are mutable, so you can add, delete, and change their key-value elements.

If you’ve worked with languages that support only arrays or lists, you’ll love dictionaries.

NOTE

In other languages, dictionaries might be called associative arrays, hashes, or hashmaps. In Python, a dictionary is also called a dict to save syllables.

Create with {}

To create a dictionary, you place curly brackets ({}) around comma-separated key : value pairs. The simplest dictionary is an empty one, containing no keys or values at all:

>>> empty_dict = {}

>>> empty_dict

{}

Let’s make a small dictionary with quotes from Ambrose Bierce’s The Devil’s Dictionary:

>>> bierce = {

... "day": "A period of twenty-four hours, mostly misspent",

... "positive": "Mistaken at the top of one's voice",

... "misfortune": "The kind of fortune that never misses",

... }

>>>

Typing the dictionary’s name in the interactive interpreter will print its keys and values:

>>> bierce

{'misfortune': 'The kind of fortune that never misses',

'positive': "Mistaken at the top of one's voice",

'day': 'A period of twenty-four hours, mostly misspent'}

NOTE

In Python, it’s okay to leave a comma after the last item of a list, tuple, or dictionary. Also, you don’t need to indent, as I did in the preceding example, when you’re typing keys and values within the curly braces. It just helps readability.

Convert by Using dict()

You can use the dict() function to convert two-value sequences into a dictionary. (You might run into such key-value sequences at times, such as “Strontium, 90, Carbon, 14”, or “Vikings, 20, Packers, 7”.) The first item in each sequence is used as the key and the second as the value.

First, here’s a small example using lol (a list of two-item lists):

>>> lol = [ ['a', 'b'], ['c', 'd'], ['e', 'f'] ]

>>> dict(lol)

{'c': 'd', 'a': 'b', 'e': 'f'}

NOTE

Remember that the order of keys in a dictionary is arbitrary, and might differ depending on how you add items.

We could have used any sequence containing two-item sequences. Here are other examples.

A list of two-item tuples:

>>> lot = [ ('a', 'b'), ('c', 'd'), ('e', 'f') ]

>>> dict(lot)

{'c': 'd', 'a': 'b', 'e': 'f'}

A tuple of two-item lists:

>>> tol = ( ['a', 'b'], ['c', 'd'], ['e', 'f'] )

>>> dict(tol)

{'c': 'd', 'a': 'b', 'e': 'f'}

A list of two-character strings:

>>> los = [ 'ab', 'cd', 'ef' ]

>>> dict(los)

{'c': 'd', 'a': 'b', 'e': 'f'}

A tuple of two-character strings:

>>> tos = ( 'ab', 'cd', 'ef' )

>>> dict(tos)

{'c': 'd', 'a': 'b', 'e': 'f'}

The section Iterate Multiple Sequences with zip() introduces you to a function called zip() that makes it easy to create these two-item sequences.

Add or Change an Item by [ key ]

Adding an item to a dictionary is easy. Just refer to the item by its key and assign a value. If the key was already present in the dictionary, the existing value is replaced by the new one. If the key is new, it’s added to the dictionary with its value. Unlike lists, you don’t need to worry about Python throwing an exception during assignment by specifying an index that’s out of range.

Let’s make a dictionary of most of the members of Monty Python, using their last names as keys, and first names as values:

>>> pythons = {

... 'Chapman': 'Graham',

... 'Cleese': 'John',

... 'Idle': 'Eric',

... 'Jones': 'Terry',

... 'Palin': 'Michael',

... }

>>> pythons

{'Cleese': 'John', 'Jones': 'Terry', 'Palin': 'Michael',

'Chapman': 'Graham', 'Idle': 'Eric'}

We’re missing one member: the one born in America, Terry Gilliam. Here’s an attempt by an anonymous programmer to add him, but he’s botched the first name:

>>> pythons['Gilliam'] = 'Gerry'

>>> pythons

{'Cleese': 'John', 'Gilliam': 'Gerry', 'Palin': 'Michael',

'Chapman': 'Graham', 'Idle': 'Eric', 'Jones': 'Terry'}

And here’s some repair code by another programmer who is Pythonic in more than one way:

>>> pythons['Gilliam'] = 'Terry'

>>> pythons

{'Cleese': 'John', 'Gilliam': 'Terry', 'Palin': 'Michael',

'Chapman': 'Graham', 'Idle': 'Eric', 'Jones': 'Terry'}

By using the same key ('Gilliam'), we replaced the original value 'Gerry' with 'Terry'.

Remember that dictionary keys must be unique. That’s why we used last names for keys instead of first names here—two members of Monty Python have the first name Terry! If you use a key more than once, the last value wins:

>>> some_pythons = {

... 'Graham': 'Chapman',

... 'John': 'Cleese',

... 'Eric': 'Idle',

... 'Terry': 'Gilliam',

... 'Michael': 'Palin',

... 'Terry': 'Jones',

... }

>>> some_pythons

{'Terry': 'Jones', 'Eric': 'Idle', 'Graham': 'Chapman',

'John': 'Cleese', 'Michael': 'Palin'}

We first assigned the value 'Gilliam' to the key 'Terry' and then replaced it with the value 'Jones'.

Combine Dictionaries with update()

You can use the update() function to copy the keys and values of one dictionary into another.

Let’s define the pythons dictionary, with all members:

>>> pythons = {

... 'Chapman': 'Graham',

... 'Cleese': 'John',

... 'Gilliam': 'Terry',

... 'Idle': 'Eric',

... 'Jones': 'Terry',

... 'Palin': 'Michael',

... }

>>> pythons

{'Cleese': 'John', 'Gilliam': 'Terry', 'Palin': 'Michael',

'Chapman': 'Graham', 'Idle': 'Eric', 'Jones': 'Terry'}

We also have a dictionary of other humorous persons called others:

>>> others = { 'Marx': 'Groucho', 'Howard': 'Moe' }

Now, along comes another anonymous programmer who thinks the members of others should be members of Monty Python:

>>> pythons.update(others)

>>> pythons

{'Cleese': 'John', 'Howard': 'Moe', 'Gilliam': 'Terry',

'Palin': 'Michael', 'Marx': 'Groucho', 'Chapman': 'Graham',

'Idle': 'Eric', 'Jones': 'Terry'}

What happens if the second dictionary has the same key as the dictionary into which it’s being merged? The value from the second dictionary wins:

>>> first = {'a': 1, 'b': 2}

>>> second = {'b': 'platypus'}

>>> first.update(second)

>>> first

{'b': 'platypus', 'a': 1}

Delete an Item by Key with del

Our anonymous programmer’s code was correct—technically. But, he shouldn’t have done it! The members of others, although funny and famous, were not in Monty Python. Let’s undo those last two additions:

>>> del pythons['Marx']

>>> pythons

{'Cleese': 'John', 'Howard': 'Moe', 'Gilliam': 'Terry',

'Palin': 'Michael', 'Chapman': 'Graham', 'Idle': 'Eric',

'Jones': 'Terry'}

>>> del pythons['Howard']

>>> pythons

{'Cleese': 'John', 'Gilliam': 'Terry', 'Palin': 'Michael',

'Chapman': 'Graham', 'Idle': 'Eric', 'Jones': 'Terry'}

Delete All Items by Using clear()

To delete all keys and values from a dictionary, use clear() or just reassign an empty dictionary ({}) to the name:

>>> pythons.clear()

>>> pythons

{}

>>> pythons = {}

>>> pythons

{}

Test for a Key by Using in

If you want to know whether a key exists in a dictionary, use in. Let’s redefine the pythons dictionary again, this time omitting a name or two:

>>> pythons = {'Chapman': 'Graham', 'Cleese': 'John',

'Jones': 'Terry', 'Palin': 'Michael'}

Now let’s see who’s in there:

>>> 'Chapman' inpythons

True

>>> 'Palin' inpythons

True

Did we remember to add Terry Gilliam this time?

>>> 'Gilliam' inpythons

False

Drat.

Get an Item by [ key ]

This is the most common use of a dictionary. You specify the dictionary and key to get the corresponding value:

>>> pythons['Cleese']

'John'

If the key is not present in the dictionary, you’ll get an exception:

>>> pythons['Marx']

Traceback (most recent call last):

File "<stdin>", line 1, in<module>

KeyError: 'Marx'

There are two good ways to avoid this. The first is to test for the key at the outset by using in, as you saw in the previous section:

>>> 'Marx' inpythons

False

The second is to use the special dictionary get() function. You provide the dictionary, key, and an optional value. If the key exists, you get its value:

>>> pythons.get('Cleese')

'John'

If not, you get the optional value, if you specified one:

>>> pythons.get('Marx', 'Not a Python')

'Not a Python'

Otherwise, you get None (which displays nothing in the interactive interpreter):

>>> pythons.get('Marx')

>>>

Get All Keys by Using keys()

You can use keys() to get all the keys in a dictionary. We’ll use a different sample dictionary for the next few examples:

>>> signals = {'green': 'go', 'yellow': 'go faster', 'red': 'smile for the camera'}

>>> signals.keys()

dict_keys(['green', 'red', 'yellow'])

NOTE

In Python 2, keys() just returns a list. Python 3 returns dict_keys(), which is an iterable view of the keys. This is handy with large dictionaries because it doesn’t use the time and memory to create and store a list that you might not use. But often you actually do want a list. In Python 3, you need to call list() to convert a dict_keys object to a list.

>>> list( signals.keys() )

['green', 'red', 'yellow']

In Python 3, you also need to use the list() function to turn the results of values() and items() into normal Python lists. I’m using that in these examples.

Get All Values by Using values()

To obtain all the values in a dictionary, use values():

>>> list( signals.values() )

['go', 'smile for the camera', 'go faster']

Get All Key-Value Pairs by Using items()

When you want to get all the key-value pairs from a dictionary, use the items() function:

>>> list( signals.items() )

[('green', 'go'), ('red', 'smile for the camera'), ('yellow', 'go faster')]

Each key and value is returned as a tuple, such as ('green', 'go').

Assign with =, Copy with copy()

As with lists, if you make a change to a dictionary, it will be reflected in all the names that refer to it.

>>> signals = {'green': 'go', 'yellow': 'go faster', 'red': 'smile for the camera'}

>>> save_signals = signals

>>> signals['blue'] = 'confuse everyone'

>>> save_signals

{'blue': 'confuse everyone', 'green': 'go',

'red': 'smile for the camera', 'yellow': 'go faster'}

To actually copy keys and values from a dictionary to another dictionary and avoid this, you can use copy():

>>> signals = {'green': 'go', 'yellow': 'go faster', 'red': 'smile for the camera'}

>>> original_signals = signals.copy()

>>> signals['blue'] = 'confuse everyone'

>>> signals

{'blue': 'confuse everyone', 'green': 'go',

'red': 'smile for the camera', 'yellow': 'go faster'}

>>> original_signals

{'green': 'go', 'red': 'smile for the camera', 'yellow': 'go faster'}

Sets

A set is like a dictionary with its values thrown away, leaving only the keys. As with a dictionary, each key must be unique. You use a set when you only want to know that something exists, and nothing else about it. Use a dictionary if you want to attach some information to the key as a value.

At some bygone time, in some places, set theory was taught in elementary school along with basic mathematics. If your school skipped it (or covered it and you were staring out the window as I often did), Figure 3-1 shows the ideas of union and intersection.

Suppose that you take the union of two sets that have some keys in common. Because a set must contain only one of each item, the union of two sets will contain only one of each key. The null or empty set is a set with zero elements. In Figure 3-1, an example of a null set would be female names beginning with X.

Common things to do with sets

Figure 3-1. Common things to do with sets

Create with set()

To create a set, you use the set() function or enclose one or more comma-separated values in curly brackets, as shown here:

>>> empty_set = set()

>>> empty_set

set()

>>> even_numbers = {0, 2, 4, 6, 8}

>>> even_numbers

{0, 8, 2, 4, 6}

>>> odd_numbers = {1, 3, 5, 7, 9}

>>> odd_numbers

{9, 3, 1, 5, 7}

As with dictionary keys, sets are unordered.

NOTE

Because [] creates an empty list, you might expect {} to create an empty set. Instead, {} creates an empty dictionary. That’s also why the interpreter prints an empty set as set() instead of {}. Why? Dictionaries were in Python first and took possession of the curly brackets.

Convert from Other Data Types with set()

You can create a set from a list, string, tuple, or dictionary, discarding any duplicate values.

First, let’s take a look at a string with more than one occurrence of some letters:

>>> set( 'letters' )

{'l', 'e', 't', 'r', 's'}

Notice that the set contains only one 'e' or 't', even though 'letters' contained two of each.

Now, let’s make a set from a list:

>>> set( ['Dasher', 'Dancer', 'Prancer', 'Mason-Dixon'] )

{'Dancer', 'Dasher', 'Prancer', 'Mason-Dixon'}

This time, a set from a tuple:

>>> set( ('Ummagumma', 'Echoes', 'Atom Heart Mother') )

{'Ummagumma', 'Atom Heart Mother', 'Echoes'}

When you give set() a dictionary, it uses only the keys:

>>> set( {'apple': 'red', 'orange': 'orange', 'cherry': 'red'} )

{'apple', 'cherry', 'orange'}

Test for Value by Using in

This is the most common use of a set. We’ll make a dictionary called drinks. Each key is the name of a mixed drink, and the corresponding value is a set of its ingredients:

>>> drinks = {

... 'martini': {'vodka', 'vermouth'},

... 'black russian': {'vodka', 'kahlua'},

... 'white russian': {'cream', 'kahlua', 'vodka'},

... 'manhattan': {'rye', 'vermouth', 'bitters'},

... 'screwdriver': {'orange juice', 'vodka'}

... }

Even though both are enclosed by curly braces ({ and }), a set is just a sequence of values, and a dictionary is one or more key : value pairs.

Which drinks contain vodka? (Note that I’m previewing the use of for, if, and, and or from the next chapter for these tests.)

>>> for name, contents indrinks.items():

... if 'vodka' incontents:

... print(name)

...

screwdriver

martini

black russian

white russian

We want something with vodka but are lactose intolerant, and think vermouth tastes like kerosene:

>>> for name, contents indrinks.items():

... if 'vodka' incontents and not ('vermouth' incontents or

... 'cream' incontents):

... print(name)

...

screwdriver

black russian

We’ll rewrite this a bit more succinctly in the next section.

Combinations and Operators

What if you want to check for combinations of set values? Suppose that you want to find any drink that has orange juice or vermouth? We’ll use the set intersection operator, which is an ampersand (&):

>>> for name, contents indrinks.items():

... if contents & {'vermouth', 'orange juice'}:

... print(name)

...

screwdriver

martini

manhattan

The result of the & operator is a set, which contains all the items that appear in both lists that you compare. If neither of those ingredients were in contents, the & returns an empty set, which is considered False.

Now, let’s rewrite the example from the previous section, in which we wanted vodka but neither cream nor vermouth:

>>> for name, contents indrinks.items():

... if 'vodka' incontents and not contents & {'vermouth', 'cream'}:

... print(name)

...

screwdriver

black russian

Let’s save the ingredient sets for these two drinks in variables, just to save typing in the coming examples:

>>> bruss = drinks['black russian']

>>> wruss = drinks['white russian']

The following are examples of all the set operators. Some have special punctuation, some have special functions, and some have both. We’ll use test sets a (contains 1 and 2) and b (contains 2 and 3):

>>> a = {1, 2}

>>> b = {2, 3}

You get the intersection (members common to both sets) with the special punctuation symbol & or the set intersection() function, as demonstrated here:

>>> a & b

{2}

>>> a.intersection(b)

{2}

This snippet uses our saved drink variables:

>>> bruss & wruss

{'kahlua', 'vodka'}

In this example, you get the union (members of either set) by using | or the set union() function:

>>> a | b

{1, 2, 3}

>>> a.union(b)

{1, 2, 3}

And here’s the alcoholic version:

>>> bruss | wruss

{'cream', 'kahlua', 'vodka'}

The difference (members of the first set but not the second) is obtained by using the character - or difference():

>>> a - b

{1}

>>> a.difference(b)

{1}

>>> bruss - wruss

set()

>>> wruss - bruss

{'cream'}

By far, the most common set operations are union, intersection, and difference. I’ve included the others for completeness in the examples that follow, but you might never use them.

The exclusive or (items in one set or the other, but not both) uses ^ or symmetric_difference():

>>> a ^ b

{1, 3}

>>> a.symmetric_difference(b)

{1, 3}

This finds the exclusive ingredient in our two russian drinks:

>>> bruss ^ wruss

{'cream'}

You can check whether one set is a subset of another (all members of the first set are also in the second set) by using <= or issubset():

>>> a <= b

False

>>> a.issubset(b)

False

Adding cream to a black russian makes a white russian, so wruss is a superset of bruss:

>>> bruss <= wruss

True

Is any set a subset of itself? Yup.

>>> a <= a

True

>>> a.issubset(a)

True

To be a proper subset, the second set needs to have all the members of the first and more. Calculate it by using <:

>>> a < b

False

>>> a < a

False

>>> bruss < wruss

True

A superset is the opposite of a subset (all members of the second set are also members of the first). This uses >= or issuperset():

>>> a >= b

False

>>> a.issuperset(b)

False

>>> wruss >= bruss

True

Any set is a superset of itself:

>>> a >= a

True

>>> a.issuperset(a)

True

And finally, you can find a proper superset (the first set has all members of the second, and more) by using >:

>>> a > b

False

>>> wruss > bruss

True

You can’t be a proper superset of yourself:

>>> a > a

False

Compare Data Structures

To review: you make a list by using square brackets ([]), a tuple by using commas, and a dictionary by using curly brackets ({}). In each case, you access a single element with square brackets:

>>> marx_list = ['Groucho', 'Chico', 'Harpo']

>>> marx_tuple = 'Groucho', 'Chico', 'Harpo'

>>> marx_dict = {'Groucho': 'banjo', 'Chico': 'piano', 'Harpo': 'harp'}

>>> marx_list[2]

'Harpo'

>>> marx_tuple[2]

'Harpo'

>>> marx_dict['Harpo']

'harp'

For the list and tuple, the value between the square brackets is an integer offset. For the dictionary, it’s a key. For all three, the result is a value.

Make Bigger Data Structures

We worked up from simple booleans, numbers, and strings to lists, tuples, sets, and dictionaries. You can combine these built-in data structures into bigger, more complex structures of your own. Let’s start with three different lists:

>>> marxes = ['Groucho', 'Chico', 'Harpo']

>>> pythons = ['Chapman', 'Cleese', 'Gilliam', 'Jones', 'Palin']

>>> stooges = ['Moe', 'Curly', 'Larry']

We can make a tuple that contains each list as an element:

>>> tuple_of_lists = marxes, pythons, stooges

>>> tuple_of_lists

(['Groucho', 'Chico', 'Harpo'],

['Chapman', 'Cleese', 'Gilliam', 'Jones', 'Palin'],

['Moe', 'Curly', 'Larry'])

And, we can make a list that contains the three lists:

>>> list_of_lists = [marxes, pythons, stooges]

>>> list_of_lists

[['Groucho', 'Chico', 'Harpo'],

['Chapman', 'Cleese', 'Gilliam', 'Jones', 'Palin'],

['Moe', 'Curly', 'Larry']]

Finally, let’s create a dictionary of lists. In this example, let’s use the name of the comedy group as the key and the list of members as the value:

>>> dict_of_lists = {'Marxes': marxes, 'Pythons': pythons, 'Stooges': stooges}

>> dict_of_lists

{'Stooges': ['Moe', 'Curly', 'Larry'],

'Marxes': ['Groucho', 'Chico', 'Harpo'],

'Pythons': ['Chapman', 'Cleese', 'Gilliam', 'Jones', 'Palin']}

Your only limitations are those in the data types themselves. For example, dictionary keys need to be immutable, so a list, dictionary, or set can’t be a key for another dictionary. But a tuple can be. For example, you could index sites of interest by GPS coordinates (latitude, longitude, and altitude; see Maps for more mapping examples):

>>> houses = {

(44.79, -93.14, 285): 'My House',

(38.89, -77.03, 13): 'The White House'

}

Things to Do

In this chapter, you saw more complex data structures: lists, tuples, dictionaries, and sets. Using these and those from Chapter 2 (numbers and strings), you can represent elements in the real world with great variety.

3.1. Create a list called years_list, starting with the year of your birth, and each year thereafter until the year of your fifth birthday. For example, if you were born in 1980. the list would be years_list = [1980, 1981, 1982, 1983, 1984, 1985].

If you’re less than five years old and reading this book, I don’t know what to tell you.

3.2. In which year in years_list was your third birthday? Remember, you were 0 years of age for your first year.

3.3. In which year in years_list were you the oldest?

3.4. Make a list called things with these three strings as elements: "mozzarella", "cinderella", "salmonella".

3.5. Capitalize the element in things that refers to a person and then print the list. Did it change the element in the list?

3.6. Make the cheesy element of things all uppercase and then print the list.

3.7. Delete the disease element from things, collect your Nobel Prize, and print the list.

3.8. Create a list called surprise with the elements "Groucho", "Chico", and "Harpo".

3.9. Lowercase the last element of the surprise list, reverse it, and then capitalize it.

3.10. Make an English-to-French dictionary called e2f and print it. Here are your starter words: dog is chien, cat is chat, and walrus is morse.

3.11. Using your three-word dictionary e2f, print the French word for walrus.

3.12. Make a French-to-English dictionary called f2e from e2f. Use the items method.

3.13. Using f2e, print the English equivalent of the French word chien.

3.14. Make and print a set of English words from the keys in e2f.

3.15. Make a multilevel dictionary called life. Use these strings for the topmost keys: 'animals', 'plants', and 'other'. Make the 'animals' key refer to another dictionary with the keys 'cats', 'octopi', and 'emus'. Make the 'cats' key refer to a list of strings with the values 'Henri', 'Grumpy', and 'Lucy'. Make all the other keys refer to empty dictionaries.

3.16. Print the top-level keys of life.

3.17. Print the keys for life['animals'].

3.18. Print the values for life['animals']['cats'].