Writing Idiomatic Python (2013)

4. Control Structures and Functions

4.1 If Statements

4.1.1 Avoid placing conditional branch code on the same line as the colon

Using indentation to indicate scope (like you already do everywhere else in Python) makes it easy to determine what will be executed as part of a conditional statement. if, elif, and else statements should always be on their own line. No code should follow the :.

4.1.1.1 Harmful

name = 'Jeff'

address = 'New York, NY'

if name: print(name)

print(address)

4.1.1.2 Idiomatic

name = 'Jeff'

address = 'New York, NY'

if name:

print(name)

print(address)

4.1.2 Avoid repeating variable name in compound if statement

When one wants to check a variable against a number of values, repeatedly listing the variable being checked is unnecessarily verbose. Using an iterable makes the code more clear and improves readability.

4.1.2.1 Harmful

is_generic_name = False

name = 'Tom'

if name == 'Tom' or name == 'Dick' or name == 'Harry':

is_generic_name = True

4.1.2.2 Idiomatic

name = 'Tom'

is_generic_name = name in ('Tom', 'Dick', 'Harry')

4.1.3 Avoid comparing directly to True, False, or None

For any object, be it a built-in or user defined, there is a “truthiness” associated with the object. When checking if a condition is true, prefer relying on the implicit “truthiness” of the object in the conditional statement. The rules regarding “truthiness” are reasonably straightforward. All of the following are considered False:

· None

· False

· zero for numeric types

· empty sequences

· empty dictionaries

· a value of 0 or False returned when either __len__ or __nonzero__ is called

Everything else is considered True (and thus most things are implicitly True). The last condition for determining False, by checking the value returned by __len__ or __nonzero__, allows you to define how “truthiness” should work for any class you create.

if statements in Python make use of “truthiness” implicitly, and you should too. Instead of checking if a variable foo is True like this

if foo == True:

you should simply check if foo:.

There are a number of reasons for this. The most obvious is that if your code changes and foo becomes an int instead of True or False, your if statement still works. But at a deeper level, the reasoning is based on the difference between equality and identity. Using == determines if two objects have the same value (as defined by their _eq attribute). Using is determines if the two objects are actually the same object.

Note that while there are cases where is works as if it were comparing for equality, these are special cases and shouldn’t be relied upon.

As a consequence, avoid comparing directly to False and None and empty sequences like [], {}, and (). If a list named my_list is empty, calling if my_list: will evaluate to False.

There are times, however, when comparing directly to None is not just recommended, but required. A function checking if an argument whose default value is None was actually set must compare directly to None like so:

def insert_value(value, position=None):

"""Inserts a value into my container, optionally at the

specified position"""

if position is not None:

...

What’s wrong with if position:? Well, if someone wanted to insert into position 0, the function would act as if position hadn’t been set, since 0 evaluates to False. Note the use of is not: comparisons against None (a singleton in Python) should always use is or is not, not == (from PEP8).

Just let Python’s “truthiness” do the work for you.

4.1.3.1 Harmful

def number_of_evil_robots_attacking():

return 10

def should_raise_shields():

# "We only raise Shields when one or more giant robots attack,

# so I can just return that value..."

return number_of_evil_robots_attacking()

if should_raise_shields() == True:

raise_shields()

print('Shields raised')

else:

print('Safe! No giant robots attacking')

4.1.3.2 Idiomatic

def number_of_evil_robots_attacking():

return 10

def should_raise_shields():

# "We only raise Shields when one or more giant robots attack,

# so I can just return that value..."

return number_of_evil_robots_attacking()

if should_raise_shields():

raise_shields()

print('Shields raised')

else:

print('Safe! No giant robots attacking')

4.2 For loops

4.2.1 Use the enumerate function in loops instead of creating an “index” variable

Programmers coming from other languages are used to explicitly declaring a variable to track the index of a container in a loop. For example, in C++:

for (int i=0; i < container.size(); ++i)

{

// Do stuff

}

In Python, the enumerate built-in function handles this role.

4.2.1.1 Harmful

my_container = ['Larry', 'Moe', 'Curly']

index = 0

for element in my_container:

print ('{} {}'.format(index, element))

index += 1

4.2.1.2 Idiomatic

my_container = ['Larry', 'Moe', 'Curly']

for index, element in enumerate(my_container):

print ('{} {}'.format(index, element))

4.2.2 Use the in keyword to iterate over an iterable

Programmers coming from languages lacking a for_each style construct are used to iterating over a container by accessing elements via index. Python’s in keyword handles this gracefully.

4.2.2.1 Harmful

my_list = ['Larry', 'Moe', 'Curly']

index = 0

while index < len(my_list):

print (my_list[index])

index += 1

4.2.2.2 Idiomatic

my_list = ['Larry', 'Moe', 'Curly']

for element in my_list:

print (element)

4.2.3 Use else to execute code after a for loop concludes

One of the lesser known facts about Python’s for loop is that it can include an else clause. The else clause is executed after the iterator is exhausted, unless the loop was ended prematurely due to a break statement. This allows you to check for a condition in a for loop, break if the condition holds for an element, else take some action if the condition did not hold for any of the elements being looped over. This obviates the need for conditional flags in a loop solely used to determine if some condition held.

In the scenario below, we are running a report to check if any of the email addresses our users registered are malformed (users can register multiple addresses). The idiomatic version is more concise thanks to not having to deal with the has_malformed_email_address flag. What’s more, even if another programmer wasn’t familiar with the for ... else idiom, our code is clear enough to teach them.

4.2.3.1 Harmful

for user in get_all_users():

has_malformed_email_address = False

print ('Checking {}'.format(user))

for email_address in user.get_all_email_addresses():

if email_is_malformed(email_address):

has_malformed_email_address = True

print ('Has a malformed email address!')

break

if not has_malformed_email_address:

print ('All email addresses are valid!')

4.2.3.2 Idiomatic

for user in get_all_users():

print ('Checking {}'.format(user))

for email_address in user.get_all_email_addresses():

if email_is_malformed(email_address):

print ('Has a malformed email address!')

break

else:

print ('All email addresses are valid!')

4.3 Functions

4.3.1 Avoid using '', [], and {} as default parameters to functions

Though this is explicitly mentioned in the Python tutorial, it nevertheless surprises even experienced developers. In short: prefer names=None to names=[] for default parameters to functions. Below is the Python Tutorial’s treatment of the issue.

4.3.1.1 Harmful

# The default value [of a function] is evaluated only once.

# This makes a difference when the default is a mutable object

# such as a list, dictionary, or instances of most classes. For

# example, the following function accumulates the arguments

# passed to it on subsequent calls.

def f(a, L=[]):

L.append(a)

return L

print(f(1))

print(f(2))

print(f(3))

# This will print

# [1]

# [1, 2]

# [1, 2, 3]

4.3.1.2 Idiomatic

# If you don't want the default to be shared between subsequent

# calls, you can write the function like this instead:

def f(a, L=None):

if L is None:

L = []

L.append(a)

return L

print(f(1))

print(f(2))

print(f(3))

# This will print

# [1]

# [2]

# [3]

4.3.2 Use *args and **kwargs to accept arbitrary arguments

Oftentimes, functions need to accept an arbitrary list of positional parameters and/or keyword parameters, use a subset of them, and forward the rest to another function. Using *args and **kwargs as parameters allows a function to accept an arbitrary list of positional and keyword arguments, respectively.

The idiom is also useful when maintaining backwards compatibility in an API. If our function accepts arbitrary arguments, we are free to add new arguments in a new version while not breaking existing code using fewer arguments. As long as everything is properly documented, the “actual” parameters of a function are not of much consequence.

4.3.2.1 Harmful

def make_api_call(foo, bar, baz):

if baz in ('Unicorn', 'Oven', 'New York'):

return foo(bar)

else:

return bar(foo)

# I need to add another parameter to `make_api_call`

# without breaking everyone's existing code.

# I have two options...

def so_many_options():

# I can tack on new parameters, but only if I make

# all of them optional...

def make_api_call(foo, bar, baz, qux=None, foo_polarity=None,

baz_coefficient=None, quux_capacitor=None,

bar_has_hopped=None, true=None, false=None,

file_not_found=None):

# ... and so on ad infinitum

return file_not_found

def version_graveyard():

# ... or I can create a new function each time the signature

# changes.

def make_api_call_v2(foo, bar, baz, qux):

return make_api_call(foo, bar, baz) - qux

def make_api_call_v3(foo, bar, baz, qux, foo_polarity):

if foo_polarity != 'reversed':

return make_api_call_v2(foo, bar, baz, qux)

return None

def make_api_call_v4(

foo, bar, baz, qux, foo_polarity, baz_coefficient):

return make_api_call_v3(

foo, bar, baz, qux, foo_polarity) * baz_coefficient

def make_api_call_v5(

foo, bar, baz, qux, foo_polarity,

baz_coefficient, quux_capacitor):

# I don't need 'foo', 'bar', or 'baz' anymore, but I have to

# keep supporting them...

return baz_coefficient * quux_capacitor

def make_api_call_v6(

foo, bar, baz, qux, foo_polarity, baz_coefficient,

quux_capacitor, bar_has_hopped):

if bar_has_hopped:

baz_coefficient *= -1

return make_api_call_v5(foo, bar, baz, qux,

foo_polarity, baz_coefficient,

quux_capacitor)

def make_api_call_v7(

foo, bar, baz, qux, foo_polarity, baz_coefficient,

quux_capacitor, bar_has_hopped, true):

return true

def make_api_call_v8(

foo, bar, baz, qux, foo_polarity, baz_coefficient,

quux_capacitor, bar_has_hopped, true, false):

return false

def make_api_call_v9(

foo, bar, baz, qux, foo_polarity, baz_coefficient,

quux_capacitor, bar_has_hopped,

true, false, file_not_found):

return file_not_found

4.3.2.2 Idiomatic

def make_api_call(foo, bar, baz):

if baz in ('Unicorn', 'Oven', 'New York'):

return foo(bar)

else:

return bar(foo)

# I need to add another parameter to `make_api_call`

# without breaking everyone's existing code.

# Easy...

def new_hotness():

def make_api_call(foo, bar, baz, *args, **kwargs):

# Now I can accept any type and number of arguments

# without worrying about breaking existing code.

baz_coefficient = kwargs['the_baz']

# I can even forward my args to a different function without

# knowing their contents!

return baz_coefficient in new_function(args)

4.3.3 Use the function-based version of print

In Python 3.0, print was changed from a special language construct to a normal built-in function. The reasons for the change are listed in PEP 3105, summarized in the list below:

· There is nothing about print that requires special language syntax

· Replacing print with an alternative implementation is often desirable, but difficult if print is language syntax

· Converting print calls to use a separator other than spaces is difficult at best.

Python 2.6 included a mechanism for using the new print() function by including it in Python’s standard mechanism for backporting changes from Python 3 to Python 2: the __future__ module. For both the reasons listed above and, more importantly, to make transitioning your code from Python 2 to 3 as easy as possible, using the fuction-based print is recommended.

4.3.3.1 Harmful

print 1, 'foo', __name__

4.3.3.2 Idiomatic

from __future__ import print_function

print(1, 'foo', __name__)

4.4 Exceptions

4.4.1 Don’t be Afraid to Use Exceptions

In many languages, exceptions are reserved for truly exceptional cases. For example, a function that takes a file name as an argument and performs some calculations on the file’s contents probably shouldn’t throw an exception if the file is not found. That’s not too “exceptional”; it’s probably a reasonably common occurrence. If, however, the file system itself were unavailable, raising an exception makes sense.

Because (in other languages) deciding when to raise an exception is partly a matter of taste (and thus experience), novices tend to overuse them. This overuse of exceptions leads to a number of problems: the control flow of a program is more difficult to follow, they create a burden on calling code when allowed to propagate up a call chain, and in many languages they impose a stiff performance penalty. These facts have led to a general vilification of exceptions. Many organizations have explicit coding standards that forbid their use (see, for example, Google’s official C++ Style Guide ).

Python takes a different view. Exceptions can be found in almost every popular third-party package, and the Python standard library makes liberal use of them. In fact, exceptions are built into fundamental parts of the language itself. For example, did you know that any time you use a forloop in Python, you’re using exceptions?

That may sound odd, but it’s true: exceptions are used for control flow throughout the Python language. Have you ever wondered how for loops know when to stop? For things like lists that have an easily determined length the question seems trivial. But what about generators, which could produce values ad infinitum?

Any time you use for to iterate over an iterable (basically, all sequence types and anything that defines __iter__() or __getitem__()), it needs to know when to stop iterating. Take a look at the code below:

#!py

words = ['exceptions', 'are', 'useful']

for word in words:

print(word)

How does for know when it’s reached the last element in words and should stop trying to get more items? The answer may surprise you: the list raises a StopIteration exception.

In fact, all iterables follow this pattern. When a for statement is first evaluated, it calls iter() on the object being iterated over. This creates an iterator for the object, capable of returning the contents of the object in sequence. For the call to iter() to succeed, the object must either support the iteration protocol (by defining __iter__()) or the sequence protocol (by defining __getitem__()).

As it happens, both the __iter__() and __getitem__() functions are required to raise an exception when the items to iterate over are exhausted. __iter__() raises the StopIteration exception, as discussed earlier, and __getitem__() raises the IndexError exception. This is how for knows when to stop.

So whenever you’re wondering if it’s OK to use exceptions in Python, just remember this: for all but the most trivial programs, you’re probably using them already.

4.4.2 Use Exceptions to Write Code in an “EAFP” Style

Code that doesn’t use exceptions is always checking if it’s OK to do something. In the harmful code below, the function print_first_row seems meek and overly-cautious. One can imagine it saying, “I want to print the first result of a database query. Do I have a valid database connection? Did my query complete successfully? Are there any results?”

Code written in this manner must ask a number of different questions before it is convinced it’s OK to do something. More importantly, once all questions have been answered to its satisfaction, the code assumes whatever it is about to do will succeed.

The if statements littered throughout the function give both the programmer and readers of the code a false sense of security. The programmer has checked everything she can think of that would prevent her code from working, so clearly nothing can go wrong, right? Someone reading the code would similarly assume that, with all those if statements, the function must handle all possible error conditions. Calling it shouldn’t require any error handling.

Code written in this style is said to be written in a “Look Before You Leap (LBYL)” style. Every (thought of) pre-condition is explicitly checked. There’s an obvious problem with this approach: if the code doesn’t ask all of the right questions, bad things happen. It’s rarely possible to anticipate everything that could go wrong. What’s more, as the Python documentation astutely points out, code written in this style can fail badly in a multi-threaded environment. A condition that was true in an if statement may be false by the next line.

Alternatively, code written according to the principle, “[It’s] Easier to Ask for Forgiveness than Permission (EAFP),” assumes things will go well and catches exceptions if they don’t. It puts the code’s true purpose front-and-center, increasing clarity. Rather than seeing a string of ifstatements and needing to remember what each checked before you even know what the code wants to do, EAFP-style code presents the end goal first. The error handling code that follows is easier to read; you already know the operation that could have failed.

4.4.2.1 Harmful

def get_log_level(config_dict):

if 'ENABLE_LOGGING' in config_dict:

if config_dict['ENABLE_LOGGING'] != True:

return None

elif not 'DEFAULT_LOG_LEVEL' in config_dict:

return None

else: return config_dict['DEFAULT_LOG_LEVEL']

else:

return None

4.4.2.2 Idiomatic

def get_log_level(config_dict):

try:

if config_dict['ENABLE_LOGGING']:

return config_dict['DEFAULT_LOG_LEVEL']

except KeyError:

# if either value wasn't present, a KeyError will be raised, so

# return None

return None

4.4.3 Avoid “Swallowing” Useful Exceptions With Bare Except Clauses

A common mistake made by novices when using exceptions is to feel compelled to catch any exception code could raise. This is especially common when writing code using third-party packages; programmers encapsulate all uses of the package in try blocks followed by an except clause that doesn’t specify an exception (also known as a “bare” except clause). A generic error message like, “something went wrong” may also be printed.

Exceptions have tracebacks and messages for a reason: to aid in debugging when something goes wrong. If you “swallow” an exception with a bare except clause, you suppress genuinely useful debugging information. If you need to know whenever an exception occurs but don’t intend to deal with it (say for logging purposes), add a bare raise to the end of your except block. The bare yield re-raises the exception that was caught. This way, your code runs and the user still gets useful information when something goes wrong.

Of course, there are valid reasons one would need to ensure some block of code never generates an exception. Almost none of the idioms described in this book are meant to be mechanically followed: use your head.

4.4.3.1 Harmful

import requests

def get_json_response(url):

try:

r = requests.get(url)

return r.json()

except:

print('Oops, something went wrong!')

return None

4.4.3.2 Idiomatic

import requests

def get_json_response(url):

return requests.get(url).json()

# If we need to make note of the exception, we

# would write the function this way...

def alternate_get_json_response(url):

try:

r = requests.get(url)

return r.json()

except:

# do some logging here, but don't handle the exception

# ...

raise