Designing with Exceptions - Exceptions and Tools - Learning Python (2013)

Learning Python (2013)

Part VII. Exceptions and Tools

Chapter 36. Designing with Exceptions

This chapter rounds out this part of the book with a collection of exception design topics and common use case examples, followed by this part’s gotchas and exercises. Because this chapter also closes out the fundamentals portion of the book at large, it includes a brief overview of development tools as well to help you as you make the migration from Python beginner to Python application developer.

Nesting Exception Handlers

Most of our examples so far have used only a single try to catch exceptions, but what happens if one try is physically nested inside another? For that matter, what does it mean if a try calls a function that runs another try? Technically, try statements can nest, in terms of both syntax and the runtime control flow through your code. I’ve mentioned this briefly, but let’s clarify the idea here.

Both of these cases can be understood if you realize that Python stacks try statements at runtime. When an exception is raised, Python returns to the most recently entered try statement with a matching except clause. Because each try statement leaves a marker, Python can jump back to earlier trys by inspecting the stacked markers. This nesting of active handlers is what we mean when we talk about propagating exceptions up to “higher” handlers—such handlers are simply try statements entered earlier in the program’s execution flow.

Figure 36-1 illustrates what occurs when try statements with except clauses nest at runtime. The amount of code that goes into a try block can be substantial, and it may contain function calls that invoke other code watching for the same exceptions. When an exception is eventually raised, Python jumps back to the most recently entered try statement that names that exception, runs that statement’s except clause, and then resumes execution after that try.

Nested try/except statements: when an exception is raised (by you or by Python), control jumps back to the most recently entered try statement with a matching except clause, and the program resumes after that try statement. except clauses intercept and stop the exception—they are where you process and recover from exceptions.

Figure 36-1. Nested try/except statements: when an exception is raised (by you or by Python), control jumps back to the most recently entered try statement with a matching except clause, and the program resumes after that try statement. except clauses intercept and stop the exception—they are where you process and recover from exceptions.

Once the exception is caught, its life is over—control does not jump back to all matching trys that name the exception; only the first (i.e., most recent) one is given the opportunity to handle it. In Figure 36-1, for instance, the raise statement in the function func2 sends control back to the handler in func1, and then the program continues within func1.

By contrast, when try statements that contain only finally clauses are nested, each finally block is run in turn when an exception occurs—Python continues propagating the exception up to other trys, and eventually perhaps to the top-level default handler (the standard error message printer). As Figure 36-2 illustrates, the finally clauses do not kill the exception—they just specify code to be run on the way out of each try during the exception propagation process. If there are many try/finally clauses active when an exception occurs, they will all be run, unless atry/except catches the exception somewhere along the way.

Nested try/finally statements: when an exception is raised here, control returns to the most recently entered try to run its finally statement, but then the exception keeps propagating to all finallys in all active try statements and eventually reaches the default top-level handler, where an error message is printed. finally clauses intercept (but do not stop) an exception—they are for actions to be performed “on the way out.”

Figure 36-2. Nested try/finally statements: when an exception is raised here, control returns to the most recently entered try to run its finally statement, but then the exception keeps propagating to all finallys in all active try statements and eventually reaches the default top-level handler, where an error message is printed. finally clauses intercept (but do not stop) an exception—they are for actions to be performed “on the way out.”

In other words, where the program goes when an exception is raised depends entirely upon where it has been—it’s a function of the runtime flow of control through the script, not just its syntax. The propagation of an exception essentially proceeds backward through time to try statements that have been entered but not yet exited. This propagation stops as soon as control is unwound to a matching except clause, but not as it passes through finally clauses on the way.

Example: Control-Flow Nesting

Let’s turn to an example to make this nesting concept more concrete. The following module file, nestexc.py, defines two functions. action2 is coded to trigger an exception (you can’t add numbers and sequences), and action1 wraps a call to action2 in a try handler, to catch the exception:

def action2():

print(1 + []) # Generate TypeError

def action1():

try:

action2()

except TypeError: # Most recent matching try

print('inner try')

try:

action1()

except TypeError: # Here, only if action1 re-raises

print('outer try')

% python nestexc.py

inner try

Notice, though, that the top-level module code at the bottom of the file wraps a call to action1 in a try handler, too. When action2 triggers the TypeError exception, there will be two active try statements—the one in action1, and the one at the top level of the module file. Python picks and runs just the most recent try with a matching except—which in this case is the try inside action1.

Again, the place where an exception winds up jumping to depends on the control flow through the program at runtime. Because of this, to know where you will go, you need to know where you’ve been. In this case, where exceptions are handled is more a function of control flow than of statement syntax. However, we can also nest exception handlers syntactically—an equivalent case we turn to next.

Example: Syntactic Nesting

As I mentioned when we looked at the new unified try/except/finally statement in Chapter 34, it is possible to nest try statements syntactically by their position in your source code:

try:

try:

action2()

except TypeError: # Most recent matching try

print('inner try')

except TypeError: # Here, only if nested handler re-raises

print('outer try')

Really, though, this code just sets up the same handler-nesting structure as (and behaves identically to) the prior example. In fact, syntactic nesting works just like the cases sketched in Figure 36-1 and Figure 36-2. The only difference is that the nested handlers are physically embedded in atry block, not coded elsewhere in functions that are called from the try block. For example, nested finally handlers all fire on an exception, whether they are nested syntactically or by means of the runtime flow through physically separated parts of your code:

>>> try:

... try:

... raise IndexError

... finally:

... print('spam')

... finally:

... print('SPAM')

...

spam

SPAM

Traceback (most recent call last):

File "<stdin>", line 3, in <module>

IndexError

See Figure 36-2 for a graphic illustration of this code’s operation; the effect is the same, but the function logic has been inlined as nested statements here. For a more useful example of syntactic nesting at work, consider the following file, except-finally.py:

def raise1(): raise IndexError

def noraise(): return

def raise2(): raise SyntaxError

for func in (raise1, noraise, raise2):

print('<%s>' % func.__name__)

try:

try:

func()

except IndexError:

print('caught IndexError')

finally:

print('finally run')

print('...')

This code catches an exception if one is raised and performs a finally termination-time action regardless of whether an exception occurs. This may take a few moments to digest, but the effect is the same as combining an except and a finally clause in a single try statement in Python 2.5 and later:

% python except-finally.py

<raise1>

caught IndexError

finally run

...

<noraise>

finally run

...

<raise2>

finally run

Traceback (most recent call last):

File "except-finally.py", line 9, in <module>

func()

File "except-finally.py", line 3, in raise2

def raise2(): raise SyntaxError

SyntaxError: None

As we saw in Chapter 34, as of Python 2.5, except and finally clauses can be mixed in the same try statement. This, along with multiple except clause support, makes some of the syntactic nesting described in this section unnecessary, though the equivalent runtime nesting is common in larger Python programs. Moreover, syntactic nesting still works today, may still appear in code written prior to Python 2.5 that you may encounter, can make the disjoint roles of except and finally more explicit, and can be used as a technique for implementing alternative exception-handling behaviors in general.

Exception Idioms

We’ve seen the mechanics behind exceptions. Now let’s take a look at some of the other ways they are typically used.

Breaking Out of Multiple Nested Loops: “go to”

As mentioned at the start of this part of the book, exceptions can often be used to serve the same roles as other languages’ “go to” statements to implement more arbitrary control transfers. Exceptions, however, provide a more structured option that localizes the jump to a specific block of nested code.

In this role, raise is like “go to,” and except clauses and exception names take the place of program labels. You can jump only out of code wrapped in a try this way, but that’s a crucial feature—truly arbitrary “go to” statements can make code extraordinarily difficult to understand and maintain.

For example, Python’s break statement exits just the single closest enclosing loop, but we can always use exceptions to break out of more than one loop level if needed:

>>> class Exitloop(Exception): pass

...

>>> try:

... while True:

... while True:

... for i in range(10):

... if i > 3: raise Exitloop # break exits just one level

... print('loop3: %s' % i)

... print('loop2')

... print('loop1')

... except Exitloop:

... print('continuing') # Or just pass, to move on

...

loop3: 0

loop3: 1

loop3: 2

loop3: 3

continuing

>>> i

4

If you change the raise in this to break, you’ll get an infinite loop, because you’ll break only out of the most deeply nested for loop, and wind up in the second-level loop nesting. The code would then print “loop2” and start the for again.

Also notice that variable i is still what it was after the try statement exits. Variable assignments made in a try are not undone in general, though as we’ve seen, exception instance variables listed in except clause headers are localized to that clause, and the local variables of any functions that are exited as a result of a raise are discarded. Technically, active functions’ local variables are popped off the call stack and the objects they reference may be garbage-collected as a result, but this is an automatic step.

Exceptions Aren’t Always Errors

In Python, all errors are exceptions, but not all exceptions are errors. For instance, we saw in Chapter 9 that file object read methods return an empty string at the end of a file. In contrast, the built-in input function—which we first met in Chapter 3, deployed in an interactive loop inChapter 10, and learned is named raw_input in 2.X—reads a line of text from the standard input stream, sys.stdin, at each call and raises the built-in EOFError at end-of-file.

Unlike file methods, this function does not return an empty string—an empty string from input means an empty line. Despite its name, though, the EOFError exception is just a signal in this context, not an error. Because of this behavior, unless the end-of-file should terminate a script,input often appears wrapped in a try handler and nested in a loop, as in the following code:

while True:

try:

line = input() # Read line from stdin (raw_input in 2.X)

except EOFError:

break # Exit loop at end-of-file

else:

...process next line here...

Several other built-in exceptions are similarly signals, not errors—for example, calling sys.exit() and pressing Ctrl-C on your keyboard raise SystemExit and KeyboardInterrupt, respectively.

Python also has a set of built-in exceptions that represent warnings rather than errors; some of these are used to signal use of deprecated (phased out) language features. See the standard library manual’s description of built-in exceptions for more information, and consult the warningsmodule’s documentation for more on exceptions raised as warnings.

Functions Can Signal Conditions with raise

User-defined exceptions can also signal nonerror conditions. For instance, a search routine can be coded to raise an exception when a match is found instead of returning a status flag for the caller to interpret. In the following, the try/except/else exception handler does the work of anif/else return-value tester:

class Found(Exception): pass

def searcher():

if ...success...:

raise Found() # Raise exceptions instead of returning flags

else:

return

try:

searcher()

except Found: # Exception if item was found

...success...

else: # else returned: not found

...failure...

More generally, such a coding structure may also be useful for any function that cannot return a sentinel value to designate success or failure. In a widely applicable function, for instance, if all objects are potentially valid return values, it’s impossible for any return value to signal a failure condition. Exceptions provide a way to signal results without a return value:

class Failure(Exception): pass

def searcher():

if ...success...:

return ...founditem...

else:

raise Failure()

try:

item = searcher()

except Failure:

...not found...

else:

...use item here...

Because Python is dynamically typed and polymorphic to the core, exceptions, rather than sentinel return values, are the generally preferred way to signal such conditions.

Closing Files and Server Connections

We encountered examples in this category in Chapter 34. As a summary, though, exception processing tools are also commonly used to ensure that system resources are finalized, regardless of whether an error occurs during processing or not.

For example, some servers require connections to be closed in order to terminate a session. Similarly, output files may require close calls to flush their buffers to disk for waiting consumers, and input files may consume file descriptors if not closed; although file objects are automatically closed when garbage-collected if still open, in some Pythons it may be difficult to be sure when that will occur.

As we saw in Chapter 34, the most general and explicit way to guarantee termination actions for a specific block of code is the try/finally statement:

myfile = open(r'C:\code\textdata', 'w')

try:

...process myfile...

finally:

myfile.close()

As we also saw, some objects make this potentially easier in Python 2.6, 3.0, and later by providing context managers that terminate or close the objects for us automatically when run by the with/as statement:

with open(r'C:\code\textdata', 'w') as myfile:

...process myfile...

So which option is better here? As usual, it depends on your programs. Compared to the traditional try/finally, context managers are more implicit, which runs contrary to Python’s general design philosophy. Context managers are also arguably less general—they are available only for select objects, and writing user-defined context managers to handle general termination requirements is more complex than coding a try/finally.

On the other hand, using existing context managers requires less code than using try/finally, as shown by the preceding examples. Moreover, the context manager protocol supports entry actions in addition to exit actions. In fact, it can save a line of code when no exceptions are expected at all (albeit at the expense of further nesting and indenting file processing logic):

myfile = open(filename, 'w') # Traditional form

...process myfile...

myfile.close()

with open(filename) as myfile: # Context manager form

...process myfile...

Still, the implicit exception processing of with makes it more directly comparable to the explicit exception handling of try/finally. Although try/finally is the more widely applicable technique, context managers may be preferable where they are already available, or where their extra complexity is warranted.

Debugging with Outer try Statements

You can also make use of exception handlers to replace Python’s default top-level exception-handling behavior. By wrapping an entire program (or a call to it) in an outer try in your top-level code, you can catch any exception that may occur while your program runs, thereby subverting the default program termination.

In the following, the empty except clause catches any uncaught exception raised while the program runs. To get hold of the actual exception that occurred in this mode, fetch the sys.exc_info function call result from the built-in sys module; it returns a tuple whose first two items contain the current exception’s class and the instance object raised (more on sys.exc_info in a moment):

try:

...run program...

except: # All uncaught exceptions come here

import sys

print('uncaught!', sys.exc_info()[0], sys.exc_info()[1])

This structure is commonly used during development, to keep programs active even after errors occur—within a loop, it allows you to run additional tests without having to restart. It’s also used when testing other program code, as described in the next section.

NOTE

On a related note, for more about handling program shutdowns without recovery from them, see also Python’s atexit standard library module. It’s also possible to customize what the top-level exception handler does with sys.excepthook. These and other related tools are described in Python’s library manual.

Running In-Process Tests

Some of the coding patterns we’ve just looked at can be combined in a test-driver application that tests other code within the same process. The following partial code sketches the general model:

import sys

log = open('testlog', 'a')

from testapi import moreTests, runNextTest, testName

def testdriver():

while moreTests():

try:

runNextTest()

except:

print('FAILED', testName(), sys.exc_info()[:2], file=log)

else:

print('PASSED', testName(), file=log)

testdriver()

The testdriver function here cycles through a series of test calls (the module testapi is left abstract in this example). Because an uncaught exception in a test case would normally kill this test driver, you need to wrap test case calls in a try if you want to continue the testing process after a test fails. The empty except catches any uncaught exception generated by a test case as usual, and it uses sys.exc_info to log the exception to a file. The else clause is run when no exception occurs—the test success case.

Such boilerplate code is typical of systems that test functions, modules, and classes by running them in the same process as the test driver. In practice, however, testing can be much more sophisticated than this. For instance, to test external programs, you could instead check status codes or outputs generated by program-launching tools such as os.system and os.popen, used earlier in this book and covered in the standard library manual. Such tools do not generally raise exceptions for errors in the external programs—in fact, the test cases may run in parallel with the test driver.

At the end of this chapter, we’ll also briefly meet more complete testing frameworks provided by Python, such as doctest and PyUnit, which provide tools for comparing expected outputs with actual results.

More on sys.exc_info

The sys.exc_info result used in the last two sections allows an exception handler to gain access to the most recently raised exception generically. This is especially useful when using the empty except clause to catch everything blindly, to determine what was raised:

try:

...

except:

# sys.exc_info()[0:2] are the exception class and instance

If no exception is being handled, this call returns a tuple containing three None values. Otherwise, the values returned are (type, value, traceback), where:

§ type is the exception class of the exception being handled.

§ value is the exception class instance that was raised.

§ traceback is a traceback object that represents the call stack at the point where the exception originally occurred, and used by the traceback module to generate error messages.

As we saw in the prior chapter, sys.exc_info can also sometimes be useful to determine the specific exception type when catching exception category superclasses. As we’ve also learned, though, because in this case you can also get the exception type by fetching the __class__ attribute of the instance obtained with the as clause, sys.exc_info is often redundant apart from the empty except:

try:

...

except General as instance:

# instance.__class__ is the exception class

As we’ve seen, using Exception for the General exception name here would catch all nonexit exceptions, similar to an empty except but less extreme, and still giving access to the exception instance and its class. Even so, using the instance object’s interfaces and polymorphism is often a better approach than testing exception types—exception methods can be defined per class and run generically:

try:

...

except General as instance:

# instance.method() does the right thing for this instance

As usual, being too specific in Python can limit your code’s flexibility. A polymorphic approach like the last example here generally supports future evolution better than explicitly type-specific tests or actions.

Displaying Errors and Tracebacks

Finally, the exception traceback object available in the prior section’s sys.exc_info result is also used by the standard library’s traceback module to generate the standard error message and stack display manually. This file has a handful of interfaces that support wide customization, which we don’t have space to cover usefully here, but the basics are simple. Consider the following aptly named file, badly.py:

import traceback

def inverse(x):

return 1 / x

try:

inverse(0)

except Exception:

traceback.print_exc(file=open('badly.exc', 'w'))

print('Bye')

This code uses the print_exc convenience function in the traceback module, which uses sys.exc_info data by default; when run, the script prints the error message to a file—handy in testing programs that need to catch errors but still record them in full:

c:\code> python badly.py

Bye

c:\code> type badly.exc

Traceback (most recent call last):

File "badly.py", line 7, in <module>

inverse(0)

File "badly.py", line 4, in inverse

return 1 / x

ZeroDivisionError: division by zero

For much more on traceback objects, the traceback module that uses them, and related topics, consult other reference resources and manuals.

NOTE

Version skew note: In Python 2.X, the older tools sys.exc_type and sys.exc_value still work to fetch the most recent exception type and value, but they can manage only a single, global exception for the entire process. These two names have been removed in Python 3.X. The newer and preferred sys.exc_info() call available in both 2.X and 3.X instead keeps track of each thread’s exception information, and so is thread-specific. Of course, this distinction matters only when using multiple threads in Python programs (a subject beyond this book’s scope), but 3.X forces the issue. See other resources for more details.

Exception Design Tips and Gotchas

I’m lumping design tips and gotchas together in this chapter, because it turns out that the most common gotchas largely stem from design issues. By and large, exceptions are easy to use in Python. The real art behind them is in deciding how specific or general your except clauses should be and how much code to wrap up in try statements. Let’s address the second of these concerns first.

What Should Be Wrapped

In principle, you could wrap every statement in your script in its own try, but that would just be silly (the try statements would then need to be wrapped in try statements!). What to wrap is really a design issue that goes beyond the language itself, and it will become more apparent with use. But for now, here are a few rules of thumb:

§ Operations that commonly fail should generally be wrapped in try statements. For example, operations that interface with system state (file opens, socket calls, and the like) are prime candidates for try.

§ However, there are exceptions to the prior rule—in a simple script, you may want failures of such operations to kill your program instead of being caught and ignored. This is especially true if the failure is a showstopper. Failures in Python typically result in useful error messages (not hard crashes), and this is the best outcome some programs could hope for.

§ You should implement termination actions in try/finally statements to guarantee their execution, unless a context manager is available as a with/as option. The try/finally statement form allows you to run code whether exceptions occur or not in arbitrary scenarios.

§ It is sometimes more convenient to wrap the call to a large function in a single try statement, rather than littering the function itself with many try statements. That way, all exceptions in the function percolate up to the try around the call, and you reduce the amount of code within the function.

The types of programs you write will probably influence the amount of exception handling you code as well. Servers, for instance, must generally keep running persistently and so will likely require try statements to catch and recover from exceptions. In-process testing programs of the kind we saw in this chapter will probably handle exceptions as well. Simpler one-shot scripts, though, will often ignore exception handling completely because failure at any step requires script shutdown.

Catching Too Much: Avoid Empty except and Exception

As mentioned, exception handler generality is a key design choice. Python lets you pick and choose which exceptions to catch, but you sometimes have to be careful to not be too inclusive. For example, you’ve seen that an empty except clause catches every exception that might be raised while the code in the try block runs.

That’s easy to code, and sometimes desirable, but you may also wind up intercepting an error that’s expected by a try handler higher up in the exception nesting structure. For example, an exception handler such as the following catches and stops every exception that reaches it, regardless of whether another handler is waiting for it:

def func():

try:

... # IndexError is raised in here

except:

... # But everything comes here and dies!

try:

func()

except IndexError: # Exception should be processed here

...

Perhaps worse, such code might also catch unrelated system exceptions. Even things like memory errors, genuine programming mistakes, iteration stops, keyboard interrupts, and system exits raise exceptions in Python. Unless you’re writing a debugger or similar tool, such exceptions should not usually be intercepted in your code.

For example, scripts normally exit when control falls off the end of the top-level file. However, Python also provides a built-in sys.exit(statuscode) call to allow early terminations. This actually works by raising a built-in SystemExit exception to end the program, so thattry/finally handlers run on the way out and special types of programs can intercept the event.[71] Because of this, a try with an empty except might unknowingly prevent a crucial exit, as in the following file (exiter.py):

import sys

def bye():

sys.exit(40) # Crucial error: abort now!

try:

bye()

except:

print('got it') # Oops--we ignored the exit

print('continuing...')

% python exiter.py

got it

continuing...

You simply might not expect all the kinds of exceptions that could occur during an operation. Using the built-in exception classes of the prior chapter can help in this particular case, because the Exception superclass is not a superclass of SystemExit:

try:

bye()

except Exception: # Won't catch exits, but _will_ catch many others

...

In other cases, though, this scheme is no better than an empty except clause—because Exception is a superclass above all built-in exceptions except system-exit events, it still has the potential to catch exceptions meant for elsewhere in the program.

Probably worst of all, both using an empty except and catching the Exception superclass will also catch genuine programming errors, which should be allowed to pass most of the time. In fact, these two techniques can effectively turn off Python’s error-reporting machinery, making it difficult to notice mistakes in your code. Consider this code, for example:

mydictionary = {...}

...

try:

x = myditctionary['spam'] # Oops: misspelled

except:

x = None # Assume we got KeyError

...continue here with x...

The coder here assumes that the only sort of error that can happen when indexing a dictionary is a missing key error. But because the name myditctionary is misspelled (it should say mydictionary), Python raises a NameError instead for the undefined name reference, which the handler will silently catch and ignore. The event handler will incorrectly fill in a None default for the dictionary access, masking the program error.

Moreover, catching Exception here will not help—it would have the exact same effect as an empty except, happily and silently filling in a default and masking a genuine program error you will probably want to know about. If this happens in code that is far removed from the place where the fetched values are used, it might make for a very interesting debugging task!

As a rule of thumb, be as specific in your handlers as you can be—empty except clauses and Exception catchers are handy, but potentially error-prone. In the last example, for instance, you would be better off saying except KeyError: to make your intentions explicit and avoid intercepting unrelated events. In simpler scripts, the potential for problems might not be significant enough to outweigh the convenience of a catchall, but in general, general handlers are generally trouble.

Catching Too Little: Use Class-Based Categories

On the other hand, neither should handlers be too specific. When you list specific exceptions in a try, you catch only what you actually list. This isn’t necessarily a bad thing, but if a system evolves to raise other exceptions in the future, you may need to go back and add them to exception lists elsewhere in your code.

We saw this phenomenon at work in the prior chapter. For instance, the following handler is written to treat MyExcept1 and MyExcept2 as normal cases and everything else as an error. If you add a MyExcept3 in the future, though, it will be processed as an error unless you update the exception list:

try:

...

except (MyExcept1, MyExcept2): # Breaks if you add a MyExcept3 later

... # Nonerrors

else:

... # Assumed to be an error

Luckily, careful use of the class-based exceptions we discussed in Chapter 34 can make this code maintenance trap go away completely. As we saw, if you catch a general superclass, you can add and raise more specific subclasses in the future without having to extend except clause lists manually—the superclass becomes an extendible exceptions category:

try:

...

except SuccessCategoryName: # OK if you add a MyExcept3 subclass later

... # Nonerrors

else:

... # Assumed to be an error

In other words, a little design goes a long way. The moral of the story is to be careful to be neither too general nor too specific in exception handlers, and to pick the granularity of your try statement wrappings wisely. Especially in larger systems, exception policies should be a part of the overall design.


[71] A related call, os._exit, also ends a program, but via an immediate termination—it skips cleanup actions, including any registered with the atexit module noted earlier, and cannot be intercepted with try/except or try/finally blocks. It is usually used only in spawned child processes, a topic beyond this book’s scope. See the library manual or follow-up texts for details.

Core Language Summary

Congratulations! This concludes your look at the fundamentals of the Python programming language. If you’ve gotten this far, you’ve become a fully operational Python programmer. There’s more optional reading in the advanced topics part ahead that I’ll describe in a moment. In terms of the essentials, though, the Python story—and this book’s main journey—is now complete.

Along the way, you’ve seen just about everything there is to see in the language itself, and in enough depth to apply to most of the code you are likely to encounter in the open source “wild.” You’ve studied built-in types, statements, and exceptions, as well as tools used to build up the larger program units of functions, modules, and classes. You’ve also explored important software design issues, the complete OOP paradigm, functional programing tools, program architecture concepts, alternative tool tradeoffs, and more—compiling a skill set now qualified to be turned loose on the task of developing real applications.

The Python Toolset

From this point forward, your future Python career will largely consist of becoming proficient with the toolset available for application-level Python programming. You’ll find this to be an ongoing task. The standard library, for example, contains hundreds of modules, and the public domain offers still more tools. It’s possible to spend decades seeking proficiency with all these tools, especially as new ones are constantly appearing to address new technologies (trust me on this—I’m at 20 years and counting!).

Speaking generally, Python provides a hierarchy of toolsets:

Built-ins

Built-in types like strings, lists, and dictionaries make it easy to write simple programs fast.

Python extensions

For more demanding tasks, you can extend Python by writing your own functions, modules, and classes.

Compiled extensions

Although we don’t cover this topic in this book, Python can also be extended with modules written in an external language like C or C++.

Because Python layers its toolsets, you can decide how deeply your programs need to delve into this hierarchy for any given task—you can use built-ins for simple scripts, add Python-coded extensions for larger systems, and code compiled extensions for advanced work. We’ve only covered the first two of these categories in this book, and that’s plenty to get you started doing substantial programming in Python.

Beyond this, there are tools, resources, or precedents for using Python in nearly any computer domain you can imagine. For pointers on where to go next, see Chapter 1’s overview of Python applications and users. You’ll likely find that with a powerful open source language like Python, common tasks are often much easier, and even enjoyable, than you might expect.

Development Tools for Larger Projects

Most of the examples in this book have been fairly small and self-contained. They were written that way on purpose, to help you master the basics. But now that you know all about the core language, it’s time to start learning how to use Python’s built-in and third-party interfaces to do real work.

In practice, Python programs can become substantially larger than the examples you’ve experimented with so far in this book. Even in Python, thousands of lines of code are not uncommon for nontrivial and useful programs, once you add up all the individual modules in the system. Though Python basic program structuring tools such as modules and classes help much to manage this complexity, other tools can sometimes offer additional support.

For developing larger systems, you’ll find such support available in both Python and the public domain. You’ve seen some of these in action, and I’ve mentioned a few others. To help you on your next steps, here is a quick tour and summary of some of the most commonly used tools in this domain:

PyDoc and docstrings

PyDoc’s help function and HTML interfaces were introduced in Chapter 15. PyDoc provides a documentation system for your modules and objects, integrates with Python’s docstrings syntax, and is a standard part of the Python system. See Chapter 15 and Chapter 4 for more documentation source hints.

PyChecker and PyLint

Because Python is such a dynamic language, some programming errors are not reported until your program runs (even syntax errors are not caught until a file is run or imported). This isn’t a big drawback—as with most languages, it just means that you have to test your Python code before shipping it. At worst, with Python you essentially trade a compile phase for an initial testing phase. Furthermore, Python’s dynamic nature, automatic error messages, and exception model make it easier and quicker to find and fix errors than it is in some other languages. Unlike C, for example, Python does not crash completely on errors.

Still, tools can help here too. The PyChecker and PyLint systems provide support for catching common errors ahead of time, before your script runs. They serve similar roles to the lint program in C development. Some Python developers run their code through PyChecker prior to testing or delivery, to catch any lurking potential problems. In fact, it’s not a bad idea to try this when you’re first starting out—some of these tools’ warnings may help you learn to spot and avoid common Python mistakes. PyChecker and PyLint are third-party open source packages, available at the PyPI website or your friendly neighborhood web search engine. They may appear in IDE GUIs as well.

PyUnit (a.k.a. unittest)

In Chapter 25, we learned how to add self-test code to a Python file by using the __name__ == '__main__' trick at the bottom of the file—a simple unit-testing protocol. For more advanced testing purposes, Python comes with two testing support tools. The first, PyUnit (calledunittest in the library manual), provides an object-oriented class framework for specifying and customizing test cases and expected results. It mimics the JUnit framework for Java. This is a sophisticated class-based unit testing system; see the Python library manual for details.

doctest

The doctest standard library module provides a second and simpler approach to regression testing, based upon Python’s docstrings feature. Roughly, to use doctest, you cut and paste a log of an interactive testing session into the docstrings of your source files. doctest then extracts your docstrings, parses out the test cases and results, and reruns the tests to verify the expected results. doctest’s operation can be tailored in a variety of ways; see the library manual for more details.

IDEs

We discussed IDEs for Python in Chapter 3. IDEs such as IDLE provide a graphical environment for editing, running, debugging, and browsing your Python programs. Some advanced IDEs—such as Eclipse, Komodo, NetBeans, and others listed in Chapter 3—may support additional development tasks, including source control integration, code refactoring, project management tools, and more. See Chapter 3, the text editors page at http://www.python.org, and your favorite web search engine for more on available IDEs and GUI builders for Python.

Profilers

Because Python is so high-level and dynamic, intuitions about performance gleaned from experience with other languages usually don’t apply to Python code. To truly isolate performance bottlenecks in your code, you need to add timing logic with clock tools in the time or timeitmodules, or run your code under the profile module. We saw an example of the timing modules at work when comparing the speed of iteration tools and Pythons in Chapter 21.

Profiling is usually your first optimization step—code for clarity, then profile to isolate bottlenecks, and then time alternative codings of the slow parts of your program. For the second of these steps, profile is a standard library module that implements a source code profiler for Python. It runs a string of code you provide (e.g., a script file import, or a call to a function) and then, by default, prints a report to the standard output stream that gives performance statistics—number of calls to each function, time spent in each function, and more.

The profile module can be run as a script or imported, and it may be customized in various ways; for example, it can save run statistics to a file to be analyzed later with the pstats module. To profile interactively, import the profile module and call profile.run('code'), passing in the code you wish to profile as a string (e.g., a call to a function, an import of a file, or code read from a file). To profile from a system shell command line, use a command of the form python -m profile main.py args (see Appendix A for more on this format). Also see Python’s standard library manuals for other profiling options; the cProfile module, for example, has identical interfaces to profile but runs with less overhead, so it may be better suited to profiling long-running programs.

Debuggers

We also discussed debugging options in Chapter 3 (see its sidebar Debugging Python Code). As a review, most development IDEs for Python support GUI-based debugging, and the Python standard library also includes a source code debugger module called pdb. This module provides a command-line interface and works much like common C language debuggers (e.g., dbx, gdb).

Much like the profiler, the pdb debugger can be run either interactively or from a command line and can be imported and called from a Python program. To use it interactively, import the module, start running code by calling a pdb function (e.g., pdb.run('main()')), and then type debugging commands from pdb’s interactive prompt. To launch pdb from a system shell command line, use a command of the form python -m pdb main.py args. pdb also includes a useful postmortem analysis call, pdb.pm(), which starts the debugger after an exception has been encountered, possibly in conjunction with Python’s -i flag. See Appendix A for more on these tools.

Because IDEs such as IDLE also include point-and-click debugging interfaces, pdb isn’t as critical a tool today, except when a GUI isn’t available or when more control is desired. See Chapter 3 for tips on using IDLE’s debugging GUI interfaces. Really, neither pdb nor IDEs seem to be used much in practice—as noted in Chapter 3, most programmers either insert print statements or simply read Python’s error messages: perhaps not the most high-tech of approaches, but the practical tends to win the day in the Python world!

Shipping options

In Chapter 2, we introduced common tools for packaging Python programs. py2exe, PyInstaller, and others listed in that chapter can package byte code and the Python Virtual Machine into “frozen binary” standalone executables, which don’t require that Python be installed on the target machine and hide your system’s code. In addition, we learned in Chapter 2 that Python programs may be shipped in their source (.py) or byte code (.pyc) forms, and that import hooks support special packaging techniques such as automatic extraction of .zip files and byte code encryption.

We also briefly met the standard library’s distutils modules, which provide packaging options for Python modules and packages, and C-coded extensions; see the Python manuals for more details. The emerging Python “eggs” third-party packaging system provides another alternative that also accounts for dependencies; search the Web for more details.

Optimization options

When speed counts, there are a handful of options for optimizing your programs. The PyPy system described in Chapter 2 provides a just-in-time compiler for translating Python byte code to binary machine code, and Shed Skin offers a Python-to-C++ translator. You may also occasionally see .pyo optimized byte code files, generated and run with the -O Python command-line flag discussed in Chapter 22 and Chapter 34, and to be deployed in Chapter 39; because this provides a very modest performance boost, however, it is not commonly used except to remove debugging code.

As a last resort, you can also move parts of your program to a compiled language such as C to boost performance. See the book Programming Python and the Python standard manuals for more on C extensions. In general, Python’s speed tends to also improve over time, so upgrading to later releases may improve speed too—once you verify that they are faster for your code, that is (though largely repaired since, Python 3.0’s initial release was up to 1000X slower than 2.X on some IO operations!).

Other hints for larger projects

We’ve met a variety of core language features in this text that will also tend to become more useful once you start coding larger projects. These include module packages (Chapter 24), class-based exceptions (Chapter 34), class pseudoprivate attributes (Chapter 31), documentation strings (Chapter 15), module path configuration files (Chapter 22), hiding names from from * with __all__ lists and _X-style names (Chapter 25), adding self-test code with the __name__ == '__main__' trick (Chapter 25), using common design rules for functions and modules (Chapter 17, Chapter 19, and Chapter 25), using object-oriented design patterns (Chapter 31 and others), and so on.

To learn about other large-scale Python development tools available in the public domain, be sure to browse the pages at the PyPI website at http://www.python.org, and the Web at large. Applying Python is actually a larger topic than learning Python, and one we’ll have to delegate to follow-up resources here.

Chapter Summary

This chapter wrapped up the exceptions part of the book with a survey of design concepts, a look at common exception use cases, and a brief summary of commonly used development tools.

This chapter also wrapped up the core material of this book. At this point, you’ve been exposed to the full subset of Python that most programmers use—and probably more. In fact, if you have read this far, you should feel free to consider yourself an official Python programmer. Be sure to pick up a t-shirt or laptop sticker the next time you’re online (and don’t forget to add Python to your résumé the next time you dig it out).

The next and final part of this book is a collection of chapters dealing with topics that are advanced, but still in the core language category. These chapters are all optional reading, or at least deferrable reading, because not every Python programmer must delve into their subjects, and others can postpone these chapters’ topics until they are needed. Indeed, many of you can stop here and begin exploring Python’s roles in your application domains. Frankly, application libraries tend to be more important in practice than advanced—and to some, esoteric—language features.

On the other hand, if you do need to care about things like Unicode or binary data, have to deal with API-building tools such as descriptors, decorators, and metaclasses, or just want to dig a bit further in general, the next part of the book will help you get started. The larger examples in the final part will also give you a chance to see the concepts you’ve already learned being applied in more realistic ways.

As this is the end of the core material of this book, though, you get a break on the chapter quiz—just one question this time. As always, be sure to work through this part’s closing exercises to cement what you’ve learned in the past few chapters; because the next part is optional reading, this is the final end-of-part exercises session. If you want to see some examples of how what you’ve learned comes together in real scripts drawn from common applications, be sure to check out the “solution” to exercise 4 in Appendix D.

And if this is the end of your journey in this book, be sure to also see the “Bonus” section at the end of Chapter 41, the very last chapter in this book (for the sake of readers continuing on to the Advanced Topics part, I won’t spill the beans here).

Test Your Knowledge: Quiz

1. (This question is a repeat from the first quiz in Chapter 1—see, I told you it would be easy! :-) Why does “spam” show up in so many Python examples in books and on the Web?

Test Your Knowledge: Answers

1. Because Python is named after the British comedy group Monty Python (based on surveys I’ve conducted in classes, this is a much-too-well-kept secret in the Python world!). The spam reference comes from a Monty Python skit, set in a cafeteria whose menu items all seem to come with Spam. A couple trying to order food there keeps getting drowned out by a chorus of Vikings singing a song about Spam. No, really. And if I could insert an audio clip of that song here, I would...

Test Your Knowledge: Part VII Exercises

As we’ve reached the end of this part of the book, it’s time for a few exception exercises to give you a chance to practice the basics. Exceptions really are simple tools; if you get these, you’ve probably mastered the exceptions domain. See Part VII in Appendix D for the solutions.

1. try/except. Write a function called oops that explicitly raises an IndexError exception when called. Then write another function that calls oops inside a try/except statement to catch the error. What happens if you change oops to raise a KeyError instead of an IndexError? Where do the names KeyError and IndexError come from? (Hint: recall that all unqualified names generally come from one of four scopes.)

2. Exception objects and lists. Change the oops function you just wrote to raise an exception you define yourself, called MyError. Identify your exception with a class (unless you’re using Python 2.5 or earlier, you must). Then, extend the try statement in the catcher function to catch this exception and its instance in addition to IndexError, and print the instance you catch.

3. Error handling. Write a function called safe(func, *pargs, **kargs) that runs any function with any number of positional and/or keyword arguments by using the * arbitrary arguments header and call syntax, catches any exception raised while the function runs, and prints the exception using the exc_info call in the sys module. Then use your safe function to run your oops function from exercise 1 or 2. Put safe in a module file called exctools.py, and pass it the oops function interactively. What kind of error messages do you get? Finally, expandsafe to also print a Python stack trace when an error occurs by calling the built-in print_exc function in the standard traceback module; see earlier in this chapter, and consult the Python library reference manual for usage details. We could probably code safe as a function decorator using Chapter 32 techniques, but we’ll have to move on to the next part of the book to learn fully how (see the solutions for a preview).

4. Self-study examples. At the end of Appendix D, I’ve included a handful of example scripts developed as group exercises in live Python classes for you to study and run on your own in conjunction with Python’s standard manual set. These are not described, and they use tools in the Python standard library that you’ll have to research on your own. Still, for many readers, it helps to see how the concepts we’ve discussed in this book come together in real programs. If these whet your appetite for more, you can find a wealth of larger and more realistic application-level Python program examples in follow-up books like Programming Python and on the Web.