Class Coding Details - Classes and OOP - Learning Python (2013)

Learning Python (2013)

Part VI. Classes and OOP

Chapter 29. Class Coding Details

If you haven’t quite gotten all of Python OOP yet, don’t worry; now that we’ve had a first tour, we’re going to dig a bit deeper and study the concepts introduced earlier in further detail. In this and the following chapter, we’ll take another look at class mechanics. Here, we’re going to study classes, methods, and inheritance, formalizing and expanding on some of the coding ideas introduced in Chapter 27. Because the class is our last namespace tool, we’ll summarize Python’s namespace and scope concepts as well.

The next chapter continues this in-depth second pass over class mechanics by covering one specific aspect: operator overloading. Besides presenting additional details, this chapter and the next also give us an opportunity to explore some larger classes than those we have studied so far.

Content note: if you’ve been reading linearly, some of this chapter will be review and summary of topics introduced in the preceding chapter’s case study, revisited here by language topics with smaller and more self-contained examples for readers new to OOP. Others may be tempted to skip some of this chapter, but be sure to see the namespace coverage here, as it explains some subtleties in Python’s class model.

The class Statement

Although the Python class statement may seem similar to tools in other OOP languages on the surface, on closer inspection, it is quite different from what some programmers are used to. For example, as in C++, the class statement is Python’s main OOP tool, but unlike in C++, Python’sclass is not a declaration. Like a def, a class statement is an object builder, and an implicit assignment—when run, it generates a class object and stores a reference to it in the name used in the header. Also like a def, a class statement is true executable code—your class doesn’t exist until Python reaches and runs the class statement that defines it. This typically occurs while importing the module it is coded in, but not before.

General Form

class is a compound statement, with a body of statements typically indented appearing under the header. In the header, superclasses are listed in parentheses after the class name, separated by commas. Listing more than one superclass leads to multiple inheritance, which we’ll discuss more formally in Chapter 31. Here is the statement’s general form:

class name(superclass,...): # Assign to name

attr = value # Shared class data

def method(self,...): # Methods

self.attr = value # Per-instance data

Within the class statement, any assignments generate class attributes, and specially named methods overload operators; for instance, a function called __init__ is called at instance object construction time, if defined.

Example

As we’ve seen, classes are mostly just namespaces—that is, tools for defining names (i.e., attributes) that export data and logic to clients. A class statement effectively defines a namespace. Just as in a module file, the statements nested in a class statement body create its attributes. When Python executes a class statement (not a call to a class), it runs all the statements in its body, from top to bottom. Assignments that happen during this process create names in the class’s local scope, which become attributes in the associated class object. Because of this, classes resemble both modules and functions:

§ Like functions, class statements are local scopes where names created by nested assignments live.

§ Like names in a module, names assigned in a class statement become attributes in a class object.

The main distinction for classes is that their namespaces are also the basis of inheritance in Python; reference attributes that are not found in a class or instance object are fetched from other classes.

Because class is a compound statement, any sort of statement can be nested inside its body—print, assignments, if, def, and so on. All the statements inside the class statement run when the class statement itself runs (not when the class is later called to make an instance). Typically, assignment statements inside the class statement make data attributes, and nested defs make method attributes. In general, though, any type of name assignment at the top level of a class statement creates a same-named attribute of the resulting class object.

For example, assignments of simple nonfunction objects to class attributes produce data attributes, shared by all instances:

>>> class SharedData:

spam = 42 # Generates a class data attribute

>>> x = SharedData() # Make two instances

>>> y = SharedData()

>>> x.spam, y.spam # They inherit and share 'spam' (a.k.a. SharedData.spam)

(42, 42)

Here, because the name spam is assigned at the top level of a class statement, it is attached to the class and so will be shared by all instances. We can change it by going through the class name, and we can refer to it through either instances or the class:[57]

>>> SharedData.spam = 99

>>> x.spam, y.spam, SharedData.spam

(99, 99, 99)

Such class attributes can be used to manage information that spans all the instances—a counter of the number of instances generated, for example (we’ll expand on this idea by example in Chapter 32). Now, watch what happens if we assign the name spam through an instance instead of the class:

>>> x.spam = 88

>>> x.spam, y.spam, SharedData.spam

(88, 99, 99)

Assignments to instance attributes create or change the names in the instance, rather than in the shared class. More generally, inheritance searches occur only on attribute references, not on assignment: assigning to an object’s attribute always changes that object, and no other.[58] For example,y.spam is looked up in the class by inheritance, but the assignment to x.spam attaches a name to x itself.

Here’s a more comprehensive example of this behavior that stores the same name in two places. Suppose we run the following class:

class MixedNames: # Define class

data = 'spam' # Assign class attr

def __init__(self, value): # Assign method name

self.data = value # Assign instance attr

def display(self):

print(self.data, MixedNames.data) # Instance attr, class attr

This class contains two defs, which bind class attributes to method functions. It also contains an = assignment statement; because this assignment assigns the name data inside the class, it lives in the class’s local scope and becomes an attribute of the class object. Like all class attributes, this data is inherited and shared by all instances of the class that don’t have data attributes of their own.

When we make instances of this class, the name data is attached to those instances by the assignment to self.data in the constructor method:

>>> x = MixedNames(1) # Make two instance objects

>>> y = MixedNames(2) # Each has its own data

>>> x.display(); y.display() # self.data differs, MixedNames.data is the same

1 spam

2 spam

The net result is that data lives in two places: in the instance objects (created by the self.data assignment in __init__), and in the class from which they inherit names (created by the data assignment in the class). The class’s display method prints both versions, by first qualifying the self instance, and then the class.

By using these techniques to store attributes in different objects, we determine their scope of visibility. When attached to classes, names are shared; in instances, names record per-instance data, not shared behavior or data. Although inheritance searches look up names for us, we can always get to an attribute anywhere in a tree by accessing the desired object directly.

In the preceding example, for instance, specifying x.data or self.data will return an instance name, which normally hides the same name in the class; however, MixedNames.data grabs the class’s version of the name explicitly. The next section describes one of the most common roles for such coding patterns, and explains more about the way we deployed it in the prior chapter.


[57] If you’ve used C++ you may recognize this as similar to the notion of C++’s “static” data members—members that are stored in the class, independent of instances. In Python, it’s nothing special: all class attributes are just names assigned in the class statement, whether they happen to reference functions (C++’s “methods”) or something else (C++’s “members”). In Chapter 32, we’ll also meet Python static methods (akin to those in C++), which are just self-less functions that usually process class attributes.

[58] Unless the class has redefined the attribute assignment operation to do something unique with the __setattr__ operator overloading method (discussed in Chapter 30), or uses advanced attribute tools such as properties and descriptors (discussed in Chapter 32 and Chapter 38). Much of this chapter presents the normal case, which suffices at this point in the book, but as we’ll see later, Python hooks allow programs to deviate from the norm often.

Methods

Because you already know about functions, you also know about methods in classes. Methods are just function objects created by def statements nested in a class statement’s body. From an abstract perspective, methods provide behavior for instance objects to inherit. From a programming perspective, methods work in exactly the same way as simple functions, with one crucial exception: a method’s first argument always receives the instance object that is the implied subject of the method call.

In other words, Python automatically maps instance method calls to a class’s method functions as follows. Method calls made through an instance, like this:

instance.method(args...)

are automatically translated to class method function calls of this form:

class.method(instance, args...)

where Python determines the class by locating the method name using the inheritance search procedure. In fact, both call forms are valid in Python.

Besides the normal inheritance of method attribute names, the special first argument is the only real magic behind method calls. In a class’s method, the first argument is usually called self by convention (technically, only its position is significant, not its name). This argument provides methods with a hook back to the instance that is the subject of the call—because classes generate many instance objects, they need to use this argument to manage data that varies per instance.

C++ programmers may recognize Python’s self argument as being similar to C++’s this pointer. In Python, though, self is always explicit in your code: methods must always go through self to fetch or change attributes of the instance being processed by the current method call. This explicit nature of self is by design—the presence of this name makes it obvious that you are using instance attribute names in your script, not names in the local or global scope.

Method Example

To clarify these concepts, let’s turn to an example. Suppose we define the following class:

class NextClass: # Define class

def printer(self, text): # Define method

self.message = text # Change instance

print(self.message) # Access instance

The name printer references a function object; because it’s assigned in the class statement’s scope, it becomes a class object attribute and is inherited by every instance made from the class. Normally, because methods like printer are designed to process instances, we call them through instances:

>>> x = NextClass() # Make instance

>>> x.printer('instance call') # Call its method

instance call

>>> x.message # Instance changed

'instance call'

When we call the method by qualifying an instance like this, printer is first located by inheritance, and then its self argument is automatically assigned the instance object (x); the text argument gets the string passed at the call ('instance call'). Notice that because Python automatically passes the first argument to self for us, we only actually have to pass in one argument. Inside printer, the name self is used to access or set per-instance data because it refers back to the instance currently being processed.

As we’ve seen, though, methods may be called in one of two ways—through an instance, or through the class itself. For example, we can also call printer by going through the class name, provided we pass an instance to the self argument explicitly:

>>> NextClass.printer(x, 'class call') # Direct class call

class call

>>> x.message # Instance changed again

'class call'

Calls routed through the instance and the class have the exact same effect, as long as we pass the same instance object ourselves in the class form. By default, in fact, you get an error message if you try to call a method without any instance:

>>> NextClass.printer('bad call')

TypeError: unbound method printer() must be called with NextClass instance...

Calling Superclass Constructors

Methods are normally called through instances. Calls to methods through a class, though, do show up in a variety of special roles. One common scenario involves the constructor method. The __init__ method, like all attributes, is looked up by inheritance. This means that at construction time, Python locates and calls just one __init__. If subclass constructors need to guarantee that superclass construction-time logic runs, too, they generally must call the superclass’s __init__ method explicitly through the class:

class Super:

def __init__(self, x):

...default code...

class Sub(Super):

def __init__(self, x, y):

Super.__init__(self, x) # Run superclass __init__

...custom code... # Do my init actions

I = Sub(1, 2)

This is one of the few contexts in which your code is likely to call an operator overloading method directly. Naturally, you should call the superclass constructor this way only if you really want it to run—without the call, the subclass replaces it completely. For a more realistic illustration of this technique in action, see the Manager class example in the prior chapter’s tutorial.[59]

Other Method Call Possibilities

This pattern of calling methods through a class is the general basis of extending—instead of completely replacing—inherited method behavior. It requires an explicit instance to be passed because all methods do by default. Technically, this is because methods are instance methods in the absence of any special code.

In Chapter 32, we’ll also meet a newer option added in Python 2.2, static methods, that allow you to code methods that do not expect instance objects in their first arguments. Such methods can act like simple instanceless functions, with names that are local to the classes in which they are coded, and may be used to manage class data. A related concept we’ll meet in the same chapter, the class method, receives a class when called instead of an instance and can be used to manage per-class data, and is implied in metaclasses.

These are both advanced and usually optional extensions, though. Normally, an instance must always be passed to a method—whether automatically when it is called through an instance, or manually when you call through a class.

NOTE

Per the sidebar What About super? in Chapter 28, Python also has a super built-in function that allows calling back to a superclass’s methods more generically, but we’ll defer its presentation until Chapter 32 due to its downsides and complexities. See the aforementioned sidebar for more details; this call has well-known tradeoffs in basic usage, and an esoteric advanced use case that requires universal deployment to be most effective. Because of these issues, this book prefers to call superclasses by explicit name instead of super as a policy; if you’re new to Python, I recommend the same approach for now, especially for your first pass over OOP. Learn the simple way now, so you can compare it to others later.


[59] On a related note, you can also code multiple __init__ methods within the same class, but only the last definition will be used; see Chapter 31 for more details on multiple method definitions.

Inheritance

Of course, the whole point of the namespace created by the class statement is to support name inheritance. This section expands on some of the mechanisms and roles of attribute inheritance in Python.

As we’ve seen, in Python, inheritance happens when an object is qualified, and it involves searching an attribute definition tree—one or more namespaces. Every time you use an expression of the form object.attr where object is an instance or class object, Python searches the namespace tree from bottom to top, beginning with object, looking for the first attr it can find. This includes references to self attributes in your methods. Because lower definitions in the tree override higher ones, inheritance forms the basis of specialization.

Attribute Tree Construction

Figure 29-1 summarizes the way namespace trees are constructed and populated with names. Generally:

§ Instance attributes are generated by assignments to self attributes in methods.

§ Class attributes are created by statements (assignments) in class statements.

§ Superclass links are made by listing classes in parentheses in a class statement header.

The net result is a tree of attribute namespaces that leads from an instance, to the class it was generated from, to all the superclasses listed in the class header. Python searches upward in this tree, from instances to superclasses, each time you use qualification to fetch an attribute name from an instance object.[60]

Program code creates a tree of objects in memory to be searched by attribute inheritance. Calling a class creates a new instance that remembers its class, running a class statement creates a new class, and superclasses are listed in parentheses in the class statement header. Each attribute reference triggers a new bottom-up tree search—even references to self attributes within a class’s methods.

Figure 29-1. Program code creates a tree of objects in memory to be searched by attribute inheritance. Calling a class creates a new instance that remembers its class, running a class statement creates a new class, and superclasses are listed in parentheses in the class statement header. Each attribute reference triggers a new bottom-up tree search—even references to self attributes within a class’s methods.

Specializing Inherited Methods

The tree-searching model of inheritance just described turns out to be a great way to specialize systems. Because inheritance finds names in subclasses before it checks superclasses, subclasses can replace default behavior by redefining their superclasses’ attributes. In fact, you can build entire systems as hierarchies of classes, which you extend by adding new external subclasses rather than changing existing logic in place.

The idea of redefining inherited names leads to a variety of specialization techniques. For instance, subclasses may replace inherited attributes completely, provide attributes that a superclass expects to find, and extend superclass methods by calling back to the superclass from an overridden method. We’ve already seen some of these patterns in action; here’s a self-contained example of extension at work:

>>> class Super:

def method(self):

print('in Super.method')

>>> class Sub(Super):

def method(self): # Override method

print('starting Sub.method') # Add actions here

Super.method(self) # Run default action

print('ending Sub.method')

Direct superclass method calls are the crux of the matter here. The Sub class replaces Super’s method function with its own specialized version, but within the replacement, Sub calls back to the version exported by Super to carry out the default behavior. In other words, Sub.method just extends Super.method’s behavior, rather than replacing it completely:

>>> x = Super() # Make a Super instance

>>> x.method() # Runs Super.method

in Super.method

>>> x = Sub() # Make a Sub instance

>>> x.method() # Runs Sub.method, calls Super.method

starting Sub.method

in Super.method

ending Sub.method

This extension coding pattern is also commonly used with constructors; see the section Methods for an example.

Class Interface Techniques

Extension is only one way to interface with a superclass. The file shown in this section, specialize.py, defines multiple classes that illustrate a variety of common techniques:

Super

Defines a method function and a delegate that expects an action in a subclass.

Inheritor

Doesn’t provide any new names, so it gets everything defined in Super.

Replacer

Overrides Super’s method with a version of its own.

Extender

Customizes Super’s method by overriding and calling back to run the default.

Provider

Implements the action method expected by Super’s delegate method.

Study each of these subclasses to get a feel for the various ways they customize their common superclass. Here’s the file:

class Super:

def method(self):

print('in Super.method') # Default behavior

def delegate(self):

self.action() # Expected to be defined

class Inheritor(Super): # Inherit method verbatim

pass

class Replacer(Super): # Replace method completely

def method(self):

print('in Replacer.method')

class Extender(Super): # Extend method behavior

def method(self):

print('starting Extender.method')

Super.method(self)

print('ending Extender.method')

class Provider(Super): # Fill in a required method

def action(self):

print('in Provider.action')

if __name__ == '__main__':

for klass in (Inheritor, Replacer, Extender):

print('\n' + klass.__name__ + '...')

klass().method()

print('\nProvider...')

x = Provider()

x.delegate()

A few things are worth pointing out here. First, notice how the self-test code at the end of this example creates instances of three different classes in a for loop. Because classes are objects, you can store them in a tuple and create instances generically with no extra syntax (more on this idea later). Classes also have the special __name__ attribute, like modules; it’s preset to a string containing the name in the class header. Here’s what happens when we run the file:

% python specialize.py

Inheritor...

in Super.method

Replacer...

in Replacer.method

Extender...

starting Extender.method

in Super.method

ending Extender.method

Provider...

in Provider.action

Abstract Superclasses

Of the prior example’s classes, Provider may be the most crucial to understand. When we call the delegate method through a Provider instance, two independent inheritance searches occur:

1. On the initial x.delegate call, Python finds the delegate method in Super by searching the Provider instance and above. The instance x is passed into the method’s self argument as usual.

2. Inside the Super.delegate method, self.action invokes a new, independent inheritance search of self and above. Because self references a Provider instance, the action method is located in the Provider subclass.

This “filling in the blanks” sort of coding structure is typical of OOP frameworks. In a more realistic context, the method filled in this way might handle an event in a GUI, provide data to be rendered as part of a web page, process a tag’s text in an XML file, and so on—your subclass provides specific actions, but the framework handles the rest of the overall job.

At least in terms of the delegate method, the superclass in this example is what is sometimes called an abstract superclass—a class that expects parts of its behavior to be provided by its subclasses. If an expected method is not defined in a subclass, Python raises an undefined name exception when the inheritance search fails.

Class coders sometimes make such subclass requirements more obvious with assert statements, or by raising the built-in NotImplementedError exception with raise statements. We’ll study statements that may trigger exceptions in depth in the next part of this book; as a quick preview, here’s the assert scheme in action:

class Super:

def delegate(self):

self.action()

def action(self):

assert False, 'action must be defined!' # If this version is called

>>> X = Super()

>>> X.delegate()

AssertionError: action must be defined!

We’ll meet assert in Chapter 33 and Chapter 34; in short, if its first expression evaluates to false, it raises an exception with the provided error message. Here, the expression is always false so as to trigger an error message if a method is not redefined, and inheritance locates the version here. Alternatively, some classes simply raise a NotImplementedError exception directly in such method stubs to signal the mistake:

class Super:

def delegate(self):

self.action()

def action(self):

raise NotImplementedError('action must be defined!')

>>> X = Super()

>>> X.delegate()

NotImplementedError: action must be defined!

For instances of subclasses, we still get the exception unless the subclass provides the expected method to replace the default in the superclass:

>>> class Sub(Super): pass

>>> X = Sub()

>>> X.delegate()

NotImplementedError: action must be defined!

>>> class Sub(Super):

def action(self): print('spam')

>>> X = Sub()

>>> X.delegate()

spam

For a somewhat more realistic example of this section’s concepts in action, see the “Zoo animal hierarchy” exercise (Exercise 8) at the end of Chapter 32, and its solution in “Part VI, Classes and OOP” in Appendix D. Such taxonomies are a traditional way to introduce OOP, but they’re a bit removed from most developers’ job descriptions (with apologies to any readers who happen to work at the zoo!).

Abstract superclasses in Python 3.X and 2.6+: Preview

As of Python 2.6 and 3.0, the prior section’s abstract superclasses (a.k.a. “abstract base classes”), which require methods to be filled in by subclasses, may also be implemented with special class syntax. The way we code this varies slightly depending on the version. In Python 3.X, we use akeyword argument in a class header, along with special @ decorator syntax, both of which we’ll study in detail later in this book:

from abc import ABCMeta, abstractmethod

class Super(metaclass=ABCMeta):

@abstractmethod

def method(self, ...):

pass

But in Python 2.6 and 2.7, we use a class attribute instead:

class Super:

__metaclass__ = ABCMeta

@abstractmethod

def method(self, ...):

pass

Either way, the effect is the same—we can’t make an instance unless the method is defined lower in the class tree. In 3.X, for example, here is the special syntax equivalent of the prior section’s example:

>>> from abc import ABCMeta, abstractmethod

>>>

>>> class Super(metaclass=ABCMeta):

def delegate(self):

self.action()

@abstractmethod

def action(self):

pass

>>> X = Super()

TypeError: Can't instantiate abstract class Super with abstract methods action

>>> class Sub(Super): pass

>>> X = Sub()

TypeError: Can't instantiate abstract class Sub with abstract methods action

>>> class Sub(Super):

def action(self): print('spam')

>>> X = Sub()

>>> X.delegate()

spam

Coded this way, a class with an abstract method cannot be instantiated (that is, we cannot create an instance by calling it) unless all of its abstract methods have been defined in subclasses. Although this requires more code and extra knowledge, the potential advantage of this approach is that errors for missing methods are issued when we attempt to make an instance of the class, not later when we try to call a missing method. This feature may also be used to define an expected interface, automatically verified in client classes.

Unfortunately, this scheme also relies on two advanced language tools we have not met yet—function decorators, introduced in Chapter 32 and covered in depth in Chapter 39, as well as metaclass declarations, mentioned in Chapter 32 and covered in Chapter 40—so we will finesse other facets of this option here. See Python’s standard manuals for more on this, as well as precoded abstract superclasses Python provides.


[60] Two fine points here: first, this description isn’t 100% complete, because we can also create instance and class attributes by assigning them to objects outside class statements—but that’s a much less common and sometimes more error-prone approach (changes aren’t isolated to class statements). In Python, all attributes are always accessible by default. We’ll talk more about attribute nameprivacy in Chapter 30 when we study __setattr__, in Chapter 31 when we meet __X names, and again in Chapter 39, where we’ll implement it with a class decorator.

Second, as also noted in Chapter 27, the full inheritance story grows more convoluted when advanced topics such as metaclasses and descriptors are added to the mix—and we’re deferring a formal definition until Chapter 40 for this reason. In common usage, though, it’s simply a way to redefine, and hence customize, behavior coded in classes.

Namespaces: The Conclusion

Now that we’ve examined class and instance objects, the Python namespace story is complete. For reference, I’ll quickly summarize all the rules used to resolve names here. The first things you need to remember are that qualified and unqualified names are treated differently, and that some scopes serve to initialize object namespaces:

§ Unqualified names (e.g., X) deal with scopes.

§ Qualified attribute names (e.g., object.X) use object namespaces.

§ Some scopes initialize object namespaces (for modules and classes).

These concepts sometimes interact—in object.X, for example, object is looked up per scopes, and then X is looked up in the result objects. Since scopes and namespaces are essential to understanding Python code, let’s summarize the rules in more detail.

Simple Names: Global Unless Assigned

As we’ve learned, unqualified simple names follow the LEGB lexical scoping rule outlined when we explored functions in Chapter 17:

Assignment (X = value)

Makes names local by default: creates or changes the name X in the current local scope, unless declared global (or nonlocal in 3.X).

Reference (X)

Looks for the name X in the current local scope, then any and all enclosing functions, then the current global scope, then the built-in scope, per the LEGB rule. Enclosing classes are not searched: class names are fetched as object attributes instead.

Also per Chapter 17, some special-case constructs localize names further (e.g., variables in some comprehensions and try statement clauses), but the vast majority of names follow the LEGB rule.

Attribute Names: Object Namespaces

We’ve also seen that qualified attribute names refer to attributes of specific objects and obey the rules for modules and classes. For class and instance objects, the reference rules are augmented to include the inheritance search procedure:

Assignment (object.X = value)

Creates or alters the attribute name X in the namespace of the object being qualified, and none other. Inheritance-tree climbing happens only on attribute reference, not on attribute assignment.

Reference (object.X)

For class-based objects, searches for the attribute name X in object, then in all accessible classes above it, using the inheritance search procedure. For nonclass objects such as modules, fetches X from object directly.

As noted earlier, the preceding captures the normal and typical case. These attribute rules can vary in classes that utilize more advanced tools, especially for new-style classes—an option in 2.X and the standard in 3.X, which we’ll explore in Chapter 32. For example, reference inheritance can be richer than implied here when metaclasses are deployed, and classes which leverage attribute management tools such as properties, descriptors, and __setattr__ can intercept and route attribute assignments arbitrarily.

In fact, some inheritance is run on assignment too, to locate descriptors with a __set__ method in new-style classes; such tools override the normal rules for both reference and assignment. We’ll explore attribute management tools in depth in Chapter 38, and formalize inheritance and its use of descriptors in Chapter 40. For now, most readers should focus on the normal rules given here, which cover most Python application code.

The “Zen” of Namespaces: Assignments Classify Names

With distinct search procedures for qualified and unqualified names, and multiple lookup layers for both, it can sometimes be difficult to tell where a name will wind up going. In Python, the place where you assign a name is crucial—it fully determines the scope or object in which a name will reside. The file manynames.py illustrates how this principle translates to code and summarizes the namespace ideas we have seen throughout this book (sans obscure special-case scopes like comprehensions):

# File manynames.py

X = 11 # Global (module) name/attribute (X, or manynames.X)

def f():

print(X) # Access global X (11)

def g():

X = 22 # Local (function) variable (X, hides module X)

print(X)

class C:

X = 33 # Class attribute (C.X)

def m(self):

X = 44 # Local variable in method (X)

self.X = 55 # Instance attribute (instance.X)

This file assigns the same name, X, five times—illustrative, though not exactly best practice! Because this name is assigned in five different locations, though, all five Xs in this program are completely different variables. From top to bottom, the assignments to X here generate: a module attribute (11), a local variable in a function (22), a class attribute (33), a local variable in a method (44), and an instance attribute (55). Although all five are named X, the fact that they are all assigned at different places in the source code or to different objects makes all of these unique variables.

You should take the time to study this example carefully because it collects ideas we’ve been exploring throughout the last few parts of this book. When it makes sense to you, you will have achieved Python namespace enlightenment. Or, you can run the code and see what happens—here’s the remainder of this source file, which makes an instance and prints all the Xs that it can fetch:

# manynames.py, continued

if __name__ == '__main__':

print(X) # 11: module (a.k.a. manynames.X outside file)

f() # 11: global

g() # 22: local

print(X) # 11: module name unchanged

obj = C() # Make instance

print(obj.X) # 33: class name inherited by instance

obj.m() # Attach attribute name X to instance now

print(obj.X) # 55: instance

print(C.X) # 33: class (a.k.a. obj.X if no X in instance)

#print(C.m.X) # FAILS: only visible in method

#print(g.X) # FAILS: only visible in function

The outputs that are printed when the file is run are noted in the comments in the code; trace through them to see which variable named X is being accessed each time. Notice in particular that we can go through the class to fetch its attribute (C.X), but we can never fetch local variables in functions or methods from outside their def statements. Locals are visible only to other code within the def, and in fact only live in memory while a call to the function or method is executing.

Some of the names defined by this file are visible outside the file to other modules too, but recall that we must always import before we can access names in another file—name segregation is the main point of modules, after all:

# otherfile.py

import manynames

X = 66

print(X) # 66: the global here

print(manynames.X) # 11: globals become attributes after imports

manynames.f() # 11: manynames's X, not the one here!

manynames.g() # 22: local in other file's function

print(manynames.C.X) # 33: attribute of class in other module

I = manynames.C()

print(I.X) # 33: still from class here

I.m()

print(I.X) # 55: now from instance!

Notice here how manynames.f() prints the X in manynames, not the X assigned in this file—scopes are always determined by the position of assignments in your source code (i.e., lexically) and are never influenced by what imports what or who imports whom. Also, notice that the instance’s own X is not created until we call I.m()—attributes, like all variables, spring into existence when assigned, and not before. Normally we create instance attributes by assigning them in class __init__ constructor methods, but this isn’t the only option.

Finally, as we learned in Chapter 17, it’s also possible for a function to change names outside itself, with global and (in Python 3.X) nonlocal statements—these statements provide write access, but also modify assignment’s namespace binding rules:

X = 11 # Global in module

def g1():

print(X) # Reference global in module (11)

def g2():

global X

X = 22 # Change global in module

def h1():

X = 33 # Local in function

def nested():

print(X) # Reference local in enclosing scope (33)

def h2():

X = 33 # Local in function

def nested():

nonlocal X # Python 3.X statement

X = 44 # Change local in enclosing scope

Of course, you generally shouldn’t use the same name for every variable in your script—but as this example demonstrates, even if you do, Python’s namespaces will work to keep names used in one context from accidentally clashing with those used in another.

Nested Classes: The LEGB Scopes Rule Revisited

The preceding example summarized the effect of nested functions on scopes, which we studied in Chapter 17. It turns out that classes can be nested too—a useful coding pattern in some types of programs, with scope implications that follow naturally from what you already know, but that may not be obvious on first encounter. This section illustrates the concept by example.

Though they are normally coded at the top level of a module, classes also sometimes appear nested in functions that generate them—a variation on the “factory function” (a.k.a. closure) theme in Chapter 17, with similar state retention roles. There we noted that class statements introduce new local scopes much like function def statements, which follow the same LEGB scope lookup rule as function definitions.

This rule applies both to the top level of the class itself, as well as to the top level of method functions nested within it. Both form the L layer in this rule—they are normal local scopes, with access to their names, names in any enclosing functions, globals in the enclosing module, and built-ins. Like modules, the class’s local scope morphs into an attribute namespace after the class statement is run.

Although classes have access to enclosing functions’ scopes, though, they do not act as enclosing scopes to code nested within the class: Python searches enclosing functions for referenced names, but never any enclosing classes. That is, a class is a local scope and has access to enclosing local scopes, but it does not serve as an enclosing local scope to further nested code. Because the search for names used in method functions skips the enclosing class, class attributes must be fetched as object attributes using inheritance.

For example, in the following nester function, all references to X are routed to the global scope except the last, which picks up a local scope redefinition (the section’s code is in file classscope.py, and the output of each example is described in its last two comments):

X = 1

def nester():

print(X) # Global: 1

class C:

print(X) # Global: 1

def method1(self):

print(X) # Global: 1

def method2(self):

X = 3 # Hides global

print(X) # Local: 3

I = C()

I.method1()

I.method2()

print(X) # Global: 1

nester() # Rest: 1, 1, 1, 3

print('-'*40)

Watch what happens, though, when we reassign the same name in nested function layers: the redefinitions of X create locals that hide those in enclosing scopes, just as for simple nested functions; the enclosing class layer does not change this rule, and in fact is irrelevant to it:

X = 1

def nester():

X = 2 # Hides global

print(X) # Local: 2

class C:

print(X) # In enclosing def (nester): 2

def method1(self):

print(X) # In enclosing def (nester): 2

def method2(self):

X = 3 # Hides enclosing (nester)

print(X) # Local: 3

I = C()

I.method1()

I.method2()

print(X) # Global: 1

nester() # Rest: 2, 2, 2, 3

print('-'*40)

And here’s what happens when we reassign the same name at multiple stops along the way: assignments in the local scopes of both functions and classes hide globals or enclosing function locals of the same name, regardless of the nesting involved:

X = 1

def nester():

X = 2 # Hides global

print(X) # Local: 2

class C:

X = 3 # Class local hides nester's: C.X or I.X (not scoped)

print(X) # Local: 3

def method1(self):

print(X) # In enclosing def (not 3 in class!): 2

print(self.X) # Inherited class local: 3

def method2(self):

X = 4 # Hides enclosing (nester, not class)

print(X) # Local: 4

self.X = 5 # Hides class

print(self.X) # Located in instance: 5

I = C()

I.method1()

I.method2()

print(X) # Global: 1

nester() # Rest: 2, 3, 2, 3, 4, 5

print('-'*40)

Most importantly, the lookup rules for simple names like X never search enclosing class statements—just defs, modules, and built-ins (it’s the LEGB rule, not CLEGB!). In method1, for example, X is found in a def outside the enclosing class that has the same name in its local scope. To get to names assigned in the class (e.g., methods), we must fetch them as class or instance object attributes, via self.X in this case.

Believe it or not, we’ll see use cases for this nested classes coding pattern later in this book, especially in some of Chapter 39’s decorators. In this role, the enclosing function usually both serves as a class factory and provides retained state for later use in the enclosed class or its methods.

Namespace Dictionaries: Review

In Chapter 23, we learned that module namespaces have a concrete implementation as dictionaries, exposed with the built-in __dict__ attribute. In Chapter 27 and Chapter 28, we learned that the same holds true for class and instance objects—attribute qualification is mostly a dictionary indexing operation internally, and attribute inheritance is largely a matter of searching linked dictionaries. In fact, within Python, instance and class objects are mostly just dictionaries with links between them. Python exposes these dictionaries, as well as their links, for use in advanced roles (e.g., for coding tools).

We put some of these tools to work in the prior chapter, but to summarize and help you better understand how attributes work internally, let’s work through an interactive session that traces the way namespace dictionaries grow when classes are involved. Now that we know more about methods and superclasses, we can also embellish the coverage here for a better look. First, let’s define a superclass and a subclass with methods that will store data in their instances:

>>> class Super:

def hello(self):

self.data1 = 'spam'

>>> class Sub(Super):

def hola(self):

self.data2 = 'eggs'

When we make an instance of the subclass, the instance starts out with an empty namespace dictionary, but it has links back to the class for the inheritance search to follow. In fact, the inheritance tree is explicitly available in special attributes, which you can inspect. Instances have a__class__ attribute that links to their class, and classes have a __bases__ attribute that is a tuple containing links to higher superclasses (I’m running this on Python 3.3; your name formats, internal attributes, and key orders may vary):

>>> X = Sub()

>>> X.__dict__ # Instance namespace dict

{}

>>> X.__class__ # Class of instance

<class '__main__.Sub'>

>>> Sub.__bases__ # Superclasses of class

(<class '__main__.Super'>,)

>>> Super.__bases__ # () empty tuple in Python 2.X

(<class 'object'>,)

As classes assign to self attributes, they populate the instance objects—that is, attributes wind up in the instances’ attribute namespace dictionaries, not in the classes’. An instance object’s namespace records data that can vary from instance to instance, and self is a hook into that namespace:

>>> Y = Sub()

>>> X.hello()

>>> X.__dict__

{'data1': 'spam'}

>>> X.hola()

>>> X.__dict__

{'data2': 'eggs', 'data1': 'spam'}

>>> list(Sub.__dict__.keys())

['__qualname__', '__module__', '__doc__', 'hola']

>>> list(Super.__dict__.keys())

['__module__', 'hello', '__dict__', '__qualname__', '__doc__', '__weakref__']

>>> Y.__dict__

{}

Notice the extra underscore names in the class dictionaries; Python sets these automatically, and we can filter them out with the generator expressions we saw in Chapter 27 and Chapter 28 that we won’t repeat here. Most are not used in typical programs, but there are tools that use some ofthem (e.g., __doc__ holds the docstrings discussed in Chapter 15).

Also, observe that Y, a second instance made at the start of this series, still has an empty namespace dictionary at the end, even though X’s dictionary has been populated by assignments in methods. Again, each instance has an independent namespace dictionary, which starts out empty and can record completely different attributes than those recorded by the namespace dictionaries of other instances of the same class.

Because attributes are actually dictionary keys inside Python, there are really two ways to fetch and assign their values—by qualification, or by key indexing:

>>> X.data1, X.__dict__['data1']

('spam', 'spam')

>>> X.data3 = 'toast'

>>> X.__dict__

{'data2': 'eggs', 'data3': 'toast', 'data1': 'spam'}

>>> X.__dict__['data3'] = 'ham'

>>> X.data3

'ham'

This equivalence applies only to attributes actually attached to the instance, though. Because attribute fetch qualification also performs an inheritance search, it can access inherited attributes that namespace dictionary indexing cannot. The inherited attribute X.hello, for instance, cannot be accessed by X.__dict__['hello'].

Experiment with these special attributes on your own to get a better feel for how namespaces actually do their attribute business. Also try running these objects through the dir function we met in the prior two chapters—dir(X) is similar to X.__dict__.keys(), but dir sorts its list and includes some inherited and built-in attributes. Even if you will never use these in the kinds of programs you write, seeing that they are just normal dictionaries can help solidify namespaces in general.

NOTE

In Chapter 32, we’ll learn also about slots, a somewhat advanced new-style class feature that stores attributes in instances, but not in their namespace dictionaries. It’s tempting to treat these as class attributes, and indeed, they appear in class namespaces where they manage the per-instance values. As we’ll see, though, slots may prevent a __dict__ from being created in the instance entirely—a potential that generic tools must sometimes account for by using storage-neutral tools such as dir and getattr.

Namespace Links: A Tree Climber

The prior section demonstrated the special __class__ and __bases__ instance and class attributes, without really explaining why you might care about them. In short, these attributes allow you to inspect inheritance hierarchies within your own code. For example, they can be used to display a class tree, as in the following Python 3.X and 2.X example:

#!python

"""

classtree.py: Climb inheritance trees using namespace links,

displaying higher superclasses with indentation for height

"""

def classtree(cls, indent):

print('.' * indent + cls.__name__) # Print class name here

for supercls in cls.__bases__: # Recur to all superclasses

classtree(supercls, indent+3) # May visit super > once

def instancetree(inst):

print('Tree of %s' % inst) # Show instance

classtree(inst.__class__, 3) # Climb to its class

def selftest():

class A: pass

class B(A): pass

class C(A): pass

class D(B,C): pass

class E: pass

class F(D,E): pass

instancetree(B())

instancetree(F())

if __name__ == '__main__': selftest()

The classtree function in this script is recursive—it prints a class’s name using __name__, then climbs up to the superclasses by calling itself. This allows the function to traverse arbitrarily shaped class trees; the recursion climbs to the top, and stops at root superclasses that have empty__bases__ attributes. When using recursion, each active level of a function gets its own copy of the local scope; here, this means that cls and indent are different at each classtree level.

Most of this file is self-test code. When run standalone in Python 2.X, it builds an empty class tree, makes two instances from it, and prints their class tree structures:

C:\code> c:\python27\python classtree.py

Tree of <__main__.B instance at 0x00000000022C3A88>

...B

......A

Tree of <__main__.F instance at 0x00000000022C3A88>

...F

......D

.........B

............A

.........C

............A

......E

When run by Python 3.X, the tree includes the implied object superclass that is automatically added above standalone root (i.e., topmost) classes, because all classes are “new style” in 3.X—more on this change in Chapter 32:

C:\code> c:\python33\python classtree.py

Tree of <__main__.selftest.<locals>.B object at 0x00000000029216A0>

...B

......A

.........object

Tree of <__main__.selftest.<locals>.F object at 0x00000000029216A0>

...F

......D

.........B

............A

...............object

.........C

............A

...............object

......E

.........object

Here, indentation marked by periods is used to denote class tree height. Of course, we could improve on this output format, and perhaps even sketch it in a GUI display. Even as is, though, we can import these functions anywhere we want a quick display of a physical class tree:

C:\code> c:\python33\python

>>> class Emp: pass

>>> class Person(Emp): pass

>>> bob = Person()

>>> import classtree

>>> classtree.instancetree(bob)

Tree of <__main__.Person object at 0x000000000298B6D8>

...Person

......Emp

.........object

Regardless of whether you will ever code or use such tools, this example demonstrates one of the many ways that you can make use of special attributes that expose interpreter internals. You’ll see another when we code the lister.py general-purpose class display tools in Chapter 31’s sectionMultiple Inheritance: “Mix-in” Classes—there, we will extend this technique to also display attributes in each object in a class tree and function as a common superclass.

In the last part of this book, we’ll revisit such tools in the context of Python tool building at large, to code tools that implement attribute privacy, argument validation, and more. While not in every Python programmer’s job description, access to internals enables powerful development tools.

Documentation Strings Revisited

The last section’s example includes a docstring for its module, but remember that docstrings can be used for class components as well. Docstrings, which we covered in detail in Chapter 15, are string literals that show up at the top of various structures and are automatically saved by Python in the corresponding objects’ __doc__ attributes. This works for module files, function defs, and classes and methods.

Now that we know more about classes and methods, the following file, docstr.py, provides a quick but comprehensive example that summarizes the places where docstrings can show up in your code. All of these can be triple-quoted blocks or simpler one-liner literals like those here:

"I am: docstr.__doc__"

def func(args):

"I am: docstr.func.__doc__"

pass

class spam:

"I am: spam.__doc__ or docstr.spam.__doc__ or self.__doc__"

def method(self):

"I am: spam.method.__doc__ or self.method.__doc__"

print(self.__doc__)

print(self.method.__doc__)

The main advantage of documentation strings is that they stick around at runtime. Thus, if it’s been coded as a docstring, you can qualify an object with its __doc__ attribute to fetch its documentation (printing the result interprets line breaks if it’s a multiline string):

>>> import docstr

>>> docstr.__doc__

'I am: docstr.__doc__'

>>> docstr.func.__doc__

'I am: docstr.func.__doc__'

>>> docstr.spam.__doc__

'I am: spam.__doc__ or docstr.spam.__doc__ or self.__doc__'

>>> docstr.spam.method.__doc__

'I am: spam.method.__doc__ or self.method.__doc__'

>>> x = docstr.spam()

>>> x.method()

I am: spam.__doc__ or docstr.spam.__doc__ or self.__doc__

I am: spam.method.__doc__ or self.method.__doc__

A discussion of the PyDoc tool, which knows how to format all these strings in reports and web pages, appears in Chapter 15. Here it is running its help function on our code under Python 2.X (Python 3.X shows additional attributes inherited from the implied object superclass in the new-style class model—run this on your own to see the 3.X extras, and watch for more about this difference in Chapter 32):

>>> help(docstr)

Help on module docstr:

NAME

docstr - I am: docstr.__doc__

FILE

c:\code\docstr.py

CLASSES

spam

class spam

| I am: spam.__doc__ or docstr.spam.__doc__ or self.__doc__

|

| Methods defined here:

|

| method(self)

| I am: spam.method.__doc__ or self.method.__doc__

FUNCTIONS

func(args)

I am: docstr.func.__doc__

Documentation strings are available at runtime, but they are less flexible syntactically than # comments, which can appear anywhere in a program. Both forms are useful tools, and any program documentation is good (as long as it’s accurate, of course!). As stated before, the Python “best practice” rule of thumb is to use docstrings for functional documentation (what your objects do) and hash-mark comments for more micro-level documentation (how arcane bits of code work).

Classes Versus Modules

Finally, let’s wrap up this chapter by briefly comparing the topics of this book’s last two parts: modules and classes. Because they’re both about namespaces, the distinction can be confusing. In short:

§ Modules

§ Implement data/logic packages

§ Are created with Python files or other-language extensions

§ Are used by being imported

§ Form the top-level in Python program structure

§ Classes

§ Implement new full-featured objects

§ Are created with class statements

§ Are used by being called

§ Always live within a module

Classes also support extra features that modules don’t, such as operator overloading, multiple instance generation, and inheritance. Although both classes and modules are namespaces, you should be able to tell by now that they are very different things. We need to move ahead to see just how different classes can be.

Chapter Summary

This chapter took us on a second, more in-depth tour of the OOP mechanisms of the Python language. We learned more about classes, methods, and inheritance, and we wrapped up the namespaces and scopes story in Python by extending it to cover its application to classes. Along the way, we looked at some more advanced concepts, such as abstract superclasses, class data attributes, namespace dictionaries and links, and manual calls to superclass methods and constructors.

Now that we’ve learned all about the mechanics of coding classes in Python, Chapter 30 turns to a specific facet of those mechanics: operator overloading. After that we’ll explore common design patterns, looking at some of the ways that classes are commonly used and combined to optimize code reuse. Before you read ahead, though, be sure to work through the usual chapter quiz to review what we’ve covered here.

Test Your Knowledge: Quiz

1. What is an abstract superclass?

2. What happens when a simple assignment statement appears at the top level of a class statement?

3. Why might a class need to manually call the __init__ method in a superclass?

4. How can you augment, instead of completely replacing, an inherited method?

5. How does a class’s local scope differ from that of a function?

6. What...was the capital of Assyria?

Test Your Knowledge: Answers

1. An abstract superclass is a class that calls a method, but does not inherit or define it—it expects the method to be filled in by a subclass. This is often used as a way to generalize classes when behavior cannot be predicted until a more specific subclass is coded. OOP frameworks also use this as a way to dispatch to client-defined, customizable operations.

2. When a simple assignment statement (X = Y) appears at the top level of a class statement, it attaches a data attribute to the class (Class.X). Like all class attributes, this will be shared by all instances; data attributes are not callable method functions, though.

3. A class must manually call the __init__ method in a superclass if it defines an __init__ constructor of its own and still wants the superclass’s construction code to run. Python itself automatically runs just one constructor—the lowest one in the tree. Superclass constructors are usually called through the class name, passing in the self instance manually: Superclass.__init__(self, ...).

4. To augment instead of completely replacing an inherited method, redefine it in a subclass, but call back to the superclass’s version of the method manually from the new version of the method in the subclass. That is, pass the self instance to the superclass’s version of the method manually: Superclass.method(self, ...).

5. A class is a local scope and has access to enclosing local scopes, but it does not serve as an enclosing local scope to further nested code. Like modules, the class local scope morphs into an attribute namespace after the class statement is run.

6. Ashur (or Qalat Sherqat), Calah (or Nimrud), the short-lived Dur Sharrukin (or Khorsabad), and finally Nineveh.