Learning Python (2013)

Part VI. Classes and OOP

Chapter 27. Class Coding Basics

Now that we’ve talked about OOP in the abstract, it’s time to see how this translates to actual code. This chapter begins to fill in the syntax details behind the class model in Python.

If you’ve never been exposed to OOP in the past, classes can seem somewhat complicated if taken in a single dose. To make class coding easier to absorb, we’ll begin our detailed exploration of OOP by taking a first look at some basic classes in action in this chapter. We’ll expand on the details introduced here in later chapters of this part of the book, but in their basic form, Python classes are easy to understand.

In fact, classes have just three primary distinctions. At a base level, they are mostly just namespaces, much like the modules we studied in Part V. Unlike modules, though, classes also have support for generating multiple objects, for namespace inheritance, and for operator overloading. Let’s begin our class statement tour by exploring each of these three distinctions in turn.

Classes Generate Multiple Instance Objects

To understand how the multiple objects idea works, you have to first understand that there are two kinds of objects in Python’s OOP model: class objects and instance objects. Class objects provide default behavior and serve as factories for instance objects. Instance objects are the real objects your programs process—each is a namespace in its own right, but inherits (i.e., has automatic access to) names in the class from which it was created. Class objects come from statements, and instances come from calls; each time you call a class, you get a new instance of that class.

This object-generation concept is very different from most of the other program constructs we’ve seen so far in this book. In effect, classes are essentially factories for generating multiple instances. By contrast, only one copy of each module is ever imported into a single program. In fact, this is why reload works as it does, updating a single-instance shared object in place. With classes, each instance can have its own, independent data, supporting multiple versions of the object that the class models.

In this role, class instances are similar to the per-call state of the closure (a.k.a. factory) functions of Chapter 17, but this is a natural part of the class model, and state in classes is explicit attributes instead of implicit scope references. Moreover, this is just part of what classes do—they also support customization by inheritance, operator overloading, and multiple behaviors via methods. Generally speaking, classes are a more complete programming tool, though OOP and function programming are not mutually exclusive paradigms. We may combine them by using functional tools in methods, by coding methods that are themselves generators, by writing user-defined iterators (as we’ll see in Chapter 30), and so on.

The following is a quick summary of the bare essentials of Python OOP in terms of its two object types. As you’ll see, Python classes are in some ways similar to both defs and modules, but they may be quite different from what you’re used to in other languages.

Class Objects Provide Default Behavior

When we run a class statement, we get a class object. Here’s a rundown of the main properties of Python classes:

§ The class statement creates a class object and assigns it a name. Just like the function def statement, the Python class statement is an executable statement. When reached and run, it generates a new class object and assigns it to the name in the class header. Also, like defs,class statements typically run when the files they are coded in are first imported.

§ Assignments inside class statements make class attributes. Just like in module files, top-level assignments within a class statement (not nested in a def) generate attributes in a class object. Technically, the class statement defines a local scope that morphs into the attribute namespace of the class object, just like a module’s global scope. After running a class statement, class attributes are accessed by name qualification: object.name.

§ Class attributes provide object state and behavior. Attributes of a class object record state information and behavior to be shared by all instances created from the class; function def statements nested inside a class generate methods, which process instances.

Instance Objects Are Concrete Items

When we call a class object, we get an instance object. Here’s an overview of the key points behind class instances:

§ Calling a class object like a function makes a new instance object. Each time a class is called, it creates and returns a new instance object. Instances represent concrete items in your program’s domain.

§ Each instance object inherits class attributes and gets its own namespace. Instance objects created from classes are new namespaces; they start out empty but inherit attributes that live in the class objects from which they were generated.

§ Assignments to attributes of self in methods make per-instance attributes. Inside a class’s method functions, the first argument (called self by convention) references the instance object being processed; assignments to attributes of self create or change data in the instance, not the class.

The end result is that classes define common, shared data and behavior, and generate instances. Instances reflect concrete application entities, and record per-instance data that may vary per object.

A First Example

Let’s turn to a real example to show how these ideas work in practice. To begin, let’s define a class named FirstClass by running a Python class statement interactively:

>>> class FirstClass: # Define a class object

def setdata(self, value): # Define class's methods

self.data = value # self is the instance

def display(self):

print(self.data) # self.data: per instance

We’re working interactively here, but typically, such a statement would be run when the module file it is coded in is imported. Like functions created with defs, this class won’t even exist until Python reaches and runs this statement.

Like all compound statements, the class starts with a header line that lists the class name, followed by a body of one or more nested and (usually) indented statements. Here, the nested statements are defs; they define functions that implement the behavior the class means to export.

As we learned in Part IV, def is really an assignment. Here, it assigns function objects to the names setdata and display in the class statement’s scope, and so generates attributes attached to the class—FirstClass.setdata and FirstClass.display. In fact, any name assigned at the top level of the class’s nested block becomes an attribute of the class.

Functions inside a class are usually called methods. They’re coded with normal defs, and they support everything we’ve learned about functions already (they can have defaults, return values, yield items on request, and so on). But in a method function, the first argument automatically receives an implied instance object when called—the subject of the call. We need to create a couple of instances to see how this works:

>>> x = FirstClass() # Make two instances

>>> y = FirstClass() # Each is a new namespace

By calling the class this way (notice the parentheses), we generate instance objects, which are just namespaces that have access to their classes’ attributes. Properly speaking, at this point, we have three objects: two instances and a class. Really, we have three linked namespaces, as sketched in Figure 27-1. In OOP terms, we say that x “is a” FirstClass, as is y—they both inherit names attached to the class.

Classes and instances are linked namespace objects in a class tree that is searched by inheritance. Here, the “data” attribute is found in instances, but “setdata” and “display” are in the class above them.

Figure 27-1. Classes and instances are linked namespace objects in a class tree that is searched by inheritance. Here, the “data” attribute is found in instances, but “setdata” and “display” are in the class above them.

The two instances start out empty but have links back to the class from which they were generated. If we qualify an instance with the name of an attribute that lives in the class object, Python fetches the name from the class by inheritance search (unless it also lives in the instance):

>>> x.setdata("King Arthur") # Call methods: self is x

>>> y.setdata(3.14159) # Runs: FirstClass.setdata(y, 3.14159)

Neither x nor y has a setdata attribute of its own, so to find it, Python follows the link from instance to class. And that’s about all there is to inheritance in Python: it happens at attribute qualification time, and it just involves looking up names in linked objects—here, by following the is-a links in Figure 27-1.

In the setdata function inside FirstClass, the value passed in is assigned to self.data. Within a method, self—the name given to the leftmost argument by convention—automatically refers to the instance being processed (x or y), so the assignments store values in the instances’ namespaces, not the class’s; that’s how the data names in Figure 27-1 are created.

Because classes can generate multiple instances, methods must go through the self argument to get to the instance to be processed. When we call the class’s display method to print self.data, we see that it’s different in each instance; on the other hand, the name display itself is the same in x and y, as it comes (is inherited) from the class:

>>> x.display() # self.data differs in each instance

King Arthur

>>> y.display() # Runs: FirstClass.display(y)

3.14159

Notice that we stored different object types in the data member in each instance—a string and a floating-point number. As with everything else in Python, there are no declarations for instance attributes (sometimes called members); they spring into existence the first time they are assigned values, just like simple variables. In fact, if we were to call display on one of our instances before calling setdata, we would trigger an undefined name error—the attribute named data doesn’t even exist in memory until it is assigned within the setdata method.

As another way to appreciate how dynamic this model is, consider that we can change instance attributes in the class itself, by assigning to self in methods, or outside the class, by assigning to an explicit instance object:

>>> x.data = "New value" # Can get/set attributes

>>> x.display() # Outside the class too

New value

Although less common, we could even generate an entirely new attribute in the instance’s namespace by assigning to its name outside the class’s method functions:

>>> x.anothername = "spam" # Can set new attributes here too!

This would attach a new attribute called anothername, which may or may not be used by any of the class’s methods, to the instance object x. Classes usually create all of the instance’s attributes by assignment to the self argument, but they don’t have to—programs can fetch, change, or create attributes on any objects to which they have references.

It usually doesn’t make sense to add data that the class cannot use, and it’s possible to prevent this with extra “privacy” code based on attribute access operator overloading, as we’ll discuss later in this book (see Chapter 30 and Chapter 39). Still, free attribute access translates to less syntax, and there are cases where it’s even useful—for example, in coding data records of the sort we’ll see later in this chapter.

Classes Are Customized by Inheritance

Let’s move on to the second major distinction of classes. Besides serving as factories for generating multiple instance objects, classes also allow us to make changes by introducing new components (called subclasses), instead of changing existing components in place.

As we’ve seen, instance objects generated from a class inherit the class’s attributes. Python also allows classes to inherit from other classes, opening the door to coding hierarchies of classes that specialize behavior—by redefining attributes in subclasses that appear lower in the hierarchy, we override the more general definitions of those attributes higher in the tree. In effect, the further down the hierarchy we go, the more specific the software becomes. Here, too, there is no parallel with modules, whose attributes live in a single, flat namespace that is not as amenable to customization.

In Python, instances inherit from classes, and classes inherit from superclasses. Here are the key ideas behind the machinery of attribute inheritance:

§ Superclasses are listed in parentheses in a class header. To make a class inherit attributes from another class, just list the other class in parentheses in the new class statement’s header line. The class that inherits is usually called a subclass, and the class that is inherited from is itssuperclass.

§ Classes inherit attributes from their superclasses. Just as instances inherit the attribute names defined in their classes, classes inherit all of the attribute names defined in their superclasses; Python finds them automatically when they’re accessed, if they don’t exist in the subclasses.

§ Instances inherit attributes from all accessible classes. Each instance gets names from the class it’s generated from, as well as all of that class’s superclasses. When looking for a name, Python checks the instance, then its class, then all superclasses.

§ Each object.attribute reference invokes a new, independent search. Python performs an independent search of the class tree for each attribute fetch expression. This includes references to instances and classes made outside class statements (e.g., X.attr), as well as references to attributes of the self instance argument in a class’s method functions. Each self.attr expression in a method invokes a new search for attr in self and above.

§ Logic changes are made by subclassing, not by changing superclasses. By redefining superclass names in subclasses lower in the hierarchy (class tree), subclasses replace and thus customize inherited behavior.

The net effect—and the main purpose of all this searching—is that classes support factoring and customization of code better than any other language tool we’ve seen so far. On the one hand, they allow us to minimize code redundancy (and so reduce maintenance costs) by factoring operations into a single, shared implementation; on the other, they allow us to program by customizing what already exists, rather than changing it in place or starting from scratch.

NOTE

Strictly speaking, Python’s inheritance is a bit richer than described here, when we factor in new-style descriptors and metaclasses—advanced topics we’ll study later—but we can safely restrict our scope to instances and their classes, both at this point in the book and in most Python application code. We’ll define inheritance formally in Chapter 40.

A Second Example

To illustrate the role of inheritance, this next example builds on the previous one. First, we’ll define a new class, SecondClass, that inherits all of FirstClass’s names and provides one of its own:

>>> class SecondClass(FirstClass): # Inherits setdata

def display(self): # Changes display

print('Current value = "%s"' % self.data)

SecondClass defines the display method to print with a different format. By defining an attribute with the same name as an attribute in FirstClass, SecondClass effectively replaces the display attribute in its superclass.

Recall that inheritance searches proceed upward from instances to subclasses to superclasses, stopping at the first appearance of the attribute name that it finds. In this case, since the display name in SecondClass will be found before the one in FirstClass, we say that SecondClassoverrides FirstClass’s display. Sometimes we call this act of replacing attributes by redefining them lower in the tree overloading.

The net effect here is that SecondClass specializes FirstClass by changing the behavior of the display method. On the other hand, SecondClass (and any instances created from it) still inherits the setdata method in FirstClass verbatim. Let’s make an instance to demonstrate:

>>> z = SecondClass()

>>> z.setdata(42) # Finds setdata in FirstClass

>>> z.display() # Finds overridden method in SecondClass

Current value = "42"

As before, we make a SecondClass instance object by calling it. The setdata call still runs the version in FirstClass, but this time the display attribute comes from SecondClass and prints a custom message. Figure 27-2 sketches the namespaces involved.

Specialization: overriding inherited names by redefining them in extensions lower in the class tree. Here, SecondClass redefines and so customizes the “display” method for its instances.

Figure 27-2. Specialization: overriding inherited names by redefining them in extensions lower in the class tree. Here, SecondClass redefines and so customizes the “display” method for its instances.

Now, here’s a crucial thing to notice about OOP: the specialization introduced in SecondClass is completely external to FirstClass. That is, it doesn’t affect existing or future FirstClass objects, like the x from the prior example:

>>> x.display() # x is still a FirstClass instance (old message)

New value

Rather than changing FirstClass, we customized it. Naturally, this is an artificial example, but as a rule, because inheritance allows us to make changes like this in external components (i.e., in subclasses), classes often support extension and reuse better than functions or modules can.

Classes Are Attributes in Modules

Before we move on, remember that there’s nothing magic about a class name. It’s just a variable assigned to an object when the class statement runs, and the object can be referenced with any normal expression. For instance, if our FirstClass were coded in a module file instead of being typed interactively, we could import it and use its name normally in a class header line:

from modulename import FirstClass # Copy name into my scope

class SecondClass(FirstClass): # Use class name directly

def display(self): ...

Or, equivalently:

import modulename # Access the whole module

class SecondClass(modulename.FirstClass): # Qualify to reference

def display(self): ...

Like everything else, class names always live within a module, so they must follow all the rules we studied in Part V. For example, more than one class can be coded in a single module file—like other statements in a module, class statements are run during imports to define names, and these names become distinct module attributes. More generally, each module may arbitrarily mix any number of variables, functions, and classes, and all names in a module behave the same way. The file food.py demonstrates:

# food.py

var = 1 # food.var

def func(): ... # food.func

class spam: ... # food.spam

class ham: ... # food.ham

class eggs: ... # food.eggs

This holds true even if the module and class happen to have the same name. For example, given the following file, person.py:

class person: ...

we need to go through the module to fetch the class as usual:

import person # Import module

x = person.person() # Class within module

Although this path may look redundant, it’s required: person.person refers to the person class inside the person module. Saying just person gets the module, not the class, unless the from statement is used:

from person import person # Get class from module

x = person() # Use class name

As with any other variable, we can never see a class in a file without first importing and somehow fetching it from its enclosing file. If this seems confusing, don’t use the same name for a module and a class within it. In fact, common convention in Python dictates that class names should begin with an uppercase letter, to help make them more distinct:

import person # Lowercase for modules

x = person.Person() # Uppercase for classes

Also, keep in mind that although classes and modules are both namespaces for attaching attributes, they correspond to very different source code structures: a module reflects an entire file, but a class is a statement within a file. We’ll say more about such distinctions later in this part of thebook.

Classes Can Intercept Python Operators

Let’s move on to the third and final major difference between classes and modules: operator overloading. In simple terms, operator overloading lets objects coded with classes intercept and respond to operations that work on built-in types: addition, slicing, printing, qualification, and so on. It’s mostly just an automatic dispatch mechanism—expressions and other built-in operations route control to implementations in classes. Here, too, there is nothing similar in modules: modules can implement function calls, but not the behavior of expressions.

Although we could implement all class behavior as method functions, operator overloading lets objects be more tightly integrated with Python’s object model. Moreover, because operator overloading makes our own objects act like built-ins, it tends to foster object interfaces that are more consistent and easier to learn, and it allows class-based objects to be processed by code written to expect a built-in type’s interface. Here is a quick rundown of the main ideas behind overloading operators:

§ Methods named with double underscores (__X__) are special hooks. In Python classes we implement operator overloading by providing specially named methods to intercept operations. The Python language defines a fixed and unchangeable mapping from each of these operations to a specially named method.

§ Such methods are called automatically when instances appear in built-in operations. For instance, if an instance object inherits an __add__ method, that method is called whenever the object appears in a + expression. The method’s return value becomes the result of the corresponding expression.

§ Classes may override most built-in type operations. There are dozens of special operator overloading method names for intercepting and implementing nearly every operation available for built-in types. This includes expressions, but also basic operations like printing and object creation.

§ There are no defaults for operator overloading methods, and none are required. If a class does not define or inherit an operator overloading method, it just means that the corresponding operation is not supported for the class’s instances. If there is no __add__, for example, +expressions raise exceptions.

§ New-style classes have some defaults, but not for common operations. In Python 3.X, and so-called “new style” classes in 2.X that we’ll define later, a root class named object does provide defaults for some __X__ methods, but not for many, and not for most commonly used operations.

§ Operators allow classes to integrate with Python’s object model. By overloading type operations, the user-defined objects we implement with classes can act just like built-ins, and so provide consistency as well as compatibility with expected interfaces.

Operator overloading is an optional feature; it’s used primarily by people developing tools for other Python programmers, not by application developers. And, candidly, you probably shouldn’t use it just because it seems clever or “cool.” Unless a class needs to mimic built-in type interfaces, it should usually stick to simpler named methods. Why would an employee database application support expressions like * and +, for example? Named methods like giveRaise and promote would usually make more sense.

Because of this, we won’t go into details on every operator overloading method available in Python in this book. Still, there is one operator overloading method you are likely to see in almost every realistic Python class: the __init__ method, which is known as the constructor method and is used to initialize objects’ state. You should pay special attention to this method, because __init__, along with the self argument, turns out to be a key requirement to reading and understanding most OOP code in Python.

A Third Example

On to another example. This time, we’ll define a subclass of the prior section’s SecondClass that implements three specially named attributes that Python will call automatically:

§ __init__ is run when a new instance object is created: self is the new ThirdClass object.^[53^]

§ __add__ is run when a ThirdClass instance appears in a + expression.

§ __str__ is run when an object is printed (technically, when it’s converted to its print string by the str built-in function or its Python internals equivalent).

Our new subclass also defines a normally named method called mul, which changes the instance object in place. Here’s the new subclass:

>>> class ThirdClass(SecondClass): # Inherit from SecondClass

def __init__(self, value): # On "ThirdClass(value)"

self.data = value

def __add__(self, other): # On "self + other"

return ThirdClass(self.data + other)

def __str__(self): # On "print(self)", "str()"

return '[ThirdClass: %s]' % self.data

def mul(self, other): # In-place change: named

self.data *= other

>>> a = ThirdClass('abc') # __init__ called

>>> a.display() # Inherited method called

Current value = "abc"

>>> print(a) # __str__: returns display string

[ThirdClass: abc]

>>> b = a + 'xyz' # __add__: makes a new instance

>>> b.display() # b has all ThirdClass methods

Current value = "abcxyz"

>>> print(b) # __str__: returns display string

[ThirdClass: abcxyz]

>>> a.mul(3) # mul: changes instance in place

>>> print(a)

[ThirdClass: abcabcabc]

ThirdClass “is a” SecondClass, so its instances inherit the customized display method from SecondClass of the preceding section. This time, though, ThirdClass creation calls pass an argument (e.g., “abc”). This argument is passed to the value argument in the __init__constructor and assigned to self.data there. The net effect is that ThirdClass arranges to set the data attribute automatically at construction time, instead of requiring setdata calls after the fact.

Further, ThirdClass objects can now show up in + expressions and print calls. For +, Python passes the instance object on the left to the self argument in __add__ and the value on the right to other, as illustrated in Figure 27-3; whatever __add__ returns becomes the result of the +expression (more on its result in a moment).

For print, Python passes the object being printed to self in __str__; whatever string this method returns is taken to be the print string for the object. With __str__ (or its more broadly relevant twin __repr__, which we’ll meet and use in the next chapter), we can use a normal printto display objects of this class, instead of calling the special display method.

In operator overloading, expression operators and other built-in operations performed on class instances are mapped back to specially named methods in the class. These special methods are optional and may be inherited as usual. Here, a + expression triggers the __add__ method.

Figure 27-3. In operator overloading, expression operators and other built-in operations performed on class instances are mapped back to specially named methods in the class. These special methods are optional and may be inherited as usual. Here, a + expression triggers the __add__ method.

Specially named methods such as __init__, __add__, and __str__ are inherited by subclasses and instances, just like any other names assigned in a class. If they’re not coded in a class, Python looks for such names in all its superclasses, as usual. Operator overloading method names are also not built-in or reserved words; they are just attributes that Python looks for when objects appear in various contexts. Python usually calls them automatically, but they may occasionally be called by your code as well. For example, the __init__ method is often called manually to trigger initialization steps in a superclass, as we’ll see in the next chapter.

Returning results, or not

Some operator overloading methods like __str__ require results, but others are more flexible. For example, notice how the __add__ method makes and returns a new instance object of its class, by calling ThirdClass with the result value—which in turn triggers __init__ to initialize the result. This is a common convention, and explains why b in the listing has a display method; it’s a ThirdClass object too, because that’s what + returns for this class’s objects. This essentially propagates the type.

By contrast, mul changes the current instance object in place, by reassigning the self attribute. We could overload the * expression to do the latter, but this would be too different from the behavior of * for built-in types such as numbers and strings, for which it always makes new objects. Common practice dictates that overloaded operators should work the same way that built-in operator implementations do. Because operator overloading is really just an expression-to-method dispatch mechanism, though, you can interpret operators any way you like in your own class objects.

Why Use Operator Overloading?

As a class designer, you can choose to use operator overloading or not. Your choice simply depends on how much you want your object to look and feel like built-in types. As mentioned earlier, if you omit an operator overloading method and do not inherit it from a superclass, the corresponding operation will not be supported for your instances; if it’s attempted, an exception will be raised (or, in some cases like printing, a standard default will be used).

Frankly, many operator overloading methods tend to be used only when you are implementing objects that are mathematical in nature; a vector or matrix class may overload the addition operator, for example, but an employee class likely would not. For simpler classes, you might not use overloading at all, and would rely instead on explicit method calls to implement your objects’ behavior.

On the other hand, you might decide to use operator overloading if you need to pass a user-defined object to a function that was coded to expect the operators available on a built-in type like a list or a dictionary. Implementing the same operator set in your class will ensure that your objects support the same expected object interface and so are compatible with the function. Although we won’t cover every operator overloading method in this book, we’ll survey additional common operator overloading techniques in action in Chapter 30.

One overloading method we will use often here is the __init__ constructor method, used to initialize newly created instance objects, and present in almost every realistic class. Because it allows classes to fill out the attributes in their new instances immediately, the constructor is useful for almost every kind of class you might code. In fact, even though instance attributes are not declared in Python, you can usually find out which attributes an instance will have by inspecting its class’s __init__ method.

Of course, there’s nothing wrong with experimenting with interesting language tools, but they don’t always translate to production code. With time and experience, you’ll find these programming patterns and guidelines to be natural and nearly automatic.

^[53^] Not to be confused with the __init__.py files in module packages! The method here is a class constructor function used to initialize the newly created instance, not a module package. See Chapter 24 for more details.

The World’s Simplest Python Class

We’ve begun studying class statement syntax in detail in this chapter, but I’d again like to remind you that the basic inheritance model that classes produce is very simple—all it really involves is searching for attributes in trees of linked objects. In fact, we can create a class with nothing in it at all. The following statement makes a class with no attributes attached, an empty namespace object:

>>> class rec: pass # Empty namespace object

We need the no-operation pass placeholder statement (discussed in Chapter 13) here because we don’t have any methods to code. After we make the class by running this statement interactively, we can start attaching attributes to the class by assigning names to it completely outside of the original class statement:

>>> rec.name = 'Bob' # Just objects with attributes

>>> rec.age = 40

And, after we’ve created these attributes by assignment, we can fetch them with the usual syntax. When used this way, a class is roughly similar to a “struct” in C, or a “record” in Pascal. It’s basically an object with field names attached to it (as we’ll see ahead, doing similar with dictionary keys requires extra characters):

>>> print(rec.name) # Like a C struct or a record

Bob

Notice that this works even though there are no instances of the class yet; classes are objects in their own right, even without instances. In fact, they are just self-contained namespaces; as long as we have a reference to a class, we can set or change its attributes anytime we wish. Watch what happens when we do create two instances, though:

>>> x = rec() # Instances inherit class names

>>> y = rec()

These instances begin their lives as completely empty namespace objects. Because they remember the class from which they were made, though, they will obtain the attributes we attached to the class by inheritance:

>>> x.name, y.name # name is stored on the class only

('Bob', 'Bob')

Really, these instances have no attributes of their own; they simply fetch the name attribute from the class object where it is stored. If we do assign an attribute to an instance, though, it creates (or changes) the attribute in that object, and no other—crucially, attribute references kick off inheritance searches, but attribute assignments affect only the objects in which the assignments are made. Here, this means that x gets its own name, but y still inherits the name attached to the class above it:

>>> x.name = 'Sue' # But assignment changes x only

>>> rec.name, x.name, y.name

('Bob', 'Sue', 'Bob')

In fact, as we’ll explore in more detail in Chapter 29, the attributes of a namespace object are usually implemented as dictionaries, and class inheritance trees are (generally speaking) just dictionaries with links to other dictionaries. If you know where to look, you can see this explicitly.

For example, the __dict__ attribute is the namespace dictionary for most class-based objects. Some classes may also (or instead) define attributes in __slots__, an advanced and seldom-used feature that we’ll note in Chapter 28, but largely postpone until Chapter 31 and Chapter 32. Normally, __dict__ literally is an instance’s attribute namespace.

To illustrate, the following was run in Python 3.3; the order of names and set of __X__ internal names present can vary from release to release, and we filter out built-ins with a generator expression as we’ve done before, but the names we assigned are present in all:

>>> list(rec.__dict__.keys())

['age', '__module__', '__qualname__', '__weakref__', 'name', '__dict__', '__doc__']

>>> list(name for name in rec.__dict__ if not name.startswith('__'))

['age', 'name']

>>> list(x.__dict__.keys())

['name']

>>> list(y.__dict__.keys()) # list() not required in Python 2.X

[]

Here, the class’s namespace dictionary shows the name and age attributes we assigned to it, x has its own name, and y is still empty. Because of this model, an attribute can often be fetched by either dictionary indexing or attribute notation, but only if it’s present on the object in question—attribute notation kicks off inheritance search, but indexing looks in the single object only (as we’ll see later, both have valid roles):

>>> x.name, x.__dict__['name'] # Attributes present here are dict keys

('Sue', 'Sue')

>>> x.age # But attribute fetch checks classes too

>>> x.__dict__['age'] # Indexing dict does not do inheritance

KeyError: 'age'

To facilitate inheritance search on attribute fetches, each instance has a link to its class that Python creates for us—it’s called __class__, if you want to inspect it:

>>> x.__class__ # Instance to class link

Classes also have a __bases__ attribute, which is a tuple of references to their superclass objects—in this example just the implied object root class in Python 3.X we’ll explore later (you’ll get an empty tuple in 2.X instead):

>>> rec.__bases__ # Class to superclasses link, () in 2.X

(<class 'object'>,)

These two attributes are how class trees are literally represented in memory by Python. Internal details like these are not required knowledge—class trees are implied by the code you run, and their search is normally automatic—but they can often help demystify the model.

The main point to take away from this look under the hood is that Python’s class model is extremely dynamic. Classes and instances are just namespace objects, with attributes created on the fly by assignment. Those assignments usually happen within the class statements you code, but they can occur anywhere you have a reference to one of the objects in the tree.

Even methods, normally created by a def nested in a class, can be created completely independently of any class object. The following, for example, defines a simple function outside of any class that takes one argument:

>>> def uppername(obj):

return obj.name.upper() # Still needs a self argument (obj)

There is nothing about a class here yet—it’s a simple function, and it can be called as such at this point, provided we pass in an object obj with a name attribute, whose value in turn has an upper method—our class instances happen to fit the expected interface, and kick off string uppercase conversion:

>>> uppername(x) # Call as a simple function

'SUE'

If we assign this simple function to an attribute of our class, though, it becomes a method, callable through any instance, as well as through the class name itself as long as we pass in an instance manually—a technique we’ll leverage further in the next chapter:^[54^]

>>> rec.method = uppername # Now it's a class's method!

>>> x.method() # Run method to process x

'SUE'

>>> y.method() # Same, but pass y to self

'BOB'

>>> rec.method(x) # Can call through instance or class

'SUE'

Normally, classes are filled out by class statements, and instance attributes are created by assignments to self attributes in method functions. The point again, though, is that they don’t have to be; OOP in Python really is mostly about looking up attributes in linked namespace objects.

Records Revisited: Classes Versus Dictionaries

Although the simple classes of the prior section are meant to illustrate class model basics, the techniques they employ can also be used for real work. For example, Chapter 8 and Chapter 9 showed how to use dictionaries, tuples, and lists to record properties of entities in our programs, generically called records. It turns out that classes can often serve better in this role—they package information like dictionaries, but can also bundle processing logic in the form of methods. For reference, here is an example for tuple- and dictionary-based records we used earlier in the book (using one of many dictionary coding techniques):

>>> rec = ('Bob', 40.5, ['dev', 'mgr']) # Tuple-based record

>>> print(rec[0])

Bob

>>> rec = {}

>>> rec['name'] = 'Bob' # Dictionary-based record

>>> rec['age'] = 40.5 # Or {...}, dict(n=v), etc.

>>> rec['jobs'] = ['dev', 'mgr']

>>>

>>> print(rec['name'])

Bob

This code emulates tools like records in other languages. As we just saw, though, there are also multiple ways to do the same with classes. Perhaps the simplest is this—trading keys for attributes:

>>> class rec: pass

>>> rec.name = 'Bob' # Class-based record

>>> rec.age = 40.5

>>> rec.jobs = ['dev', 'mgr']

>>>

>>> print(rec.name)

Bob

This code has substantially less syntax than the dictionary equivalent. It uses an empty class statement to generate an empty namespace object. Once we make the empty class, we fill it out by assigning class attributes over time, as before.

This works, but a new class statement will be required for each distinct record we will need. Perhaps more typically, we can instead generate instances of an empty class to represent each distinct entity:

>>> class rec: pass

>>> pers1 = rec() # Instance-based records

>>> pers1.name = 'Bob'

>>> pers1.jobs = ['dev', 'mgr']

>>> pers1.age = 40.5

>>>

>>> pers2 = rec()

>>> pers2.name = 'Sue'

>>> pers2.jobs = ['dev', 'cto']

>>>

>>> pers1.name, pers2.name

('Bob', 'Sue')

Here, we make two records from the same class. Instances start out life empty, just like classes. We then fill in the records by assigning to attributes. This time, though, there are two separate objects, and hence two separate name attributes. In fact, instances of the same class don’t even have to have the same set of attribute names; in this example, one has a unique age name. Instances really are distinct namespaces, so each has a distinct attribute dictionary. Although they are normally filled out consistently by a class’s methods, they are more flexible than you might expect.

Finally, we might instead code a more full-blown class to implement the record and its processing—something that data-oriented dictionaries do not directly support:

>>> class Person:

def __init__(self, name, jobs, age=None): # class = data + logic

self.name = name

self.jobs = jobs

self.age = age

def info(self):

return (self.name, self.jobs)

>>> rec1 = Person('Bob', ['dev', 'mgr'], 40.5) # Construction calls

>>> rec2 = Person('Sue', ['dev', 'cto'])

>>>

>>> rec1.jobs, rec2.info() # Attributes + methods

(['dev', 'mgr'], ('Sue', ['dev', 'cto']))

This scheme also makes multiple instances, but the class is not empty this time: we’ve added logic (methods) to initialize instances at construction time and collect attributes into a tuple on request. The constructor imposes some consistency on instances here by always setting the name, job, and age attributes, even though the latter can be omitted when an object is made. Together, the class’s methods and instance attributes create a package, which combines both data and logic.

We could further extend this code by adding logic to compute salaries, parse names, and so on. Ultimately, we might link the class into a larger hierarchy to inherit and customize an existing set of methods via the automatic attribute search of classes, or perhaps even store instances of the class in a file with Python object pickling to make them persistent. In fact, we will—in the next chapter, we’ll expand on this analogy between classes and records with a more realistic running example that demonstrates class basics in action.

To be fair to other tools, in this form, the two class construction calls above more closely resemble dictionaries made all at once, but still seem less cluttered and provide extra processing methods. In fact, the class’s construction calls more closely resemble Chapter 9’s named tuples—which makes sense, given that named tuples really are classes with extra logic to map attributes to tuple offsets:

>>> rec = dict(name='Bob', age=40.5, jobs=['dev', 'mgr']) # Dictionaries

>>> rec = {'name': 'Bob', 'age': 40.5, 'jobs': ['dev', 'mgr']}

>>> rec = Rec('Bob', 40.5, ['dev', 'mgr']) # Named tuples

In the end, although types like dictionaries and tuples are flexible, classes allow us to add behavior to objects in ways that built-in types and simple functions do not directly support. Although we can store functions in dictionaries, too, using them to process implied instances is nowhere near as natural and structured as it is in classes. To see this more clearly, let’s move ahead to the next chapter.

^[54^] In fact, this is one of the reasons the self argument must always be explicit in Python methods—because methods can be created as simple functions independent of a class, they need to make the implied instance argument explicit. They can be called as either functions or methods, and Python can neither guess nor assume that a simple function might eventually become a class’s method. The main reason for the explicit self argument, though, is to make the meanings of names more obvious: names not referenced through self are simple variables mapped to scopes, while names referenced through self with attribute notation are obviously instance attributes.

Chapter Summary

This chapter introduced the basics of coding classes in Python. We studied the syntax of the class statement, and we saw how to use it to build up a class inheritance tree. We also studied how Python automatically fills in the first argument in method functions, how attributes are attached to objects in a class tree by simple assignment, and how specially named operator overloading methods intercept and implement built-in operations for our instances (e.g., expressions and printing).

Now that we’ve learned all about the mechanics of coding classes in Python, the next chapter turns to a larger and more realistic example that ties together much of what we’ve learned about OOP so far, and introduces some new topics. After that, we’ll continue our look at class coding, taking a second pass over the model to fill in some of the details that were omitted here to keep things simple. First, though, let’s work through a quiz to review the basics we’ve covered so far.

Test Your Knowledge: Quiz

1. How are classes related to modules?

2. How are instances and classes created?

3. Where and how are class attributes created?

4. Where and how are instance attributes created?

5. What does self mean in a Python class?

6. How is operator overloading coded in a Python class?

7. When might you want to support operator overloading in your classes?

8. Which operator overloading method is most commonly used?

9. What are two key concepts required to understand Python OOP code?

Test Your Knowledge: Answers

1. Classes are always nested inside a module; they are attributes of a module object. Classes and modules are both namespaces, but classes correspond to statements (not entire files) and support the OOP notions of multiple instances, inheritance, and operator overloading (modules do not). In a sense, a module is like a single-instance class, without inheritance, which corresponds to an entire file of code.

2. Classes are made by running class statements; instances are created by calling a class as though it were a function.

3. Class attributes are created by assigning attributes to a class object. They are normally generated by top-level assignments nested in a class statement—each name assigned in the class statement block becomes an attribute of the class object (technically, the class statement’s local scope morphs into the class object’s attribute namespace, much like a module). Class attributes can also be created, though, by assigning attributes to the class anywhere a reference to the class object exists—even outside the class statement.

4. Instance attributes are created by assigning attributes to an instance object. They are normally created within a class’s method functions coded inside the class statement, by assigning attributes to the self argument (which is always the implied instance). Again, though, they may be created by assignment anywhere a reference to the instance appears, even outside the class statement. Normally, all instance attributes are initialized in the __init__ constructor method; that way, later method calls can assume the attributes already exist.

5. self is the name commonly given to the first (leftmost) argument in a class’s method function; Python automatically fills it in with the instance object that is the implied subject of the method call. This argument need not be called self (though this is a very strong convention); its position is what is significant. (Ex-C++ or Java programmers might prefer to call it this because in those languages that name reflects the same idea; in Python, though, this argument must always be explicit.)

6. Operator overloading is coded in a Python class with specially named methods; they all begin and end with double underscores to make them unique. These are not built-in or reserved names; Python just runs them automatically when an instance appears in the corresponding operation. Python itself defines the mappings from operations to special method names.

7. Operator overloading is useful to implement objects that resemble built-in types (e.g., sequences or numeric objects such as matrixes), and to mimic the built-in type interface expected by a piece of code. Mimicking built-in type interfaces enables you to pass in class instances that also have state information (i.e., attributes that remember data between operation calls). You shouldn’t use operator overloading when a simple named method will suffice, though.

8. The __init__ constructor method is the most commonly used; almost every class uses this method to set initial values for instance attributes and perform other startup tasks.

9. The special self argument in method functions and the __init__ constructor method are the two cornerstones of OOP code in Python; if you get these, you should be able to read the text of most OOP Python code—apart from these, it’s largely just packages of functions. The inheritance search matters too, of course, but self represents the automatic object argument, and __init__ is widespread.