Managed Attributes - Advanced Topics - Learning Python (2013)

Learning Python (2013)

Part VIII. Advanced Topics

Chapter 38. Managed Attributes

This chapter expands on the attribute interception techniques introduced earlier, introduces another, and employs them in a handful of larger examples. Like everything in this part of the book, this chapter is classified as an advanced topic and optional reading, because most applications programmers don’t need to care about the material discussed here—they can fetch and set attributes on objects without concern for attribute implementations.

Especially for tools builders, though, managing attribute access can be an important part of flexible APIs. Moreover, an understanding of the descriptor model covered here can make related tools such as slots and properties more tangible, and may even be required reading if it appears in code you must use.

Why Manage Attributes?

Object attributes are central to most Python programs—they are where we often store information about the entities our scripts process. Normally, attributes are simply names for objects; a person’s name attribute, for example, might be a simple string, fetched and set with basic attribute syntax:

person.name # Fetch attribute value

person.name = value # Change attribute value

In most cases, the attribute lives in the object itself, or is inherited from a class from which it derives. That basic model suffices for most programs you will write in your Python career.

Sometimes, though, more flexibility is required. Suppose you’ve written a program to use a name attribute directly, but then your requirements change—for example, you decide that names should be validated with logic when set or mutated in some way when fetched. It’s straightforward to code methods to manage access to the attribute’s value (valid and transform are abstract here):

class Person:

def getName(self):

if not valid():

raise TypeError('cannot fetch name')

else:

return self.name.transform()

def setName(self, value):

if not valid(value):

raise TypeError('cannot change name')

else:

self.name = transform(value)

person = Person()

person.getName()

person.setName('value')

However, this also requires changing all the places where names are used in the entire program—a possibly nontrivial task. Moreover, this approach requires the program to be aware of how values are exported: as simple names or called methods. If you begin with a method-based interface to data, clients are immune to changes; if you do not, they can become problematic.

This issue can crop up more often than you might expect. The value of a cell in a spreadsheet-like program, for instance, might begin its life as a simple discrete value, but later mutate into an arbitrary calculation. Since an object’s interface should be flexible enough to support such future changes without breaking existing code, switching to methods later is less than ideal.

Inserting Code to Run on Attribute Access

A better solution would allow you to run code automatically on attribute access, if needed. That’s one of the main roles of managed attributes—they provide ways to add attribute accessor logic after the fact. More generally, they support arbitrary attribute usage modes that go beyond simple data storage.

At various points in this book, we’ve met Python tools that allow our scripts to dynamically compute attribute values when fetching them and validate or change attribute values when storing them. In this chapter, we’re going to expand on the tools already introduced, explore other available tools, and study some larger use-case examples in this domain. Specifically, this chapter presents four accessor techniques:

§ The __getattr__ and __setattr__ methods, for routing undefined attribute fetches and all attribute assignments to generic handler methods.

§ The __getattribute__ method, for routing all attribute fetches to a generic handler method.

§ The property built-in, for routing specific attribute access to get and set handler functions.

§ The descriptor protocol, for routing specific attribute accesses to instances of classes with arbitrary get and set handler methods, and the basis for other tools such as properties and slots.

The tools in the first of these bullets are available in all Pythons. The last three bullets’ tools are available in Python 3.X and new-style classes in 2.X—they first appeared in Python 2.2, along with many of the other advanced tools of Chapter 32 such as slots and super. We briefly met the first and third of these in Chapter 30 and Chapter 32, respectively; the second and fourth are largely new topics we’ll explore in full here.

As we’ll see, all four techniques share goals to some degree, and it’s usually possible to code a given problem using any one of them. They do differ in some important ways, though. For example, the last two techniques listed here apply to specific attributes, whereas the first two are generic enough to be used by delegation-based proxy classes that must route arbitrary attributes to wrapped objects. As we’ll see, all four schemes also differ in both complexity and aesthetics, in ways you must see in action to judge for yourself.

Besides studying the specifics behind the four attribute interception techniques listed in this section, this chapter also presents an opportunity to explore larger programs than we’ve seen elsewhere in this book. The CardHolder case study at the end, for example, should serve as a self-study example of larger classes in action. We’ll also be using some of the techniques outlined here in the next chapter to code decorators, so be sure you have at least a general understanding of these topics before you move on.

Properties

The property protocol allows us to route a specific attribute’s get, set, and delete operations to functions or methods we provide, enabling us to insert code to be run automatically on attribute access, intercept attribute deletions, and provide documentation for the attributes if desired.

Properties are created with the property built-in and are assigned to class attributes, just like method functions. Accordingly, they are inherited by subclasses and instances, like any other class attributes. Their access-interception functions are provided with the self instance argument, which grants access to state information and class attributes available on the subject instance.

A property manages a single, specific attribute; although it can’t catch all attribute accesses generically, it allows us to control both fetch and assignment accesses and enables us to change an attribute from simple data to a computation freely, without breaking existing code. As we’ll see, properties are strongly related to descriptors; in fact, they are essentially a restricted form of them.

The Basics

A property is created by assigning the result of a built-in function to a class attribute:

attribute = property(fget, fset, fdel, doc)

None of this built-in’s arguments are required, and all default to None if not passed. For the first three, this None means that the corresponding operation is not supported, and attempting it will raise an AttributeError exception automatically.

When these arguments are used, we pass fget a function for intercepting attribute fetches, fset a function for assignments, and fdel a function for attribute deletions. Technically, all three of these arguments accept any callable, including a class’s method, having a first argument to receive the instance being qualified. When later invoked, the fget function returns the computed attribute value, fset and fdel return nothing (really, None), and all three may raise exceptions to reject access requests.

The doc argument receives a documentation string for the attribute, if desired; otherwise, the property copies the docstring of the fget function, which as usual defaults to None.

This built-in property call returns a property object, which we assign to the name of the attribute to be managed in the class scope, where it will be inherited by every instance.

A First Example

To demonstrate how this translates to working code, the following class uses a property to trace access to an attribute named name; the actual stored data is named _name so it does not clash with the property (if you’re working along with the book examples package, some filenames in this chapter are implied by the command-lines that run them following their listings):

class Person: # Add (object) in 2.X

def __init__(self, name):

self._name = name

def getName(self):

print('fetch...')

return self._name

def setName(self, value):

print('change...')

self._name = value

def delName(self):

print('remove...')

del self._name

name = property(getName, setName, delName, "name property docs")

bob = Person('Bob Smith') # bob has a managed attribute

print(bob.name) # Runs getName

bob.name = 'Robert Smith' # Runs setName

print(bob.name)

del bob.name # Runs delName

print('-'*20)

sue = Person('Sue Jones') # sue inherits property too

print(sue.name)

print(Person.name.__doc__) # Or help(Person.name)

Properties are available in both 2.X and 3.X, but they require new-style object derivation in 2.X to work correctly for assignments—add object as a superclass here to run this in 2.X. You can list the superclass in 3.X too, but it’s implied and not required, and is sometimes omitted in this book to reduce clutter.

This particular property doesn’t do much—it simply intercepts and traces an attribute—but it serves to demonstrate the protocol. When this code is run, two instances inherit the property, just as they would any other attribute attached to their class. However, their attribute accesses are caught:

c:\code> py −3 prop-person.py

fetch...

Bob Smith

change...

fetch...

Robert Smith

remove...

--------------------

fetch...

Sue Jones

name property docs

Like all class attributes, properties are inherited by both instances and lower subclasses. If we change our example as follows, for instance:

class Super:

...the original Person class code...

name = property(getName, setName, delName, 'name property docs')

class Person(Super):

pass # Properties are inherited (class attrs)

bob = Person('Bob Smith')

...rest unchanged...

the output is the same—the Person subclass inherits the name property from Super, and the bob instance gets it from Person. In terms of inheritance, properties work the same as normal methods; because they have access to the self instance argument, they can access instance state information and methods irrespective of subclass depth, as the next section further demonstrates.

Computed Attributes

The example in the prior section simply traces attribute accesses. Usually, though, properties do much more—computing the value of an attribute dynamically when fetched, for example. The following example illustrates:

class PropSquare:

def __init__(self, start):

self.value = start

def getX(self): # On attr fetch

return self.value ** 2

def setX(self, value): # On attr assign

self.value = value

X = property(getX, setX) # No delete or docs

P = PropSquare(3) # Two instances of class with property

Q = PropSquare(32) # Each has different state information

print(P.X) # 3 ** 2

P.X = 4

print(P.X) # 4 ** 2

print(Q.X) # 32 ** 2 (1024)

This class defines an attribute X that is accessed as though it were static data, but really runs code to compute its value when fetched. The effect is much like an implicit method call. When the code is run, the value is stored in the instance as state information, but each time we fetch it via the managed attribute, its value is automatically squared:

c:\code> py −3 prop-computed.py

9

16

1024

Notice that we’ve made two different instances—because property methods automatically receive a self argument, they have access to the state information stored in instances. In our case, this means the fetch computes the square of the subject instance’s own data.

Coding Properties with Decorators

Although we’re saving additional details until the next chapter, we introduced function decorator basics earlier, in Chapter 32. Recall that the function decorator syntax:

@decorator

def func(args): ...

is automatically translated to this equivalent by Python, to rebind the function name to the result of the decorator callable:

def func(args): ...

func = decorator(func)

Because of this mapping, it turns out that the property built-in can serve as a decorator, to define a function that will run automatically when an attribute is fetched:

class Person:

@property

def name(self): ... # Rebinds: name = property(name)

When run, the decorated method is automatically passed to the first argument of the property built-in. This is really just alternative syntax for creating a property and rebinding the attribute name manually, but may be seen as more explicit in this role:

class Person:

def name(self): ...

name = property(name)

Setter and deleter decorators

As of Python 2.6 and 3.0, property objects also have getter, setter, and deleter methods that assign the corresponding property accessor methods and return a copy of the property itself. We can use these to specify components of properties by decorating normal methods too, though the getter component is usually filled in automatically by the act of creating the property itself:

class Person:

def __init__(self, name):

self._name = name

@property

def name(self): # name = property(name)

"name property docs"

print('fetch...')

return self._name

@name.setter

def name(self, value): # name = name.setter(name)

print('change...')

self._name = value

@name.deleter

def name(self): # name = name.deleter(name)

print('remove...')

del self._name

bob = Person('Bob Smith') # bob has a managed attribute

print(bob.name) # Runs name getter (name 1)

bob.name = 'Robert Smith' # Runs name setter (name 2)

print(bob.name)

del bob.name # Runs name deleter (name 3)

print('-'*20)

sue = Person('Sue Jones') # sue inherits property too

print(sue.name)

print(Person.name.__doc__) # Or help(Person.name)

In fact, this code is equivalent to the first example in this section—decoration is just an alternative way to code properties in this case. When it’s run, the results are the same:

c:\code> py −3 prop-person-deco.py

fetch...

Bob Smith

change...

fetch...

Robert Smith

remove...

--------------------

fetch...

Sue Jones

name property docs

Compared to manual assignment of property results, in this case using decorators to code properties requires just three extra lines of code—a seemingly negligible difference. As is so often the case with alternative tools, though, the choice between the two techniques is largely subjective.

Descriptors

Descriptors provide an alternative way to intercept attribute access; they are strongly related to the properties discussed in the prior section. Really, a property is a kind of descriptor—technically speaking, the property built-in is just a simplified way to create a specific type of descriptor that runs method functions on attribute accesses. In fact, descriptors are the underlying implementation mechanism for a variety of class tools, including both properties and slots.

Functionally speaking, the descriptor protocol allows us to route a specific attribute’s get, set, and delete operations to methods of a separate class’s instance object that we provide. This allows us to insert code to be run automatically on attribute fetches and assignments, intercept attribute deletions, and provide documentation for the attributes if desired.

Descriptors are created as independent classes, and they are assigned to class attributes just like method functions. Like any other class attribute, they are inherited by subclasses and instances. Their access-interception methods are provided with both a self for the descriptor instance itself, as well as the instance of the client class whose attribute references the descriptor object. Because of this, they can retain and use state information of their own, as well as state information of the subject instance. For example, a descriptor may call methods available in the client class, as well as descriptor-specific methods it defines.

Like a property, a descriptor manages a single, specific attribute; although it can’t catch all attribute accesses generically, it provides control over both fetch and assignment accesses and allows us to change an attribute name freely from simple data to a computation without breaking existing code. Properties really are just a convenient way to create a specific kind of descriptor, and as we shall see, they can be coded as descriptors directly.

Unlike properties, descriptors are broader in scope, and provide a more general tool. For instance, because they are coded as normal classes, descriptors have their own state, may participate in descriptor inheritance hierarchies, can use composition to aggregate objects, and provide a natural structure for coding internal methods and attribute documentation strings.

The Basics

As mentioned previously, descriptors are coded as separate classes and provide specially named accessor methods for the attribute access operations they wish to intercept—get, set, and deletion methods in the descriptor class are automatically run when the attribute assigned to the descriptor class instance is accessed in the corresponding way:

class Descriptor:

"docstring goes here"

def __get__(self, instance, owner): ... # Return attr value

def __set__(self, instance, value): ... # Return nothing (None)

def __delete__(self, instance): ... # Return nothing (None)

Classes with any of these methods are considered descriptors, and their methods are special when one of their instances is assigned to another class’s attribute—when the attribute is accessed, they are automatically invoked. If any of these methods are absent, it generally means that the corresponding type of access is not supported. Unlike properties, however, omitting a __set__ allows the descriptor attribute’s name to be assigned and thus redefined in an instance, thereby hiding the descriptor—to make an attribute read-only, you must define __set__ to catch assignments and raise an exception.

Descriptors with __set__ methods also have some special-case implications for inheritance that we’ll largely defer until Chapter 40’s coverage of metaclasses and the complete inheritance specification. In short, a descriptor with a __set__ is known formally as data descriptor, and is given precedence over other names located by normal inheritance rules. The inherited descriptor for name __class__, for example, overrides the same name in an instance’s namespace dictionary. This also works to ensure that data descriptors you code in your own classes take precedence over others.

Descriptor method arguments

Before we code anything realistic, let’s take a brief look at some fundamentals. All three descriptor methods outlined in the prior section are passed both the descriptor class instance (self), and the instance of the client class to which the descriptor instance is attached (instance).

The __get__ access method additionally receives an owner argument, specifying the class to which the descriptor instance is attached. Its instance argument is either the instance through which the attribute was accessed (for instance.attr), or None when the attribute is accessed through the owner class directly (for class.attr). The former of these generally computes a value for instance access, and the latter usually returns self if descriptor object access is supported.

For example, in the following 3.X session, when X.attr is fetched, Python automatically runs the __get__ method of the Descriptor class instance to which the Subject.attr class attribute is assigned. In 2.X, use the print statement equivalent, and derive both classes here fromobject, as descriptors are a new-style class tool; in 3.X this derivation is implied and can be omitted, but doesn’t hurt:

>>> class Descriptor: # Add "(object)" in 2.X

def __get__(self, instance, owner):

print(self, instance, owner, sep='\n')

>>> class Subject: # Add "(object)" in 2.X

attr = Descriptor() # Descriptor instance is class attr

>>> X = Subject()

>>> X.attr

<__main__.Descriptor object at 0x0281E690>

<__main__.Subject object at 0x028289B0>

<class '__main__.Subject'>

>>> Subject.attr

<__main__.Descriptor object at 0x0281E690>

None

<class '__main__.Subject'>

Notice the arguments automatically passed in to the __get__ method in the first attribute fetch—when X.attr is fetched, it’s as though the following translation occurs (though the Subject.attr here doesn’t invoke __get__ again):

X.attr -> Descriptor.__get__(Subject.attr, X, Subject)

The descriptor knows it is being accessed directly when its instance argument is None.

Read-only descriptors

As mentioned earlier, unlike properties, simply omitting the __set__ method in a descriptor isn’t enough to make an attribute read-only, because the descriptor name can be assigned to an instance. In the following, the attribute assignment to X.a stores a in the instance object X, thereby hiding the descriptor stored in class C:

>>> class D:

def __get__(*args): print('get')

>>> class C:

a = D() # Attribute a is a descriptor instance

>>> X = C()

>>> X.a # Runs inherited descriptor __get__

get

>>> C.a

get

>>> X.a = 99 # Stored on X, hiding C.a!

>>> X.a

99

>>> list(X.__dict__.keys())

['a']

>>> Y = C()

>>> Y.a # Y still inherits descriptor

get

>>> C.a

get

This is the way all instance attribute assignments work in Python, and it allows classes to selectively override class-level defaults in their instances. To make a descriptor-based attribute read-only, catch the assignment in the descriptor class and raise an exception to prevent attribute assignment—when assigning an attribute that is a descriptor, Python effectively bypasses the normal instance-level assignment behavior and routes the operation to the descriptor object:

>>> class D:

def __get__(*args): print('get')

def __set__(*args): raise AttributeError('cannot set')

>>> class C:

a = D()

>>> X = C()

>>> X.a # Routed to C.a.__get__

get

>>> X.a = 99 # Routed to C.a.__set__

AttributeError: cannot set

NOTE

Also be careful not to confuse the descriptor __delete__ method with the general __del__ method. The former is called on attempts to delete the managed attribute name on an instance of the owner class; the latter is the general instance destructor method, run when an instance of any kind of class is about to be garbage-collected. __delete__ is more closely related to the __delattr__generic attribute deletion method we’ll meet later in this chapter. See Chapter 30 for more on operator overloading methods.

A First Example

To see how this all comes together in more realistic code, let’s get started with the same first example we wrote for properties. The following defines a descriptor that intercepts access to an attribute named name in its clients. Its methods use their instance argument to access state information in the subject instance, where the name string is actually stored. Like properties, descriptors work properly only for new-style classes, so be sure to derive both classes in the following from object if you’re using 2.X—it’s not enough to derive just the descriptor, or just its client:

class Name: # Use (object) in 2.X

"name descriptor docs"

def __get__(self, instance, owner):

print('fetch...')

return instance._name

def __set__(self, instance, value):

print('change...')

instance._name = value

def __delete__(self, instance):

print('remove...')

del instance._name

class Person: # Use (object) in 2.X

def __init__(self, name):

self._name = name

name = Name() # Assign descriptor to attr

bob = Person('Bob Smith') # bob has a managed attribute

print(bob.name) # Runs Name.__get__

bob.name = 'Robert Smith' # Runs Name.__set__

print(bob.name)

del bob.name # Runs Name.__delete__

print('-'*20)

sue = Person('Sue Jones') # sue inherits descriptor too

print(sue.name)

print(Name.__doc__) # Or help(Name)

Notice in this code how we assign an instance of our descriptor class to a class attribute in the client class; because of this, it is inherited by all instances of the class, just like a class’s methods. Really, we must assign the descriptor to a class attribute like this—it won’t work if assigned to aself instance attribute instead. When the descriptor’s __get__ method is run, it is passed three objects to define its context:

§ self is the Name class instance.

§ instance is the Person class instance.

§ owner is the Person class.

When this code is run the descriptor’s methods intercept accesses to the attribute, much like the property version. In fact, the output is the same again:

c:\code> py −3 desc-person.py

fetch...

Bob Smith

change...

fetch...

Robert Smith

remove...

--------------------

fetch...

Sue Jones

name descriptor docs

Also like in the property example, our descriptor class instance is a class attribute and thus is inherited by all instances of the client class and any subclasses. If we change the Person class in our example to the following, for instance, the output of our script is the same:

...

class Super:

def __init__(self, name):

self._name = name

name = Name()

class Person(Super): # Descriptors are inherited (class attrs)

pass

...

Also note that when a descriptor class is not useful outside the client class, it’s perfectly reasonable to embed the descriptor’s definition inside its client syntactically. Here’s what our example looks like if we use a nested class:

class Person:

def __init__(self, name):

self._name = name

class Name: # Using a nested class

"name descriptor docs"

def __get__(self, instance, owner):

print('fetch...')

return instance._name

def __set__(self, instance, value):

print('change...')

instance._name = value

def __delete__(self, instance):

print('remove...')

del instance._name

name = Name()

When coded this way, Name becomes a local variable in the scope of the Person class statement, such that it won’t clash with any names outside the class. This version works the same as the original—we’ve simply moved the descriptor class definition into the client class’s scope—but the last line of the testing code must change to fetch the docstring from its new location (per the example file desc-person-nested.py):

...

print(Person.Name.__doc__) # Differs: not Name.__doc__ outside class

Computed Attributes

As was the case when using properties, our first descriptor example of the prior section didn’t do much—it simply printed trace messages for attribute accesses. In practice, descriptors can also be used to compute attribute values each time they are fetched. The following illustrates—it’s a rehash of the same example we coded for properties, which uses a descriptor to automatically square an attribute’s value each time it is fetched:

class DescSquare:

def __init__(self, start): # Each desc has own state

self.value = start

def __get__(self, instance, owner): # On attr fetch

return self.value ** 2

def __set__(self, instance, value): # On attr assign

self.value = value # No delete or docs

class Client1:

X = DescSquare(3) # Assign descriptor instance to class attr

class Client2:

X = DescSquare(32) # Another instance in another client class

# Could also code two instances in same class

c1 = Client1()

c2 = Client2()

print(c1.X) # 3 ** 2

c1.X = 4

print(c1.X) # 4 ** 2

print(c2.X) # 32 ** 2 (1024)

When run, the output of this example is the same as that of the original property-based version, but here a descriptor class object is intercepting the attribute accesses:

c:\code> py −3 desc-computed.py

9

16

1024

Using State Information in Descriptors

If you study the two descriptor examples we’ve written so far, you might notice that they get their information from different places—the first (the name attribute example) uses data stored on the client instance, and the second (the attribute squaring example) uses data attached to thedescriptor object itself (a.k.a. self). In fact, descriptors can use both instance state and descriptor state, or any combination thereof:

§ Descriptor state is used to manage either data internal to the workings of the descriptor, or data that spans all instances. It can vary per attribute appearance (often, per client class).

§ Instance state records information related to and possibly created by the client class. It can vary per client class instance (that is, per application object).

In other words, descriptor state is per-descriptor data and instance state is per-client-instance data. As usual in OOP, you must choose state carefully. For instance, you would not normally use descriptor state to record employee names, since each client instance requires its own value—if stored in the descriptor, each client class instance will effectively share the same single copy. On the other hand, you would not usually use instance state to record data pertaining to descriptor implementation internals—if stored in each instance, there would be multiple varying copies.

Descriptor methods may use either state form, but descriptor state often makes it unnecessary to use special naming conventions to avoid name collisions in the instance for data that is not instance-specific. For example, the following descriptor attaches information to its own instance, so it doesn’t clash with that on the client class’s instance—but also shares that information between two client instances:

class DescState: # Use descriptor state, (object) in 2.X

def __init__(self, value):

self.value = value

def __get__(self, instance, owner): # On attr fetch

print('DescState get')

return self.value * 10

def __set__(self, instance, value): # On attr assign

print('DescState set')

self.value = value

# Client class

class CalcAttrs:

X = DescState(2) # Descriptor class attr

Y = 3 # Class attr

def __init__(self):

self.Z = 4 # Instance attr

obj = CalcAttrs()

print(obj.X, obj.Y, obj.Z) # X is computed, others are not

obj.X = 5 # X assignment is intercepted

CalcAttrs.Y = 6 # Y reassigned in class

obj.Z = 7 # Z assigned in instance

print(obj.X, obj.Y, obj.Z)

obj2 = CalcAttrs() # But X uses shared data, like Y!

print(obj2.X, obj2.Y, obj2.Z)

This code’s internal value information lives only in the descriptor, so there won’t be a collision if the same name is used in the client’s instance. Notice that only the descriptor attribute is managed here—get and set accesses to X are intercepted, but accesses to Y and Z are not (Y is attached to the client class and Z to the instance). When this code is run, X is computed when fetched, but its value is also the same for all client instances because it uses descriptor-level state:

c:\code> py −3 desc-state-desc.py

DescState get

20 3 4

DescState set

DescState get

50 6 7

DescState get

50 6 4

It’s also feasible for a descriptor to store or use an attribute attached to the client class’s instance, instead of itself. Crucially, unlike data stored in the descriptor itself, this allows for data that can vary per client class instance. The descriptor in the following example assumes the instance has an attribute _X attached by the client class, and uses it to compute the value of the attribute it represents:

class InstState: # Using instance state, (object) in 2.X

def __get__(self, instance, owner):

print('InstState get') # Assume set by client class

return instance._X * 10

def __set__(self, instance, value):

print('InstState set')

instance._X = value

# Client class

class CalcAttrs:

X = InstState() # Descriptor class attr

Y = 3 # Class attr

def __init__(self):

self._X = 2 # Instance attr

self.Z = 4 # Instance attr

obj = CalcAttrs()

print(obj.X, obj.Y, obj.Z) # X is computed, others are not

obj.X = 5 # X assignment is intercepted

CalcAttrs.Y = 6 # Y reassigned in class

obj.Z = 7 # Z assigned in instance

print(obj.X, obj.Y, obj.Z)

obj2 = CalcAttrs() # But X differs now, like Z!

print(obj2.X, obj2.Y, obj2.Z)

Here, X is assigned to a descriptor as before that manages accesses. The new descriptor here, though, has no information itself, but it uses an attribute assumed to exist in the instance—that attribute is named _X, to avoid collisions with the name of the descriptor itself. When this version is run the results are similar, but the value of the descriptor attribute can vary per client instance due to the differing state policy:

c:\code> py −3 desc-state-inst.py

InstState get

20 3 4

InstState set

InstState get

50 6 7

InstState get

20 6 4

Both descriptor and instance state have roles. In fact, this is a general advantage that descriptors have over properties—because they have state of their own, they can easily retain data internally, without adding it to the namespace of the client instance object. As a summary, the following uses both state sources—its self.data retains per-attribute information, while its instance.data can vary per client instance:

>>> class DescBoth:

def __init__(self, data):

self.data = data

def __get__(self, instance, owner):

return '%s, %s' % (self.data, instance.data)

def __set__(self, instance, value):

instance.data = value

>>> class Client:

def __init__(self, data):

self.data = data

managed = DescBoth('spam')

>>> I = Client('eggs')

>>> I.managed # Show both data sources

'spam, eggs'

>>> I.managed = 'SPAM' # Change instance data

>>> I.managed

'spam, SPAM'

We’ll revisit the implications of this choice in a larger case study later in this chapter. Before we move on, recall from Chapter 32’s coverage of slots that we can access “virtual” attributes like properties and descriptors with tools like dir and getattr, even though they don’t exist in the instance’s namespace dictionary. Whether you should access these this way probably varies per program—properties and descriptors may run arbitrary computation, and may be less obviously instance “data” than slots:

>>> I.__dict__

{'data': 'SPAM'}

>>> [x for x in dir(I) if not x.startswith('__')]

['data', 'managed']

>>> getattr(I, 'data')

'SPAM'

>>> getattr(I, 'managed')

'spam, SPAM'

>>> for attr in (x for x in dir(I) if not x.startswith('__')):

print('%s => %s' % (attr, getattr(I, attr)))

data => SPAM

managed => spam, SPAM

The more generic __getattr__ and __getattribute__ tools we’ll meet later are not designed to support this functionality—because they have no class-level attributes, their “virtual” attribute names do not appear in dir results. In exchange, they are also not limited to specific attribute names coded as properties or descriptors: tools that share even more than this behavior, as the next section explains.

How Properties and Descriptors Relate

As mentioned earlier, properties and descriptors are strongly related—the property built-in is just a convenient way to create a descriptor. Now that you know how both work, you should also be able to see that it’s possible to simulate the property built-in with a descriptor class like the following:

class Property:

def __init__(self, fget=None, fset=None, fdel=None, doc=None):

self.fget = fget

self.fset = fset

self.fdel = fdel # Save unbound methods

self.__doc__ = doc # or other callables

def __get__(self, instance, instancetype=None):

if instance is None:

return self

if self.fget is None:

raise AttributeError("can't get attribute")

return self.fget(instance) # Pass instance to self

# in property accessors

def __set__(self, instance, value):

if self.fset is None:

raise AttributeError("can't set attribute")

self.fset(instance, value)

def __delete__(self, instance):

if self.fdel is None:

raise AttributeError("can't delete attribute")

self.fdel(instance)

class Person:

def getName(self): print('getName...')

def setName(self, value): print('setName...')

name = Property(getName, setName) # Use like property()

x = Person()

x.name

x.name = 'Bob'

del x.name

This Property class catches attribute accesses with the descriptor protocol and routes requests to functions or methods passed in and saved in descriptor state when the class is created. Attribute fetches, for example, are routed from the Person class, to the Property class’s __get__method, and back to the Person class’s getName. With descriptors, this “just works”:

c:\code> py −3 prop-desc-equiv.py

getName...

setName...

AttributeError: can't delete attribute

Note that this descriptor class equivalent only handles basic property usage, though; to use @ decorator syntax to also specify set and delete operations, we’d have to extend our Property class with setter and deleter methods, which would save the decorated accessor function and return the property object (self should suffice). Since the property built-in already does this, we’ll omit a formal coding of this extension here.

Descriptors and slots and more

You can also probably now at least in part imagine how descriptors are used to implement Python’s slots extension: instance attribute dictionaries are avoided by creating class-level descriptors that intercept slot name access, and map those names to sequential storage space in the instance. Unlike the explicit property call, though, much of the magic behind slots is orchestrated at class creation time both automatically and implicitly, when a __slots__ attribute is present in a class.

See Chapter 32 for more on slots (and why they’re not recommended except in pathological use cases). Descriptors are also used for other class tools, but we’ll omit further internals details here; see Python’s manuals and source code for more details.

NOTE

In Chapter 39, we’ll also make use of descriptors to implement function decorators that apply to both functions and methods. As you’ll see there, because descriptors receive both descriptor and subject class instances they work well in this role, though nested functions are usually a conceptually much simpler solution. We’ll also deploy descriptors as one way to intercept built-in operation method fetches in Chapter 39.

Be sure to also see Chapter 40’s coverage of data descriptors’ precedence in the full inheritance model mentioned earlier: with a __set__, descriptors override other names, and are thus fairly binding—they cannot be hidden by names in instance dictionaries.

__getattr__ and __getattribute__

So far, we’ve studied properties and descriptors—tools for managing specific attributes. The __getattr__ and __getattribute__ operator overloading methods provide still other ways to intercept attribute fetches for class instances. Like properties and descriptors, they allow us to insert code to be run automatically when attributes are accessed. As we’ll see, though, these two methods can also be used in more general ways. Because they intercept arbitrary names, they apply in broader roles such as delegation, but may also incur extra calls in some contexts, and are too dynamic to register in dir results.

Attribute fetch interception comes in two flavors, coded with two different methods:

§ __getattr__ is run for undefined attributes—because it is run only for attributes not stored on an instance or inherited from one of its classes, its use is straightforward.

§ __getattribute__ is run for every attribute—because it is all-inclusive, you must be cautious when using this method to avoid recursive loops by passing attribute accesses to a superclass.

We met the former of these in Chapter 30; it’s available for all Python versions. The latter of these is available for new-style classes in 2.X, and for all (implicitly new-style) classes in 3.X. These two methods are representatives of a set of attribute interception methods that also includes__setattr__ and __delattr_. Because these methods have similar roles, though, we will generally treat them all as a single topic here.

Unlike properties and descriptors, these methods are part of Python’s general operator overloading protocol—specially named methods of a class, inherited by subclasses, and run automatically when instances are used in the implied built-in operation. Like all normal methods of a class, they each receive a first self argument when called, giving access to any required instance state information as well as other methods of the class in which they appear.

The __getattr__ and __getattribute__ methods are also more generic than properties and descriptors—they can be used to intercept access to any (or even all) instance attribute fetches, not just a single specific name. Because of this, these two methods are well suited to generaldelegation-based coding patterns—they can be used to implement wrapper (a.k.a. proxy) objects that manage all attribute accesses for an embedded object. By contrast, we must define one property or descriptor for every attribute we wish to intercept. As we’ll see ahead, this role is impaired somewhat in new-style classes for built-in operations, but still applies to all named methods in a wrapped object’s interface.

Finally, these two methods are more narrowly focused than the alternatives we considered earlier: they intercept attribute fetches only, not assignments. To also catch attribute changes by assignment, we must code a __setattr__ method—an operator overloading method run for every attribute fetch, which must take care to avoid recursive loops by routing attribute assignments through the instance namespace dictionary or a superclass method. Although less common, we can also code a __delattr__ overloading method (which must avoid looping in the same way) to intercept attribute deletions. By contrast, properties and descriptors catch get, set, and delete operations by design.

Most of these operator overloading methods were introduced earlier in the book; here, we’ll expand on their usage and study their roles in larger contexts.

The Basics

__getattr__ and __setattr__ were introduced in Chapter 30 and Chapter 32, and __getattribute__ was mentioned briefly in Chapter 32. In short, if a class defines or inherits the following methods, they will be run automatically when an instance is used in the context described by the comments to the right:

def __getattr__(self, name): # On undefined attribute fetch [obj.name]

def __getattribute__(self, name): # On all attribute fetch [obj.name]

def __setattr__(self, name, value): # On all attribute assignment [obj.name=value]

def __delattr__(self, name): # On all attribute deletion [del obj.name]

In all of these, self is the subject instance object as usual, name is the string name of the attribute being accessed, and value is the object being assigned to the attribute. The two get methods normally return an attribute’s value, and the other two return nothing (None). All can raise exceptions to signal prohibited access.

For example, to catch every attribute fetch, we can use either of the first two previous methods, and to catch every attribute assignment we can use the third. The following uses __getattr__ and works portably on both Python 2.X and 3.X, not requiring new-style object derivation in 2.X:

class Catcher:

def __getattr__(self, name):

print('Get: %s' % name)

def __setattr__(self, name, value):

print('Set: %s %s' % (name, value))

X = Catcher()

X.job # Prints "Get: job"

X.pay # Prints "Get: pay"

X.pay = 99 # Prints "Set: pay 99"

Using __getattribute__ works exactly the same in this specific case, but requires object derivation in 2.X (only), and has subtle looping potential, which we’ll take up in the next section:

class Catcher(object): # Need (object) in 2.X only

def __getattribute__(self, name): # Works same as getattr here

print('Get: %s' % name) # But prone to loops on general

...rest unchanged...

Such a coding structure can be used to implement the delegation design pattern we met earlier, in Chapter 31. Because all attributes are routed to our interception methods generically, we can validate and pass them along to embedded, managed objects. The following class (borrowed fromChapter 31), for example, traces every attribute fetch made to another object passed to the wrapper (proxy) class:

class Wrapper:

def __init__(self, object):

self.wrapped = object # Save object

def __getattr__(self, attrname):

print('Trace: ' + attrname) # Trace fetch

return getattr(self.wrapped, attrname) # Delegate fetch

X = Wrapper([1, 2, 3])

X.append(4) # Prints "Trace: append"

print(X.wrapped) # Prints "[1, 2, 3, 4]"

There is no such analog for properties and descriptors, short of coding accessors for every possible attribute in every possibly wrapped object. On the other hand, when such generality is not required, generic accessor methods may incur additional calls for assignments in some contexts—a tradeoff described in Chapter 30 and mentioned in the context of the case study example we’ll explore at the end of this chapter.

Avoiding loops in attribute interception methods

These methods are generally straightforward to use; their only substantially complex aspect is the potential for looping (a.k.a. recursing). Because __getattr__ is called for undefined attributes only, it can freely fetch other attributes within its own code. However, because__getattribute__ and __setattr__ are run for all attributes, their code needs to be careful when accessing other attributes to avoid calling themselves again and triggering a recursive loop.

For example, another attribute fetch run inside a __getattribute__ method’s code will trigger __getattribute__ again, and the code will usually loop until memory is exhausted:

def __getattribute__(self, name):

x = self.other # LOOPS!

Technically, this method is even more loop-prone than this may imply—a self attribute reference run anywhere in a class that defines this method will trigger __getattribute__, and also has the potential to loop depending on the class’s logic. This is normally desired behavior—intercepting every attribute fetch is this method’s purpose, after all—but you should be aware that this method catches all attribute fetches wherever they are coded. When coded within __getattribute__ itself, this almost always causes a loop. To avoid this loop, route the fetch through a higher superclass instead to skip this level’s version—because the object class is always a new-style superclass, it serves well in this role:

def __getattribute__(self, name):

x = object.__getattribute__(self, 'other') # Force higher to avoid me

For __setattr__, the situation is similar, as summarized in Chapter 30—assigning any attribute inside this method triggers __setattr__ again and may create a similar loop:

def __setattr__(self, name, value):

self.other = value # Recurs (and might LOOP!)

Here too, self attribute assignments anywhere in a class defining this method trigger __setattr__ as well, though the potential for looping is much stronger when they show up in __setattr__ itself. To work around this problem, you can assign the attribute as a key in the instance’s__dict__ namespace dictionary instead. This avoids direct attribute assignment:

def __setattr__(self, name, value):

self.__dict__['other'] = value # Use attr dict to avoid me

Although it’s a less traditional approach, __setattr__ can also pass its own attribute assignments to a higher superclass to avoid looping, just like __getattribute__ (and per the upcoming note, this scheme is sometimes preferred):

def __setattr__(self, name, value):

object.__setattr__(self, 'other', value) # Force higher to avoid me

By contrast, though, we cannot use the __dict__ trick to avoid loops in __getattribute__:

def __getattribute__(self, name):

x = self.__dict__['other'] # Loops!

Fetching the __dict__ attribute itself triggers __getattribute__ again, causing a recursive loop. Strange but true!

The __delattr__ method is less commonly used in practice, but when it is, it is called for every attribute deletion (just as __setattr__ is called for every attribute assignment). When using this method, you must take care to avoid loops when deleting attributes, by using the same techniques: namespace dictionaries operations or superclass method calls.

NOTE

As noted in Chapter 30, attributes implemented with new-style class features such as slots and properties are not physically stored in the instance’s __dict__ namespace dictionary (and slots may even preclude its existence entirely). Because of this, code that wishes to support such attributes should code __setattr__ to assign with the object.__setattr__ scheme shown here, not by self.__dict__ indexing. Namespace __dict__ operations suffice for classes known to store data in instances, like this chapter’s self-contained examples; general tools, though, should prefer object.

A First Example

Generic attribute management is not nearly as complicated as the prior section may have implied. To see how to put these ideas to work, here is the same first example we used for properties and descriptors in action again, this time implemented with attribute operator overloading methods. Because these methods are so generic, we test attribute names here to know when a managed attribute is being accessed; others are allowed to pass normally:

class Person: # Portable: 2.X or 3.X

def __init__(self, name): # On [Person()]

self._name = name # Triggers __setattr__!

def __getattr__(self, attr): # On [obj.undefined]

print('get: ' + attr)

if attr == 'name': # Intercept name: not stored

return self._name # Does not loop: real attr

else: # Others are errors

raise AttributeError(attr)

def __setattr__(self, attr, value): # On [obj.any = value]

print('set: ' + attr)

if attr == 'name':

attr = '_name' # Set internal name

self.__dict__[attr] = value # Avoid looping here

def __delattr__(self, attr): # On [del obj.any]

print('del: ' + attr)

if attr == 'name':

attr = '_name' # Avoid looping here too

del self.__dict__[attr] # but much less common

bob = Person('Bob Smith') # bob has a managed attribute

print(bob.name) # Runs __getattr__

bob.name = 'Robert Smith' # Runs __setattr__

print(bob.name)

del bob.name # Runs __delattr__

print('-'*20)

sue = Person('Sue Jones') # sue inherits property too

print(sue.name)

#print(Person.name.__doc__) # No equivalent here

Notice that the attribute assignment in the __init__ constructor triggers __setattr__ too—this method catches every attribute assignment, even those anywhere within the class itself. When this code is run, the same output is produced, but this time it’s the result of Python’s normal operator overloading mechanism and our attribute interception methods:

c:\code> py −3 getattr-person.py

set: _name

get: name

Bob Smith

set: name

get: name

Robert Smith

del: name

--------------------

set: _name

get: name

Sue Jones

Also note that, unlike with properties and descriptors, there’s no direct notion of specifying documentation for our attribute here; managed attributes exist within the code of our interception methods, not as distinct objects.

Using __getattribute__

To achieve exactly the same results with __getattribute__, replace __getattr__ in the example with the following; because it catches all attribute fetches, this version must be careful to avoid looping by passing new fetches to a superclass, and it can’t generally assume unknown names are errors:

# Replace __getattr__ with this

def __getattribute__(self, attr): # On [obj.any]

print('get: ' + attr)

if attr == 'name': # Intercept all names

attr = '_name' # Map to internal name

return object.__getattribute__(self, attr) # Avoid looping here

When run with this change, the output is similar, but we get an extra __getattribute__ call for the fetch in __setattr__ (the first time originating in __init__):

c:\code> py −3 getattribute-person.py

set: _name

get: __dict__

get: name

Bob Smith

set: name

get: __dict__

get: name

Robert Smith

del: name

get: __dict__

--------------------

set: _name

get: __dict__

get: name

Sue Jones

This example is equivalent to that coded for properties and descriptors, but it’s a bit artificial, and it doesn’t really highlight these tools’ assets. Because they are generic, __getattr__ and __getattribute__ are probably more commonly used in delegation-base code (as sketched earlier), where attribute access is validated and routed to an embedded object. Where just a single attribute must be managed, properties and descriptors might do as well or better.

Computed Attributes

As before, our prior example doesn’t really do anything but trace attribute fetches; it’s not much more work to compute an attribute’s value when fetched. As for properties and descriptors, the following creates a virtual attribute X that runs a calculation when fetched:

class AttrSquare:

def __init__(self, start):

self.value = start # Triggers __setattr__!

def __getattr__(self, attr): # On undefined attr fetch

if attr == 'X':

return self.value ** 2 # value is not undefined

else:

raise AttributeError(attr)

def __setattr__(self, attr, value): # On all attr assignments

if attr == 'X':

attr = 'value'

self.__dict__[attr] = value

A = AttrSquare(3) # 2 instances of class with overloading

B = AttrSquare(32) # Each has different state information

print(A.X) # 3 ** 2

A.X = 4

print(A.X) # 4 ** 2

print(B.X) # 32 ** 2 (1024)

Running this code results in the same output that we got earlier when using properties and descriptors, but this script’s mechanics are based on generic attribute interception methods:

c:\code> py −3 getattr-computed.py

9

16

1024

Using __getattribute__

As before, we can achieve the same effect with __getattribute__ instead of __getattr__; the following replaces the fetch method with a __getattribute__ and changes the __setattr__ assignment method to avoid looping by using direct superclass method calls instead of__dict__ keys:

class AttrSquare: # Add (object) for 2.X

def __init__(self, start):

self.value = start # Triggers __setattr__!

def __getattribute__(self, attr): # On all attr fetches

if attr == 'X':

return self.value ** 2 # Triggers __getattribute__ again!

else:

return object.__getattribute__(self, attr)

def __setattr__(self, attr, value): # On all attr assignments

if attr == 'X':

attr = 'value'

object.__setattr__(self, attr, value)

When this version, getattribute-computed.py, is run, the results are the same again. Notice, though, the implicit routing going on inside this class’s methods:

§ self.value=start inside the constructor triggers __setattr__

§ self.value inside __getattribute__ triggers __getattribute__ again

In fact, __getattribute__ is run twice each time we fetch attribute X. This doesn’t happen in the __getattr__ version, because the value attribute is not undefined. If you care about speed and want to avoid this, change __getattribute__ to use the superclass to fetch value as well:

def __getattribute__(self, attr):

if attr == 'X':

return object.__getattribute__(self, 'value') ** 2

Of course, this still incurs a call to the superclass method, but not an additional recursive call before we get there. Add print calls to these methods to trace how and when they run.

__getattr__ and __getattribute__ Compared

To summarize the coding differences between __getattr__ and __getattribute__, the following example uses both to implement three attributes—attr1 is a class attribute, attr2 is an instance attribute, and attr3 is a virtual managed attribute computed when fetched:

class GetAttr:

attr1 = 1

def __init__(self):

self.attr2 = 2

def __getattr__(self, attr): # On undefined attrs only

print('get: ' + attr) # Not on attr1: inherited from class

if attr == 'attr3': # Not on attr2: stored on instance

return 3

else:

raise AttributeError(attr)

X = GetAttr()

print(X.attr1)

print(X.attr2)

print(X.attr3)

print('-'*20)

class GetAttribute(object): # (object) needed in 2.X only

attr1 = 1

def __init__(self):

self.attr2 = 2

def __getattribute__(self, attr): # On all attr fetches

print('get: ' + attr) # Use superclass to avoid looping here

if attr == 'attr3':

return 3

else:

return object.__getattribute__(self, attr)

X = GetAttribute()

print(X.attr1)

print(X.attr2)

print(X.attr3)

When run, the __getattr__ version intercepts only attr3 accesses, because it is undefined. The __getattribute__ version, on the other hand, intercepts all attribute fetches and must route those it does not manage to the superclass fetcher to avoid loops:

c:\code> py −3 getattr-v-getattr.py

1

2

get: attr3

3

--------------------

get: attr1

1

get: attr2

2

get: attr3

3

Although __getattribute__ can catch more attribute fetches than __getattr__, in practice they are often just variations on a theme—if attributes are not physically stored, the two have the same effect.

Management Techniques Compared

To summarize the coding differences in all four attribute management schemes we’ve seen in this chapter, let’s quickly step through a somewhat more comprehensive computed-attribute example using each technique, coded to run in either Python 3.X or 2.X. The following first version usesproperties to intercept and calculate attributes named square and cube. Notice how their base values are stored in names that begin with an underscore, so they don’t clash with the names of the properties themselves:

# Two dynamically computed attributes with properties

class Powers(object): # Need (object) in 2.X only

def __init__(self, square, cube):

self._square = square # _square is the base value

self._cube = cube # square is the property name

def getSquare(self):

return self._square ** 2

def setSquare(self, value):

self._square = value

square = property(getSquare, setSquare)

def getCube(self):

return self._cube ** 3

cube = property(getCube)

X = Powers(3, 4)

print(X.square) # 3 ** 2 = 9

print(X.cube) # 4 ** 3 = 64

X.square = 5

print(X.square) # 5 ** 2 = 25

To do the same with descriptors, we define the attributes with complete classes. Note that these descriptors store base values as instance state, so they must use leading underscores again so as not to clash with the names of descriptors; as we’ll see in the final example of this chapter, we could avoid this renaming requirement by storing base values as descriptor state instead, but that doesn’t as directly address data that must vary per client class instance:

# Same, but with descriptors (per-instance state)

class DescSquare(object):

def __get__(self, instance, owner):

return instance._square ** 2

def __set__(self, instance, value):

instance._square = value

class DescCube(object):

def __get__(self, instance, owner):

return instance._cube ** 3

class Powers(object): # Need all (object) in 2.X only

square = DescSquare()

cube = DescCube()

def __init__(self, square, cube):

self._square = square # "self.square = square" works too,

self._cube = cube # because it triggers desc __set__!

X = Powers(3, 4)

print(X.square) # 3 ** 2 = 9

print(X.cube) # 4 ** 3 = 64

X.square = 5

print(X.square) # 5 ** 2 = 25

To achieve the same result with __getattr__ fetch interception, we again store base values with underscore-prefixed names so that accesses to managed names are undefined and thus invoke our method; we also need to code a __setattr__ to intercept assignments, and take care to avoid its potential for looping:

# Same, but with generic __getattr__ undefined attribute interception

class Powers:

def __init__(self, square, cube):

self._square = square

self._cube = cube

def __getattr__(self, name):

if name == 'square':

return self._square ** 2

elif name == 'cube':

return self._cube ** 3

else:

raise TypeError('unknown attr:' + name)

def __setattr__(self, name, value):

if name == 'square':

self.__dict__['_square'] = value # Or use object

else:

self.__dict__[name] = value

X = Powers(3, 4)

print(X.square) # 3 ** 2 = 9

print(X.cube) # 4 ** 3 = 64

X.square = 5

print(X.square) # 5 ** 2 = 25

The final option, coding this with __getattribute__, is similar to the prior version. Because we catch every attribute now, though, we must also route base value fetches to a superclass to avoid looping or extra calls—fetching self._square directly works too, but runs a second__getattribute__ call:

# Same, but with generic __getattribute__ all attribute interception

class Powers(object): # Need (object) in 2.X only

def __init__(self, square, cube):

self._square = square

self._cube = cube

def __getattribute__(self, name):

if name == 'square':

return object.__getattribute__(self, '_square') ** 2

elif name == 'cube':

return object.__getattribute__(self, '_cube') ** 3

else:

return object.__getattribute__(self, name)

def __setattr__(self, name, value):

if name == 'square':

object.__setattr__(self, '_square', value) # Or use __dict__

else:

object.__setattr__(self, name , value)

X = Powers(3, 4)

print(X.square) # 3 ** 2 = 9

print(X.cube) # 4 ** 3 = 64

X.square = 5

print(X.square) # 5 ** 2 = 25

As you can see, each technique takes a different form in code, but all four produce the same result when run:

9

64

25

For more on how these alternatives compare, and other coding options, stay tuned for a more realistic application of them in the attribute validation example in the section Example: Attribute Validations. First, though, we need to take a short side trip to study a new-style-class pitfall associated with two of these tools—the generic attribute interceptors presented in this section.

Intercepting Built-in Operation Attributes

If you’ve been reading this book linearly, some of this section is review and elaboration on material covered earlier, especially in Chapter 32. For others, this topic is presented in this chapter’s context here.

When I introduced __getattr__ and __getattribute__, I stated that they intercept undefined and all attribute fetches, respectively, which makes them ideal for delegation-based coding patterns. While this is true for both normally named and explicitly called attributes, their behavior needs some additional clarification: for method-name attributes implicitly fetched by built-in operations, these methods may not be run at all. This means that operator overloading method calls cannot be delegated to wrapped objects unless wrapper classes somehow redefine these methods themselves.

For example, attribute fetches for the __str__, __add__, and __getitem__ methods run implicitly by printing, + expressions, and indexing, respectively, are not routed to the generic attribute interception methods in 3.X. Specifically:

§ In Python 3.X, neither __getattr__ nor __getattribute__ is run for such attributes.

§ In Python 2.X classic classes, __getattr__ is run for such attributes if they are undefined in the class.

§ In Python 2.X, __getattribute__ is available for new-style classes only and works as it does in 3.X.

In other words, in all Python 3.X classes (and 2.X new-style classes), there is no direct way to generically intercept built-in operations like printing and addition. In Python 2.X’s default classic classes, the methods such operations invoke are looked up at runtime in instances, like all other attributes; in Python 3.X’s new-style classes such methods are looked up in classes instead. Since 3.X mandates new-style classes and 2.X defaults to classic, this is understandably attributed to 3.X, but it can happen in 2.X new-style code too. In 2.X, though, you at least have a way to avoid this change; in 3.X, you do not.

Per Chapter 32, the official (though tersely documented) rationale for this change appears to revolve around metaclassses and optimization of built-in operations. Regardless, given that all attributes—both normally named and others—still dispatch generically through the instance and these methods when accessed explicitly by name, this does not seem meant to preclude delegation in general; it seems more an optimization step for built-in operations’ implicit behavior. This does, however, make delegation-based coding patterns more complex in 3.X, because object interface proxies cannot generically intercept operator overloading method calls and route them to an embedded object.

This is an inconvenience, but is not necessarily a showstopper—wrapper classes can work around this constraint by redefining all relevant operator overloading methods in the wrapper itself, in order to delegate calls. These extra methods can be added either manually, with tools, or by definition in and inheritance from common superclasses. This does, however, make object wrappers more work than they used to be when operator overloading methods are a part of a wrapped object’s interface.

Keep in mind that this issue applies only to __getattr__ and __getattribute__. Because properties and descriptors are defined for specific attributes only, they don’t really apply to delegation-based classes at all—a single property or descriptor cannot be used to intercept arbitrary attributes. Moreover, a class that defines both operator overloading methods and attribute interception will work correctly, regardless of the type of attribute interception defined. Our concern here is only with classes that do not have operator overloading methods defined, but try to intercept them generically.

Consider the following example, the file getattr-bultins.py, which tests various attribute types and built-in operations on instances of classes containing __getattr__ and __getattribute__ methods:

class GetAttr:

eggs = 88 # eggs stored on class, spam on instance

def __init__(self):

self.spam = 77

def __len__(self): # len here, else __getattr__ called with __len__

print('__len__: 42')

return 42

def __getattr__(self, attr): # Provide __str__ if asked, else dummy func

print('getattr: ' + attr)

if attr == '__str__':

return lambda *args: '[Getattr str]'

else:

return lambda *args: None

class GetAttribute(object): # object required in 2.X, implied in 3.X

eggs = 88 # In 2.X all are isinstance(object) auto

def __init__(self): # But must derive to get new-style tools,

self.spam = 77 # incl __getattribute__, some __X__ defaults

def __len__(self):

print('__len__: 42')

return 42

def __getattribute__(self, attr):

print('getattribute: ' + attr)

if attr == '__str__':

return lambda *args: '[GetAttribute str]'

else:

return lambda *args: None

for Class in GetAttr, GetAttribute:

print('\n' + Class.__name__.ljust(50, '='))

X = Class()

X.eggs # Class attr

X.spam # Instance attr

X.other # Missing attr

len(X) # __len__ defined explicitly

# New-styles must support [], +, call directly: redefine

try: X[0] # __getitem__?

except: print('fail []')

try: X + 99 # __add__?

except: print('fail +')

try: X() # __call__? (implicit via built-in)

except: print('fail ()')

X.__call__() # __call__? (explicit, not inherited)

print(X.__str__()) # __str__? (explicit, inherited from type)

print(X) # __str__? (implicit via built-in)

When run under Python 2.X as coded, __getattr__ does receive a variety of implicit attribute fetches for built-in operations, because Python looks up such attributes in instances normally. Conversely, __getattribute__ is not run for any of the operator overloading names invoked by built-in operations, because such names are looked up in classes only in the new-style class model:

c:\code> py −2 getattr-builtins.py

GetAttr===========================================

getattr: other

__len__: 42

getattr: __getitem__

getattr: __coerce__

getattr: __add__

getattr: __call__

getattr: __call__

getattr: __str__

[Getattr str]

getattr: __str__

[Getattr str]

GetAttribute======================================

getattribute: eggs

getattribute: spam

getattribute: other

__len__: 42

fail []

fail +

fail ()

getattribute: __call__

getattribute: __str__

[GetAttribute str]

<__main__.GetAttribute object at 0x02287898>

Note how __getattr__ intercepts both implicit and explicit fetches of __call__ and __str__ in 2.X here. By contrast, __getattribute__ fails to catch implicit fetches of either attribute name for built-in operations.

Really, the __getattribute__ case is the same in 2.X as it is in 3.X, because in 2.X classes must be made new-style by deriving from object to use this method. This code’s object derivation is optional in 3.X because all classes are new-style.

When run under Python 3.X, though, results for __getattr__ differ—none of the implicitly run operator overloading methods trigger either attribute interception method when their attributes are fetched by built-in operations. Python 3.X (and new-style classes in general) skips the normal instance lookup mechanism when resolving such names, though normally named methods are still intercepted as before:

c:\code> py −3 getattr-builtins.py

GetAttr===========================================

getattr: other

__len__: 42

fail []

fail +

fail ()

getattr: __call__

<__main__.GetAttr object at 0x02987CC0>

<__main__.GetAttr object at 0x02987CC0>

GetAttribute======================================

getattribute: eggs

getattribute: spam

getattribute: other

__len__: 42

fail []

fail +

fail ()

getattribute: __call__

getattribute: __str__

[GetAttribute str]

<__main__.GetAttribute object at 0x02987CF8>

Trace these outputs back to prints in the script to see how this works. Some highlights:

§ __str__ access fails to be caught twice by __getattr__ in 3.X: once for the built-in print, and once for explicit fetches because a default is inherited from the class (really, from the built-in object, which is an automatic superclass to every class in 3.X).

§ __str__ fails to be caught only once by the __getattribute__ catchall, during the built-in print operation; explicit fetches bypass the inherited version.

§ __call__ fails to be caught in both schemes in 3.X for built-in call expressions, but it is intercepted by both when fetched explicitly; unlike __str__, there is no inherited __call__ default for object instances to defeat __getattr__.

§ __len__ is caught by both classes, simply because it is an explicitly defined method in the classes themselves—though its name it is not routed to either __getattr__ or __getattribute__ in 3.X if we delete the class’s __len__ methods.

§ All other built-in operations fail to be intercepted by both schemes in 3.X.

Again, the net effect is that operator overloading methods implicitly run by built-in operations are never routed through either attribute interception method in 3.X: Python 3.X’s new-style classes search for such attributes in classes and skip instance lookup entirely. Normally named attributes do not.

This makes delegation-based wrapper classes more difficult to code in 3.X’s new-style classes—if wrapped classes may contain operator overloading methods, those methods must be redefined redundantly in the wrapper class in order to delegate to the wrapped object. In general delegation tools, this can add dozens of extra methods.

Of course, the addition of such methods can be partly automated by tools that augment classes with new methods (the class decorators and metaclasses of the next two chapters might help here). Moreover, a superclass might be able to define all these extra methods once, for inheritance in delegation-based classes. Still, delegation coding patterns require extra work in 3.X’s classes.

For a more realistic illustration of this phenomenon as well as its workaround, see the Private decorator example in the following chapter. There, we’ll explore alternatives for coding the operator methods required of proxies in 3.X’s classes—including reusable mix-in superclass models. We’ll also see there that it’s possible to insert a __getattribute__ in the client class to retain its original type, although this method still won’t be called for operator overloading methods; printing still runs a __str__ defined in such a class directly, for example, instead of routing the request through __getattribute__.

As a more realistic example of this, the next section resurrects our class tutorial example. Now that you understand how attribute interception works, I’ll be able to explain one of its stranger bits.

Delegation-based managers revisited

The object-oriented tutorial of Chapter 28 presented a Manager class that used object embedding and method delegation to customize its superclass, rather than inheritance. Here is the code again for reference, with some irrelevant testing removed:

class Person:

def __init__(self, name, job=None, pay=0):

self.name = name

self.job = job

self.pay = pay

def lastName(self):

return self.name.split()[-1]

def giveRaise(self, percent):

self.pay = int(self.pay * (1 + percent))

def __repr__(self):

return '[Person: %s, %s]' % (self.name, self.pay)

class Manager:

def __init__(self, name, pay):

self.person = Person(name, 'mgr', pay) # Embed a Person object

def giveRaise(self, percent, bonus=.10):

self.person.giveRaise(percent + bonus) # Intercept and delegate

def __getattr__(self, attr):

return getattr(self.person, attr) # Delegate all other attrs

def __repr__(self):

return str(self.person) # Must overload again (in 3.X)

if __name__ == '__main__':

sue = Person('Sue Jones', job='dev', pay=100000)

print(sue.lastName())

sue.giveRaise(.10)

print(sue)

tom = Manager('Tom Jones', 50000) # Manager.__init__

print(tom.lastName()) # Manager.__getattr__ -> Person.lastName

tom.giveRaise(.10) # Manager.giveRaise -> Person.giveRaise

print(tom) # Manager.__repr__ -> Person.__repr__

Comments at the end of this file show which methods are invoked for a line’s operation. In particular, notice how lastName calls are undefined in Manager, and thus are routed into the generic __getattr__ and from there on to the embedded Person object. Here is the script’s output—Sue receives a 10% raise from Person, but Tom gets 20% because giveRaise is customized in Manager:

c:\code> py −3 getattr-delegate.py

Jones

[Person: Sue Jones, 110000]

Jones

[Person: Tom Jones, 60000]

By contrast, though, notice what occurs when we print a Manager at the end of the script: the wrapper class’s __repr__ is invoked, and it delegates to the embedded Person object’s __repr__. With that in mind, watch what happens if we delete the Manager.__repr__ method in this code:

# Delete the Manager __str__ method

class Manager:

def __init__(self, name, pay):

self.person = Person(name, 'mgr', pay) # Embed a Person object

def giveRaise(self, percent, bonus=.10):

self.person.giveRaise(percent + bonus) # Intercept and delegate

def __getattr__(self, attr):

return getattr(self.person, attr) # Delegate all other attrs

Now printing does not route its attribute fetch through the generic __getattr__ interceptor under Python 3.X’s new-style classes for Manager objects. Instead, a default __repr__ display method inherited from the class’s implicit object superclass is looked up and run (sue still prints correctly, because Person has an explicit __repr__):

c:\code> py −3 getattr-delegate.py

Jones

[Person: Sue Jones, 110000]

Jones

<__main__.Manager object at 0x029E7B70>

As coded, running without a __repr__ like this does trigger __getattr__ in Python 2.X’s default classic classes, because operator overloading attributes are routed through this method, and such classes do not inherit a default for __repr__:

c:\code> py −2 getattr-delegate.py

Jones

[Person: Sue Jones, 110000]

Jones

[Person: Tom Jones, 60000]

Switching to __getattribute__ won’t help 3.X here either—like __getattr__, it is not run for operator overloading attributes implied by built-in operations in either Python 2.X or 3.X:

# Replace __getattr_ with __getattribute__

class Manager(object): # Use "(object)" in 2.X

def __init__(self, name, pay):

self.person = Person(name, 'mgr', pay) # Embed a Person object

def giveRaise(self, percent, bonus=.10):

self.person.giveRaise(percent + bonus) # Intercept and delegate

def __getattribute__(self, attr):

print('**', attr)

if attr in ['person', 'giveRaise']:

return object.__getattribute__(self, attr) # Fetch my attrs

else:

return getattr(self.person, attr) # Delegate all others

Regardless of which attribute interception method is used in 3.X, we still must include a redefined __repr__ in Manager (as shown previously) in order to intercept printing operations and route them to the embedded Person object:

C:\code> py −3 getattr-delegate.py

Jones

[Person: Sue Jones, 110000]

** lastName

** person

Jones

** giveRaise

** person

<__main__.Manager object at 0x028E0590>

Notice that __getattribute__ gets called twice here for methods—once for the method name, and again for the self.person embedded object fetch. We could avoid that with a different coding, but we would still have to redefine __repr__ to catch printing, albeit differently here (self.person would cause this __getattribute__ to fail):

# Code __getattribute__ differently to minimize extra calls

class Manager:

def __init__(self, name, pay):

self.person = Person(name, 'mgr', pay)

def __getattribute__(self, attr):

print('**', attr)

person = object.__getattribute__(self, 'person')

if attr == 'giveRaise':

return lambda percent: person.giveRaise(percent+.10)

else:

return getattr(person, attr)

def __repr__(self):

person = object.__getattribute__(self, 'person')

return str(person)

When this alternative runs, our object prints properly, but only because we’ve added an explicit __repr__ in the wrapper—this attribute is still not routed to our generic attribute interception method:

Jones

[Person: Sue Jones, 110000]

** lastName

Jones

** giveRaise

[Person: Tom Jones, 60000]

That short story here is that delegation-based classes like Manager must redefine some operator overloading methods (like __repr__ and __str__) to route them to embedded objects in Python 3.X, but not in Python 2.X unless new-style classes are used. Our only direct options seem to be using __getattr__ and Python 2.X, or redefining operator overloading methods in wrapper classes redundantly in 3.X.

Again, this isn’t an impossible task; many wrappers can predict the set of operator overloading methods required, and tools and superclasses can automate part of this task—in fact, we’ll study coding patterns that can fill this need in the next chapter. Moreover, not all classes use operator overloading methods (indeed, most application classes usually should not). It is, however, something to keep in mind for delegation coding models used in Python 3.X; when operator overloading methods are part of an object’s interface, wrappers must accommodate them portably byredefining them locally.

Example: Attribute Validations

To close out this chapter, let’s turn to a more realistic example, coded in all four of our attribute management schemes. The example we will use defines a CardHolder object with four attributes, three of which are managed. The managed attributes validate or transform values when fetched or stored. All four versions produce the same results for the same test code, but they implement their attributes in very different ways. The examples are included largely for self-study; although I won’t go through their code in detail, they all use concepts we’ve already explored in this chapter.

Using Properties to Validate

Our first coding in the file that follows uses properties to manage three attributes. As usual, we could use simple methods instead of managed attributes, but properties help if we have been using attributes in existing code already. Properties run code automatically on attribute access, but are focused on a specific set of attributes; they cannot be used to intercept all attributes generically.

To understand this code, it’s crucial to notice that the attribute assignments inside the __init__ constructor method trigger property setter methods too. When this method assigns to self.name, for example, it automatically invokes the setName method, which transforms the value and assigns it to an instance attribute called __name so it won’t clash with the property’s name.

This renaming (sometimes called name mangling) is necessary because properties use common instance state and have none of their own. Data is stored in an attribute called __name, and the attribute called name is always a property, not data. As we saw in Chapter 31, names like __nameare known as pseudoprivate attributes, and are changed by Python to include the enclosing class’s name when stored in the instance’s namespace; here, this helps keep the implementation-specific attributes distinct from others, including that of the property that manages them.

In the end, this class manages attributes called name, age, and acct; allows the attribute addr to be accessed directly; and provides a read-only attribute called remain that is entirely virtual and computed on demand. For comparison purposes, this property-based coding weighs in at 39 lines of code, not counting its two initial lines, and includes the object derivation required in 2.X but optional in 3.X:

# File validate_properties.py

class CardHolder(object): # Need "(object)" for setter in 2.X

acctlen = 8 # Class data

retireage = 59.5

def __init__(self, acct, name, age, addr):

self.acct = acct # Instance data

self.name = name # These trigger prop setters too!

self.age = age # __X mangled to have class name

self.addr = addr # addr is not managed

# remain has no data

def getName(self):

return self.__name

def setName(self, value):

value = value.lower().replace(' ', '_')

self.__name = value

name = property(getName, setName)

def getAge(self):

return self.__age

def setAge(self, value):

if value < 0 or value > 150:

raise ValueError('invalid age')

else:

self.__age = value

age = property(getAge, setAge)

def getAcct(self):

return self.__acct[:-3] + '***'

def setAcct(self, value):

value = value.replace('-', '')

if len(value) != self.acctlen:

raise TypeError('invald acct number')

else:

self.__acct = value

acct = property(getAcct, setAcct)

def remainGet(self): # Could be a method, not attr

return self.retireage - self.age # Unless already using as attr

remain = property(remainGet)

Testing code

The following code, validate_tester.py, tests our class; run this script with the name of the class’s module (sans “.py”) as a single command-line argument (you could also add most of its test code to the bottom of each file, or interactively import it from a module after importing the class). We’ll use this same testing code for all four versions of this example. When it runs, it makes two instances of our managed-attribute class and fetches and changes their various attributes. Operations expected to fail are wrapped in try statements, and identical behavior on 2.X is supported by enabling the 3.X print function:

# File validate_tester.py

from __future__ import print_function # 2.X

def loadclass():

import sys, importlib

modulename = sys.argv[1] # Module name in command line

module = importlib.import_module(modulename) # Import module by name string

print('[Using: %s]' % module.CardHolder) # No need for getattr() here

return module.CardHolder

def printholder(who):

print(who.acct, who.name, who.age, who.remain, who.addr, sep=' / ')

if __name__ == '__main__':

CardHolder = loadclass()

bob = CardHolder('1234-5678', 'Bob Smith', 40, '123 main st')

printholder(bob)

bob.name = 'Bob Q. Smith'

bob.age = 50

bob.acct = '23-45-67-89'

printholder(bob)

sue = CardHolder('5678-12-34', 'Sue Jones', 35, '124 main st')

printholder(sue)

try:

sue.age = 200

except:

print('Bad age for Sue')

try:

sue.remain = 5

except:

print("Can't set sue.remain")

try:

sue.acct = '1234567'

except:

print('Bad acct for Sue')

Here is the output of our self-test code on both Python 3.X and 2.X; again, this is the same for all versions of this example, except for the tested class’s name. Trace through this code to see how the class’s methods are invoked; accounts are displayed with some digits hidden, names are converted to a standard format, and time remaining until retirement is computed when fetched using a class attribute cutoff:

c:\code> py −3 validate_tester.py validate_properties

[Using: <class 'validate_properties.CardHolder'>]

12345*** / bob_smith / 40 / 19.5 / 123 main st

23456*** / bob_q._smith / 50 / 9.5 / 123 main st

56781*** / sue_jones / 35 / 24.5 / 124 main st

Bad age for Sue

Can't set sue.remain

Bad acct for Sue

Using Descriptors to Validate

Now, let’s recode our example using descriptors instead of properties. As we’ve seen, descriptors are very similar to properties in terms of functionality and roles; in fact, properties are basically a restricted form of descriptor. Like properties, descriptors are designed to handle specific attributes, not generic attribute access. Unlike properties, descriptors can also have their own state, and are a more general scheme.

Option 1: Validating with shared descriptor instance state

To understand the following code, it’s again important to notice that the attribute assignments inside the __init__ constructor method trigger descriptor __set__ methods. When the constructor method assigns to self.name, for example, it automatically invokes the Name.__set__()method, which transforms the value and assigns it to a descriptor attribute called name.

In the end, this class implements the same attributes as the prior version: it manages attributes called name, age, and acct; allows the attribute addr to be accessed directly; and provides a read-only attribute called remain that is entirely virtual and computed on demand. Notice how we must catch assignments to the remain name in its descriptor and raise an exception; as we learned earlier, if we did not do this, assigning to this attribute of an instance would silently create an instance attribute that hides the class attribute descriptor.

For comparison purposes, this descriptor-based coding takes 45 lines of code; I’ve added the required object derivation to the main descriptor classes for 2.X compatibility (they can be omitted for code to be run in 3.X only, but don’t hurt in 3.X, and aid portability if present):

# File validate_descriptors1.py: using shared descriptor state

class CardHolder(object): # Need all "(object)" in 2.X only

acctlen = 8 # Class data

retireage = 59.5

def __init__(self, acct, name, age, addr):

self.acct = acct # Instance data

self.name = name # These trigger __set__ calls too!

self.age = age # __X not needed: in descriptor

self.addr = addr # addr is not managed

# remain has no data

class Name(object):

def __get__(self, instance, owner): # Class names: CardHolder locals

return self.name

def __set__(self, instance, value):

value = value.lower().replace(' ', '_')

self.name = value

name = Name()

class Age(object):

def __get__(self, instance, owner):

return self.age # Use descriptor data

def __set__(self, instance, value):

if value < 0 or value > 150:

raise ValueError('invalid age')

else:

self.age = value

age = Age()

class Acct(object):

def __get__(self, instance, owner):

return self.acct[:-3] + '***'

def __set__(self, instance, value):

value = value.replace('-', '')

if len(value) != instance.acctlen: # Use instance class data

raise TypeError('invald acct number')

else:

self.acct = value

acct = Acct()

class Remain(object):

def __get__(self, instance, owner):

return instance.retireage - instance.age # Triggers Age.__get__

def __set__(self, instance, value):

raise TypeError('cannot set remain') # Else set allowed here

remain = Remain()

When run with the prior testing script, all examples in this section produce the same output as shown for properties earlier, except that the name of the class in the first line varies:

C:\code> python validate_tester.py validate_descriptors1

...same output as properties, except class name...

Option 2: Validating with per-client-instance state

Unlike in the prior property-based variant, though, in this case the actual name value is attached to the descriptor object, not the client class instance. Although we could store this value in either instance or descriptor state, the latter avoids the need to mangle names with underscores to avoid collisions. In the CardHolder client class, the attribute called name is always a descriptor object, not data.

Importantly, the downside of this scheme is that state stored inside a descriptor itself is class-level data that is effectively shared by all client class instances, and so cannot vary between them. That is, storing state in the descriptor instance instead of the owner (client) class instance means that the state will be the same in all owner class instances. Descriptor state can vary only per attribute appearance.

To see this at work, in the preceding descriptor-based CardHolder example, try printing attributes of the bob instance after creating the second instance, sue. The values of sue’s managed attributes (name, age, and acct) overwrite those of the earlier object bob, because both share the same, single descriptor instance attached to their class:

# File validate_tester2.py

from __future__ import print_function # 2.X

from validate_tester import loadclass

CardHolder = loadclass()

bob = CardHolder('1234-5678', 'Bob Smith', 40, '123 main st')

print('bob:', bob.name, bob.acct, bob.age, bob.addr)

sue = CardHolder('5678-12-34', 'Sue Jones', 35, '124 main st')

print('sue:', sue.name, sue.acct, sue.age, sue.addr) # addr differs: client data

print('bob:', bob.name, bob.acct, bob.age, bob.addr) # name,acct,age overwritten?

The results confirm the suspicion—in terms of managed attributes, bob has morphed into sue!

c:\code> py −3 validate_tester2.py validate_descriptors1

[Using: <class 'validate_descriptors1.CardHolder'>]

bob: bob_smith 12345*** 40 123 main st

sue: sue_jones 56781*** 35 124 main st

bob: sue_jones 56781*** 35 123 main st

There are valid uses for descriptor state, of course—to manage descriptor implementation and data that spans all instance—and this code was implemented to illustrate the technique. Moreover, the state scope implications of class versus instance attributes should be more or less a given at this point in the book.

However, in this particular use case, attributes of CardHolder objects are probably better stored as per-instance data instead of descriptor instance data, perhaps using the same __X naming convention as the property-based equivalent to avoid name clashes in the instance—a more important factor this time, as the client is a different class with its own state attributes. Here are the required coding changes; it doesn’t change line counts (we’re still at 45):

# File validate_descriptors2.py: using per-client-instance state

class CardHolder(object): # Need all "(object)" in 2.X only

acctlen = 8 # Class data

retireage = 59.5

def __init__(self, acct, name, age, addr):

self.acct = acct # Client instance data

self.name = name # These trigger __set__ calls too!

self.age = age # __X needed: in client instance

self.addr = addr # addr is not managed

# remain managed but has no data

class Name(object):

def __get__(self, instance, owner): # Class names: CardHolder locals

return instance.__name

def __set__(self, instance, value):

value = value.lower().replace(' ', '_')

instance.__name = value

name = Name() # class.name vs mangled attr

class Age(object):

def __get__(self, instance, owner):

return instance.__age # Use descriptor data

def __set__(self, instance, value):

if value < 0 or value > 150:

raise ValueError('invalid age')

else:

instance.__age = value

age = Age() # class.age vs mangled attr

class Acct(object):

def __get__(self, instance, owner):

return instance.__acct[:-3] + '***'

def __set__(self, instance, value):

value = value.replace('-', '')

if len(value) != instance.acctlen: # Use instance class data

raise TypeError('invald acct number')

else:

instance.__acct = value

acct = Acct() # class.acct vs mangled name

class Remain(object):

def __get__(self, instance, owner):

return instance.retireage - instance.age # Triggers Age.__get__

def __set__(self, instance, value):

raise TypeError('cannot set remain') # Else set allowed here

remain = Remain()

This supports per-instance data for the name, age, and acct managed fields as expected (bob remains bob), and other tests work as before:

c:\code> py −3 validate_tester2.py validate_descriptors2

[Using: <class 'validate_descriptors2.CardHolder'>]

bob: bob_smith 12345*** 40 123 main st

sue: sue_jones 56781*** 35 124 main st

bob: bob_smith 12345*** 40 123 main st

c:\code> py −3 validate_tester.py validate_descriptors2

...same output as properties, except class name...

One small caveat here: as coded, this version doesn’t support through-class descriptor access, because such access passes a None to the instance argument (also notice the attribute __X name mangling to _Name__name in the error message when the fetch attempt is made):

>>> from validate_descriptors1 import CardHolder

>>> bob = CardHolder('1234-5678', 'Bob Smith', 40, '123 main st')

>>> bob.name

'bob_smith'

>>> CardHolder.name

'bob_smith'

>>> from validate_descriptors2 import CardHolder

>>> bob = CardHolder('1234-5678', 'Bob Smith', 40, '123 main st')

>>> bob.name

'bob_smith'

>>> CardHolder.name

AttributeError: 'NoneType' object has no attribute '_Name__name'

We could detect this with a minor amount of additional code to trigger the error more explicitly, but there’s probably no point—because this version stores data in the client instance, there’s no meaning to its descriptors unless they’re accompanied by a client instance (much like a normal unbound instance method). In fact, that’s really the entire point of this version’s change!

Because they are classes, descriptors are a useful and powerful tool, but they present choices that can deeply impact a program’s behavior. As always in OOP, choose your state retention policies carefully.

Using __getattr__ to Validate

As we’ve seen, the __getattr__ method intercepts all undefined attributes, so it can be more generic than using properties or descriptors. For our example, we simply test the attribute name to know when a managed attribute is being fetched; others are stored physically on the instance and so never reach __getattr__. Although this approach is more general than using properties or descriptors, extra work may be required to imitate the specific-attribute focus of other tools. We need to check names at runtime, and we must code a __setattr__ in order to intercept and validate attribute assignments.

As for the property and descriptor versions of this example, it’s critical to notice that the attribute assignments inside the __init__ constructor method trigger the class’s __setattr__ method too. When this method assigns to self.name, for example, it automatically invokes the__setattr__ method, which transforms the value and assigns it to an instance attribute called name. By storing name on the instance, it ensures that future accesses will not trigger __getattr__. In contrast, acct is stored as _acct, so that later accesses to acct do invoke__getattr__.

In the end, this class, like the prior two, manages attributes called name, age, and acct; allows the attribute addr to be accessed directly; and provides a read-only attribute called remain that is entirely virtual and is computed on demand.

For comparison purposes, this alternative comes in at 32 lines of code—7 fewer than the property-based version, and 13 fewer than the version using descriptors. Clarity matters more than code size, of course, but extra code can sometimes imply extra development and maintenance work. Probably more important here are roles: generic tools like __getattr__ may be better suited to generic delegation, while properties and descriptors are more directly designed to manage specific attributes.

Also note that the code here incurs extra calls when setting unmanaged attributes (e.g., addr), although no extra calls are incurred for fetching unmanaged attributes, since they are defined. Though this will likely result in negligible overhead for most programs, the more narrowly focusedproperties and descriptors incur an extra call only when managed attributes are accessed, and also appear in dir results when needed by generic tools.

Here’s the __getattr__ version of our validations code:

# File validate_getattr.py

class CardHolder:

acctlen = 8 # Class data

retireage = 59.5

def __init__(self, acct, name, age, addr):

self.acct = acct # Instance data

self.name = name # These trigger __setattr__ too

self.age = age # _acct not mangled: name tested

self.addr = addr # addr is not managed

# remain has no data

def __getattr__(self, name):

if name == 'acct': # On undefined attr fetches

return self._acct[:-3] + '***' # name, age, addr are defined

elif name == 'remain':

return self.retireage - self.age # Doesn't trigger __getattr__

else:

raise AttributeError(name)

def __setattr__(self, name, value):

if name == 'name': # On all attr assignments

value = value.lower().replace(' ', '_') # addr stored directly

elif name == 'age': # acct mangled to _acct

if value < 0 or value > 150:

raise ValueError('invalid age')

elif name == 'acct':

name = '_acct'

value = value.replace('-', '')

if len(value) != self.acctlen:

raise TypeError('invald acct number')

elif name == 'remain':

raise TypeError('cannot set remain')

self.__dict__[name] = value # Avoid looping (or via object)

When this code is run with either test script, it produces the same output (with a different class name):

c:\code> py −3 validate_tester.py validate_getattr

...same output as properties, except class name...

c:\code> py −3 validate_tester2.py validate_getattr

...same output as instance-state descriptors, except class name...

Using __getattribute__ to Validate

Our final variant uses the __getattribute__ catchall to intercept attribute fetches and manage them as needed. Every attribute fetch is caught here, so we test the attribute names to detect managed attributes and route all others to the superclass for normal fetch processing. This version uses the same __setattr__ to catch assignments as the prior version.

The code works very much like the __getattr__ version, so I won’t repeat the full description here. Note, though, that because every attribute fetch is routed to __getattribute__, we don’t need to mangle names to intercept them here (acct is stored as acct). On the other hand, this code must take care to route nonmanaged attribute fetches to a superclass to avoid looping or extra calls.

Also notice that this version incurs extra calls for both setting and fetching unmanaged attributes (e.g., addr); if speed is paramount, this alternative may be the slowest of the bunch. For comparison purposes, this version amounts to 32 lines of code, just like the prior version, and includes the requisite object derivation for 2.X compatibility; like properties and descriptors, __getattribute__ is a new-style class tool:

# File validate_getattribute.py

class CardHolder(object): # Need "(object)" in 2.X only

acctlen = 8 # Class data

retireage = 59.5

def __init__(self, acct, name, age, addr):

self.acct = acct # Instance data

self.name = name # These trigger __setattr__ too

self.age = age # acct not mangled: name tested

self.addr = addr # addr is not managed

# remain has no data

def __getattribute__(self, name):

superget = object.__getattribute__ # Don't loop: one level up

if name == 'acct': # On all attr fetches

return superget(self, 'acct')[:-3] + '***'

elif name == 'remain':

return superget(self, 'retireage') - superget(self, 'age')

else:

return superget(self, name) # name, age, addr: stored

def __setattr__(self, name, value):

if name == 'name': # On all attr assignments

value = value.lower().replace(' ', '_') # addr stored directly

elif name == 'age':

if value < 0 or value > 150:

raise ValueError('invalid age')

elif name == 'acct':

value = value.replace('-', '')

if len(value) != self.acctlen:

raise TypeError('invald acct number')

elif name == 'remain':

raise TypeError('cannot set remain')

self.__dict__[name] = value # Avoid loops, orig names

Both the getattr and getattribute scripts work the same as the property and per-client-instance descriptor versions, when run by both tester scripts on either 2.X or 3.X.—four ways to achieve the same goal in Python, though they vary in structure, and are perhaps less redundant in some other roles. Be sure to study and run this section’s code on your own for more pointers on managed attribute coding techniques.

Chapter Summary

This chapter covered the various techniques for managing access to attributes in Python, including the __getattr__ and __getattribute__ operator overloading methods, class properties, and class attribute descriptors. Along the way, it compared and contrasted these tools and presented a handful of use cases to demonstrate their behavior.

Chapter 39 continues our tool-building survey with a look at decorators—code run automatically at function and class creation time, rather than on attribute access. Before we continue, though, let’s work through a set of questions to review what we’ve covered here.

Test Your Knowledge: Quiz

1. How do __getattr__ and __getattribute__ differ?

2. How do properties and descriptors differ?

3. How are properties and decorators related?

4. What are the main functional differences between __getattr__ and __getattribute__ and properties and descriptors?

5. Isn’t all this feature comparison just a kind of argument?

Test Your Knowledge: Answers

1. The __getattr__ method is run for fetches of undefined attributes only (i.e., those not present on an instance and not inherited from any of its classes). By contrast, the __getattribute__ method is called for every attribute fetch, whether the attribute is defined or not. Because of this, code inside a __getattr__ can freely fetch other attributes if they are defined, whereas __getattribute__ must use special code for all such attribute fetches to avoid looping or extra calls (it must route fetches to a superclass to skip itself).

2. Properties serve a specific role, while descriptors are more general. Properties define get, set, and delete functions for a specific attribute; descriptors provide a class with methods for these actions, too, but they provide extra flexibility to support more arbitrary actions. In fact, properties are really a simple way to create a specific kind of descriptor—one that runs functions on attribute accesses. Coding differs too: a property is created with a built-in function, and a descriptor is coded with a class; thus, descriptors can leverage all the usual OOP features of classes, such as inheritance. Moreover, in addition to the instance’s state information, descriptors have local state of their own, so they can sometimes avoid name collisions in the instance.

3. Properties can be coded with decorator syntax. Because the property built-in accepts a single function argument, it can be used directly as a function decorator to define a fetch access property. Due to the name rebinding behavior of decorators, the name of the decorated function is assigned to a property whose get accessor is set to the original function decorated (name = property(name)). Property setter and deleter attributes allow us to further add set and delete accessors with decoration syntax—they set the accessor to the decorated function and return the augmented property.

4. The __getattr__ and __getattribute__ methods are more generic: they can be used to catch arbitrarily many attributes. In contrast, each property or descriptor provides access interception for only one specific attribute—we can’t catch every attribute fetch with a single property or descriptor. On the other hand, properties and descriptors handle both attribute fetch and assignment by design: __getattr__ and __getattribute__ handle fetches only; to intercept assignments as well, __setattr__ must also be coded. The implementation is also different: __getattr__ and __getattribute__ are operator overloading methods, whereas properties and descriptors are objects manually assigned to class attributes. Unlike the others, properties and descriptors can also sometimes avoid extra calls on assignment to unmanaged names, and show up in dir results automatically, but are also narrower in scope—they can’t address generic dispatch goals. In Python evolution, new features tend to offer alternatives, but do not fully subsume what came before.

5. No it isn’t. To quote from Python namesake Monty Python’s Flying Circus:

6. An argument is a connected series of statements intended to establish a

7. proposition.

8. No it isn't.

9. Yes it is! It's not just contradiction.

10.Look, if I argue with you, I must take up a contrary position.

11.Yes, but that's not just saying "No it isn't."

12.Yes it is!

13.No it isn't!

14.Yes it is!

15.No it isn't. Argument is an intellectual process. Contradiction is just

16.the automatic gainsaying of any statement the other person makes.

17.(short pause) No it isn't.

18.It is.

19.Not at all.

20.Now look...