Monday: The Object Model - Metaprogramming Ruby - Metaprogramming Ruby 2: Program Like the Ruby Pros (2014)

Metaprogramming Ruby 2: Program Like the Ruby Pros (2014)

Part 1. Metaprogramming Ruby

Chapter 2. Monday: The Object Model

Glance at any Ruby program, and you’ll see objects everywhere. Do a double take, and you’ll see that objects are citizens of a larger world that also includes other language constructs, such as classes, modules, and instance variables. Metaprogramming manipulates these language constructs, so you need to know a few things about them right off the bat.

You are about to dig into the first concept: all these constructs live together in a system called the object model. The object model is where you’ll find answers to questions such as “Which class does this method come from?” and “What happens when I include this module?” Delving into the object model, at the very heart of Ruby, you’ll learn some powerful techniques, and you’ll also learn how to steer clear of a few pitfalls.

Monday promises to be a full day, so silence your messaging app, grab a donut, and get ready to start.

Open Classes

Where you refactor some legacy code and learn a trick or two along the way.

Welcome to your new job as a Ruby programmer. After you’ve settled yourself at your new desk with a shiny, latest-generation computer and a cup of coffee, you can meet Bill, your mentor. Yes, you have your first assignment at your new company, a new language to work with, and a new pair-programming buddy.

You’ve only been using Ruby for a few weeks, but Bill is there to help you. He has plenty of Ruby experience, and he looks like a nice guy. You’re going to have a good time working with him—at least until your first petty fight over coding conventions.

The boss wants you and Bill to review the source of a small application called Bookworm. The company developed Bookworm to manage its large internal library of books. The program has slowly grown out of control as many different developers have added their pet features to the mix, from text previews to magazine management and the tracking of borrowed books. As a result, the Bookworm source code is a bit of a mess. You and Bill have been selected to go through the code and clean it up. The boss called it “just an easy refactoring job.”

You and Bill have been browsing through the Bookworm source code for a few minutes when you spot your first refactoring opportunity. Bookworm has a function that formats book titles for printing on old-fashioned tape labels. It strips all punctuation and special characters out of a string, leaving only alphanumeric characters and spaces:

object_model/alphanumeric.rb

def to_alphanumeric(s)

s.gsub(/[^\w\s]/, '')

end

This method also comes with its own unit test (remember to gem install test-unit before you try to run it on Ruby 2.2 and later):

require 'test/unit'

class ToAlphanumericTest < Test::Unit::TestCase

def test_strip_non_alphanumeric_characters

assert_equal '3 the Magic Number', to_alphanumeric('#3, the *Magic, Number*?')

end

end

“This to_alphanumeric method is not very object oriented, is it?” Bill says. “This is generic functionality that makes sense for all strings. It’d be better if we could ask a String to convert itself, rather than pass it through an external method.”

Even though you’re the new guy on the block, you can’t help but interrupt. “But this is just a regular String. To add methods to it, we’d have to write a whole new AlphanumericString class. I’m not sure it would be worth it.”

“I think I have a simpler solution to this problem,” Bill replies. He opens the String class and plants the to_alphanumeric method there:

class String

def to_alphanumeric

gsub(/[^\w\s]/, '')

end

end

Bill also changes the callers to use String#to_alphanumeric. For example, the test becomes as follows:

require 'test/unit'

class StringExtensionsTest < Test::Unit::TestCase

def test_strip_non_alphanumeric_characters

assert_equal '3 the Magic Number', '#3, the *Magic, Number*?'.to_alphanumeric

end

end

To understand the previous trick, you need to know a thing or two about Ruby classes. Bill is only too happy to teach you….

Inside Class Definitions

In Ruby, there is no real distinction between code that defines a class and code of any other kind. You can put any code you want in a class definition:

3.times do

class C

puts "Hello"

end

end

<=

Hello

Hello

Hello

Ruby executed the code within the class just as it would execute any other code. Does that mean you defined three classes with the same name? The answer is no, as you can quickly find out yourself:

class D

def x; 'x'; end

end

class D

def y; 'y'; end

end

obj = D.new

obj.x # => "x"

obj.y # => "y"

When the previous code mentions class D for the first time, no class by that name exists yet. So, Ruby steps in and defines the class—and the x method. At the second mention, class D already exists, so Ruby doesn’t need to define it. Instead, it reopens the existing class and defines a method named y there.

In a sense, the class keyword in Ruby is more like a scope operator than a class declaration. Yes, it creates classes that don’t yet exist, but you might argue that it does this as a pleasant side effect. For class, the core job is to move you in the context of the class, where you can define methods.

This distinction about the class keyword is not an academic detail. It has an important practical consequence: you can always reopen existing classes—even standard library classes such as String or Array—and modify them on the fly. You can call this technique Spell: Open Class.

To see how people use Open Classes in practice, let’s look at a quick example from a real-life library.

The Money Example

You can find an example of Open Classes in the money gem, a set of utility classes for managing money and currencies. Here’s how you create a Money object:

object_model/money_example.rb

require "money"

bargain_price = Money.from_numeric(99, "USD")

bargain_price.format # => "$99.00"

As a shortcut, you can also convert any number to a Money object by calling Numeric#to_money:

object_model/money_example.rb

require "money"

standard_price = 100.to_money("USD")

standard_price.format # => "$100.00"

Since Numeric is a standard Ruby class, you might wonder where the method Numeric#to_money comes from. Look through the source of the Money gem, and you’ll find code that reopens Numeric and defines that method:

class Numeric

def to_money(currency = nil)

Money.from_numeric(self, currency || Money.default_currency)

end

end

It’s quite common for libraries to use Open Classes this way.

As cool as they are, however, Open Classes have a dark side—one that you’re about to experience.

The Problem with Open Classes

You and Bill don’t have to look much further before you stumble upon another opportunity to use Open Classes. The Bookworm source contains a method that replaces elements in an array:

object_model/replace.rb

def replace(array, original, replacement)

array.map {|e| e == original ? replacement : e }

end

Instead of focusing on the internal workings of replace, you can look at Bookworm’s unit tests to see how that method is supposed to be used:

def test_replace

original = ['one', 'two', 'one', 'three']

replaced = replace(original, 'one', 'zero')

assert_equal ['zero', 'two', 'zero', 'three'], replaced

end

This time, you know what to do. You grab the keyboard (taking advantage of Bill’s slower reflexes) and move the method to the Array class:

class Array

def replace(original, replacement)

self.map {|e| e == original ? replacement : e }

end

end

Then you change all calls to replace into calls to Array#replace. For example, the test becomes as follows:

def test_replace

original = ['one', 'two', 'one', 'three']

*

replaced = original.replace('one', 'zero')

assert_equal ['zero', 'two', 'zero', 'three'], replaced

end

You save the test, you run Bookworm’s unit tests suite, and...whoops! While test_replace does pass, other tests unexpectedly fail. To make things more perplexing, the failing tests seem to have nothing to do with the code you just edited. What gives?

“I think I know what happened,” Bill says. He fires up irb, the interactive Ruby interpreter, and gets a list of all methods in Ruby’s standard Array that begin with re:

[].methods.grep /^re/ # => [:reverse_each, :reverse, ..., :replace, ...]

In looking at the irb output, you spot the problem. Class Array already has a method named replace. When you defined your own replace method, you inadvertently overwrote the original replace, a method that some other part of Bookworm was relying on.

This is the dark side to Open Classes: if you casually add bits and pieces of functionality to classes, you can end up with bugs like the one you just encountered. Some people would frown upon this kind of reckless patching of classes, and they would refer to the previous code with a derogatory name: they’d call it a Spell: Monkeypatch.

Now that you know what the problem is, you and Bill rename your own version of Array#replace to Array#substitute and fix both the tests and the calling code. You learned a lesson the hard way, but that didn’t spoil your attitude. If anything, this incident piqued your curiosity about Ruby classes. It’s time for you to learn the truth about them.

Is Monkeypatching Evil?

In the previous section, you learned that Monkeypatch is a derogatory term. However, the same term is sometimes used in a positive sense, to refer to Open Classes (Open Class) in general. You might argue that there are two types of Monkeypatches (Monkeypatch). Some happen by mistake, like the one that you and Bill experienced, and they’re invariably evil. Others are applied on purpose, and they’re quite useful—especially when you want to bend an existing library to your needs.

Even when you think you’re in control, you should still Monkeypatch with care. Like any other global modification, Monkeypatches can be difficult to track in a large code base. To minimize the dangers of Monkeypatches, carefully check the existing methods in a class before you define your own methods. Also, be aware that some changes are riskier than others. For example, adding a new method is usually safer than modifying an existing one.

You’ll see alternatives to Monkeypatching throughout the book. In particular, we will see soon that you can make Monkeypatches safer by using Refinements (Refinement). Unfortunately, Refinements are still a new feature, and there is no guarantee that they’ll ever completely replace traditional Monkeypatches.

Inside the Object Model

Where you learn surprising facts about objects, classes, and constants.

Your recent experience with Open Classes (Open Class) hints that there is more to Ruby classes than meets the eye. Much more, actually. Some of the truths about Ruby classes and the object model in general might even come as a bit of a shock when you first uncover them.

There is a lot to learn about the object model, but don’t let all this theory put you off. If you understand the truth about classes and objects, you’ll be well on your way to being a master of metaprogramming. Let’s start with the basics: objects.

What’s in an Object

Imagine running this code:

class MyClass

def my_method

@v = 1

end

end

obj = MyClass.new

obj.class # => MyClass

Look at the obj object. If you could open the Ruby interpreter and look into obj, what would you see?

Instance Variables

Most importantly, objects contain instance variables. Even though you’re not really supposed to peek at them, you can do that anyway by calling Object#instance_variables. The object from the previous example has just a single instance variable:

obj.my_method

obj.instance_variables # => [:@v]

Unlike in Java or other static languages, in Ruby there is no connection between an object’s class and its instance variables. Instance variables just spring into existence when you assign them a value, so you can have objects of the same class that carry different instance variables. For example, if you hadn’t called obj.my_method, then obj would have no instance variable at all. You can think of the names and values of instance variables as keys and values in a hash. Both the keys and the values can be different for each object.

That’s all there is to know about instance variables. Let’s move on to methods.

Methods

Besides having instance variables, objects also have methods. You can get a list of an object’s methods by calling Object#methods. Most objects (including obj in the previous example) inherit a number of methods from Object, so this list of methods is usually quite long. You can useArray#grep to check that my_method is in obj’s list:

obj.methods.grep(/my/) # => [:my_method]

If you could pry open the Ruby interpreter and look into obj, you’d notice that this object doesn’t really carry a list of methods. An object contains its instance variables and a reference to a class (because every object belongs to a class, or—in OO speak—is an instance of a class)…but no methods. Where are the methods?

Your pair-programming buddy Bill walks over to the nearest whiteboard and starts scribbling all over it. “Think about it for a minute,” he says, drawing the following diagram. “Objects that share the same class also share the same methods, so the methods must be stored in the class, not the object.”

images/chp1_vars_and_methods.jpg


Figure 1. Instance variables live in objects; methods live in classes.

Before going on, you should be aware of one important distinction about methods. You can rightly say that “obj has a method called my_method,” meaning that you’re able to call obj.my_method(). By contrast, you shouldn’t say that “MyClass has a method named my_method.” That would be confusing, because it would imply that you’re able to call MyClass.my_method() as if it were a class method.

To remove the ambiguity, you should say that my_method is an instance method (not just “a method”) of MyClass, meaning that it’s defined in MyClass, and you actually need an object (or instance) of MyClass to call it. It’s the same method, but when you talk about the class, you call it aninstance method, and when you talk about the object, you simply call it a method. Remember this distinction, and you won’t get confused when writing introspective code like this:

String.instance_methods == "abc".methods # => true

String.methods == "abc".methods # => false

Let’s wrap it all up: an object’s instance variables live in the object itself, and an object’s methods live in the object’s class. That’s why objects of the same class share methods but don’t share instance variables.

That’s all you really have to know about objects, instance variables, and methods. But since we brought classes into the picture, we can also take a closer look at them.

The Truth About Classes

Here is possibly the most important thing you’ll ever learn about the Ruby object model: classes themselves are nothing but objects.

Because a class is an object, everything that applies to objects also applies to classes. Classes, like any object, have their own class, called—you guessed it—Class:

"hello".class # => String

String.class # => Class

You might be familiar with Class from other object-oriented languages. In languages such as Java, however, an instance of Class is just a read-only description of the class. By contrast, a Class in Ruby is quite literally the class itself, and you can manipulate it like you would manipulate any other object. For example, in Chapter 5, Thursday: Class Definitions, you’ll see that you can call Class.new to create new classes while your program is running. This flexibility is typical of Ruby’s metaprogramming: while other languages allow you to read class-related information, Ruby allows you to write that information at runtime.

Like any object, classes also have methods. Remember what you learned in What’s in an Object? The methods of an object are also the instance methods of its class. In turn, this means that the methods of a class are the instance methods of Class:

# The "false" argument here means: ignore inherited methods

Class.instance_methods(false) # => [:allocate, :new, :superclass]

You already know about new because you use it all the time to create objects. The allocate method plays a supporting role to new. Chances are, you’ll never need to care about it.

On the other hand, you’ll use the superclass method a lot. This method is related to a concept that you’re probably familiar with: inheritance. A Ruby class inherits from its superclass. Have a look at the following code:

Array.superclass # => Object

Object.superclass # => BasicObject

BasicObject.superclass # => nil

The Array class inherits from Object, which is the same as saying “an array is an object.” Object contains methods that are generally useful for any object—such as to_s, which converts an object to a string. In turn, Object inherits from BasicObject, the root of the Ruby class hierarchy, which contains only a few essential methods. (You will learn more about BasicObject later in the book.)

While talking about superclasses, we can ask ourselves one more question: what is the superclass of Class?

Modules

Take a deep breath and check out the superclass of the Class class itself:

Class.superclass # => Module

The superclass of Class is Module—which is to say, every class is also a module. To be precise, a class is a module with three additional instance methods (new, allocate, and superclass) that allow you to create objects or arrange classes into hierarchies.

Indeed, classes and modules are so closely related that Ruby could easily get away with a single “thing” that plays both roles. The main reason for having a distinction between modules and classes is clarity: by carefully picking either a class or a module, you can make your code more explicit. Usually, you pick a module when you mean it to be included somewhere, and you pick a class when you mean it to be instantiated or inherited. So, although you can use classes and modules interchangeably in many situations, you’ll probably want to make your intentions clear by using them for different purposes.

Putting It All Together

Bill concludes his lecture with a piece of code and a whiteboard diagram:

class MyClass; end

obj1 = MyClass.new

obj2 = MyClass.new

images/chp1_classes_1.jpg


Figure 2. Classes are just objects.

“See?” Bill asks, pointing at the previous diagram. “Classes and regular objects live together happily.”

There’s one more interesting detail in the “Classes are objects” theme: like you do with any other object, you hold onto a class with a reference. A variable can reference a class just like any other object:

my_class = MyClass

MyClass and my_class are both references to the same instance of Class—the only difference being that my_class is a variable, while MyClass is a constant. To put this differently, just as classes are nothing but objects, class names are nothing but constants. So let’s look more closely at constants.

Constants

Any reference that begins with an uppercase letter, including the names of classes and modules, is a constant. You might be surprised to learn that a Ruby constant is actually very similar to a variable—to the extent that you can change the value of a constant, although you will get a warning from the interpreter. (If you’re in a destructive mood, you can even break Ruby beyond repair by changing the value of the String class name.)

If you can change the value of a constant, how is a constant different from a variable? The one important difference has to do with their scope. The scope of constants follows its own special rules, as you can see in the example that follows.

module MyModule

MyConstant = 'Outer constant'

class MyClass

MyConstant = 'Inner constant'

end

end

Bill pulls a napkin from his shirt pocket and sketches out the constants in this code. You can see the result in the following figure.

images/chp1_constants.jpg

All the constants in a program are arranged in a tree similar to a file system, where modules (and classes) are directories and regular constants are files. Like in a file system, you can have multiple files with the same name, as long as they live in different directories. You can even refer to a constant by its path, as you’d do with a file. Let’s see how.

The Paths of Constants

You just learned that constants are nested like directories and files. Also like directories and files, constants are uniquely identified by their paths. Constants’ paths use a double colon as a separator (this is akin to the scope operator in C++):

module M

class C

X = 'a constant'

end

C::X # => "a constant"

end

M::C::X # => "a constant"

If you’re sitting deep inside the tree of constants, you can provide the absolute path to an outer constant by using a leading double colon as root:

Y = 'a root-level constant'

module M

Y = 'a constant in M'

Y # => "a constant in M"

::Y # => "a root-level constant"

end

The Module class also provides an instance method and a class method that, confusingly, are both called constants. Module#constants returns all constants in the current scope, like your file system’s ls command (or dir command, if you’re running Windows). Module.constants returns all the top-level constants in the current program, including class names:

M.constants # => [:C, :Y]

Module.constants.include? :Object # => true

Module.constants.include? :Module # => true

Finally, if you need the current path, check out Module.nesting:

module M

class C

module M2

Module.nesting # => [M::C::M2, M::C, M]

end

end

end

The similarities between Ruby constants and files go even further: you can use modules to organize your constants, the same way that you use directories to organize your files. Let’s look at an example.

The Rake Example

The earliest versions of Rake, the popular Ruby build system, defined classes with obvious names, such as Task and FileTask. These names had a good chance of clashing with other class names from different libraries. To prevent clashes, Rake switched to defining those classes inside a Rakemodule:

gems/rake-0.9.2.2/lib/rake/task.rb

module Rake

class Task

# ...

Now the full name of the Task class is Rake::Task, which is unlikely to clash with someone else’s name. A module such as Rake, which only exists to be a container of constants, is called a Spell: Namespace.

This switch to Namespaces had a problem: if someone had an old Rake build file lying around—one that still referenced the earlier, non-Namespaced class names—that file wouldn’t work with an upgraded version of Rake. For this reason, Rake maintained compatibility with older build files for a while. It did so by providing a command-line option named classic-namespace that loaded an additional source file. This source file assigned the new, safer constant names to the old, unsafe ones:

gems/rake-0.9.2.2/lib/rake/classic_namespace.rb

Task = Rake::Task

FileTask = Rake::FileTask

FileCreationTask = Rake::FileCreationTask

# ...

When this file was loaded, both Task and Rake::Task ended up referencing the same instance of Class, so a build file could use either constant to refer to the class. A few versions afterwards, Rake assumed that all users had migrated their build file, and it removed the option.

Enough digression on constants. Let’s go back to objects and classes, and wrap up what you’ve just learned.

Objects and Classes Wrap-Up

What’s an object? It’s a bunch of instance variables, plus a link to a class. The object’s methods don’t live in the object—they live in the object’s class, where they’re called the instance methods of the class.

What’s a class? It’s an object (an instance of Class), plus a list of instance methods and a link to a superclass. Class is a subclass of Module, so a class is also a module. If this is confusing, look back at Figure 2, Classes are just objects.

These are instance methods of the Class class. Like any object, a class has its own methods, such as new. Also like any object, classes must be accessed through references. You already have a constant reference to each class: the class’s name.

“That’s pretty much all there is to know about objects and classes,” Bill says. “If you can understand this, you’re well on your way to understanding metaprogramming. Now, let’s turn back to the code.”

Using Namespaces

It takes only a short while for you to get a chance to apply your newfound knowledge about classes. Sifting through the Bookworm source code, you stumble upon a class that represents a snippet of text out of a book:

class TEXT

# ...

Ruby class names are conventionally Pascal cased: words are concatenated with the first letter of each capitalized: ThisTextIsPascalCased, so you rename the class Text:

class Text

You change the name of the class everywhere it’s used, you run the unit tests, and—surprise!—the tests fail with a cryptic error message:

<=

TypeError: Text is not a class

“D’oh! Of course it is,” you exclaim. Bill is as puzzled as you are, so it takes the two of you some time to find the cause of the problem. As it turns out, the Bookworm application requires an old version of the popular Action Mailer library. Action Mailer, in turn, uses a text-formatting library that defines a module named—you guessed it—Text:

module Text

That’s where the problem lies: because Text is already the name of a module, Ruby complains that it can’t also be the name of a class at the same time.

In a sense, you were lucky that this name clash was readily apparent. If Action Mailer’s Text had been a class, you might have never noticed that this name already existed. Instead, you’d have inadvertently Monkeypatched (Monkeypatch) the existing Text class. At that point, only your unit tests would have protected you from potential bugs.

Fixing the clash between your Text class and Action Mailer’s Text module is as easy as wrapping your class in a Namespace (Namespace):

module Bookworm

class Text

You and Bill also change all references to Text into references to Bookworm::Text. It’s unlikely that an external library defines a class named Bookworm::Text, so you should be safe from clashes now.

That was a lot of learning in a single sitting. You deserve a break and a cup of coffee—and a little quiz.

Loading and Requiring

Speaking of Namespaces (Namespace), there is one interesting detail that involves Namespaces, constants, and Ruby’s load and require methods. Imagine finding a motd.rb file on the web that displays a “message of the day” on the console. You want to add this code to your latest program, so you load the file to execute it and display the message:

load('motd.rb')

Using load, however, has a side effect. The motd.rb file probably defines variables and classes. Although variables fall out of scope when the file has finished loading, constants don’t. As a result, motd.rb can pollute your program with the names of its own constants—in particular, class names.

You can force motd.rb to keep its constants to itself by passing a second, optional argument to load:

load('motd.rb', true)

If you load a file this way, Ruby creates an anonymous module, uses that module as a Namespace to contain all the constants from motd.rb, and then destroys the module.

The require method is quite similar to load, but it’s meant for a different purpose. You use load to execute code, and you use require to import libraries. That’s why require has no second argument: those leftover class names are probably the reason why you imported the file in the first place. Also, that’s why require tries only once to load each file, while load executes the file again every time you call it.

Quiz: Missing Lines

Where you find your way around the Ruby object model.

Back in The Truth About Classes, Bill showed you how objects and classes are related. As an example, he used a snippet of code and this whiteboard diagram:

images/chp1_classes_1.jpg

class MyClass; end

obj1 = MyClass.new

obj2 = MyClass.new

The diagram shows some of the connections between the program entities. Now it’s your turn to add more lines and boxes to the diagram and answer these questions:

· What’s the class of Object?

· What’s the superclass of Module?

· What’s the class of Class?

· Imagine that you execute this code:

obj3 = MyClass.new

obj3.instance_variable_set("@x", 10)

· Can you add obj3 to the diagram?

You can use irb and the Ruby documentation to find out the answers.

Quiz Solution

Your enhanced version of the original diagram is in Figure 3, Bill's diagram, enhanced by you.

images/chp1_classes_2.jpg


Figure 3. Bill’s diagram, enhanced by you

As you can easily check in irb, the superclass of Module is Object. You don’t even need irb to know what the class of Object is: because Object is a class, its class must be Class. This is true of all classes, meaning that the class of Class must be Class itself. Don’t you love self-referential logic?

Finally, calling instance_variable_set blesses obj3 with its own instance variable @x. If you find this concept surprising, remember that in a dynamic language such as Ruby, every object has its own list of instance variables, independent of other objects—even other objects of the same class.

What Happens When You Call a Method?

Where you learn that a humble method call requires a lot of work on Ruby’s part and you shed light on a twisted piece of code.

After some hours working on Bookworm, you and Bill already feel confident enough to fix some minor bugs here and there—but now, as your working day is drawing to a close, you find yourself stuck. Attempting to fix a long-standing bug, you’ve stumbled upon a tangle of classes, modules, and methods that you can’t make heads or tails of.

“Stop!” Bill shouts, startling you. “This code is too complicated. To understand it, you have to learn in detail what happens when you call a method.” And before you can react, he dives into yet another lecture.

When you call a method, Ruby does two things:

1. It finds the method. This is a process called method lookup.

2. It executes the method. To do that, Ruby needs something called self.

This process—find a method and then execute it—happens in every object-oriented language. In Ruby, however, you should understand the process in depth, because this knowledge will open the door to some powerful tricks. We’ll talk about method lookup first, and we’ll come around toself later.

Method Lookup

You already know about the simplest case of method lookup. Look back at Figure 1, Instance variables live in objects; methods live in classes. When you call a method, Ruby looks into the object’s class and finds the method there. Before you look at a more complicated example, though, you need to know about two new concepts: the receiver and the ancestors chain.

The receiver is the object that you call a method on. For example, if you write my_string.reverse(), then my_string is the receiver.

To understand the concept of an ancestors chain, look at any Ruby class. Then imagine moving from the class into its superclass, then into the superclass’s superclass, and so on, until you reach BasicObject, the root of the Ruby class hierarchy. The path of classes you just traversed is the ancestors chain of the class. (The ancestors chain also includes modules, but forget about them for now. We’ll get around to modules in a bit.)

Now that you know what a receiver is and what an ancestors chain is, you can sum up the process of method lookup in a single sentence: to find a method, Ruby goes in the receiver’s class, and from there it climbs the ancestors chain until it finds the method. Here’s an example:

object_model/lookup.rb

class MyClass

def my_method; 'my_method()'; end

end

class MySubclass < MyClass

end

obj = MySubclass.new

obj.my_method() # => "my_method()"

Bill draws this diagram:

images/chp1_right_up.jpg

If you’re used to traditional class diagrams, this picture might look confusing to you. Why is obj, a humble object, hanging around in the same diagram with a class hierarchy? Don’t get confused—this is not a class diagram. Every box in the diagram is an object. It’s just that some of these objects happen to be classes, and classes are linked together through the superclass method.

When you call my_method, Ruby goes right from obj, the receiver, into MySubclass. Because it can’t find my_method there, Ruby continues its search by going up into MyClass, where it finally finds the method.

MyClass doesn’t specify a superclass, so it implicitly inherits from the default superclass: Object. If it hadn’t found the method in MyClass, Ruby would look for the method by climbing up the chain into Object and finally BasicObject.

Because of the way most people draw diagrams, this behavior is also called the “one step to the right, then up” rule: go one step to the right into the receiver’s class, and then go up the ancestors chain until you find the method. You can ask a class for its ancestors chain with the ancestorsmethod:

MySubclass.ancestors # => [MySubclass, MyClass, Object, Kernel, BasicObject]

“Hey, what’s Kernel doing there in the ancestors chain?” you ask. “You told me about a chain of superclasses, but I’m pretty sure that Kernel is a module, not a class.”

“You’re right.” Bill admits. “I forgot to tell you about modules. They’re easy….”

Modules and Lookup

You learned that the ancestors chain goes from class to superclass. Actually, the ancestors chain also includes modules. When you include a module in a class (or even in another module), Ruby inserts the module in the ancestors chain, right above the including class itself:

object_model/modules_include.rb

module M1

def my_method

'M1#my_method()'

end

end

class C

include M1

end

class D < C; end

D.ancestors # => [D, C, M1, Object, Kernel, BasicObject]

Starting from Ruby 2.0, you also have a second way to insert a module in a class’s chain of ancestors: the prepend method. It works like include, but it inserts the module below the including class (sometimes called the includer), rather than above it:

class C2

prepend M2

end

class D2 < C2; end

D2.ancestors # => [D2, M2, C2, Object, Kernel, BasicObject]

Bill draws the following flowchart to show how include and prepend work.

images/chp1_lookup_modules.jpg


Figure 4. Method lookup with modules

Later in this book, you’ll see how to use prepend to your advantage. For now, it’s enough that you understand the previous diagram. There is one last corner case about include and prepend, however—one that is worth mentioning right away.

Multiple Inclusions

What happens if you try to include a module in the same chain of ancestors multiple times? Here is an example:

object_model/modules_multiple.rb

module M1; end

module M2

include M1

end

module M3

prepend M1

include M2

end

M3.ancestors # => [M1, M3, M2]

In the previous code, M3 prepends M1 and then includes M2. When M2 also includes M1, that include has no effect, because M1 is already in the chain of ancestors. This is true every time you include or prepend a module: if that module is already in the chain, Ruby silently ignores the second inclusion. As a result, a module can appear only once in the same chain of ancestors. This behavior might change in future Rubies—but don’t hold your breath.

While we’re talking about modules, it’s worth taking a look at that Kernel module that keeps popping up everywhere.

The Kernel

Ruby includes some methods, such as print, that you can call from anywhere in your code. It looks like each and every object has the print method. Methods such as print are actually private instance methods of module Kernel:

Kernel.private_instance_methods.grep(/^pr/) # => [:printf, :print, :proc]

The trick here is that class Object includes Kernel, so Kernel gets into every object’s ancestors chain. Every line of Ruby is always executed inside an object, so you can call the instance methods in Kernel from anywhere. This gives you the illusion that print is a language keyword, when it’s actually a method. Neat, isn’t it?

You can take advantage of this mechanism yourself: if you add a method to Kernel, this Spell: Kernel Method will be available to all objects. To prove that Kernel Methods are actually useful, you can look at the way some Ruby libraries use them.

The Awesome Print Example

The awesome_print gem prints Ruby objects on the screen with indentation, color, and other niceties:

object_model/awesome_print_example.rb

require "awesome_print"

local_time = {:city => "Rome", :now => Time.now }

ap local_time, :indent => 2

This produces:

<=

{

:city => "Rome",

:now => 2013-11-30 12:51:03 +0100

}

You can call ap from anywhere because it’s a Kernel Method (Kernel Method), which you can verify by peeking into Awesome Print’s source code:

gems/awesome_print-1.1.0/lib/awesome_print/core_ext/kernel.rb

module Kernel

def ap(object, options = {})

# ...

end

end

After this foray into Ruby modules and the Kernel, you can finally learn how Ruby executes methods after finding them.

Method Execution

When you call a method, Ruby does two things: first, it finds the method, and second, it executes the method. Up to now, you focused on the finding part. Now you can finally look at the execution part.

Imagine being the Ruby interpreter. Somebody called a method named, say, my_method. You found the method by going one step to the right, then up, and it looks like this:

def my_method

temp = @x + 1

my_other_method(temp)

end

To execute this method, you need to answer two questions. First, what object does the instance variable @x belong to? And second, what object should you call my_other_method on?

Being a smart human being (as opposed to a dumb computer program), you can probably answer both questions intuitively: both @x and my_other_method belong to the receiver—the object that my_method was originally called upon. However, Ruby doesn’t have the luxury of intuition. When you call a method, it needs to tuck away a reference to the receiver. Thanks to this reference, it can remember who the receiver is as it executes the method.

That reference to the receiver can be useful for you as well—so it is worth exploring further.

The self Keyword

Every line of Ruby code is executed inside an object—the so-called current object. The current object is also known as self, because you can access it with the self keyword.

Only one object can take the role of self at a given time, but no object holds that role for a long time. In particular, when you call a method, the receiver becomes self. From that moment on, all instance variables are instance variables of self, and all methods called without an explicit receiver are called on self. As soon as your code explicitly calls a method on some other object, that other object becomes self.

Here is an artfully complicated example to show you self in action:

object_model/self.rb

class MyClass

def testing_self

@var = 10 # An instance variable of self

my_method() # Same as self.my_method()

self

end

def my_method

@var = @var + 1

end

end

obj = MyClass.new

obj.testing_self # => #<MyClass:0x007f93ab08a728 @var=11>

What private Really Means

Now that you know about self, you can cast a new light on Ruby’s private keyword. Private methods are governed by a single simple rule: you cannot call a private method with an explicit receiver. In other words, every time you call a private method, it must be on the implicit receiver—self. Let’s see a corner case:

class C

def public_method

self.private_method

end

private

def private_method; end

end

C.new.public_method

<=

NoMethodError: private method ‘private_method’ called [...]

You can make this code work by removing the self keyword.

This contrived example shows that private methods come from two rules working together: first, you need an explicit receiver to call a method on an object that is not yourself, and second, private methods can be called only with an implicit receiver. Put these two rules together, and you’ll see that you can only call a private method on yourself. You can call this the “private rule.”

You could find Ruby’s private methods perplexing—especially if you come from Java or C#, where private behaves differently. When you’re in doubt, go back to the private rule, and everything will make sense. Can object x call a private method on object y if the two objects share the same class? The answer is no, because no matter which class you belong to, you still need an explicit receiver to call another object’s method. Can you call a private method that you inherited from a superclass? The answer is yes, because you don’t need an explicit receiver to call inherited methods on yourself.

As soon as you call testing_self, the receiver obj becomes self. Because of that, the instance variable @var is an instance variable of obj, and the method my_method is called on obj. As my_method is executed, obj is still self, so @var is still an instance variable of obj. Finally, testing_selfreturns a reference to self. (You can also check the output to verify that @var is now 11.)

If you want to become a master of Ruby, you should always know which object has the role self at any given moment. In most cases, that’s easy: just track which object was the last method receiver. However, there are two important special cases that you should be aware of. Let’s look at them.

The Top Level

You just learned that every time you call a method on an object, that object becomes self. But then, who’s self if you haven’t called any method yet? You can run irb and ask Ruby itself for an answer:

self # => main

self.class # => Object

As soon as you start a Ruby program, you’re sitting within an object named main that the Ruby interpreter created for you. This object is sometimes called the top-level context, because it’s the object you’re in when you’re at the top level of the call stack: either you haven’t called any method yet or all the methods that you called have returned. (Oh, and in case you’re wondering, Ruby’s main has nothing to do with the main() functions in C and Java.)

Class Definitions and self

In a class or module definition (and outside of any method), the role of self is taken by the class or module itself.

class MyClass

self # => MyClass

end

This last detail is not going to be useful right now, but it will become a central concept later in this book. For now, we can set it aside and go back to the main topic.

Everything that you’ve learned so far about method execution can be summed up in a few short sentences. When you call a method, Ruby looks up the method by following the “one step to the right, then up” rule and then executes the method with the receiver as self. There are some special cases in this procedure (for example, when you include a module), but there are no exceptions…except for one.

Refinements

Remember the first refactoring you coded today, in Open Classes? You and Bill used an Open Class (Open Class) to add a method to Strings:

object_model/alphanumeric.rb

class String

def to_alphanumeric

gsub(/[^\w\s]/, '')

end

end

The problem with modifying classes this way is that the changes are global: from the moment the previous code is executed, every String in the system gets the changes. If the change is an incompatible Monkeypatch (Monkeypatch), you might break some unrelated code—as happened to you and Bill when you inadvertently redefined Array#replace.

Starting with Ruby 2.0, you can deal with this problem using a Spell: Refinement. Begin by writing a module and calling refine inside the module definition:

object_model/refinements_in_file.rb

module StringExtensions

refine String do

def to_alphanumeric

gsub(/[^\w\s]/, '')

end

end

end

This code refines the String class with a new to_alphanumeric method. Differently from a regular Open Class, however, a Refinement is not active by default. If you try to call String#to_alphanumeric, you’ll get an error:

"my *1st* refinement!".to_alphanumeric

<=

NoMethodError: undefined method `to_alphanumeric' [...]

To activate the changes, you have to do so explicitly, with the using method:

using StringExtensions

From the moment you call using, all the code in that Ruby source file will see the changes:

"my *1st* refinement!".to_alphanumeric # => "my 1st refinement"

Starting from Ruby 2.1, you can even call using inside a module definition. The Refinement will be active until the end of the module definition. The code below patches the String#reverse method—but only for the code inside the definition of StringStuff:

object_model/refinements_in_module.rb

module StringExtensions

refine String do

def reverse

"esrever"

end

end

end

module StringStuff

using StringExtensions

"my_string".reverse # => "esrever"

end

"my_string".reverse # => "gnirts_ym"

Refinements are similar to Monkeypatches, but they’re not global. A Refinement is active in only two places: the refine block itself and the code starting from the place where you call using until the end of the module (if you’re in a module definition) or the end of the file (if you’re at the top level)

In the limited scope where it’s active, a Refinement is just as good as an Open Class or a Monkeypatch. It can define new methods, redefine existing methods, include or prepend modules, and generally do anything that a regular Open Class can do. Code in an active Refinement takes precedence over code in the refined class, and also over code in modules that are included or prepended by the class. Refining a class is like slapping a patch right onto the original code of the class.

On the other hand, because they’re not global, Refinements don’t have the issues that you experienced in The Problem with Open Classes. You can apply a Refinement to a few selected areas of your code, and the rest of your code will stick with the original unrefined class—so there aren’t many chances that you’ll break your system by inadvertently impacting unrelated code. However, this local quality of Refinements has the potential to surprise you, as you’re about to find out.

Refinement Gotchas

Look at this code:

object_model/refinements_gotcha.rb

class MyClass

def my_method

"original my_method()"

end

def another_method

my_method

end

end

module MyClassRefinement

refine MyClass do

def my_method

"refined my_method()"

end

end

end

using MyClassRefinement

MyClass.new.my_method # => "refined my_method()"

MyClass.new.another_method # => "original my_method()"

The call to my_method happens after the call to using, so you get the refined version of the method, just like you expect. However, the call to another_method could catch you off guard: even if you call another_method after using, the call to my_method itself happens before using—so it calls the original, unrefined version of the method.

Some people find the result above counterintuitive. The lesson here is to double-check your method calls when you use Refinements (Refinement). Also keep in mind that Refinements are still an evolving feature—so much so that Ruby 2.0 issues a scary warning when your program uses Refinements for the first time:

<=

warning: Refinements are experimental, and the

behavior may change in future versions of Ruby!

This warning has been removed in Ruby 2.1, but there are still a few corner cases where Refinements might not behave as you expect—and some of those corner cases might change in future Rubies. For example, you can call refine in a regular module, but you cannot call it in a class, even if a class is itself a module. Also, metaprogramming methods such as methods and ancestors ignore Refinements altogether. Behaviors such as these have sound technical justifications, but they could trip you up nonetheless. Refinements have the potential to eliminate dangerous Monkeypatches, but it will take some time for the Ruby community to understand how to use them best.

You’re still considering the power and responsibility of using Refinements when Bill decides to throw a quiz at you.

Quiz: Tangle of Modules

Where you untangle a twisted yarn of modules, classes, and objects.

You can finally go back to the problem that prompted Bill to launch into his discussion on method lookup, self, and Refinements. You’ve had trouble making sense of a complicated arrangement of classes and modules. Here’s the confusing part:

object_model/tangle.rb

module Printable

def print

# ...

end

def prepare_cover

# ...

end

end

module Document

def print_to_screen

prepare_cover

format_for_screen

print

end

def format_for_screen

# ...

end

def print

# ...

end

end

class Book

include Document

include Printable

# ...

end

Another source file creates a Book and calls print_to_screen:

b = Book.new

b.print_to_screen

According to the company’s bug management application, there is a problem with this code: print_to_screen is not calling the right print method. The bug report doesn’t provide anymore details.

Can you guess which version of print gets called—the one in Printable or the one in Document? Try drawing the chain of ancestors on paper. How can you quickly fix the code so print_to_screen calls the other version of print instead?

Quiz Solution

You can ask Ruby itself for the ancestors chain of Book:

Book.ancestors # => [Book, Printable, Document, Object, Kernel, BasicObject]

If you draw this ancestors chain on your whiteboard, it will look like Figure 5, The ancestors chain of the Book class.

images/chp1_tangle_quiz.jpg


Figure 5. The ancestors chain of the Book class

Let’s see how Ruby builds the chain. Because Book doesn’t have an explicit superclass, it implicitly inherits from Object, which in turn includes Kernel and inherits from BasicObject. When Book includes Document, Ruby adds Document to Book’s ancestors chain right above Book itself. Immediately after that, Book includes Printable. Again, Ruby slips Printable in the chain right above Book, pushing up the rest of the chain—from Document upward.

When you call b.print_to_screen, the object referenced by b becomes self, and method lookup begins. Ruby finds the print_to_screen method in Document, and that method then calls other methods—including print. All methods called without an explicit receiver are called on self, so method lookup starts once again from Book (self’s class) and goes up until it finds a method named print. The lowest print in the chain is Printable#print, so that’s the one that gets called.

The bug report hints that the original author of the code intended to call Document#print instead. In real production code, you’d probably want to get rid of this confusion and rename one of the clashing print methods. However, if you just want to solve this quiz, the cheapest way to do it is to swap the order of inclusion of the modules in Book so that Document gets lower than Printable in the ancestors chain:

object_model/tangle_untwisted.rb

module Printable

# ...

end

module Document

# ...

end

class Book

*

include Printable

*

include Document

ancestors # => [Book, Document, Printable, Object, Kernel, BasicObject]

end

The implicit receiver of ancestors in the previous code is Book, because in a class definition the role of self is taken by the class. The ancestors chain of Book also contains a third method named print—but Bill is not telling you where it is. If you’re curious, you’ll have to find it yourself, maybe with some help from your friend irb.

It’s almost time to go home after an exhausting but very satisfying day of work. But before you call it a day, Bill does a complete wrap-up of what you learned.

Wrap-Up

Here’s a checklist of what you learned today:

· An object is composed of a bunch of instance variables and a link to a class.

· The methods of an object live in the object’s class. (From the point of view of the class, they’re called instance methods.)

· The class itself is just an object of class Class. The name of the class is just a constant.

· Class is a subclass of Module. A module is basically a package of methods. In addition to that, a class can also be instantiated (with new) or arranged in a hierarchy (through its superclass).

· Constants are arranged in a tree similar to a file system, where the names of modules and classes play the part of directories and regular constants play the part of files.

· Each class has an ancestors chain, beginning with the class itself and going up to BasicObject.

· When you call a method, Ruby goes right into the class of the receiver and then up the ancestors chain, until it either finds the method or reaches the end of the chain.

· When you include a module in a class, the module is inserted in the ancestors chain right above the class itself. When you prepend the module, it is inserted in the ancestors chain right below the class.

· When you call a method, the receiver takes the role of self.

· When you’re defining a module (or a class), the module takes the role of self.

· Instance variables are always assumed to be instance variables of self.

· Any method called without an explicit receiver is assumed to be a method of self.

· Refinements are like pieces of code patched right over a class, and they override normal method lookup. On the other hand, a Refinement works in a limited area of the program: the lines of code between the call to using and the end of the file, or the end of the module definition.

Checked…checked…done! Now it’s time to go home before your brain explodes with all the information you crammed into it today.