Tuesday: Methods - Metaprogramming Ruby - Metaprogramming Ruby 2: Program Like the Ruby Pros (2014)

Metaprogramming Ruby 2: Program Like the Ruby Pros (2014)

Part 1. Metaprogramming Ruby

Chapter 3. Tuesday: Methods

Yesterday you learned about the Ruby object model and how to make Ruby classes sing and dance for you. Today you’re holding all calls to focus on methods.

The objects in your code talk to each other all the time. Some languages—such as Java and C—feature a compiler that presides over this chatting. For every method call, the compiler checks to see that the receiving object has a matching method. This is called static type checking, and the languages that adopt it are called static languages. For example, if you call talk_simple on a Lawyer object that has no such method, the compiler protests loudly.

Dynamic languages—such as Python and Ruby—don’t have a compiler policing method calls. As a consequence, you can start a program that calls talk_simple on a Lawyer, and everything works just fine—that is, until that specific line of code is executed. Only then does the Lawyercomplain that it doesn’t understand that call.

That’s an important advantage of static type checking: the compiler can spot some of your mistakes before the code runs. This protectiveness, however, comes at a price. Static languages often require you to write lots of tedious, repetitive methods—the so-called boilerplate methods—just to make the compiler happy. (For example, get and set methods to access an object’s properties, or scores of methods that do nothing but delegate to some other object.)

In Ruby, boilerplate methods aren’t a problem, because you can easily avoid them with techniques that would be impractical or just plain impossible in a static language. In this chapter, we’ll focus on those techniques.

A Duplication Problem

Where you and Bill face a problem with duplicated code.

Today, your boss asked you to work on a program for the accounting department. They want a system that flags expenses greater than $99 for computer gear, so they can crack down on developers splurging with company money. (You read that right: $99. The purchasing department isn’t fooling around.)

Some other developers already took a stab at the project, coding a report that lists all the components of each computer in the company and how much each component costs. To date, they haven’t plugged in any real data. Here’s where you and Bill come in.

The Legacy System

Right from the start, you have a challenge on your hands: the data you need to load into the already established program is stored in a legacy system stuck behind an awkwardly coded class named DS (for “data source”):

methods/computer/data_source.rb

class DS

def initialize # connect to data source...

def get_cpu_info(workstation_id) # ...

def get_cpu_price(workstation_id) # ...

def get_mouse_info(workstation_id) # ...

def get_mouse_price(workstation_id) # ...

def get_keyboard_info(workstation_id) # ...

def get_keyboard_price(workstation_id) # ...

def get_display_info(workstation_id) # ...

def get_display_price(workstation_id) # ...

# ...and so on

DS#initialize connects to the data system when you create a new DS object. The other methods—and there are dozens of them—take a workstation identifier and return descriptions and prices for the computer’s components. With Bill standing by to offer moral support, you quickly try the class in irb:

ds = DS.new

ds.get_cpu_info(42) # => "2.9 Ghz quad-core"

ds.get_cpu_price(42) # => 120

ds.get_mouse_info(42) # => "Wireless Touch"

ds.get_mouse_price(42) # => 60

It looks like workstation number 42 has a 2.9GHz CPU and a luxurious $60 mouse. This is enough data to get you started.

Double, Treble… Trouble

You have to wrap DS into an object that fits the reporting application. This means each Computer must be an object. This object has a single method for each component, returning a string that describes both the component and its price. Remember that price limit set by the purchasing department? Keeping this requirement in mind, you know that if the component costs $100 or more, the string must begin with an asterisk to draw people’s attention.

You kick off development by writing the first three methods in the Computer class:

methods/computer/duplicated.rb

class Computer

def initialize(computer_id, data_source)

@id = computer_id

@data_source = data_source

end

def mouse

info = @data_source.get_mouse_info(@id)

price = @data_source.get_mouse_price(@id)

result = "Mouse: #{info} ($#{price})"

return "* #{result}" if price >= 100

result

end

def cpu

info = @data_source.get_cpu_info(@id)

price = @data_source.get_cpu_price(@id)

result = "Cpu: #{info} ($#{price})"

return "* #{result}" if price >= 100

result

end

def keyboard

info = @data_source.get_keyboard_info(@id)

price = @data_source.get_keyboard_price(@id)

result = "Keyboard: #{info} ($#{price})"

return "* #{result}" if price >= 100

result

end

# ...

end

At this point in the development of Computer, you find yourself bogged down in a swampland of repetitive copy and paste. You have a long list of methods left to deal with, and you should also write tests for each and every method, because it’s easy to make mistakes in duplicated code.

“I can think of two different ways to remove this duplication,” Bill says. “One is a spell called Dynamic Methods. The other is a special method called method_missing. We can try both solutions and decide which one we like better.” You agree to start with Dynamic Methods and get tomethod_missing after that.

Dynamic Methods

Where you learn how to call and define methods dynamically, and you remove the duplicated code.

“When I was a young developer learning C++,” Bill says, “my mentors told me that when you call a method, you’re actually sending a message to an object. It took me a while to get used to that concept. If I’d been using Ruby back then, that notion of sending messages would have come more naturally to me.”

Calling Methods Dynamically

When you call a method, you usually do so using the standard dot notation:

methods/dynamic_call.rb

class MyClass

def my_method(my_arg)

my_arg * 2

end

end

obj = MyClass.new

obj.my_method(3) # => 6

You also have an alternative: call MyClass#my_method using Object#send in place of the dot notation:

obj.send(:my_method, 3) # => 6

The previous code still calls my_method, but it does so through send. The first argument to send is the message that you’re sending to the object—that is, a symbol or a string representing the name of a method. (See Method Names and Symbols.) Any remaining arguments (and the block, if one exists) are simply passed on to the method.

Why would you use send instead of the plain old dot notation? Because with send, the name of the method that you want to call becomes just a regular argument. You can wait literally until the very last moment to decide which method to call, while the code is running. This technique is called Spell: Dynamic Dispatch, and you’ll find it wildly useful. To help reveal its magic, let’s look at a couple of real-life examples.

Method Names and Symbols

People who are new to the language are sometimes confused by Ruby’s symbols. Symbols and strings belong to two separate and unrelated classes:

:x.class # => Symbol

"x".class # => String

Nevertheless, symbols are similar enough to strings that you might wonder what’s the point of having symbols at all. Can’t you just use regular strings everywhere?

There are a few different reasons to use symbols in place of regular strings, but in the end the choice boils down to conventions. In most cases, symbols are used as names of things—in particular, names of metaprogramming-related things such as methods. Symbols are a good fit for such names because they are immutable: you can change the characters inside a string, but you can’t do that for symbols. You wouldn’t expect the name of a method to change, so it makes sense to use a symbol when you refer to a method name.

For example, when you call Object#send, you need to pass it the name of a method as a first argument. Although send accepts this name as either a symbol or a string, symbols are usually considered more kosher:

# rather than: 1.send("+", 2)

1.send(:+, 2) # => 3

Regardless, you can easily convert from string to symbol and back:

"abc".to_sym #=> :abc

:abc.to_s #=> "abc"

The Pry Example

One example of Dynamic Dispatch comes from Pry. Pry is a popular alternative to irb, Ruby’s command-line interpreter. A Pry object stores the interpreter’s configuration into its own attributes, such as memory_size and quiet:

methods/pry_example.rb

require "pry"

pry = Pry.new

pry.memory_size = 101

pry.memory_size # => 101

pry.quiet = true

For each instance method like Pry#memory_size, there is a corresponding class method (Pry.memory_size) that returns the default value of the attribute:

Pry.memory_size # => 100

Let’s look a little deeper inside the Pry source code. To configure a Pry instance, you can call a method named Pry#refresh. This method takes a hash that maps attribute names to their new values:

pry.refresh(:memory_size => 99, :quiet => false)

pry.memory_size # => 99

pry.quiet # => false

Pry#refresh has a lot of work to do: it needs to go through each attribute (such as self.memory_size); initialize the attribute with its default value (such as Pry.memory_size); and finally check whether the hash argument contains a new value for the same attribute, and if it does, set the new value. Pry#refresh could do all of those steps with code like this:

def refresh(options={})

defaults[:memory_size] = Pry.memory_size

self.memory_size = options[:memory_size] if options[:memory_size]

defaults[:quiet] = Pry.quiet

self.quiet = options[:quiet] if options[:quiet]

# same for all the other attributes...

end

Those two lines of code would have to be repeated for each and every attribute. That’s a lot of duplicated code. Pry#refresh manages to avoid that duplication, and instead uses Dynamic Dispatch (Dynamic Dispatch) to set all the attributes with just a few lines of code:

gems/pry-0.9.12.2/lib/pry/pry_instance.rb

def refresh(options={})

defaults = {}

attributes = [ :input, :output, :commands, :print, :quiet,

:exception_handler, :hooks, :custom_completions,

:prompt, :memory_size, :extra_sticky_locals ]

attributes.each do |attribute|

defaults[attribute] = Pry.send attribute

end

# ...

defaults.merge!(options).each do |key, value|

send("#{key}=", value) if respond_to?("#{key}=")

end

true

end

The code above uses send to read the default attribute values into a hash, merges this hash with the options hash, and finally uses send again to call attribute accessors such as memory_size=. The Kernel#respond_to? method returns true if methods such as Pry#memory_size= actually exist, so that any key in options that doesn’t match an existing attribute will be ignored. Neat, huh?

Privacy Matters

Remember what Spiderman’s uncle used to say? “With great power comes great responsibility.” The Object#send method is very powerful—perhaps too powerful. In particular, you can call any method with send, including private methods.

If that kind of breaching of encapsulation makes you uneasy, you can use public_send instead. It’s like send, but it makes a point of respecting the receiver’s privacy. Be prepared, however, for the fact that Ruby code in the wild rarely bothers with this concern. If anything, a lot of Ruby programmers use send exactly because it allows calling private methods, not in spite of that.

Now you know about send and Dynamic Dispatch—but there is more to Dynamic Methods than that. You’re not limited to calling methods dynamically. You can also define methods dynamically. It’s time to see how.

Defining Methods Dynamically

You can define a method on the spot with Module#define_method. You just need to provide a method name and a block, which becomes the method body:

methods/dynamic_definition.rb

class MyClass

define_method :my_method do |my_arg|

my_arg * 3

end

end

obj = MyClass.new

obj.my_method(2) # => 6

require_relative '../test/assertions'

assert_equals 6, obj.my_method(2)

define_method is executed within MyClass, so my_method is defined as an instance method of MyClass. This technique of defining a method at runtime is called a Spell: Dynamic Method.

There is one important reason to use define_method over the more familiar def keyword: define_method allows you to decide the name of the defined method at runtime. To see an example of this technique, look back at your original refactoring problem.

Refactoring the Computer Class

Recall the code that pulled you and Bill into this dynamic discussion:

methods/computer/duplicated.rb

class Computer

def initialize(computer_id, data_source)

@id = computer_id

@data_source = data_source

end

def mouse

info = @data_source.get_mouse_info(@id)

price = @data_source.get_mouse_price(@id)

result = "Mouse: #{info} ($#{price})"

return "* #{result}" if price >= 100

result

end

def cpu

info = @data_source.get_cpu_info(@id)

price = @data_source.get_cpu_price(@id)

result = "Cpu: #{info} ($#{price})"

return "* #{result}" if price >= 100

result

end

def keyboard

info = @data_source.get_keyboard_info(@id)

price = @data_source.get_keyboard_price(@id)

result = "Keyboard: #{info} ($#{price})"

return "* #{result}" if price >= 100

result

end

# ...

end

In the previous pages you learned how to use Module#define_method in place of the def keyword to define a method, and how to use send in place of the dot notation to call a method. Now you can use these spells to refactor the Computer class. It’s time to remove some duplication.

Step 1: Adding Dynamic Dispatches

You and Bill start by extracting the duplicated code into its own message-sending method:

methods/computer/dynamic_dispatch.rb

class Computer

def initialize(computer_id, data_source)

@id = computer_id

@data_source = data_source

end

*

def mouse

*

component :mouse

*

end

*

*

def cpu

*

component :cpu

*

end

*

*

def keyboard

*

component :keyboard

*

end

*

*

def component(name)

*

info = @data_source.send "get_#{name}_info", @id

*

price = @data_source.send "get_#{name}_price", @id

*

result = "#{name.capitalize}: #{info} ($#{price})"

*

return "* #{result}" if price >= 100

*

result

*

end

end

A call to mouse is delegated to component, which in turn calls DS#get_mouse_info and DS#get_mouse_price. The call also writes the capitalized name of the component in the resulting string. You open an irb session and smoke-test the new Computer:

my_computer = Computer.new(42, DS.new)

my_computer.cpu # => * Cpu: 2.16 Ghz ($220)

This new version of Computer is a step forward because it contains far fewer duplicated lines—but you still have to write dozens of similar methods. To avoid writing all those methods, you can turn to define_method.

Step 2: Generating Methods Dynamically

You and Bill refactor Computer to use Dynamic Methods (Dynamic Method), as shown in the following code.

methods/computer/dynamic_methods.rb

class Computer

def initialize(computer_id, data_source)

@id = computer_id

@data_source = data_source

end

*

def self.define_component(name)

*

define_method(name) do

*

info = @data_source.send "get_#{name}_info", @id

*

price = @data_source.send "get_#{name}_price", @id

*

result = "#{name.capitalize}: #{info} ($#{price})"

*

return "* #{result}" if price >= 100

*

result

*

end

*

end

*

*

define_component :mouse

*

define_component :cpu

*

define_component :keyboard

end

Note that the three calls to define_component are executed inside the definition of Computer, where Computer is the implicit self. Because you’re calling define_component on Computer, you have to make it a class method.

You quickly test the slimmed-down Computer class in irb and discover that it still works. It’s time to move on to the next step.

Step 3: Sprinkling the Code with Introspection

The latest Computer contains minimal duplication, but you can push it even further and remove the duplication altogether. How? By getting rid of all those calls to define_component. You can do that by introspecting the data_source argument and extracting the names of all components:

methods/computer/more_dynamic_methods.rb

class Computer

def initialize(computer_id, data_source)

@id = computer_id

@data_source = data_source

*

data_source.methods.grep(/^get_(.*)_info$/) { Computer.define_component $1 }

end

def self.define_component(name)

define_method(name) do

# ...

end

end

end

The new line in initialize is where the magic happens. To understand it, you need to know a couple of things.

First, if you pass a block to Array#grep, the block is evaluated for each element that matches the regular expression. Second, the string matching the parenthesized part of the regular expression is stored in the global variable $1. So, if data_source has methods named get_cpu_info andget_mouse_info, this code ultimately calls Computer.define_component twice, with the strings "cpu" and "mouse". Note that define_method works equally well with a string or a symbol.

The duplicated code is finally gone for good. As a bonus, you don’t even have to write or maintain the list of components. If someone adds a new component to DS, the Computer class will support it automatically. Isn’t that wonderful?

Let’s Try That Again

Your refactoring was a resounding success, but Bill is not willing to stop here. “We said that we were going to try two different solutions to this problem, remember? We’ve only found one, involving Dynamic Dispatch (Dynamic Dispatch) and Dynamic Methods (Dynamic Method). It did serve us well—but to be fair, we need to give the other solution a chance.”

For this second solution, you need to know about some strange methods that are not really methods and a very special method named method_missing.

method_missing

Where you listen to spooky stories about Ghost Methods and dynamic proxies and you try a second way to remove duplicated code.

With Ruby, there’s no compiler to enforce method calls. This means you can call a method that doesn’t exist. For example:

methods/method_missing.rb

class Lawyer; end

nick = Lawyer.new

nick.talk_simple

<=

NoMethodError: undefined method `talk_simple' for #<Lawyer:0x007f801aa81938>

Do you remember how method lookup works? When you call talk_simple, Ruby goes into nick’s class and browses its instance methods. If it can’t find talk_simple there, it searches up the ancestors chain into Object and eventually into BasicObject.

Because Ruby can’t find talk_simple anywhere, it admits defeat by calling a method named method_missing on nick, the original receiver. Ruby knows that method_missing is there, because it’s a private instance method of BasicObject that every object inherits.

You can experiment by calling method_missing yourself. It’s a private method, but you can get to it through send:

nick.send :method_missing, :my_method

<=

NoMethodError: undefined method `my_method' for #<Lawyer:0x007f801b0f4978>

You have just done exactly what Ruby does. You told the object, “I tried to call a method named my_method on you, and you did not understand.” BasicObject#method_missing responded by raising a NoMethodError. In fact, this is what method_missing does for a living. It’s like an object’s dead-letter office, the place where unknown messages eventually end up (and the place where NoMethodErrors come from).

Overriding method_missing

Most likely, you will never need to call method_missing yourself. Instead, you can override it to intercept unknown messages. Each message landing on method_missing’s desk includes the name of the method that was called, plus any arguments and blocks associated with the call.

methods/more_method_missing.rb

class Lawyer

def method_missing(method, *args)

puts "You called: #{method}(#{args.join(', ')})"

puts "(You also passed it a block)" if block_given?

end

end

bob = Lawyer.new

bob.talk_simple('a', 'b') do

# a block

end

<=

You called: talk_simple(a, b)

(You also passed it a block)

Overriding method_missing allows you to call methods that don’t really exist. Let’s take a closer look at these weird creatures.

Ghost Methods

When you need to define many similar methods, you can spare yourself the definitions and just respond to calls through method_missing. This is like saying to the object, “If they ask you something and you don’t understand, do this.” From the caller’s side, a message that’s processed bymethod_missing looks like a regular call—but on the receiver’s side, it has no corresponding method. This trick is called a Spell: Ghost Method. Let’s look at some Ghost Method examples.

The Hashie Example

The Hashie gem contains a little bit of magic called Hashie::Mash. A Mash is a more powerful version of Ruby’s standard OpenStruct class: a hash-like object whose attributes work like Ruby variables. If you want a new attribute, just assign a value to the attribute, and it will spring into existence:

require 'hashie'

icecream = Hashie::Mash.new

icecream.flavor = "strawberry"

icecream.flavor # => "strawberry"

This works because Hashie::Mash is a subclass of Ruby’s Hash, and its attributes are actually Ghost Methods, as a quick look at Hashie::Mash.method_missing will confirm:

gems/hashie-1.2.0/lib/hashie/mash.rb

module Hashie

class Mash < Hashie::Hash

def method_missing(method_name, *args, &blk)

return self.[](method_name, &blk) if key?(method_name)

match = method_name.to_s.match(/(.*?)([?=!]?)$/)

case match[2]

when "="

self[match[1]] = args.first

# ...

else

default(method_name, *args, &blk)

end

end

# ...

end

end

If the name of the called method is the name of a key in the hash (such as flavor), then Hashie::Mash#method_missing simply calls the [] method to return the corresponding value. If the name ends with a "=", then method_missing chops off the "=" at the end to get the attribute name and then stores its value. If the name of the called method doesn’t match any of these cases, then method_missing just returns a default value. (Hashie::Mash also supports a few other special cases, such as methods ending in "?", that were scrapped from the code above.)

Dynamic Proxies

Ghost Methods (Ghost Method) are usually icing on the cake, but some objects actually rely almost exclusively on them. These objects are often wrappers for something else—maybe another object, a web service, or code written in a different language. They collect method calls throughmethod_missing and forward them to the wrapped object. Let’s look at a complex real-life example.

The Ghee Example

You probably know GitHub,[5] the wildly popular social coding service. A number of libraries give you easy access to GitHub’s HTTP APIs, including a Ruby gem called Ghee. Here is how you use Ghee to access a user’s “gist”—a snippet of code that can be published on GitHub:

methods/ghee_example.rb

require "ghee"

gh = Ghee.basic_auth("usr", "pwd") # Your GitHub username and password

all_gists = gh.users("nusco").gists

a_gist = all_gists[20]

a_gist.url # => "https://api.github.com/gists/535077"

a_gist.description # => "Spell: Dynamic Proxy"

a_gist.star

The code above connects to GitHub, looks up a specific user ("nusco"), and accesses that user’s list of gists. Then it selects one specific gist and reads that gist’s url and description. Finally, it “stars” the gist, to be notified of any future changes.

The GitHub APIs expose tens of types of objects besides gists, and Ghee has to support all of those objects. However, Ghee’s source code is surprisingly concise, thanks to a smart use of Ghost Methods (Ghost Method). Most of the magic happens in the Ghee::ResourceProxy class:

gems/ghee-0.9.8/lib/ghee/resource_proxy.rb

class Ghee

class ResourceProxy

# ...

def method_missing(message, *args, &block)

subject.send(message, *args, &block)

end

def subject

@subject ||= connection.get(path_prefix){|req| req.params.merge!params }.body

end

end

end

Before you understand this class, you need to see how Ghee uses it. For each type of GitHub object, such as gists or users, Ghee defines one subclass of Ghee::ResourceProxy. Here is the class for gists (the class for users is quite similar):

gems/ghee-0.9.8/lib/ghee/api/gists.rb

class Ghee

module API

module Gists

class Proxy < ::Ghee::ResourceProxy

def star

connection.put("#{path_prefix}/star").status == 204

end

# ...

end

end

end

When you call a method that changes the state of an object, such as Ghee::API::Gists#star, Ghee places an HTTP call to the corresponding GitHub URL. However, when you call a method that just reads from an attribute, such as url or description, that call ends intoGhee::ResourceProxy#method_missing. In turn, method_missing forwards the call to the object returned by Ghee::ResourceProxy#subject. What kind of object is that?

If you dig into the implementation of ResourceProxy#subject, you’ll find that this method also makes an HTTP call to the GitHub API. The specific call depends on which subclass of Ghee::ResourceProxy we’re using. For example, Ghee::API::Gists::Proxy callshttps://api.github.com/users/nusco/gists. ResourceProxy#subject receives the GitHub object in JSON format—in our example, all the gists of user nusco—and converts it to a hash-like object.

Dig a little deeper, and you’ll find that this hash-like object is actually a Hashie::Mash, the magic hash class that we talked about in The Hashie Example. This means that a method call such as my_gist.url is forwarded to Ghee::ResourceProxy#method_missing, and from there toHashie::Mash#method_missing, which finally returns the value of the url attribute. Yes, that’s two calls to method_missing in a row.

Ghee’s design is elegant, but it uses so much metaprogramming that it might confuse you at first. Let’s wrap it up in just two points:

· Ghee stores GitHub objects as dynamic hashes. You can access the attributes of these hashes by calling their Ghost Methods (Ghost Method), such as url and description.

· Ghee also wraps these hashes inside proxy objects that enrich them with additional methods. A proxy does two things. First, it implements methods that require specific code, such as star. Second, it forwards methods that just read data, such as url, to the wrapped hash.

Thanks to this two-level design, Ghee manages to keep its code very compact. It doesn’t need to define methods that just read data, because those methods are Ghost Methods. Instead, it can just define the methods that need specific code, like star.

This dynamic approach also has another advantage: Ghee can adapt automatically to some changes in the GitHub APIs. For example, if GitHub added a new field to gists (say, lines_count), Ghee would support calls to Ghee::API::Gists#lines_count without any changes to its source code, because lines_count is just a Ghost Method—actually a chain of two Ghost Methods.

An object such as Ghee::ResourceProxy, which catches Ghost Methods and forwards them to another object, is called a Spell: Dynamic Proxy.

Refactoring the Computer Class (Again)

“Okay, you now know about method_missing,” Bill says. “Let’s go back to the Computer class and remove the duplication.”

Once again, here’s the original Computer class:

methods/computer/duplicated.rb

class Computer

def initialize(computer_id, data_source)

@id = computer_id

@data_source = data_source

end

def mouse

info = @data_source.get_mouse_info(@id)

price = @data_source.get_mouse_price(@id)

result = "Mouse: #{info} ($#{price})"

return "* #{result}" if price >= 100

result

end

def cpu

info = @data_source.get_cpu_info(@id)

price = @data_source.get_cpu_price(@id)

result = "Cpu: #{info} ($#{price})"

return "* #{result}" if price >= 100

result

end

def keyboard

info = @data_source.get_keyboard_info(@id)

price = @data_source.get_keyboard_price(@id)

result = "Keyboard: #{info} ($#{price})"

return "* #{result}" if price >= 100

result

end

# ...

end

Computer is just a wrapper that collects calls, tweaks them a bit, and routes them to a data source. To remove all those duplicated methods, you can turn Computer into a Dynamic Proxy. It only takes an override of method_missing to remove all the duplication from the Computer class.

methods/computer/method_missing.rb

class Computer

def initialize(computer_id, data_source)

@id = computer_id

@data_source = data_source

end

*

def method_missing(name)

*

superif !@data_source.respond_to?("get_#{name}_info")

*

info = @data_source.send("get_#{name}_info", @id)

*

price = @data_source.send("get_#{name}_price", @id)

*

result = "#{name.capitalize}: #{info} ($#{price})"

*

return "* #{result}" if price >= 100

*

result

*

end

end

What happens when you call a method such as Computer#mouse? The call gets routed to method_missing, which checks whether the wrapped data source has a get_mouse_info method. If it doesn’t have one, the call falls back to BasicObject#method_missing, which throws aNoMethodError. If the data source knows about the component, the original call is converted into two calls to DS#get_mouse_info and DS#get_mouse_price. The values returned from these calls are used to build the final result. You try the new class in irb:

my_computer = Computer.new(42, DS.new)

my_computer.cpu # => * Cpu: 2.9 Ghz quad-core ($120)

It worked. Bill, however, is concerned about one last detail.

respond_to_missing?

If you specifically ask a Computer whether it responds to a Ghost Method, it will flat-out lie:

cmp = Computer.new(0, DS.new)

cmp.respond_to?(:mouse) # => false

This behavior can be problematic, because respond_to? is a commonly used method. (If you need convincing, just note that the Computer itself is calling respond_to? on the data source.) Fortunately, Ruby provides a clean mechanism to make respond_to? aware of Ghost Methods.

respond_to? calls a method named respond_to_missing? that is supposed to return true if a method is a Ghost Method. (In your mind, you could rename respond_to_missing? to something like ghost_method?.) To prevent respond_to? from lying, override respond_to_missing? every time you override method_missing:

class Computer

# ...

*

def respond_to_missing?(method, include_private = false)

*

@data_source.respond_to?("get_#{method}_info") || super

*

end

end

The code in this respond_to_missing? is similar to the first line of method_missing: it finds out whether a method is a Ghost Method. If it is, it returns true. If it isn’t, it calls super. In this case, super is the default Object#respond_to_missing?, which always returns false.

Now respond_to? will learn about your Ghost Methods from respond_to_missing? and return the right result:

cmp.respond_to?(:mouse) # => true

Back in the day, Ruby coders used to override respond_to? directly. Now that respond_to_missing? is available, overriding respond_to? is considered somewhat dirty. Instead, the rule is now this: remember to override respond_to_missing? every time you override method_missing.

If you like BasicObject#method_missing, you should also take a look at Module#const_missing. Let’s check it out.

const_missing

Remember our discussion of Rake in The Rake Example? In that section we said that at one point in its history, Rake renamed classes like Task to names that are less likely to clash, such as Rake::Task. After renaming the classes, Rake went through an upgrade path: for a few versions, you could use either the new class names or the old, non-Namespaced names. Rake allowed you to do that by Monkepatching (Monkeypatch) the Module#const_missing method:

gems/rake-0.9.2.2/lib/rake/ext/module.rb

class Module

def const_missing(const_name)

case const_name

when :Task

Rake.application.const_warning(const_name)

Rake::Task

when :FileTask

Rake.application.const_warning(const_name)

Rake::FileTask

when :FileCreationTask

# ...

end

end

end

When you reference a constant that doesn’t exist, Ruby passes the name of the constant to const_missing as a symbol. Class names are just constants, so a reference to an unknown Rake class such as Task was routed to Module#const_missing. In turn, const_missing warned you that you were using an obsolete class name:

methods/const_missing.rb

require 'rake'

task_class = Task

<=

WARNING: Deprecated reference to top-level constant 'Task' found [...]

Use --classic-namespace on rake command

or 'require "rake/classic_namespace"' in Rakefile

After the warning, you automatically got the new, Namespaced class name in place of the old one:

task_class # => Rake::Task

Enough talking about magic methods. Let’s recap what you and Bill did today.

Refactoring Wrap-Up

Today you solved the same problem in two different ways. The first version of Computer introspects DS to get a list of methods to wrap and uses Dynamic Methods (Dynamic Method) and Dynamic Dispatches (Dynamic Dispatch), which delegate to the legacy system. The second version ofComputer does the same with Ghost Methods (Ghost Method). Having to pick one of the two versions, you and Bill randomly select the method_missing-based one, send it to the folks in purchasing, and head out for a well-deserved lunch break…and an unexpected quiz.

Quiz: Bug Hunt

Where you discover that bugs in a method_missing can be difficult to squash.

Over lunch, Bill has a quiz for you. “My previous team followed a cruel office ritual,” he says. “Every morning, each team member picked a random number. Whoever got the smallest number had to take a trip to the nearby Starbucks and buy coffee for the whole team.”

Bill explains that the team even wrote a class that was supposed to provide a random number (and some Wheel of Fortune--style suspense) when you called the name of a team member. Here’s the class:

methods/roulette_failure.rb

class Roulette

def method_missing(name, *args)

person = name.to_s.capitalize

3.times do

number = rand(10) + 1

puts "#{number}..."

end

"#{person} got a #{number}"

end

end

You can use the Roulette like this:

number_of = Roulette.new

puts number_of.bob

puts number_of.frank

And here’s what the result is supposed to look like:

<=

5...

6...

10...

Bob got a 3

7...

4...

3...

Frank got a 10

“This code was clearly overdesigned,” Bill admits. “We could have just defined a regular method that took the person’s name as a string—but we’d just discovered method_missing, so we used Ghost Methods (Ghost Method) instead. That wasn’t a good idea; the code didn’t work as expected.”

Can you spot the problem with the Roulette class? If you can’t, try running it on your computer. Now can you explain what is happening?

Quiz Solution

The Roulette contains a bug that causes an infinite loop. It prints a long list of numbers and finally crashes.

<=

2...

7...

1...

5...

(...more numbers here...)

roulette_failure.rb:7:in `method_missing': stack level too deep (SystemStackError)

This bug is nasty and difficult to spot. The variable number is defined within a block (the block that gets passed to times) and falls out of scope by the last line of method_missing. When Ruby executes that line, it can’t know that the number there is supposed to be a variable. As a default, it assumes that number must be a parentheses-less method call on self.

In normal circumstances, you would get an explicit NoMethodError that makes the problem obvious. But in this case you have a method_missing, and that’s where the call to number ends. The same chain of events happens again—and again and again—until the call stack overflows.

This is a common problem with Ghost Methods: because unknown calls become calls to method_missing, your object might accept a call that’s just plain wrong. Finding a bug like this one in a large program can be pretty painful.

To avoid this kind of trouble, take care not to introduce too many Ghost Methods. For example, Roulette might be better off if it simply accepted the names of people on Frank’s team. Also, remember to fall back on BasicObject#method_missing when you get a call you don’t know how to handle. Here’s a better Roulette that still uses method_missing:

methods/roulette_solution.rb

class Roulette

def method_missing(name, *args)

person = name.to_s.capitalize

*

superunless %w[Bob Frank Bill].include? person

*

number = 0

3.times do

number = rand(10) + 1

puts "#{number}..."

end

"#{person} got a #{number}"

end

end

You can also develop this code in bite-sized steps. Start by writing regular methods; then, when you’re confident that your code is working, refactor the methods to a method_missing. This way, you won’t inadvertently hide a bug behind a Ghost Method.

Blank Slates

Where you and Bill learn to avoid another common method_missing trap.

Once you get back from lunch, you find an unexpected problem waiting for you at the office. The developer who wrote the reporting application stumbled upon what he thinks is “the strangest bug ever”: the Computer class can’t retrieve information about the workstations’ displays. All the other methods work fine, but Computer#display doesn’t.

You try the display method in irb, and sure enough it fails:

my_computer = Computer.new(42, DS.new)

my_computer.display # => nil

Why does Computer#display return nil? You triple-check the code and the back-end data source, but everything seems to be fine. Bill has a sudden insight, and he lists the instance methods of Object that begin with a d:

Object.instance_methods.grep /^d/ # => [:dup, :display, :define_singleton_method]

It seems that Object defines a method named display (a seldom-used method that prints an object on a port and always returns nil). Computer inherits from Object, so it gets the display method. The call to Computer#display finds a real method by that name, so it never lands onmethod_missing. You’re calling a real, live method instead of a Ghost Method (Ghost Method).

This problem crops up with Dynamic Proxies (Dynamic Proxy). When the name of a Ghost Method clashes with the name of a real, inherited method, the latter wins.

If you don’t need the inherited method, you can fix the problem by removing it. While you’re at it, you might want to remove most methods from the class, preventing such name clashes from ever happening again. A skinny class with a minimal number of methods is called a Spell: Blank Slate. As it turns out, Ruby has a ready-made Blank Slate for you to use.

BasicObject

The root of Ruby’s class hierarchy, BasicObject, has only a handful of instance methods:

methods/basic_object.rb

im = BasicObject.instance_methods

im # => [:==, :equal?, :!, :!=, :instance_eval, :instance_exec, :__send__, :__id__]

If you don’t specify a superclass, your classes inherit by default from Object, which is itself a subclass of BasicObject. If you want a Blank Slate (Blank Slate), you can inherit directly from BasicObject instead. For example, if Computer inherited directly from BasicObject, then it wouldn’t have a problematic display method.

Inheriting from BasicObject is the quicker way to define a Blank Slate in Ruby. In some cases, however, you might want to control exactly which methods to keep and which methods to remove from your class. Let’s see how you can remove a specific method from a class.

Removing Methods

You can remove a method from a class by using either Module#undef_method or Module#remove_method. The drastic undef_method removes any method, including the inherited ones. The kinder remove_method removes the method from the receiver, but it leaves inherited methods alone. Let’s look at a real-life library that uses undef_method to create a Blank Slate.

The Builder Example

The Builder gem is an XML generator with a twist. You can generate XML tags by calling methods on Builder::XmlMarkup:

methods/builder_example_1.rb

require 'builder'

xml = Builder::XmlMarkup.new(:target=>STDOUT, :indent=>2)

xml.coder {

xml.name 'Matsumoto', :nickname => 'Matz'

xml.language 'Ruby'

}

This code produces the following snippet of XML:

<=

<coder>

<name nickname="Matz">Matsumoto</name>

<language>Ruby</language>

</coder>

Builder cleverly bends the syntax of Ruby to support nested tags, attributes, and other niceties. The core idea of Builder is simple: calls such as name and language are processed by XmlMarkup#method_missing, which generates an XML tag for every call.

Now pretend you have to generate a piece of XML describing a university course. It might look like this:

<=

<semester>

<class>Egyptology</class>

<class>Ornithology</class>

</semester>

So, you’d have to write code like this:

methods/builder_example_2.rb

xml.semester {

xml.class 'Egyptology'

xml.class 'Ornithology'

}

If XmlMarkup were a subclass of Object, then the calls to class would clash with Object’s class. To avoid that clash, XmlMarkup inherits from a Blank Slate (Blank Slate) that removes class and most other methods from Object. When Builder was written, BasicObject didn’t exist yet. (It was introduced in Ruby 1.9.) So Builder defines its own Blank Slate class:

gems/builder-3.2.2/lib/blankslate.rb

class BlankSlate

# Hide the method named +name+ in the BlankSlate class. Don't

# hide +instance_eval+ or any method beginning with "__".

def self.hide(name)

# ...

if instance_methods.include?(name._blankslate_as_name) &&

name !~ /^(__|instance_eval$)/

undef_method name

end

end

# ...

instance_methods.each { |m| hide(m) }

end

Builder doesn’t go as far as removing each and every method from BlankSlate. It keeps instance_eval (a method that you’ll get to know in the next chapter) and all the “reserved methods”—methods that are used internally by Ruby, whose names conventionally begin with a double underscore. One example of a reserved method is BasicObject#__send__, which behaves the same as send but gives you a scary warning when you try to remove it. The case of instance_eval is more of a judgement call: you could choose to remove it, but Builder decided not to.

Now that you know about Blank Slates, you can finally fix the bug in the Computer class.

Fixing the Computer Class

To turn Computer into a Blank Slate (Blank Slate) and fix the display method bug, you and Bill make it a subclass of BasicObject:

*

class Computer < BasicObject

# ...

There is one last improvement you can make to this class. BasicObject doesn’t have a respond_to? method. (respond_to? is a method of BasicObject’s subclass Object.) Because you don’t have respond_to?, you can delete the now pointless respond_to_missing? method that you and Bill added back in respond_to_missing?. Once you do that, you’re finally done with the method_missing-based implementation of Computer.

Wrap-Up

Let’s review today’s work. You and Bill started with a Computer class that contained lots of duplication. (The original class is in Double, Treble… Trouble.) You removed the duplication in two different ways.

Your first attempt relied on Dynamic Methods (Dynamic Method) and Dynamic Dispatch (Dynamic Dispatch):

methods/computer/more_dynamic_methods.rb

class Computer

def initialize(computer_id, data_source)

@id = computer_id

@data_source = data_source

data_source.methods.grep(/^get_(.*)_info$/) { Computer.define_component $1 }

end

def self.define_component(name)

define_method(name) do

info = @data_source.send "get_#{name}_info", @id

price = @data_source.send "get_#{name}_price", @id

result = "#{name.capitalize}: #{info} ($#{price})"

return "* #{result}" if price >= 100

result

end

end

end

Your second attempt centered around Ghost Methods (Ghost Method) (to be more precise, it used a Dynamic Proxy (Dynamic Proxy) that is also a Blank Slate (Blank Slate)):

methods/computer/blank_slate.rb

class Computer < BasicObject

def initialize(computer_id, data_source)

@id = computer_id

@data_source = data_source

end

def method_missing(name, *args)

superif !@data_source.respond_to?("get_#{name}_info")

info = @data_source.send("get_#{name}_info", @id)

price = @data_source.send("get_#{name}_price", @id)

result = "#{name.capitalize}: #{info} ($#{price})"

return "* #{result}" if price >= 100

result

end

end

Neither solution would be practical without Ruby’s dynamic capabilities. If you come from a static language, you’re probably accustomed to spotting and removing duplication inside your methods. In Ruby, you might want to look for duplication among methods as well. Then you can remove that duplication with some of the spells you’ve learned today.

You and Bill can consider the two solutions. It’s time to make a choice. Which one do you like best?

Dynamic Methods vs. Ghost Methods

As you experienced yourself, Ghost Methods (Ghost Method) can be dangerous. You can avoid most of their problems by following a few basic recommendations (always call super, always redefine respond_to_missing?)—but even then, Ghost Methods can sometimes cause puzzling bugs.[6]

The problems with Ghost Methods boil down to the fact that they are not really methods; instead, they’re just a way to intercept method calls. Because of this, they behave differently from actual methods. For example, they don’t appear in the list of names returned by Object#methods. In contrast, Dynamic Methods are just regular methods that happened to be defined with define_method instead of def, and they behave the same as any other method.

There are times when Ghost Methods are your only viable option. This usually happens when you have a large number of method calls, or when you don’t know what method calls you might need at runtime. For an example, look back at the Builder library in The Builder Example. Builder couldn’t define a Dynamic Method for each of the potentially infinite XML tags that you might want to generate, so it uses method_missing to intercept method calls instead.

All things considered, the choice between Dynamic and Ghost Methods depends on your experience and coding style, but you can follow a simple rule of thumb when in doubt: use Dynamic Methods if you can and Ghost Methods if you have to.

You and Bill decide to follow this rule, and you commit the define_method-based version of Computer to the project repository. Tomorrow is sure to be another day of coding challenges, so it’s time to head home and rest up.

Footnotes

[5]

http://www.github.com

[6]

A presentation on the perils of method_missing is at http://www.everytalk.tv/talks/1881-Madison-Ruby-The-Revenge-of-method-missing.