Metaprogramming Ruby 2: Program Like the Ruby Pros (2014)

Part 1. Metaprogramming Ruby

Chapter 1. The M Word

Metaprogramming is writing code that writes code.

We’ll get to a more precise definition soon, but this one will do for now. What do I mean by “code that writes code,” and how is that useful in your daily work? Before I answer those questions, let’s take a step back and look at programming languages themselves.

Ghost Towns and Marketplaces

Think of your source code as a world teeming with vibrant citizens: variables, classes, methods, and so on. If you want to get technical, you can call these citizens language constructs.

In many programming languages, language constructs behave more like ghosts than fleshed-out citizens: you can see them in your source code, but they disappear before the program runs. Take C++, for example. Once the compiler has finished its job, things like variables and methods have lost their concreteness; they are just locations in memory. You can’t ask a class for its instance methods, because by the time you ask the question, the class has faded away. In languages such as C++, runtime is an eerily quiet place—a ghost town.

In other languages, such as Ruby, runtime is more like a busy marketplace. Most language constructs are still there, buzzing all around. You can even walk up to a language construct and ask it questions about itself. This is called introspection.

Let’s watch introspection in action. Take a look at the following code.

the_m_word/introspection.rb
	class Greeting
	def initialize(text)
	@text = text
	end

	def welcome
	@text
	end
	end

	my_object = Greeting.new("Hello")

I defined a Greeting class and created a Greeting object. I can now turn to the language constructs and ask them questions.

my_object.class # => Greeting

I asked my_object about its class, and it replied in no uncertain terms: “I’m a Greeting.” Now I can ask the class for a list of its instance methods.

my_object.class.instance_methods(false) # => [:welcome]

The class answered with an array containing a single method name: welcome. (The false argument means, “List only instance methods you defined yourself, not those ones you inherited.”) Let’s peek into the object itself, asking for its instance variables.

my_object.instance_variables # => [:@text]

Again, the object’s reply was loud and clear. Because objects and classes are first-class citizens in Ruby, you can get a lot of information from running code.

However, this is only half of the picture. Sure, you can read language constructs at runtime, but what about writing them? What if you want to add new instance methods to Greeting, alongside welcome, while the program is running? You might be wondering why on earth anyone would want to do that. Allow me to explain by telling a story.

The Story of Bob, Metaprogrammer

Bob, a newcomer to Ruby, has a grand plan: he’ll write the biggest Internet social network ever for movie buffs. To do that, he needs a database of movies and movie reviews. Bob makes it a practice to write reusable code, so he decides to build a simple library to persist objects in the database.

Bob’s First Attempt

Bob’s library maps each class to a database table and each object to a record. When Bob creates an object or accesses its attributes, the object generates a string of SQL and sends it to the database. All this functionality is wrapped in a class:

the_m_word/orm.rb
	class Entity
	attr_reader :table, :ident

	def initialize(table, ident)
	@table = table
	@ident = ident
	Database.sql "INSERT INTO #{@table} (id) VALUES (#{@ident})"
	end

	def set(col, val)
	Database.sql "UPDATE #{@table} SET #{col}='#{val}' WHERE id=#{@ident}"
	end

	def get(col)
	Database.sql ("SELECT #{col} FROM #{@table} WHERE id=#{@ident}")[0][0]
	end
	end

In Bob’s database, each table has an id column. Each Entity stores the content of this column and the name of the table to which it refers. When Bob creates an Entity, the Entity saves itself to the database. Entity#set generates SQL that updates the value of a column, and Entity#get generates SQL that returns the value of a column. (In case you care, Bob’s Database class returns recordsets as arrays of arrays.)

Bob can now subclass Entity to map to a specific table. For example, class Movie maps to a database table named movies:

	class Movie < Entity
	def initialize(ident)
	super "movies", ident
	end

	def title
	get "title"
	end

	def title=(value)
	set "title", value
	end

	def director
	get "director"
	end

	def director=(value)
	set "director", value
	end
	end

A Movie has two methods for each attribute: a reader, such as Movie#title, and a writer, such as Movie#title=. Bob can now load a new movie into the database by firing up the Ruby interactive interpreter and typing the following:

	movie = Movie.new(1)
	movie.title = "Doctor Strangelove"
	movie.director = "Stanley Kubrick"

This code creates a new record in movies, which has values 1, Doctor Strangelove, and Stanley Kubrick for the columns id, title, and director, respectively. (Remember that in Ruby, movie.title = "Doctor Strangelove" is actually a disguised call to the method title=—the same as movie.title=("Doctor Strangelove").)

Proud of himself, Bob shows the code to his older, more experienced colleague, Bill. Bill stares at the screen for a few seconds and proceeds to shatter Bob’s pride into tiny little pieces. “There’s a lot of duplication in this code,” Bill says. “You have a movies table with a title column in the database, and you have a Movie class with an @title field in the code. You also have a title method, a title= method, and two "title" string constants. You can solve this problem with way less code if you sprinkle some metaprogramming over it.”

Enter Metaprogramming

At the suggestion of his expert-coder friend, Bob looks for a metaprogramming-based solution. He finds that very thing in the Active Record library, a popular Ruby library that maps objects to database tables. After a short tutorial, Bob is able to write the Active Record version of the Movieclass:

	class Movie < ActiveRecord::Base
	end

Yes, it’s as simple as that. Bob just subclassed the ActiveRecord::Base class. He didn’t have to specify a table to map Movies to. Even better, he didn’t have to write boring, almost identical methods such as title and director. It all just works:

	movie = Movie.create
	movie.title = "Doctor Strangelove"
	movie.title # => "Doctor Strangelove"

The previous code creates a Movie object that wraps a record in the movies table, then accesses the record’s title column by calling Movie#title and Movie#title=. But these methods are nowhere to be found in the source code. How can title and title= exist if they’re not defined anywhere? You can find out by looking at how Active Record works.

The table name part is straightforward: Active Record looks at the name of the class through introspection and then applies some simple conventions. Since the class is named Movie, Active Record maps it to a table named movies. (This library knows how to find plurals for English words.)

What about methods such as title= and title, which access object attributes (accessor methods for short)? This is where metaprogramming comes in: Bob doesn’t have to write those methods. Active Record defines them automatically, after inferring their names from the database schema.ActiveRecord::Base reads the schema at runtime, discovers that the movies table has two columns named title and director, and defines accessor methods for two attributes of the same name. This means that Active Record defines methods such as Movie#title and Movie#director= out of thin air while the program runs.

This is the “yang” to the introspection “yin”: rather than just reading from the language constructs, you’re writing into them. If you think this is an extremely powerful feature, you are right.

The “M” Word Again

Now you have a more formal definition of metaprogramming:

Metaprogramming is writing code that manipulates language constructs at runtime.

The authors of Active Record applied this concept. Instead of writing accessor methods for each class’s attributes, they wrote code that defines those methods at runtime for any class that inherits from ActiveRecord::Base. This is what I meant when I talked about “writing code that writes code.”

Code Generators and Compilers

In metaprogramming, you write code that writes code. But isn’t that what code generators and compilers do? For example, you can write annotated Java code and then use a code generator to output XML configuration files. In a broad sense, this XML generation is an example of metaprogramming. In fact, many people think about code generation when the “M” word comes up.

This particular brand of metaprogramming implies that you use a program to generate or otherwise manipulate a second, distinct program—and then you run the second program. After you run the code generator, you can actually read the generated code and (if you want to test your tolerance for pain) even modify it by hand before you finally run it. This is also what happens under the hood with C++ templates: the compiler turns your templates into a regular C++ program before compiling them, and then you run the compiled program.

In this book, I’ll stick to a different meaning of metaprogramming, focusing on code that manipulates itself at runtime. You can think of this as dynamic metaprogramming to distinguish it from the static metaprogramming of code generators and compilers. While you can do some amount of dynamic metaprogramming in many languages (for example, by using bytecode manipulation in Java), only a few languages allow you do to it seamlessly and elegantly—and Ruby is one of them.

You might think that this is exotic, seldom-used stuff—but if you look at Ruby, as we’re about to do, you’ll see that it’s used frequently.

Metaprogramming and Ruby

Remember our earlier talk about ghost towns and marketplaces? If you want to manipulate language constructs, those constructs must exist at runtime. In this respect, some languages are better than others. Take a quick glance at a few languages and how much control they give you at runtime.

A program written in C spans two different worlds: compile time, where you have language constructs such as variables and functions, and runtime, where you just have a bunch of machine code. Because most information from compile time is lost at runtime, C doesn’t support metaprogramming or introspection. In C++, some language constructs do survive compilation, and that’s why you can ask a C++ object for its class. In Java, the distinction between compile time and runtime is even fuzzier. You have enough introspection at your disposal to list the methods of a class or climb up a chain of superclasses.

Ruby is a very metaprogramming-friendly language. It has no compile time at all, and most constructs in a Ruby program are available at runtime. You don’t come up against a brick wall dividing the code that you’re writing from the code that your computer executes when you run the program. There is just one world.

In this one world, metaprogramming is everywhere. Ruby metaprogramming isn’t an obscure art reserved for gurus, and it’s not a bolt-on power feature that’s useful only for building something as sophisticated as Active Record. If you want to take the path to advanced Ruby coding, you’ll find metaprogramming at every step. Even if you’re happy with the amount of Ruby you already know and use, you’re still likely to stumble on metaprogramming in the source of popular frameworks, in your favorite library, and even in small examples from random blogs.

In fact, metaprogramming is so deeply entrenched in the Ruby language that it’s not even sharply separated from “regular” programming. You can’t look at a Ruby program and say, “This part here is metaprogramming, while this other part is not.” In a sense, metaprogramming is a routine part of every Ruby programmer’s job. Once you master it, you’ll be able to tap into the full power of the language.

There is also another less obvious reason why you might want to learn metaprogramming. As simple as Ruby looks at first, you can quickly become overwhelmed by its subtleties. Sooner or later, you’ll be asking yourself questions such as “Can an object call a private method on another object of the same class?” or “How can you define class methods by importing a module?” Ultimately, all of Ruby’s seemingly complicated behaviors derive from a few simple rules. Through metaprogramming, you can get an intimate look at the language, learn those rules, and get answers to your nagging questions.

Now that you know what metaprogramming is about, you’re ready to dive in.