The Design of Active Record - Metaprogramming in Rails - Metaprogramming Ruby 2: Program Like the Ruby Pros (2014)

Metaprogramming Ruby 2: Program Like the Ruby Pros (2014)

Part 2. Metaprogramming in Rails

Chapter 9. The Design of Active Record

Active Record is the library in Rails that maps Ruby objects to database records. This functionality is called object-relational mapping, and it allows you to get the best of both the relational database (used for persistence) and object-oriented programming (used for business logic).

In this chapter, as well as the next two, we’ll take a look at the high-level design of Active Record’s source code and how its pieces fit together. We are less interested in what Active Record does than how it does it. All we need is a very short example of mapping a class to a database—just enough to kickstart our exploration of Active Record’s internals.

A Short Active Record Example

Assume that you already have a file-based SQLite database that follows Active Record’s conventions: this database contains a table called ducks, which has a field called name. You want to map the records in the ducks table to objects of class Duck in your code.

Let’s start by requiring Active Record and opening a connection to the database. (If you want to run this code on your system, you also need to install the SQLite database and the sqlite3 gem. But you can probably follow along fine by just reading the example, without running it.)

part2/ar_example.rb

require 'active_record'

ActiveRecord::Base.establish_connection :adapter => "sqlite3",

:database => "dbfile"

Note that in a Rails application, you don’t need to worry about opening the connection; the application reads the names of the adapter and the database from a configuration file, and it calls establish_connection for you. We’re using Active Record on its own here, so we have to open the connection ourselves.

ActiveRecord::Base is the most important class in Active Record. Not only does it contain class methods that do important things, such as opening database connections, it’s also the superclass of all mapped classes, such as Duck:

class Duck < ActiveRecord::Base

validate do

errors.add(:base, "Illegal duck name.") unless name[0] == 'D'

end

end

The validate method is a Class Macro (Class Macro) that takes a block. You don’t have to worry about the details of the code in the block—just know that in this example, it ensures that a Duck’s name begins with a D. (Our company’s duck-naming policies demand that.) If you try to save aDuck with an illegal name to the database, the save! method will raise an exception, while the more discreet save will fail silently.

By convention, Active Record automatically maps Duck objects to the ducks table. By looking at the database schema, Active Record also finds out that Ducks have a name, and it defines a Ghost Method (Ghost Method) to access that field. Thanks to these conventions, you can use the Duckclass right away:

my_duck = Duck.new

my_duck.name = "Donald"

my_duck.valid? # => true

my_duck.save!

I’ve checked that my_duck is valid (it begins with a D) and saved it to the database. Reading it back, you get this:

duck_from_database = Duck.first

duck_from_database.name # => "Donald"

duck_from_database.delete

That’s enough code for now to give you a sense of how Active Record is meant to be used. Now let’s see what’s happening under the hood.

How Active Record Is Put Together

The code in the previous example looks simple, but ActiveRecord::Base is capable of much more than that. Indeed, the more you use Active Record, the more the methods in Base seem to multiply. You might assume that Base is a huge class with thousands of lines of code that define methods such as save or validate.

Surprisingly, the source code of ActiveRecord::Base contains no trace of those methods. This is a common problem for newcomers to Rails: it’s often difficult to understand where a specific method comes from and how it gets into a class such as Base. The rest of this short chapter will look at how ActiveRecord::Base’s functionality is assembled.

Let’s start by taking a step back to the first line in our example: require ’active_record’.

The Autoloading Mechanism

Here’s the code in active_record.rb, the only Active Record file that you’re likely to require:

gems/activerecord-4.1.0/lib/active_record.rb

require 'active_support'

require 'active_model'

# ...

module ActiveRecord

extend ActiveSupport::Autoload

autoload :Base

autoload :NoTouching

autoload :Persistence

autoload :QueryCache

autoload :Querying

autoload :Validations

# ...

Active Record relies heavily on two other libraries that it loads straight away: Active Support and Active Model. We’ll get to Active Model soon, but one piece of Active Support is already used in this code: the ActiveSupport::Autoload module, which defines an autoload method. This method uses a naming convention to automatically find and require the source code of a module (or class) the first time you use the module’s name. Active Record extends ActiveSupport::Autoload, so autoload becomes a class method on the ActiveRecord module itself. (If you’re confused by this mechanism, look back at the Class Extension (Class Extension) spell.)

Active Record then uses autoload as a Class Macro (Class Macro) to register dozens of modules, a few of which you can see in the code above. As a result, Active Record acts like a smart Namespace (Namespace) that automatically loads all the bits and pieces that make up the library. For example, when you use ActiveRecord::Base for the first time, autoload automatically requires the file active_record/base.rb, which in turn defines the class. Let’s take a look at this definition.

ActiveRecord::Base

Here is the entire definition of ActiveRecord::Base:

gems/activerecord-4.1.0/lib/active_record/base.rb

module ActiveRecord

class Base

extend ActiveModel::Naming

extend ActiveSupport::Benchmarkable

extend ActiveSupport::DescendantsTracker

extend ConnectionHandling

extend QueryCache::ClassMethods

extend Querying

extend Translation

extend DynamicMatchers

extend Explain

extend Enum

extend Delegation::DelegateCache

include Core

include Persistence

include NoTouching

include ReadonlyAttributes

include ModelSchema

include Inheritance

include Scoping

include Sanitization

include AttributeAssignment

include ActiveModel::Conversion

include Integration

include Validations

include CounterCache

include Locking::Optimistic

include Locking::Pessimistic

include AttributeMethods

include Callbacks

include Timestamp

include Associations

include ActiveModel::SecurePassword

include AutosaveAssociation

include NestedAttributes

include Aggregations

include Transactions

include Reflection

include Serialization

include Store

include Core

end

ActiveSupport.run_load_hooks(:active_record, Base)

end

It’s not uncommon to see a class that assembles its functionality out of modules, but ActiveRecord::Base does this on a large scale. The code above does nothing but extend or include tens of modules. (Plus one additional line, the call to run_load_hooks, that allows some of those modules to run their own configuration code after they’ve been autoloaded.) As it turns out, many of the modules included by Base also include even more modules.

This is where the autoloading mechanism pays off. ActiveRecord::Base doesn’t need to require a module’s source code and then include the module. Instead, it just includes the module. Thanks to autoloading, classes such as Base can do lots of module inclusions with minimal code.

In some cases, it’s not too hard to find which module a specific method in Base comes from. For example, persistence methods such as save come from ActiveRecord::Persistence:

gems/activerecord-4.1.0/lib/active_record/persistence.rb

module ActiveRecord

module Persistence

def save(*) # ...

def save!(*) # ...

def delete # ...

Other method definitions are harder to find. In A Short Active Record Example, you looked at validation methods such as valid? and validate. Let’s go hunting for them.

The Validations Modules

Among the other modules, ActiveRecord::Base includes a module named ActiveRecord::Validations. This module looks like a good candidate to define methods such as valid? and validate. Indeed, if you look in ActiveRecord::Validations, you’ll find the definition of valid?—but no validate:

gems/activerecord-4.1.0/lib/active_record/validations.rb

module ActiveRecord

module Validations

include ActiveModel::Validations

# ...

def valid?(context = nil) # ...

Where is validate? We can look for the answer in ActiveModel::Validations, a module that ActiveRecord::Validation includes. This module comes from Active Model, a library that Active Record depends on. Sure enough, if you look into its source, you’ll find that validate is defined inActiveModel::Validation.

A couple of puzzling details exist in this sequence of module inclusions. The first one is this: normally, a class gains instance methods by including a module. But validate is a class method on ActiveRecord::Base. How can Base gain class methods by including modules? This is the topic of the next chapter, Chapter 10, Active Support’s Concern Module, where we’ll also look at the metaprogramming treasure trove that hides behind this assembly of modules. For now, notice that the modules in Active Record are special. You gain both instance and class methods by including them.

You might also have this question: why does ActiveRecord::Base need both ActiveRecord::Validations and ActiveModel::Validations? There is a story behind these two similarly named modules: in earlier versions of Rails there was no Active Model library, and validate was indeed defined in ActiveRecord::Validations. As Active Record kept growing, its authors realized that it was doing two separate jobs. The first job was dealing with the database operations, such as saving and loading. The second job was dealing with the object model: maintaining an object’s attributes, or tracking which of those attributes were valid.

At this point, the authors of Active Record decided to split the library in two separate libraries, and thus was Active Model born. While the database-related operations stayed in Active Record, the model-related ones moved to Active Model. In particular, the valid? method has its own reasons to dabble with the database (it cares whether an object has ever been saved to the database already)—so it stayed in ActiveRecord::Validations. On the contrary, validate has no relationship to the database, and it only cares about the object’s attributes. So it moved to ActiveModel::Validations.

We could hunt for more method definitions, but by now you can see what Active Record’s high-level design boils down to: the most important class, ActiveRecord::Base, is an assembly of modules. Each module adds instance methods (and even class methods) to the Base mix. Some modules, such as Validations, in turn include more modules, sometimes from different libraries, bringing even more methods into Base.

Before looking deeper into Active Record’s structure, let’s see what this unusual design can teach us.

A Lesson Learned

By including so many modules, ActiveRecord::Base ends up being a very large class. In a plain-vanilla Rails installation, Base has more than 300 instance methods and a staggering 550 class methods. ActiveRecord::Base is the ultimate Open Class (Open Class).

When I looked at Active Record for the first time, I’d been a Java programmer for years. Active Record’s source code left me shocked. No sane Java coder would ever write a library that consists almost exclusively of a single huge class with many hundreds of methods. Such a library would be madness—impossible to understand and maintain!

And yet, that’s exactly what Active Record’s design is like: most methods in the library ultimately get rolled inside one class. But wait, it gets worse. As we’ll discuss later, some of the modules that comprise Active Record don’t think twice about using metaprogramming to define even more methods on their includer. To add insult to injury, even additional libraries that work with Active Record often take the liberty of extending ActiveRecord::Base with modules and methods of their own. You might think that the result of this relentless piling up of methods would be a tangled mass of spaghetti. But it isn’t.

Consider the evidence: not only does Active Record get away with that design, it also proves easy to read and change. Many users modify and Monkeypatch (Monkeypatch) Active Record for their own purposes, and hundreds of contributors have worked on the original source code. Still, the source code evolves so quickly that the poor authors of books such as this one need to rewrite most of their content with every new edition. Active Record manages to stay stable and reliable even as it changes, and most coders are happy using the latest version of the library in their production systems.

Here is the most important guideline I learned from Active Record’s design: design techniques are relative, and they depend on the language you’re using. In Ruby, you use idioms that are different from those of other languages you might be used to. It’s not that the good design rules of old suddenly grew obsolete. On the contrary, the basic tenets of design (decoupling, simplicity, no duplication) hold true in Ruby as much as they do in any other language. In Ruby, though, the techniques you wield to achieve those design goals can be surprisingly different.

Look at ActiveRecord::Base again. It’s a huge class, but this complex class doesn’t exist in the source code. Instead, it is composed at runtime by assembling loosely coupled, easy-to-test, easy-to-reuse modules. If you only need the validation features, you can includeActiveModel::Validations in your own class and happily ignore ActiveRecord::Base and all the other modules, as in the following code:

part2/validations.rb

require 'active_model'

class User

include ActiveModel::Validations

attr_accessor :password

validate do

errors.add(:base, "Don't let dad choose the password.") if password == '1234'

end

end

user = User.new

user.password = '12345'

user.valid? # => true

user.password = '1234'

user.valid? # => false

Look at how well-decoupled the code above is. ActiveModel::Validations doesn’t force you to meddle with inheritance, to worry about database-related concerns, or to manage any other complicated dependency. Just by including it, you get a complete set of validation methods without adding complexity.

Speaking of ActiveModel::Validations, I promised that I’d show you how this module adds class methods such as validate to its includer. I’ll keep that promise in the next chapter.