Wednesday: Blocks - Metaprogramming Ruby - Metaprogramming Ruby 2: Program Like the Ruby Pros (2014)

Metaprogramming Ruby 2: Program Like the Ruby Pros (2014)

Part 1. Metaprogramming Ruby

Chapter 4. Wednesday: Blocks

Yesterday you learned a lot about methods and method calls. Today you will deal with blocks.

You’re probably already familiar with blocks—you can’t write much Ruby code without them. But what you might not know is that blocks are a powerful tool for controlling scope, meaning which variables and methods can be seen by which lines of code. In this chapter, you’ll discover how this control of scope makes blocks a cornerstone of Ruby metaprogramming.

Blocks are just one member of a larger family of “callable objects,” which include objects such as procs and lambdas. This chapter shows how you can use these and other callable objects to their greatest advantage—for example, to store a block and execute it later.

Just a short public service announcement before getting started: the previous chapters never strayed far from the usual object-oriented concepts, such as classes, objects, and methods. Blocks have a different heritage that can be traced back to functional programming languages, such as LISP. If you think in objects and classes, expect to deal with some novel concepts in this chapter. You’re likely to find these concepts strange and, at the same time, fascinating.

With that sneak peek into what this chapter is all about, it’s now time to step into the office.

The Day of the Blocks

Where you and Bill agree to put off today’s job, make a roadmap, and review the basics of blocks.

You’ve barely had time to check your mail, and Bill is already making his way to your desk, eager to get to work. “I talked with the boss about today’s job,” he says. “I won’t go into the details now, but I can tell you that we’re going to need blocks for today’s project.” Before the two of you jump into the fray, you need to understand the nuances of blocks. You agree to spend the morning talking about blocks, putting off today’s project until after lunch.

Today’s Roadmap

On a sheet of paper, Bill lists the things he wants to cover:

· A review of the basics of blocks

· An overview of scopes and how you can carry variables through scopes by using blocks as closures

· How you can further manipulate scopes by passing a block to instance_eval

· How you can convert blocks into callable objects that you can set aside and call later, such as Procs and lambdas

You start with the first point—a quick review of the basics. (If you already know the basics of Ruby blocks, you can skip straight to Blocks Are Closures.)

The Basics of Blocks

Do you remember how blocks work? Here is a simple example to refresh your memory:

blocks/basics_failure.rb

def a_method(a, b)

a + yield(a, b)

end

a_method(1, 2) {|x, y| (x + y) * 3 } # => 10

You can define a block with either curly braces or the do…end keywords. A common convention is to use curly braces for single-line blocks and do…end for multiline blocks.

You can define a block only when you call a method. The block is passed straight into the method, and the method can call back to the block with the yield keyword.

Optionally, a block can have arguments, like x and y in the previous example. When you yield to the block, you can provide values for its arguments, just like you do when you call a method. Also, like a method, a block returns the result of the last line of code it evaluates.

Within a method, you can ask Ruby whether the current call includes a block. You can do that with the Kernel#block_given? method:

def a_method

returnyieldif block_given?

'no block'

end

a_method # => "no block"

a_method { "here's a block!" } # => "here's a block!"

If you use yield when block_given? is false, you’ll get a runtime error.

Now you can apply what you know about blocks to a real-life scenario.

Quiz: Ruby#

Where you’re challenged to do something useful with blocks.

Bill shares a little secret: “You know, a few years ago I was making a living out of writing C# code. I must admit that C# did have a few nice features. Let me show you one of those.”

The using Keyword

Imagine that you’re writing a C# program that connects to a remote server and you have an object that represents the connection:

RemoteConnection conn = new RemoteConnection("my_server");

String stuff = conn.ReadStuff();

conn.Dispose(); // close the connection to avoid a leak

This code correctly disposes of the connection after using it. However, it doesn’t deal with exceptions. If ReadStuff throws an exception, then the last line is never executed, and conn is never disposed of. What the code should do is manage exceptions, disposing of the connection regardless of whether an exception is thrown. C# provides a keyword named using that goes through the whole process for you:

RemoteConnection conn = new RemoteConnection("some_remote_server");

using (conn)

{

conn.ReadData();

DoMoreStuff();

}

The using keyword expects that conn has a method named Dispose. This method is called automatically after the code in the curly braces, regardless of whether an exception is thrown.

The Challenge

To refresh the basics of blocks, Bill throws a challenge at you: write a Ruby version of using. Make sure it passes this test:

blocks/using_test.rb

require 'test/unit'

require_relative 'using'

class TestUsing < Test::Unit::TestCase

class Resource

def dispose

@disposed = true

end

def disposed?

@disposed

end

end

def test_disposes_of_resources

r = Resource.new

using(r) {}

assert r.disposed?

end

def test_disposes_of_resources_in_case_of_exception

r = Resource.new

assert_raises(Exception) {

using(r) {

raise Exception

}

}

assert r.disposed?

end

end

Quiz Solution

Take a look at this solution to the quiz:

blocks/using.rb

module Kernel

def using(resource)

begin

yield

ensure

resource.dispose

end

end

end

You can’t define a new keyword, but you can fake it with a Kernel Method (Kernel Method). Kernel#using takes the managed resource as an argument. It also takes a block, which it executes. Regardless of whether the block completes normally, the ensure clause calls dispose on the resource to release it cleanly. There is no rescue clause, so any exception is still propagated to the code that calls Kernel#using.

Now that you’ve reviewed block basics, you can move to the second item on the list from Today’s Roadmap: closures.

Blocks Are Closures

Where you find there is more to blocks than meets the eye and you learn how to smuggle variables across scopes.

As Bill notes on a piece of scratch paper, a block is not just a floating piece of code. You can’t run code in a vacuum. When code runs, it needs an environment: local variables, instance variables, self….

images/chp3_context.jpg


Figure 6. Code that runs is actually made up of two things: the code itself and a set of bindings.

Because these entities are basically names bound to objects, you can call them the bindings for short. The main point about blocks is that they are all inclusive and come ready to run. They contain both the code and a set of bindings.

You’re probably wondering where the block picks up its bindings. When you define the block, it simply grabs the bindings that are there at that moment, and then it carries those bindings along when you pass the block into a method:

blocks/blocks_and_bindings.rb

def my_method

x = "Goodbye"

yield("cruel")

end

x = "Hello"

my_method {|y| "#{x}, #{y} world" } # => "Hello, cruel world"

When you create the block, you capture the local bindings, such as x. Then you pass the block to a method that has its own separate set of bindings. In the previous example, those bindings also include a variable named x. Still, the code in the block sees the x that was around when the block was defined, not the method’s x, which is not visible at all in the block.

You can also define additional bindings inside the block, but they disappear after the block ends:

blocks/block_local_vars_failure.rb

def just_yield

yield

end

top_level_variable = 1

just_yield do

top_level_variable += 1

local_to_block = 1

end

top_level_variable # => 2

local_to_block # => Error!

Because of the properties above, a computer scientist would say that a block is a closure. For the rest of us, this means a block captures the local bindings and carries them along with it.

So, how do you use closures in practice? To understand that, take a closer look at the place where all the bindings reside—the scope. Here you’ll learn to identify the spots where a program changes scope, and you’ll encounter a particular problem with changing scopes that can be solved with closures.

Scope

Imagine being a little debugger making your way through a Ruby program. You jump from statement to statement until you finally hit a breakpoint. Now catch your breath and look around. See the scenery around you? That’s your scope.

You can see bindings all over the scope. Look down at your feet, and you see a bunch of local variables. Raise your head, and you see that you’re standing within an object, with its own methods and instance variables; that’s the current object, also known as self. Farther away, you see the tree of constants so clear that you could mark your current position on a map. Squint your eyes, and you can even see a bunch of global variables off in the distance.

But what happens when you get tired of the scenery and decide to move on?

Changing Scope

This example shows how scope changes as your program runs, tracking the names of bindings with the Kernel#local_variables method:

blocks/scopes.rb

v1 = 1

class MyClass

v2 = 2

local_variables # => [:v2]

def my_method

v3 = 3

local_variables

end

local_variables # => [:v2]

end

obj = MyClass.new

obj.my_method # => [:v3]

obj.my_method # => [:v3]

local_variables # => [:v1, :obj]

Track the program as it moves through scopes. It starts within the top-level scope that you read about in The Top Level. After defining v1 in the top-level scope, the program enters the scope of MyClass’s definition. What happens then?

Global Variables and Top-Level Instance Variables

Global variables can be accessed by any scope:

def a_scope

$var = "some value"

end

def another_scope

$var

end

a_scope

another_scope # => "some value"

The problem with global variables is that every part of the system can change them, so in no time you’ll find it difficult to track who is changing what. For this reason, the general rule is this: when it comes to global variables, use them sparingly, if ever.

You can sometimes use a top-level instance variable in place of a global variable. These are the instance variables of the top-level main object, described in The Top Level:

@var = "The top-level @var"

def my_method

@var

end

my_method # => "The top-level @var"

You can access a top-level instance variable whenever main takes the role of self, as in the previous example. When any other object is self, the top-level instance variable is out of scope.

class MyClass

def my_method

@var = "This is not the top-level @var!"

end

end

Being less universally accessible, top-level instance variables are generally considered safer than global variables—but not by a wide margin.

Some languages, such as Java and C#, allow “inner scopes” to see variables from “outer scopes.” That kind of nested visibility doesn’t happen in Ruby, where scopes are sharply separated: as soon as you enter a new scope, the previous bindings are replaced by a new set of bindings. This means that when the program enters MyClass, v1 “falls out of scope” and is no longer visible.

In the scope of the definition of MyClass, the program defines v2 and a method. The code in the method isn’t executed yet, so the program never opens a new scope until the end of the class definition. There, the scope opened with the class keyword is closed, and the program gets back to the top-level scope.

What happens when the program creates a MyClass object and calls my_method twice? The first time the program enters my_method, it opens a new scope and defines a local variable, v3. Then the program exits the method, falling back to the top-level scope. At this point, the method’s scope is lost. When the program calls my_method a second time, it opens yet another new scope, and it defines a new v3 variable (unrelated to the previous v3, which is now lost). Finally, the program returns to the top-level scope, where you can see v1 and obj again. Phew!

Here is the example’s important point: “Whenever the program changes scope, some bindings are replaced by a new set of bindings.” Granted, this doesn’t happen to all the bindings each and every time. For example, if a method calls another method on the same object, instance variables stay in scope through the call. In general, though, bindings tend to fall out of scope when the scope changes. In particular, local variables change at every new scope. (That’s why they’re “local.”)

As you can see, keeping track of scopes can be a tricky task. You can spot scopes more quickly if you learn about Scope Gates.

Scope Gates

There are exactly three places where a program leaves the previous scope behind and opens a new one:

· Class definitions

· Module definitions

· Methods

Scope changes whenever the program enters (or exits) a class or module definition or a method. These three borders are marked by the keywords class, module, and def, respectively. Each of these keywords acts like a Spell: Scope Gate.

For example, here is the previous example program again, with Scope Gates clearly marked by comments:

v1 = 1

class MyClass # SCOPE GATE: entering class

v2 = 2

local_variables # => ["v2"]

def my_method # SCOPE GATE: entering def

v3 = 3

local_variables

end # SCOPE GATE: leaving def

local_variables # => ["v2"]

end # SCOPE GATE: leaving class

obj = MyClass.new

obj.my_method # => [:v3]

local_variables # => [:v1, :obj]

Now it’s easy to see that this program opens three separate scopes: the top-level scope, one new scope when it enters MyClass, and one new scope when it calls my_method.

There is a subtle difference between class and module on one side and def on the other. The code in a class or module definition is executed immediately. Conversely, the code in a method definition is executed later, when you eventually call the method. However, as you write your program, you usually don’t care when it changes scope—you only care that it does.

Now you can pinpoint the places where your program changes scope—the spots marked by class, module, and def. But what if you want to pass a variable through one of these spots? This question takes you back to blocks.

Flattening the Scope

The more you become proficient in Ruby, the more you get into difficult situations where you want to pass bindings through a Scope Gate (Scope Gate):

blocks/flat_scope_1.rb

my_var = "Success"

class MyClass

# We want to print my_var here...

def my_method

# ..and here

end

end

Scope Gates are quite a formidable barrier. As soon as you walk through one of them, local variables fall out of scope. So, how can you carry my_var across not one but two Scope Gates?

Look at the class Scope Gate first. You can’t pass my_var through it, but you can replace class with something else that is not a Scope Gate: a method call. If you could call a method instead of using the class keyword, you could capture my_var in a closure and pass that closure to the method. Can you think of a method that does the same thing that class does?

If you look at Ruby’s documentation, you’ll find the answer: Class.new is a perfect replacement for class. You can also define instance methods in the class if you pass a block to Class.new:

blocks/flat_scope_2.rb

my_var = "Success"

*

MyClass = Class.new do

*

# Now we can print my_var here...

*

puts "#{my_var} in the class definition!"

def my_method

# ...but how can we print it here?

end

end

Now, how can you pass my_var through the def Scope Gate? Once again, you have to replace the keyword with a method call. Think of the discussion about Dynamic Methods (Dynamic Method): instead of def, you can use Module#define_method:

blocks/flat_scope_3.rb

my_var = "Success"

MyClass = Class.new do

"#{my_var} in the class definition"

*

define_method :my_method do

*

"#{my_var} in the method"

*

end

end

*

MyClass.new.my_method

require_relative "../test/assertions"

assert_equals "Success in the method", MyClass.new.my_method

<=

Success in the class definition

Success in the method

If you replace Scope Gates with method calls, you allow one scope to see variables from another scope. Technically, this trick should be called nested lexical scopes, but many Ruby coders refer to it simply as “flattening the scope,” meaning that the two scopes share variables as if the scopes were squeezed together. For short, you can call this spell a Spell: Flat Scope.

Sharing the Scope

Once you know about Flat Scopes (Flat Scope), you can do pretty much whatever you want with scopes. For example, assume that you want to share a variable among a few methods, and you don’t want anybody else to see that variable. You can do that by defining all the methods in the same Flat Scope as the variable:

blocks/shared_scope.rb

def define_methods

shared = 0

Kernel.send :define_method, :counter do

shared

end

Kernel.send :define_method, :inc do |x|

shared += x

end

end

define_methods

counter # => 0

inc(4)

counter # => 4

This example defines two Kernel Methods (Kernel Method). (It also uses Dynamic Dispatch (Dynamic Dispatch) to access the private class method define_method on Kernel.) Both Kernel#counter and Kernel#inc can see the shared variable. No other method can see shared, because it’s protected by a Scope Gate (Scope Gate)—that’s what the define_methods method is for. This smart way to control the sharing of variables is called a Spell: Shared Scope.

Shared Scopes are not used much in practice, but they’re a powerful trick and a good example of the power of scopes. With a combination of Scope Gates, Flat Scopes, and Shared Scopes, you can twist and bend your scopes to see exactly the variables you need, from the place you want. Now that you wield this power, it’s time for a wrap-up of Ruby closures.

Closures Wrap-Up

Each Ruby scope contains a bunch of bindings, and the scopes are separated by Scope Gates (Scope Gate): class, module, and def.

If you want to sneak a binding or two through a Scope Gate, you can use blocks. A block is a closure: when you define a block, it captures the bindings in the current environment and carries them around. So you can replace the Scope Gate with a method call, capture the current bindings in a closure, and pass the closure to the method.

You can replace class with Class.new, module with Module.new, and def with Module#define_method. This is a Flat Scope (Flat Scope), the basic closure-related spell.

If you define multiple methods in the same Flat Scope, maybe protected by a Scope Gate, all those methods can share bindings. That’s called a Shared Scope (Shared Scope).

Bill glances at the road map he created. (See Today’s Roadmap.) “Now that you’ve gotten a taste of Flat Scopes, we should move on to something more advanced: instance_eval.”

instance_eval()

Where you learn another way to mix code and bindings at will.

The following program demonstrates BasicObject#instance_eval, which evaluates a block in the context of an object:

blocks/instance_eval.rb

class MyClass

def initialize

@v = 1

end

end

obj = MyClass.new

obj.instance_eval do

self # => #<MyClass:0x3340dc @v=1>

@v # => 1

end

The block is evaluated with the receiver as self, so it can access the receiver’s private methods and instance variables, such as @v. Even if instance_eval changes self, the block that you pass to instance_eval can still see the bindings from the place where it’s defined, like any other block:

v = 2

obj.instance_eval { @v = v }

obj.instance_eval { @v } # => 2

The three lines in the previous example are evaluated in the same Flat Scope (Flat Scope), so they can all access the local variable v—but the blocks are evaluated with the object as self, so they can also access obj’s instance variable @v. In all these cases, you can call the block that you pass to instance_eval a Spell: Context Probe, because it’s like a snippet of code that you dip inside an object to do something in there.

instance_exec()

instance_eval has a slightly more flexible twin brother named instance_exec that allows you to pass arguments to the block. This feature is useful in a few rare cases, such as the one in this artfully complicated example:

blocks/instance_exec.rb

class C

def initialize

@x = 1

end

end

class D

def twisted_method

@y = 2

C.new.instance_eval { "@x: #{@x}, @y: #{@y}" }

end

end

D.new.twisted_method # => "@x: 1, @y: "

You might assume that the block in D#twisted_method can access both the @x instance variable from C and the @y instance variable from D in the same Flat Scope (Flat Scope). However, instance variables depend on self, so when instance_eval switches self to the receiver, all the instance variables in the caller fall out of scope. The code inside the block interprets @y as an instance variable of C that hasn’t been initialized, and as such is nil (and prints out as an empty string).

To merge @x and @y in the same scope, you can use instance_exec to pass @y’s value to the block:

class D

def twisted_method

@y = 2

*

C.new.instance_exec(@y) {|y| "@x: #{@x}, @y: #{y}" }

end

end

D.new.twisted_method # => "@x: 1, @y: 2"

Breaking Encapsulation

At this point, you might be horrified. With a Context Probe (Context Probe), you can wreak havoc on encapsulation! No data is private data anymore. Isn’t that a Very Bad Thing?

Pragmatically, there are some situations where encapsulation just gets in your way. For one, you might want to take a quick peek inside an object from an irb command line. In a case like this, breaking into the object with instance_eval is often the shortest route.

Another acceptable reason to break encapsulation is arguably testing. Here’s an example.

The Padrino Example

The Padrino web framework defines a Logger class that manages all the logging that a web application must deal with. The Logger stores its own configuration into instance variables. For example, @log_static is true if the application must log access to static files.

Padrino’s unit tests need to change the configuration of the application’s logger. Instead of going through the trouble of creating and configuring a new logger, the following tests (written with the RSpec test gem) just pry open the existing application logger and change its configuration with a Context Probe:

gems/padrino-core-0.11.3/test/test_logger.rb

describe "PadrinoLogger" do

context 'for logger functionality' do

context "static asset logging" do

should 'not log static assets by default' do

# ...

get "/images/something.png"

assert_equal "Foo", body

assert_match "", Padrino.logger.log.string

end

should 'allow turning on static assets logging' do

Padrino.logger.instance_eval{ @log_static = true }

# ...

get "/images/something.png"

assert_equal "Foo", body

assert_match /GET/, Padrino.logger.log.string

Padrino.logger.instance_eval{ @log_static = false }

end

end

# ...

The first test accesses a static file and checks that the logger doesn’t log anything. This is Padrino’s default behavior. The second test uses instance_eval to change the logger’s configuration and enable static file logging. Then it accesses the same URL as the first test and checks that the logger actually logged something. Before exiting, the second test resets static file logging to the default false state.

You can easily criticize these tests for being fragile: if the implementation of Logger changes and the @log_static instance variable disappears, then the test will break. Like many other things in Ruby, encapsulation is a flexible tool that you can choose to ignore, and it’s up to you to decide if and when to accept that risk. The authors of Padrino decided that a quick hack inside the logger object was an acceptable workaround in this case.

Clean Rooms

Sometimes you create an object just to evaluate blocks inside it. An object like that can be called a Spell: Clean Room:

blocks/clean_room.rb

class CleanRoom

def current_temperature

# ...

end

end

clean_room = CleanRoom.new

clean_room.instance_eval do

if current_temperature < 20

# TODO: wear jacket

end

end

A Clean Room is just an environment where you can evaluate your blocks. It can expose a few useful methods that the block can call, such as current_temperature in the example above. However, the ideal Clean Room doesn’t have many methods or instance variables, because the names of those methods and instance variables could clash with the names in the environment that the block comes from. For this reason, instances of BasicObject usually make for good Clean Rooms, because they’re Blank Slates (Blank Slate)—so they barely have any method at all.

(Interestingly, BasicObject is even cleaner than that: in a BasicObject, standard Ruby constants such as String are out of scope. If you want to reference a constant from a BasicObject, you have to use its absolute path, such as ::String.)

You’ll find a practical example of a Clean Room in Quiz: A Better DSL.

That’s all you have to know about instance_eval. Now you can move on to the last topic in today’s roadmap: callable objects.

Callable Objects

Where you learn how blocks are just part of a larger family, and Bill shows you how to set code aside and execute it later.

If you get to the bottom of it, using a block is a two-step process. First, you set some code aside, and second, you call the block (with yield) to execute the code. This “package code first, call it later” mechanism is not exclusive to blocks. There are at least three other places in Ruby where you can package code:

· In a proc, which is basically a block turned object

· In a lambda, which is a slight variation on a proc

· In a method

Procs and lambdas are the big ones to talk about here. We’ll start with them and bring methods back into the picture later.

Proc Objects

Although most things in Ruby are objects, blocks are not. But why would you care about that? Imagine that you want to store a block and execute it later. To do that, you need an object.

To solve this problem, Ruby provides the standard library class Proc. A Proc is a block that has been turned into an object. You can create a Proc by passing the block to Proc.new. Later, you can evaluate the block-turned-object with Proc#call:

inc = Proc.new {|x| x + 1 }

# more code...

inc.call(2) # => 3

This technique is called a Spell: Deferred Evaluation.

There are a few more ways to create Procs in Ruby. Ruby provides two Kernel Methods (Kernel Method) that convert a block to a Proc: lambda and proc. In a short while, you’ll see that there are subtle differences between creating Procs with lambda and creating them in any other way, but in most cases you can just use whichever one you like best:

dec = lambda {|x| x - 1 }

dec.class # => Proc

dec.call(2) # => 1

Also, you can create a lambda with the so-called “stabby lambda” operator:

p = ->(x) { x + 1 }

Notice the little arrow. The previous code is the same as the following:

p = lambda {|x| x + 1 }

So far, you have seen not one, but four different ways to convert a block to a Proc. There is also a fifth way, which deserves its own section.

The & Operator

A block is like an additional, anonymous argument to a method. In most cases, you execute the block right there in the method, using yield. In two cases, yield is not enough:

· You want to pass the block to another method (or even another block).

· You want to convert the block to a Proc.

In both cases, you need to point at the block and say, “I want to use this block”—to do that, you need a name. To attach a binding to the block, you can add one special argument to the method. This argument must be the last in the list of arguments and prefixed by an & sign. Here’s a method that passes the block to another method:

blocks/ampersand.rb

def math(a, b)

yield(a, b)

end

def do_math(a, b, &operation)

math(a, b, &operation)

end

do_math(2, 3) {|x, y| x * y} # => 6

If you call do_math without a block, the &operation argument is bound to nil, and the yield operation in math fails.

What if you want to convert the block to a Proc? As it turns out, if you referenced operation in the previous code, you’d already have a Proc object. The real meaning of the & is this: “I want to take the block that is passed to this method and turn it into a Proc.” Just drop the &, and you’ll be left with a Proc again:

def my_method(&the_proc)

the_proc

end

p = my_method {|name| "Hello, #{name}!" }

p.class # => Proc

p.call("Bill") # => "Hello, Bill!"

You now know a bunch of different ways to convert a block to a Proc. But what if you want to convert it back? Again, you can use the & operator to convert the Proc to a block:

blocks/proc_to_block.rb

def my_method(greeting)

"#{greeting}, #{yield}!"

end

my_proc = proc { "Bill" }

my_method("Hello", &my_proc)

When you call my_method, the & converts my_proc to a block and passes that block to the method.

Now you know how to convert a block to a Proc and back again. Let’s look at a real-life example of a callable object that starts its life as a lambda and is then converted to a regular block.

The HighLine Example

The HighLine gem helps you automate console input and output. For example, you can tell HighLine to collect comma-separated user input and split it into an array, all in a single call. Here’s a Ruby program that lets you input a comma-separated list of friends:

blocks/highline_example.rb

require 'highline'

hl = HighLine.new

friends = hl.ask("Friends?", lambda {|s| s.split(',') })

puts "You're friends with: #{friends.inspect}"

<=

Friends?

=>

Ivana, Roberto, Olaf

<=

You're friends with: ["Ivana", " Roberto", " Olaf"]

You call HighLine#ask with a string (the question for the user) and a Proc that contains the post-processing code. (You might wonder why HighLine requires a Proc argument rather than a simple block. Actually, you can pass a block to ask, but that mechanism is reserved for a different HighLine feature.)

If you read the code of HighLine#ask, you’ll see that it passes the Proc to an object of class Question, which stores the Proc as an instance variable. Later, after collecting the user’s input, the Question passes the input to the stored Proc.

If you want to do something else to the user’s input—say, change it to uppercase—you just create a different Proc:

name = hl.ask("Name?", lambda {|s| s.capitalize })

puts "Hello, #{name}"

<=

Name?

=>

bill

<=

Hello, Bill

This is an example of Deferred Evaluation (Deferred Evaluation).

Procs vs. Lambdas

You’ve learned a bunch of different ways to turn a block into a Proc: Proc.new, lambda, the & operator…. In all cases, the resulting object is a Proc.

Confusingly, though, Procs created with lambda actually differ in some respects from Procs created any other way. The differences are subtle but important enough that people refer to the two kinds of Procs by distinct names: Procs created with lambda are called lambdas, while the others are simply called procs. (You can use the Proc#lambda? method to check whether the Proc is a lambda.)

One word of warning before you dive into this section: the difference between procs and lambdas is probably the most confusing feature of Ruby, with lots of special cases and arbitrary distinctions. There’s no need to go into all the gory details, but you need to know, at least roughly, the important differences.

There are two differences between procs and lambdas. One has to do with the return keyword, and the other concerns the checking of arguments. Let’s start with return.

Procs, Lambdas, and return

The first difference between lambdas and procs is that the return keyword means different things. In a lambda, return just returns from the lambda:

blocks/proc_vs_lambda.rb

def double(callable_object)

callable_object.call * 2

end

l = lambda { return 10 }

double(l) # => 20

In a proc, return behaves differently. Rather than return from the proc, it returns from the scope where the proc itself was defined:

def another_double

p = Proc.new { return 10 }

result = p.call

return result * 2 # unreachable code!

end

another_double # => 10

If you’re aware of this behavior, you can steer clear of buggy code like this:

def double(callable_object)

callable_object.call * 2

end

p = Proc.new { return 10 }

double(p) # => LocalJumpError

The previous program tries to return from the scope where p is defined. Because you can’t return from the top-level scope, the program fails. You can avoid this kind of mistake if you avoid using explicit returns:

p = Proc.new { 10 }

double(p) # => 20

Now on to the second important difference between procs and lambdas.

Procs, Lambdas, and Arity

The second difference between procs and lambdas concerns the way they check their arguments. For example, a particular proc or lambda might have an arity of two, meaning that it accepts two arguments:

p = Proc.new {|a, b| [a, b]}

p.arity # => 2

What happens if you call this callable object with three arguments, or one single argument? The long answer to this question is complicated and littered with special cases.[7] The short answer is that, in general, lambdas tend to be less tolerant than procs (and regular blocks) when it comes to arguments. Call a lambda with the wrong arity, and it fails with an ArgumentError. On the other hand, a proc fits the argument list to its own expectations:

p = Proc.new {|a, b| [a, b]}

p.call(1, 2, 3) # => [1, 2]

p.call(1) # => [1, nil]

If there are too many arguments, a proc drops the excess arguments. If there are too few arguments, it assigns nil to the missing arguments.

Procs vs. Lambdas: The Verdict

You now know the differences between procs and lambdas. But you’re wondering which kind of Proc you should use in your own code.

Generally speaking, lambdas are more intuitive than procs because they’re more similar to methods. They’re pretty strict about arity, and they simply exit when you call return. For this reason, many Rubyists use lambdas as a first choice, unless they need the specific features of procs.

Method Objects

For the sake of completeness, you might want to take one more look at the last member of the callable objects’ family: methods. If you’re not convinced that methods, like lambdas, are just callable objects, look at this code:

blocks/methods.rb

class MyClass

def initialize(value)

@x = value

end

def my_method

@x

end

end

object = MyClass.new(1)

m = object.method :my_method

m.call # => 1

By calling Kernel#method, you get the method itself as a Method object, which you can later execute with Method#call. In Ruby 2.1, you also have Kernel#singleton_method, which converts the name of a Singleton Method (Singleton Method) to a Method object. (What are you saying? You don’t know what a Singleton Method is yet? Oh, you will, you will…)

A Method object is similar to a block or a lambda. Indeed, you can convert a Method to a Proc by calling Method#to_proc, and you can convert a block to a method with define_method. However, an important difference exists between lambdas and methods: a lambda is evaluated in the scope it’s defined in (it’s a closure, remember?), while a Method is evaluated in the scope of its object.

Ruby has a second class that represents methods—one you might find perplexing. Let’s have a look at it first, and then we’ll see how it can be used.

Unbound Methods

UnboundMethods are like Methods that have been detached from their original class or module. You can turn a Method into an UnboundMethod by calling Method#unbind. You can also get an UnboundMethod directly by calling Module#instance_method, as in the following example:

blocks/unbound_methods.rb

module MyModule

def my_method

42

end

end

unbound = MyModule.instance_method(:my_method)

unbound.class # => UnboundMethod

You can’t call an UnboundMethod, but you can use it to generate a normal method that you can call. You do that by binding the UnboundMethod to an object with UnboundMethod#bind. UnboundMethods that come from a class can only be bound to objects of the same class (or a subclass), while UnboundMethods that come from a module have no such limitation from Ruby 2.0 onward. You can also bind an UnboundMethod by passing it to Module#define_method, as in the next example:

String.class_eval do

define_method :another_method, unbound

end

"abc".another_method # => 42

UnboundMethods are used only in very special cases. Let’s look at one of those.

The Active Support Example

The Active Support gem contains, among other utilities, a set of classes and modules that automatically load a Ruby file when you use a constant defined in that file. This “autoloading” system includes a module named Loadable that redefines the standard Kernel#load method. If a class includes Loadable, then Loadable#load gets lower than Kernel#load on its chain of ancestors—so a call to load will end up in Loadable#load.

In some cases, you might want to remove autoloading from a class that has already included Loadable. In other words, you want to stop using Loadable#load and go back to the plain vanilla Kernel#load. Ruby has no uninclude method, so you cannot remove Loadable from your ancestors once you have included it. Active Support works around this problem with a single line of code:

gems/activesupport-4.1.0/lib/active_support/dependencies.rb

module Loadable

def self.exclude_from(base)

base.class_eval { define_method(:load, Kernel.instance_method(:load)) }

end

# ...

Imagine that you have a MyClass class that includes Loadable. When you call Loadable.exclude_from(MyClass), the code above calls instance_method to get the original Kernel#load as an UnboundMethod. Then it uses that UnboundMethod to define a brand-new load method directly onMyClass. As a result, MyClass#load is actually the same method as Kernel#load, and it overrides the load method in Loadable. (If that sounds confusing, try drawing a picture of MyClass’s ancestors chain, and everything will be clear.)

This trick is an example of the power of UnboundMethods, but it’s also a contrived solution to a very specific problem—a solution that leaves you with a confusing chain of ancestors that contains three load methods, two of which are identical to each other (Kernel#load and MyClass#load), and two of which are never called (Kernel#load and Loadable#load). It’s probably good policy not to try this kind of class hacking at home.

Callable Objects Wrap-Up

Callable objects are snippets of code that you can evaluate, and they carry their own scope along with them. They can be the following:

· Blocks (they aren’t really “objects,” but they are still “callable”): Evaluated in the scope in which they’re defined.

· Procs: Objects of class Proc. Like blocks, they are evaluated in the scope where they’re defined.

· Lambdas: Also objects of class Proc but subtly different from regular procs. They’re closures like blocks and procs, and as such they’re evaluated in the scope where they’re defined.

· Methods: Bound to an object, they are evaluated in that object’s scope. They can also be unbound from their scope and rebound to another object or class.

Different callable objects exhibit subtly different behaviors. In methods and lambdas, return returns from the callable object, while in procs and blocks, return returns from the callable object’s original context. Different callable objects also react differently to calls with the wrong arity. Methods are stricter, lambdas are almost as strict (save for some corner cases), and procs and blocks are more tolerant.

These differences notwithstanding, you can still convert from one callable object to another, such as by using Proc.new, Method#to_proc, or the & operator.

Writing a Domain-Specific Language

Where you and Bill, at long last, write some code.

“Enough talking about blocks,” Bill says. “It’s time to focus on today’s job. Let’s call it the RedFlag project.”

RedFlag is a monitor utility for the people in the sales department. It should send the sales folks a message when an order is late, when total sales are too low…basically, whenever one of many different things happens. Sales wants to monitor dozens of different events, and the list is bound to change every week or so.

Luckily for you and Bill, sales has full-time programmers, so you don’t have to write the events yourselves. You can just write a simple Domain-Specific Language. (You can read about DSLs in Appendix 2, Domain-Specific Languages.) The sales guys can then use this DSL to define events, like this:

event "we're earning wads of money" do

recent_orders = ... # (read from database)

recent_orders > 1000

end

To define an event, you give it a description and a block of code. If the block returns true, then you get an alert via mail. If it returns false, then nothing happens. The system should check all the events every few minutes.

It’s time to write RedFlag 0.1.

Your First DSL

You and Bill put together a working RedFlag DSL in no time:

blocks/redflag_1/redflag.rb

def event(description)

puts "ALERT: #{description}" ifyield

end

load 'events.rb'

The entire DSL is just one method and a line that executes a file named events.rb. The code in events.rb is supposed to call back into RedFlag’s event method. To test the DSL, you create a quick events file:

blocks/redflag_1/events.rb

event "an event that always happens" do

true

end

event "an event that never happens" do

false

end

You save both redflag.rb and events.rb in the same folder and run redflag.rb:

<=

ALERT: an event that always happens

“Success!” Bill exclaims. “If we schedule this program to run every few minutes, we have a functional first version of RedFlag. Let’s show it to the boss.”

Sharing Among Events

Your boss is amused by the simplicity of the RedFlag DSL, but she’s not completely convinced. “The people who write the events will want to share data among events,” she observes. “Can I do this with your DSL? For example, can two separate events access the same variable?” she asks the two of you.

“Of course they can,” Bill replies. “We have a Flat Scope (Flat Scope).” To prove that, he whips up a new events file:

blocks/redflag_2/events.rb

def monthly_sales

110 # TODO: read the real number from the database

end

target_sales = 100

event "monthly sales are suspiciously high" do

monthly_sales > target_sales

end

event "monthly sales are abysmally low" do

monthly_sales < target_sales

end

The two events in this file share a method and a local variable. You run redflag.rb, and it prints what you expected:

<=

ALERT: monthly sales are suspiciously high

“Okay, this works,” the boss concedes. “But I don’t like the idea of variables and methods like monthly_sales and target_sales cluttering the top-level scope. Let me show you what I’d like the DSL to look like instead,” she says. Without further ado, the boss grabs the keyboard and starts churning out code like nobody’s business.

Quiz: A Better DSL

Where you’re unexpectedly left alone to develop a new version of the RedFlag DSL.

Your boss wants you to add a setup instruction to the RedFlag DSL, as shown in the following code.

blocks/redflag_3/events.rb

setup do

puts "Setting up sky"

@sky_height = 100

end

setup do

puts "Setting up mountains"

@mountains_height = 200

end

event "the sky is falling" do

@sky_height < 300

end

event "it's getting closer" do

@sky_height < @mountains_height

end

event "whoops... too late" do

@sky_height < 0

end

In this new version of the DSL, you’re free to mix events and setup blocks (setups for short). The DSL still checks events, and it also executes all the setups before each event. If you run redflag.rb on the previous test file, you expect this output:

<=

Setting up sky

Setting up mountains

ALERT: the sky is falling

Setting up sky

Setting up mountains

ALERT: it's getting closer

Setting up sky

Setting up mountains

RedFlag executes all the setups before each of the three events. The first two events generate an alert; the third doesn’t.

A setup can set variables by using variable names that begin with an @ sign, such as @sky_height and @mountains_height. Events can then read these variables. Your boss thinks that this feature will encourage programmers to write clean code: all shared variables are initialized together in a setup and then used in events, so it’s easy to keep track of variables.

Still impressed by your boss’ technical prowess, you and Bill get down to business.

Runaway Bill

You and Bill compare the current RedFlag DSL to the new version your boss has suggested. The current RedFlag executes blocks immediately. The new RedFlag should execute the setups and the events in a specific order. You start by rewriting the event method:

def event(description, &block)

@events << {:description => description, :condition => block}

end

@events = []

load 'events.rb'

The new event method converts the event condition from a block to a Proc. Then it wraps the event’s description and the Proc-ified condition in a hash and stores the hash in an array of events. The array is a top-level instance variable (like the ones you read about in Global Variables and Top-Level Instance Variables), so it can be initialized outside the event method. Finally, the last line loads the file that defines the events. Your plan is to write a setup method similar to the event method, and then write the code that executes events and setups in the correct sequence.

As you ponder your next step, Bill slaps his forehead, mutters something about his wife’s birthday party, and runs out the door. Now it’s up to you alone. Can you complete the new RedFlag DSL and get the expected output from the test file?

Quiz Solution

You can find many different solutions to this quiz. Here is one:

blocks/redflag_3/redflag.rb

def setup(&block)

@setups << block

end

def event(description, &block)

@events << {:description => description, :condition => block}

end

@setups = []

@events = []

load 'events.rb'

@events.each do |event|

@setups.each do |setup|

setup.call

end

puts "ALERT: #{event[:description]}" if event[:condition].call

end

Both setup and event convert the block to a proc and store away the proc, in @setups and @events, respectively. These two top-level instance variables are shared by setup, event, and the main code.

The main code initializes @setups and @events, then it loads events.rb. The code in the events file calls back into setup and event, adding elements to @setups and @events.

With all the events and setups loaded, your program iterates through the events. For each event, it calls all the setup blocks, and then it calls the event.

You can almost hear the voice of Bill in your head, sounding a bit like Obi-Wan Kenobi: “Those top-level instance variables, @events and @setups, are like global variables in disguise. Why don’t you get rid of them?”

Removing the “Global” Variables

To get rid of the global variables (and Bill’s voice in your head), you can use a Shared Scope (Shared Scope):

blocks/redflag_4/redflag.rb

lambda {

setups = []

events = []

Kernel.send :define_method, :setup do |&block|

setups << block

end

Kernel.send :define_method, :event do |description, &block|

events << {:description => description, :condition => block}

end

Kernel.send :define_method, :each_setup do |&block|

setups.each do |setup|

block.call setup

end

end

Kernel.send :define_method, :each_event do |&block|

events.each do |event|

block.call event

end

end

}.call

load 'events.rb'

each_event do |event|

each_setup do |setup|

setup.call

end

puts "ALERT: #{event[:description]}" if event[:condition].call

end

The Shared Scope is contained in a lambda that is called immediately. The code in the lambda defines the RedFlag methods as Kernel Methods (Kernel Method) that share two variables: setups and events. Nobody else can see these two variables, because they’re local to the lambda. (Indeed, the only reason why we have a lambda here is that we want to make these variables invisible to anyone except the four Kernel Methods.) And yes, each call to Kernel.send is passing a block as an argument to another block.

Now those ugly global variables are gone, but the RedFlag code is not as pleasantly simple as it used to be. It’s up to you to decide whether this change is an improvement or just an unwelcome obfuscation. While you decide that, there is one last change that is worth considering.

Adding a Clean Room

In the current version of RedFlag, events can change each other’s shared top-level instance variables:

event "define a shared variable" do

@x = 1

end

event "change the variable" do

@x = @x + 1

end

You want events to share variables with setups, but you don’t necessarily want events to share variables with each other. Once again, it’s up to you to decide whether this is a feature or a potential bug. If you decide that events should be as independent from each other as possible (like tests in a test suite), then you might want to execute events in a Clean Room (Clean Room):

blocks/redflag_5/redflag.rb

each_event do |event|

env = Object.new

each_setup do |setup|

env.instance_eval &setup

end

puts "ALERT: #{event[:description]}" if env.instance_eval &(event[:condition])

end

Now an event and its setups are evaluated in the context of an Object that acts as a Clean Room. The instance variables in the setups and events are instance variables of the Clean Room rather than top-level instance variables. Because each event runs in its own Clean Room, events cannot share instance variables.

You might think of using a BasicObject instead of an Object for your Clean Room. However, remember that BasicObject is also a Blank Slate (Blank Slate), and as such it lacks some common methods, such as puts. So you should only use a BasicObject if you know that the code in the RedFlag events isn’t going to call puts or other Object methods. You grin and add a comment to the code, leaving this difficult decision to Bill.

Wrap-Up

Here are a few spells and other interesting things that you learned today:

· What Scope Gates (Scope Gate) are and how Ruby manages scope in general

· How to make bindings visible through scopes with Flat Scopes (Flat Scope) and Shared Scopes (Shared Scope)

· How to execute code in an object’s scope (usually with instance_eval or instance_exec), or even in a Clean Room (Clean Room)

· How to turn a block into an object (a Proc) and back

· How to turn a method into an object (a Method or an UnboundMethod) and back

· What the differences are between the different types of callable objects: blocks, Procs, lambdas, and plain old methods

· How to write your own little DSL

That was a lot of new stuff in a single day. As you sneak out of the office, however, you can’t shake the nagging feeling that you’ll learn some of Ruby’s best-kept secrets tomorrow.

Footnotes

[7]

A program to explore those special cases, written by Paul Cantrell, is at http://innig.net/software/ruby/closures-in-ruby.rb.