Bootstrapping your Ruby literacy - Ruby foundations - The Well-Grounded Rubyist, Second Edition (2014)

The Well-Grounded Rubyist, Second Edition (2014)

Part 1. Ruby foundations

The goal of this part of the book is to give you a broad but practical foundation layer on which to build, and to which to anchor, the further explorations of Ruby that follow in parts 2 and 3. We’ll start with a chapter on bootstrapping your Ruby literacy; after working through that first chapter, you’ll be able to run Ruby programs comfortably and have a good sense of the layout of a typical Ruby installation. Starting with chapter 2, we’ll get into the details of the Ruby language. Ruby is an object-oriented language, and the sooner you dive into how Ruby handles objects, the better. Accordingly, objects will serve both as a way to bootstrap the discussion of the language (and your knowledge of it) and as a golden thread leading us to further topics and techniques.

Objects are created by classes, and in chapter 3 you’ll learn how classes work. The discussion of classes is followed by a look at modules in chapter 4. Modules allow you to fine-tune classes and objects by splitting out some of the object design into separate, reusable units of code. To understand Ruby programs—both your own and others’—you need to know about Ruby’s notion of a current default object, known by the keyword self. Chapter 5 will take you deep into the concept of self, along with a treatment of Ruby’s handling of variable visibility and scope.

In chapter 6, the last in this part of the book, you’ll learn about control flow in Ruby programs—that is, how to steer the Ruby interpreter through conditional (if) logic, how to loop repeatedly through code, and even how to break away from normal program execution when an error occurs. By the end of chapter 6, you’ll be thinking along with Ruby as you write and develop your code.

The title of this part is “Ruby foundations,” which obviously suggests that what’s here is to be built on later. And that’s true. But it doesn’t mean that the material in part 1 isn’t important in itself. As you’ll see once you read them, these six chapters present you with real Ruby techniques, real code, and information you’ll use every time you write or execute a Ruby program. It’s not the “foundations” because you’ll learn it once and then ignore it, but because there’s so much more about Ruby yet to follow!

Chapter 1. Bootstrapping your Ruby literacy

This chapter covers

· A Ruby syntax survival kit

· A basic Ruby programming how-to: writing, saving, running, and error-checking programs

· A tour of the Ruby installation

· The mechanics of Ruby extensions

· Ruby’s out-of-the-box command-line tools, including the interactive Ruby interpreter (irb)

This book will give you a foundation in Ruby, and this chapter will give your foundation a foundation. The goal of the chapter is to bootstrap you into the study of Ruby with enough knowledge and skill to proceed comfortably into what lies beyond.

We’ll look at basic Ruby syntax and techniques and at how Ruby works: what you do when you write a program, how you get Ruby to run your program, and how you split a program into more than one file. You’ll learn several of the switches that alter how the Ruby interpreter (the program with the name ruby, to which you feed your program files for execution) acts, as well as how to use some important auxiliary tools designed to make your life as a Rubyist easier and more productive.

The chapter is based on a view of the whole Ruby landscape as being divided into three fundamental levels:

· Core language: design principles, syntax, and semantics

· Extensions and libraries that ship with Ruby, and the facilities for adding extensions of your own

· Command-line tools that come with Ruby, with which you run the interpreter and some other important utilities

It’s not always possible to talk about these three levels in isolation—after all, they’re interlocking parts of a single system—but we’ll discuss them separately as much as possible in this chapter. You can, in any case, use the three level descriptions as pegs to hang subtopics on, wherever they’re introduced.

Ruby, ruby, and ... RUBY?!

Ruby is a programming language. We talk about things like “learning Ruby,” and we ask questions like, “Do you know Ruby?” The lowercase version, ruby, is a computer program; specifically, it’s the Ruby interpreter, the program that reads your programs and runs them. You’ll see this name used in sentences like “I ran ruby on my file, but nothing happened,” or “What’s the full path to your ruby executable?” Finally, there’s RUBY—or, more precisely, there isn’t. Ruby isn’t an acronym, and it’s never correct to spell it in all capital letters. People do this, as they do (also wrongly) with Perl, perhaps because they’re used to seeing language names like BASIC and COBOL. Ruby isn’t such a language. It’s Ruby for the language, ruby for the interpreter.

Nor does this first chapter exist solely in the service of later chapters. It has content in its own right: you’ll learn real Ruby techniques and important points about the design of the language. The goal is to bootstrap or jumpstart you, but even that process will involve close examination of some key aspects of the Ruby language.

1.1. Basic Ruby language literacy

The goal of this section is to get you going with Ruby. It takes a breadth-first approach: we’ll walk through the whole cycle of learning some syntax, writing some code, and running some programs.

At this point, you need to have Ruby installed on your computer.[1] The examples in this book use Ruby 2.1.0. You also need a text editor (you can use any editor you like, as long as it’s a plain-text editor and not a word processor) and a directory (a.k.a. a folder) in which to store your Ruby program files. You might name that directory rubycode or rubysamples—any name is fine, as long as it’s separate from other work areas so that you can keep track of your practice program files.

1 You can find full up-to-date instructions for installing Ruby at http://ruby-lang.org.

The interactive Ruby console program (irb), your new best friend

The irb utility ships with Ruby and is the most widely used Ruby command-line tool other than the interpreter itself. After starting irb, you type Ruby code into it, and it executes the code and prints out the resulting value.

Type irb at the command line and enter sample code as you encounter it in the text.

For example:

>> 100 + 32
=> 132

Having an open irb session means you can test Ruby snippets at any time and in any quantity. Most Ruby developers find irb indispensable, and you’ll see a few examples of its use as we proceed through this chapter.

The irb examples you’ll see in this book will use a command-line option that makes irb output easier to read:

irb --simple-prompt

If you want to see the effect of the --simple-prompt option, try starting irb with and without it. As you’ll see, the simple prompt keeps your screen a lot clearer. The default (nonsimple) prompt displays more information, such as a line-number count for your interactive session; but for the examples we’ll look at, the simple prompt is sufficient.

Because irb is one of the command-line tools that ship with Ruby, it’s not discussed in detail until section 1.4.2. Feel free to jump to that section and have a look; it’s pretty straightforward.

With Ruby installed and your work area created, let’s continue to bootstrap your Ruby literacy so we have a shared ground on which to continuing building and exploring. One thing you’ll need is enough exposure to basic Ruby syntax to get you started.

1.1.1. A Ruby syntax survival kit

The following three tables summarize some Ruby features that you’ll find useful in understanding the examples in this chapter and in starting to experiment with Ruby. You don’t have to memorize them, but do look them over and refer back to them later as needed.

Table 1.1 contains some of Ruby’s basic operations. Table 1.2 covers retrieving basic input from the keyboard, sending output to the screen, and basic conditional statements. Table 1.3 briefly details Ruby’s special objects and syntax for comments.

Table 1.1. Basic operations in Ruby

Operation

Example(s)

Comments

Arithmetic

2 + 3 (addition) 2 – 3 (subtraction) 2 * 3 (multiplication) 2 / 3 (division) 10.3 + 20.25 103 - 202.5 32.9 * 10 100.0 / 0.23

All these operations work on integers or floating-point numbers (floats). Mixing integers and floats together, as some of the examples do, produces a floating-point result. Note that you need to write 0.23 rather than .23.

Assignment

x = 1 string = "Hello"

This operation binds a local variable (on the left) to an object (on the right). For now, you can think of an object as a value represented by the variable.

Compare two values

x == y

Note the two equal signs (not just one, as in assignment).

Convert a numeric string to a number

x = "100".to_i
s = "100"
x = s.to_i

To perform arithmetic, you have to make sure you have numbers rather than strings of characters. to_i performs string-to-integer conversion.

Table 1.2. Basic input/output methods and flow control in Ruby

Operation

Example(s)

Comments

Print something to the screen

print "Hello"
puts "Hello"
x = "Hello"
puts x
x = "Hello"
print x
x = "Hello"
p x

puts adds a newline to the string it outputs if there isn’t one at the end already; print doesn’t. print prints exactly what it’s told to and leaves the cursor at the end. (Note: On some platforms, an extra line is automatically output at the end of a program.) p outputs an inspect string, which may contain extra information about what it’s printing.

Get a line of keyboard input

gets
string = gets

You can assign the input line directly to a variable (the variable string in the second example).

Conditional execution

if x == y
puts "Yes!"
else
puts "No!"
end

Conditional statements always end with the word end. More on these in chapter 6.

Table 1.3. Ruby’s special objects and comment

Operation

Example(s)

Comments

Special value objects

true
false
nil

The objects true and false often serve as return values for conditional expressions. The object nil is a kind of “nonobject” indicating the absence of a value or result. false and nil cause a conditional expression to fail; all other objects (including true, of course, but also including 0 and empty strings) cause it to succeed. More on these in chapter 7.

Default object

self

The keyword self refers to the default object. Self is a role that different objects play, depending on the execution context. Method calls that don’t specify a calling object are called on self. More on this in chapter 5.

Put comments in code files

# A comment
x = 1 # A comment

Comments are ignored by the interpreter.

A few fundamental aspects of Ruby and Ruby syntax are too involved for summary in a table. You need to be able to recognize a handful of different Ruby identifiers and, above all, you need a sense of what an object is in Ruby and what a method call looks like. We’ll take a look at both of those aspects of the language next.

1.1.2. The variety of Ruby identifiers

Ruby has a small number of identifier types that you’ll want to be able to spot and differentiate from each other at a glance. The identifier family tree looks like this:

· Variables:

o Local

o Instance

o Class

o Global

· Constants

· Keywords

· Method names

It’s a small family, and easily learned. We’ll survey them here. Keep in mind that this section’s purpose is to teach you to recognize the various identifiers. You’ll also learn a lot more throughout the book about when and how to use them. This is just the first lesson in identifier literacy.

Variables

Local variables start with a lowercase letter or an underscore and consist of letters, underscores, and/or digits. x, string, abc, start_value, and firstName are all valid local variable names. Note, however, that the Ruby convention is to use underscores rather than camel case when composing local variable names from multiple words—for example, first_name rather than firstName.

Instance variables, which serve the purpose of storing information for individual objects, always start with a single at sign (@) and consist thereafter of the same character set as local variables—for example, @age and @last_name. Although a local variable can’t start with an uppercase letter, an instance variable can have one in the first position after the at sign (though it may not have a digit in this position). But usually the character after the at sign is a lowercase letter.

Class variables, which store information per class hierarchy (again, don’t worry about the semantics at this stage), follow the same rules as instance variables, except that they start with two at signs—for example, @@running_total.

Global variables are recognizable by their leading dollar sign ($)—for example, $population. The segment after the dollar sign doesn’t follow local-variable naming conventions; there are global variables called $:, $1, and $/, as well as $stdin and $LOAD_PATH. As long as it begins with a dollar sign, it’s a global variable. As for the nonalphanumeric ones, the only such identifiers you’re likely to see are predefined, so you don’t need to worry about which punctuation marks are legal and which aren’t.

Table 1.4 summarizes Ruby’s variable naming rules.

Table 1.4. Valid variable names in Ruby by variable type

Type

Ruby convention

Nonconventional

Local

first_name

firstName, _firstName, __firstName, name1

Instance

@first_name

@First_name, @firstName, @name1

Class

@@first_name

@@First_name, @@firstName, @@name1

Global

$FIRST_NAME

$first_name, $firstName, $name1

Constants

Constants begin with an uppercase letter. A, String, FirstName, and STDIN are all valid constant names. The Ruby convention is to use either camel case (FirstName) or underscore-separated all-uppercase words (FIRST_NAME) in composing constant names from multiple words.

Keywords

Ruby has numerous keywords: predefined, reserved terms associated with specific programming tasks and contexts. Keywords include def (for method definitions), class (for class definitions), if (conditional execution), and __FILE__ (the name of the file currently being executed). There are about 40 of them, and they’re generally short, single-word (as opposed to underscore-composed) identifiers.

Method names

Names of methods in Ruby follow the same rules and conventions as local variables (except that they can end with ?, !, or =, with significance you’ll see later). This is by design: methods don’t call attention to themselves as methods but rather blend into the texture of a program as, simply, expressions that provide a value. In some contexts you can’t tell just by looking at an expression whether you’re seeing a local variable or a method name—and that’s intentional.

Speaking of methods, now that you’ve got a roadmap to Ruby identifiers, let’s get back to some language semantics—in particular, the all-important role of the object and its methods.

1.1.3. Method calls, messages, and Ruby objects

Ruby sees all data structures and values—from simple scalar (atomic) values like integers and strings, to complex data structures like arrays—as objects. Every object is capable of understanding a certain set of messages. Each message that an object understands corresponds directly to amethod—a named, executable routine whose execution the object has the ability to trigger.

Objects are represented either by literal constructors—like quotation marks for strings—or by variables to which they’ve been bound. Message sending is achieved via the special dot operator: the message to the right of the dot is sent to the object to the left of the dot. (There are other, more specialized ways to send messages to objects, but the dot is the most common and fundamental way.) Consider this example from table 1.1:

x = "100".to_i

The dot means that the message to_i is being sent to the string "100". The string "100" is called the receiver of the message. We can also say that the method to_i is being called on the string "100". The result of the method call—the integer 100—serves as the right-hand side of the assignment to the variable x.

Why the double terminology?

Why bother saying both “sending the message to_i” and “calling the method to_i”? Why have two ways of describing the same operation? Because they aren’t quite the same. Most of the time, you send a message to a receiving object, and the object executes the corresponding method. But sometimes there’s no corresponding method. You can put anything to the right of the dot, and there’s no guarantee that the receiver will have a method that matches the message you send.

If that sounds like chaos, it isn’t, because objects can intercept unknown messages and try to make sense of them. The Ruby on Rails web development framework, for example, makes heavy use of the technique of sending unknown messages to objects, intercepting those messages, and making sense of them on the fly based on dynamic conditions like the names of the columns in the tables of the current database.

Methods can take arguments, which are also objects. (Almost everything in Ruby is an object, although some syntactic structures that help you create and manipulate objects aren’t themselves objects.) Here’s a method call with an argument:

x = "100".to_i(9)

Calling to_i on 100 with an argument of 9 generates a decimal integer equivalent to the base-nine number 100: x is now equal to 81 decimal.

This example also shows the use of parentheses around method arguments. These parentheses are usually optional, but in more complex cases they may be required to clear up what may otherwise be ambiguities in the syntax. Many programmers use parentheses in most or all method calls, just to be safe.

The whole universe of a Ruby program consists of objects and the messages that are sent to them. As a Ruby programmer, you spend most of your time either specifying the things you want objects to be able to do (by defining methods) or asking the objects to do those things (by sending them messages).

We’ll explore all of this in much greater depth later in the book. Again, this brief sketch is just part of the process of bootstrapping your Ruby literacy. When you see a dot in what would otherwise be an inexplicable position, you should interpret it as a message (on the right) being sent to an object (on the left). Keep in mind, too, that some method calls take the form of bareword-style invocations, like the call to puts in this example:

puts "Hello."

Here, in spite of the lack of a message-sending dot and an explicit receiver for the message, we’re sending the message puts with the argument "Hello." to an object: the default object self. There’s always a self defined when your program is running, although which object is self changes, according to specific rules. You’ll learn much more about self in chapter 5. For now, take note of the fact that a bareword like puts can be a method call.

The most important concept in Ruby is the concept of the object. Closely related, and playing an important supporting role, is the concept of the class.

The origin of objects in classes

Classes define clusters of behavior or functionality, and every object is an instance of exactly one class. Ruby provides a large number of built-in classes, representing important foundational data types (classes like String, Array, and Fixnum). Every time you create a string object, you’ve created an instance of the class String.

You can also write your own classes. You can even modify existing Ruby classes; if you don’t like the behavior of strings or arrays, you can change it. It’s almost always a bad idea to do so, but Ruby allows it. (We’ll look at the pros and cons of making changes to built-in classes in chapter 13.)

Although every Ruby object is an instance of a class, the concept of class is less important than the concept of object. That’s because objects can change, acquiring methods and behaviors that weren’t defined in their class. The class is responsible for launching the object into existence, a process known as instantiation; but the object, thereafter, has a life of its own.

The ability of objects to adopt behaviors that their class didn’t give them is one of the most central defining principles of the design of Ruby as a language. As you can surmise, we’ll come back to it frequently in a variety of contexts. For now, just be aware that although every object has a class, the class of an object isn’t the sole determinant of what the object can do.

Armed with some Ruby literacy (and some material to refer back to when in doubt), let’s walk through the steps involved in running a program.

1.1.4. Writing and saving a simple program

At this point, you can start creating program files in the Ruby sample code directory you created a little while back. Your first program will be a Celsius-to-Fahrenheit temperature converter.

Note

A real-world temperature converter would, of course, use floating-point numbers. We’re sticking to integers in the input and output to keep the focus on matters of program structure and execution.

We’ll work through this example several times, adding to it and modifying it as we go. Subsequent iterations will

· Tidy the program’s output

· Accept input via the keyboard from the user

· Read a value in from a file

· Write the result of the program to a file

The first version is simple; the focus is on the file-creation and program-running processes, rather than any elaborate program logic.

Creating a first program file

Using a plain-text editor, type the code from the following listing into a text file and save it under the filename c2f.rb in your sample code directory.

Listing 1.1. Simple, limited-purpose Celsius-to-Fahrenheit converter (c2f.rb)

celsius = 100
fahrenheit = (celsius * 9 / 5) + 32
puts "The result is: "
puts fahrenheit
puts "."

Note

Depending on your operating system, you may be able to run Ruby program files standalone—that is, with just the filename, or with a short name (like c2f) and no file extension. Keep in mind, though, that the .rb filename extension is mandatory in some cases, mainly involving programs that occupy more than one file (which you’ll learn about in detail later) and that need a mechanism for the files to find each other. In this book, all Ruby program filenames end in .rb to ensure that the examples work on as many platforms, and with as few administrative digressions, as possible.

You now have a complete (albeit tiny) Ruby program on your disk, and you can run it.

1.1.5. Feeding the program to Ruby

Running a Ruby program involves passing the program’s source file (or files) to the Ruby interpreter, which is called ruby. You’ll do that now...sort of. You’ll feed the program to ruby, but instead of asking Ruby to run the program, you’ll ask it to check the program code for syntax errors.

Checking for syntax errors

If you add 31 instead of 32 in your conversion formula, that’s a programming error. Ruby will still happily run your program and give you the flawed result. But if you accidentally leave out the closing parenthesis in the second line of the program, that’s a syntax error, and Ruby won’t run the program:

$ ruby broken_c2f.rb
broken_c2f.rb:5: syntax error, unexpected end-of-input, expecting ')'

(The error is reported on line 5—the last line of the program—because Ruby waits patiently to see whether you’re ever going to close the parenthesis before concluding that you’re not.)

Conveniently, the Ruby interpreter can check programs for syntax errors without running the programs. It reads through the file and tells you whether the syntax is okay. To run a syntax check on your file, do this:

$ ruby -cw c2f.rb

The -cw command-line flag is shorthand for two flags: -c and -w. The -c flag means check for syntax errors. The -w flag activates a higher level of warning: Ruby will fuss at you if you’ve done things that are legal Ruby but are questionable on grounds other than syntax.

Assuming you’ve typed the file correctly, you should see the message

Syntax OK

printed on your screen.

Running the program

To run the program, pass the file once more to the interpreter, but this time without the combined -c and -w flags:

$ ruby c2f.rb

If all goes well, you’ll see the output of the calculation:

The result is
212
.

The result of the calculation is correct, but the output spread over three lines looks bad.

Second converter iteration

The problem can be traced to the difference between the puts command and the print command. puts adds a newline to the end of the string it prints out, if the string doesn’t end with one already. print, on the other hand, prints out the string you ask it to and then stops; it doesn’t automatically jump to the next line.

To fix the problem, change the first two puts commands to print:

print "The result is "
print fahrenheit
puts "."

(Note the blank space after is, which ensures that a space appears between is and the number.) Now the output is

The result is 212.

puts is short for put (that is, print) string. Although put may not intuitively invoke the notion of skipping down to the next line, that’s what puts does: like print, it prints what you tell it to, but then it also automatically goes to the next line. If you ask puts to print a line that already ends with a newline, it doesn’t bother adding one.

If you’re used to print facilities in languages that don’t automatically add a newline, such as Perl’s print function, you may find yourself writing code like this in Ruby when you want to print a value followed by a newline:

print fahrenheit, "\n"

You almost never have to do this, though, because puts adds a newline for you. You’ll pick up the puts habit, along with other Ruby idioms and conventions, as you go along.

Warning

On some platforms (Windows, in particular), an extra newline character is printed out at the end of the run of a program. This means a print that should really be a puts will be hard to detect, because it will act like a puts. Being aware of the difference between the two and using the one you want based on the usual behavior should be sufficient to ensure you get the desired results.

Having looked a little at screen output, let’s widen the I/O field a bit to include keyboard input and file operations.

1.1.6. Keyboard and file I/O

Ruby offers lots of techniques for reading data during the course of program execution, both from the keyboard and from disk files. You’ll find uses for them—if not in the course of writing every application, then almost certainly while writing Ruby code to maintain, convert, housekeep, or otherwise manipulate the environment in which you work. We’ll look at some of these input techniques here; an expanded look at I/O operations can be found in chapter 12.

Keyboard input

A program that tells you over and over again that 100° Celsius equals 212° Fahrenheit has limited value. A more valuable program lets you specify a Celsius temperature and tells you the Fahrenheit equivalent.

Modifying the program to allow for this functionality involves adding a couple of steps and using one method each from tables 1.1 and 1.2: gets (get a line of keyboard input) and to_i (convert to an integer), one of which you’re familiar with already. Because this is a new program, not just a correction, put the version from the following listing in a new file: c2fi.rb (the i stands for interactive).

Listing 1.2. Interactive temperature converter (c2fi.rb)

print "Hello. Please enter a Celsius value: "
celsius = gets
fahrenheit = (celsius.to_i * 9 / 5) + 32

print "The Fahrenheit equivalent is "
print fahrenheit
puts "."

A couple of sample runs demonstrate the new program in action:

$ ruby c2fi.rb
Hello. Please enter a Celsius value: 100
The Fahrenheit equivalent is 212.
$ ruby c2fi.rb
Hello. Please enter a Celsius value: 23
The Fahrenheit equivalent is 73.

Shortening the code

You can shorten the code in listing 1.2 considerably by consolidating the operations of input, calculation, and output. A compressed rewrite looks like this:

print "Hello. Please enter a Celsius value: "
print "The Fahrenheit equivalent is ", gets.to_i * 9 / 5 + 32, ".\n"

This version economizes on variables—there aren’t any—but requires anyone reading it to follow a somewhat denser (but shorter!) set of expressions. Any given program usually has several or many spots where you have to decide between longer (but maybe clearer?) and shorter (but perhaps a bit cryptic). And sometimes, shorter can be clearer. It’s all part of developing a Ruby coding style.

We now have a generalized, if not terribly nuanced, solution to the problem of converting from Celsius to Fahrenheit. Let’s widen the circle to include file input.

Reading from a file

Reading a file from a Ruby program isn’t much more difficult, at least in many cases, than reading a line of keyboard input. The next version of our temperature converter will read one number from a file and convert it from Celsius to Fahrenheit.

First, create a new file called temp.dat (temperature data), containing one line with one number on it:

100

Now create a third program file, called c2fin.rb (in for [file] input), as shown in the next listing.

Listing 1.3. Temperature converter using file input (c2fin.rb)

puts "Reading Celsius temperature value from data file..."
num = File.read("temp.dat")
celsius = num.to_i
fahrenheit = (celsius * 9 / 5) + 32
puts "The number is " + num
print "Result: "
puts fahrenheit

This time, the sample run and its output look like this:

$ ruby c2fin.rb
Reading Celsius temperature value from data file...
The number is 100
Result: 212

Naturally, if you change the number in the file, the result will be different.

What about writing the result of the calculation to a file?

Writing to a file

The simplest file-writing operation is just a little more elaborate than the simplest file-reading operation. As you can see from the following listing, the main extra step when you write to a file is the specification of a file mode—in this case, w (for write). Save the version of the program from this listing to c2fout.rb and run it.

Listing 1.4. Temperature converter with file output (c2fout.rb)

print "Hello. Please enter a Celsius value: "
celsius = gets.to_i
fahrenheit = (celsius * 9 / 5) + 32
puts "Saving result to output file 'temp.out'"
fh = File.new("temp.out", "w")
fh.puts fahrenheit
fh.close

The method call fh.puts fahrenheit has the effect of printing the value of fahrenheit to the file for which fh is a write handle. If you inspect the file temp.out, you should see that it contains the Fahrenheit equivalent of whatever number you typed in.

As an exercise, you might try to combine the previous examples into a Ruby program that reads a number from a file and writes the Fahrenheit conversion to a different file. Meanwhile, with some basic Ruby syntax in place, our next stop will be an examination of the Ruby installation. This, in turn, will equip you for a look at how Ruby manages extensions and libraries.

1.2. Anatomy of the Ruby installation

Having Ruby installed on your system means having several disk directories’ worth of Ruby-language libraries and support files. Most of the time, Ruby knows how to find what it needs without being prompted. But knowing your way around the Ruby installation is part of a good Ruby grounding.

Looking at the Ruby source code

In addition to the Ruby installation directory tree, you may also have the Ruby source code tree on your machine; if not, you can download it from the Ruby homepage. The source code tree contains a lot of Ruby files that end up in the eventual installation and a lot of C-language files that get compiled into object files that are then installed. In addition, the source tree contains informational files like the ChangeLog and software licenses.

Ruby can tell you where its installation files are located. To get the information while in an irb session, you need to preload a Ruby library package called rbconfig into your irb session. rbconfig is an interface to a lot of compiled-in configuration information about your Ruby installation, and you can get irb to load it by using irb’s -r command-line flag and the name of the package:

$ irb --simple-prompt -rrbconfig

Now you can request information. For example, you can find out where the Ruby executable files (including ruby and irb) have been installed:

>> RbConfig::CONFIG["bindir"]

RbConfig::CONFIG is a constant referring to the hash (a kind of data structure) where Ruby keeps its configuration knowledge. The string "bindir" is a hash key. Querying the hash with the "bindir" key gives you the corresponding hash value, which is the name of the binary-file installation directory.

The rest of the configuration information is made available the same way: as values inside the configuration data structure that you can access with specific hash keys. To get additional installation information, you need to replace bindir in the irb command with other terms. But each time you use the same basic formula: RbConfig::CONFIG["term"]. Table 1.5 outlines the terms and the directories they refer to.

Table 1.5. Key Ruby directories and their RbConfig terms

Term

Directory contents

rubylibdir

Ruby standard library

bindir

Ruby command-line tools

archdir

Architecture-specific extensions and libraries (compiled, binary files)

sitedir

Your own or third-party extensions and libraries (written in Ruby)

vendordir

Third-party extensions and libraries (written in Ruby)

sitelibdir

Your own Ruby language extensions (written in Ruby)

sitearchdir

Your own Ruby language extensions (written in C)

Here’s a rundown of the major installation directories and what they contain. You don’t have to memorize them, but you should be aware of how to find them if you need them (or if you’re curious to look through them and check out some examples of Ruby code!).

1.2.1. The Ruby standard library subdirectory (RbConfig::CONFIG[rubylibdir])

In rubylibdir, you’ll find program files written in Ruby. These files provide standard library facilities, which you can require from your own programs if you need the functionality they provide.

Here’s a sampling of the files you’ll find in this directory:

· cgi.rb —Tools to facilitate CGI programming

· fileutils.rb —Utilities for manipulating files easily from Ruby programs

· tempfile.rb —A mechanism for automating the creation of temporary files

· drb.rb —A facility for distributed programming with Ruby

Some of the standard libraries, such as the drb library (the last item on the previous list), span more than one file; you’ll see both a drb.rb file and a whole drb subdirectory containing components of the drb library.

Browsing your rubylibdir directory will give you a good (if perhaps initially overwhelming) sense of the many tasks for which Ruby provides programming facilities. Most programmers use only a subset of these capabilities, but even a subset of such a large collection of programming libraries gives you a lot to work with.

1.2.2. The C extensions directory (RbConfig::CONFIG[archdir])

Usually located one level down from rubylibdir, archdir contains architecture-specific extensions and libraries. The files in this directory typically have names ending in .so, .dll, or .bundle (depending on your hardware and operating system). These files are C extensions: binary, runtime-loadable files generated from Ruby’s C-language extension code, compiled into binary form as part of the Ruby installation process.

Like the Ruby-language program files in rubylibdir, the files in archdir contain standard library components that you can load into your own programs. (Among others, you’ll see the file for the rbconfig extension—the extension you’re using with irb to uncover the directory names.) These files aren’t human-readable, but the Ruby interpreter knows how to load them when asked to do so. From the perspective of the Ruby programmer, all standard libraries are equally useable, whether written in Ruby or written in C and compiled to binary format.

The files installed in archdir vary from one installation to another, depending on which extensions were compiled—which, in turn, depends on a mixture of what the person doing the compiling asked for and which extensions Ruby was able to compile.

1.2.3. The site_ruby (RbConfig::CONFIG[sitedir]) and vendor_ruby (RbConf- fig::CONFIG[vendordir]) directories

Your Ruby installation includes a subdirectory called site_ruby, where you and/or your system administrator store third-party extensions and libraries. Some of these may be code you write, while others are tools you download from other people’s sites and archives of Ruby libraries.

The site_ruby directory parallels the main Ruby installation directory, in the sense that it has its own subdirectories for Ruby-language and C-language extensions (sitelibdir and sitearchdir, respectively, in RbConfig::CONFIG terms). When you require an extension, the Ruby interpreter checks for it in these subdirectories of site_ruby, as well as in both the main rubylibdir and the main archdir.

Alongside site_ruby you’ll find the directory vendor_ruby. Some third-party extensions install themselves here. The vendor_ruby directory was new as of Ruby 1.9, and standard practice as to which of the two areas gets which packages is still developing.

1.2.4. The gems directory

The RubyGems utility is the standard way to package and distribute Ruby libraries. When you install gems (as the packages are called), the unbundled library files land in the gems directory. This directory isn’t listed in the config data structure, but it’s usually at the same level as site_ruby; if you’ve found site_ruby, look at what else is installed next to it. You’ll learn more about using gems in section 1.4.5.

Let’s look now at the mechanics and semantics of how Ruby uses its own extensions as well as those you may write or install.

1.3. Ruby extensions and programming libraries

The first major point to take on board as you read this section is that it isn’t a Ruby standard library reference. As explained in the introduction, this book doesn’t aim to document the Ruby language; it aims to teach you the language and to confer Ruby citizenship upon you so that you can keep widening your horizons.

The purpose of this section, accordingly, is to show you how extensions work: how you get Ruby to run its extensions, the difference among techniques for doing so, and the extension architecture that lets you write your own extensions and libraries.

The extensions that ship with Ruby are usually referred to collectively as the standard library. The standard library includes extensions for a wide variety of projects and tasks: database management, networking, specialized mathematics, XML processing, and many more. The exact makeup of the standard library usually changes, at least a little, with every new release of Ruby. But most of the more widely used libraries tend to stay, once they’ve proven their worth.

The key to using extensions and libraries is the require method, along with its near relation load. These methods allow you to load extensions at runtime, including extensions you write yourself. We’ll look at them in their own right and then expand our scope to take in their use in loading built-in extensions.

1.3.1. Loading external files and extensions

Storing a program in a single file can be handy, but it starts to be a liability rather than an asset when you’ve got hundreds or thousands—or hundreds of thousands—of lines of code. Somewhere along the line, breaking your program into separate files starts to make lots of sense. Ruby facilitates this process with the require and load methods. We’ll start with load, which is the more simply engineered of the two.

Feature, extension, or library?

Things you load into your program at runtime get called by several different names. Feature is the most abstract and is rarely heard outside of the specialized usage “requiring a feature” (that is, with require). Library is more concrete and more common. It connotes the actual code as well as the basic fact that a set of programming facilities exists and can be loaded. Extension can refer to any loadable add-on library but is often used to mean a library for Ruby that has been written in the C programming language, rather than in Ruby. If you tell people you’ve written a Ruby extension, they’ll probably assume you mean that it’s in C.

To try the examples that follow, you’ll need a program that’s split over two files. The first file, loaddemo.rb, should contain the following Ruby code:

puts "This is the first (master) program file."
load "loadee.rb"
puts "And back again to the first file."

When it encounters the load method call, Ruby reads in the second file. That file, loadee.rb, should look like this:

puts "> This is the second file."

The two files should be in the same directory (presumably your sample code directory). When you run loaddemo.rb from the command line, you’ll see this output:

This is the first (master) program file.
> This is the second file.
And back again to the first file.

The output gives you a trace of which lines from which files are being executed, and in what order.

The call to load in loaddemo.rb provides a filename, loadee.rb, as its argument:

load "loadee.rb"

If the file you’re loading is in your current working directory, Ruby will be able to find it by name. If it isn’t, Ruby will look for it in the load path.

1.3.2. “Load”-ing a file in the default load path

The Ruby interpreter’s load path is a list of directories in which it searches for files you ask it to load. You can see the names of these directories by examining the contents of the special global variable $: (dollar-colon). What you see depends on what platform you’re on. A typical load-path inspection on Mac OS X looks like the following (an example that includes the .rvm directory, where the Ruby Version Manager keeps a selection of Ruby versions):

On your machine, the part to the left of “ruby-2.1.0” may say something different, like “/usr/local/lib/,” but the basic pattern of subdirectories will be the same. When you load a file, Ruby looks for it in each of the listed directories, in order from top to bottom.

Note

The current working directory, usually represented by a single dot (.), is actually not included in the load path. The load command acts as if it is, but that’s a specially engineered case.

You can navigate relative directories in your load commands with the conventional double-dot “directory up” symbol:

load "../extras.rb"

Note that if you change the current directory during a program run, relative directory references will change, too.

Note

Keep in mind that load is a method, and it’s executed at the point where Ruby encounters it in your file. Ruby doesn’t search the whole file looking for load directives; it finds them when it finds them. This means you can load files whose names are determined dynamically during the run of the program. You can even wrap a load call in a conditional statement, in which case the call will be executed only if the condition is true.

You can also force load to find a file, regardless of the contents of the load path, by giving it the fully qualified path to the file:

load "/home/users/dblack/book/code/loadee.rb"

This is, of course, less portable than the use of the load path or relative directories, but it can be useful, particularly if you have an absolute path stored as a string in a variable and want to load the file it represents.

A call to load always loads the file you ask for, whether you’ve loaded it already or not. If a file changes between loadings, then anything in the new version of the file that rewrites or overrides anything in the original version takes priority. This can be useful, especially if you’re in an irb session while you’re modifying a file in an editor at the same time and want to examine the effect of your changes immediately.

The other file-loading method, require, also searches the directories that lie in the default load path. But require has some features that load doesn’t have.

1.3.3. “Require”-ing a feature

One major difference between load and require is that require, if called more than once with the same arguments, doesn’t reload files it’s already loaded. Ruby keeps track of which files you’ve required and doesn’t duplicate the effort.

require is more abstract than load. Strictly speaking, you don’t require a file; you require a feature. And typically, you do so without even specifying the extension on the filename. To see how this works, change this line in loaddemo.rb,

load "loadee.rb"

to this:

require "./loadee.rb"

When you run loaddemo.rb, you get the same result as before, even though you haven’t supplied the full name of the file you want loaded.

By viewing loadee as a “feature” rather than a file, require allows you to treat extensions written in Ruby the same way you treat extensions written in C—or, to put it another way, to treat files ending in .rb the same way as files ending in .so, .dll, or .bundle.

Specifying the working directory

require doesn’t know about the current working directory (.). You can specify it explicitly

require "./loadee.rb"

or you can append it to the load path using the array append operator,

$: << "."

so you don’t need to specify it in calls to require:

require "loadee.rb"

You can also feed a fully qualified path to require, as you can to load, and it will pull in the file/feature. And you can mix and match; the following syntax works, for example, even though it mixes the static path specification with the more abstract syntax of the feature at the end of the path:

require "/home/users/dblack/book/code/loadee.rb"

Although load is useful, particularly when you want to load a file more than once, require is the day-to-day technique you’ll use to load Ruby extensions and libraries—standard and otherwise. Loading standard library features isn’t any harder than loading loadee. You just require what you want. After you do, and of course depending on what the extension is, you’ll have new classes and methods available to you. Here’s a before-and-after example in an irb session:

The first call to scanf fails with an error . But after the require call , and with no further programmer intervention, string objects like "David Black" respond to the scanf message. (In this example , we’re asking for two consecutive strings to be extracted from the original string, with whitespace as an implied separator.)

1.3.4. require_relative

There’s a third way to load files: require_relative. This command loads features by searching relative to the directory in which the file from which it’s called resides. Thus in the previous example you could do this

require_relative "loadee"

without manipulating the load path to include the current directory. require_relative is convenient when you want to navigate a local directory hierarchy—for example:

require_relative "lib/music/sonata"

We’ll conclude this chapter with an examination of the command-line tools that ship with Ruby.

1.4. Out-of-the-box Ruby tools and applications

When you install Ruby, you get a handful of important command-line tools, which are installed in whatever directory is configured as bindir—usually /usr/local/bin, /usr/bin, or the /opt equivalents. (You can require "rbconfig" and examine Rb-Config::CONFIG["bindir"] to check.) These tools are

· ruby —The interpreter

· irb —The interactive Ruby interpreter

· rdoc and ri —Ruby documentation tools

· rake —Ruby make, a task-management utility

· gem —A Ruby library and application package-management utility

· erb —A templating system

· testrb —A high-level tool for use with the Ruby test framework

In this section we’ll look at all of these tools except erb and testrb. They’re both useful in certain situations but not an immediate priority as you get your bearings and grounding in Ruby.

You don’t need to memorize all the techniques in this section on the spot. Rather, read through it and get a sense of what’s here. You’ll use some of the material soon and often (especially some of the command-line switches and the ri utility) and some of it increasingly as you get more deeply into Ruby.

1.4.1. Interpreter command-line switches

When you start the Ruby interpreter from the command line, you can provide not only the name of a program file, but also one or more command-line switches, as you’ve already seen in the chapter. The switches you choose instruct the interpreter to behave in particular ways and/or take particular actions.

Ruby has more than 20 command-line switches. Some of them are used rarely, while others are used every day by many Ruby programmers. Table 1.6 summarizes the most commonly used ones.

Table 1.6. Summary of commonly used Ruby command-line switches

Switch

Description

Example of usage

-c

Check the syntax of a program file without executing the program

ruby -c c2f.rb

-w

Give warning messages during program execution

ruby -w c2f.rb

-e

Execute the code provided in quotation marks on the command line

ruby -e 'puts "Code demo!"'

-l

Line mode: print a newline after every line of output

ruby -le 'print "+ newline!"'

-rname

Require the named feature

ruby –rprofile

-v

Show Ruby version information and execute the program in verbose mode

ruby –v

--version

Show Ruby version information

ruby –-version

-h

Show information about all command-line switches for the interpreter

ruby –h

Let’s look at each of these switches in more detail.

Check syntax (-c)

The -c switch tells Ruby to check the code in one or more files for syntactical accuracy without executing the code. It’s usually used in conjunction with the -w flag.

Turn on warnings (-w)

Running your program with -w causes the interpreter to run in warning mode. This means you see more warnings printed to the screen than you otherwise would, drawing your attention to places in your program that, although not syntax errors, are stylistically or logically suspect. It’s Ruby’s way of saying, “What you’ve done is syntactically correct, but it’s weird. Are you sure you meant to do that?” Even without this switch, Ruby issues certain warnings, but fewer than it does in full warning mode.

Execute literal script (-e)

The -e switch tells the interpreter that the command line includes Ruby code in quotation marks, and that it should execute that actual code rather than execute the code contained in a file. This can be handy for quick scripting jobs where entering your code into a file and running ruby on the file may not be worth the trouble.

For example, let’s say you want to see your name printed out backward. Here’s how you can do this quickly in one command-line command, using the execute switch:

$ ruby -e 'puts "David A. Black".reverse'
kcalB .A divaD

What lies inside the single quotation marks is an entire (although short) Ruby program. If you want to feed a program with more than one line to the -e switch, you can use literal line breaks (press Enter) inside the mini-program:

$ ruby -e 'print "Enter a name: "
puts gets.reverse'
Enter a name: David A. Black

kcalB .A divaD

Or you can separate the lines with semicolons:

$ ruby -e 'print "Enter a name: "; print gets.reverse'

Note

Why is there a blank line between the program code and the output in the two-line reverse example? Because the line you enter on the keyboard ends with a newline character, so when you reverse the input, the new string starts with a newline! Ruby takes you very literally when you ask it to manipulate and print data.

Run in line mode (-l)

The -l switch produces the effect that every string output by the program is placed on a line of its own, even if it normally wouldn’t be. Usually this means that lines that are output using print, rather than puts, and that therefore don’t automatically end with a newline character, now end with a newline.

We made use of the print versus puts distinction to ensure that the temperature-converter programs didn’t insert extra newlines in the middle of their output (see section 1.1.5). You can use the -l switch to reverse the effect; it causes even printed lines to appear on a line of their own. Here’s the difference:

$ ruby c2f-2.rb
The result is 212.
$ ruby -l c2f-2.rb
The result is
212
.

The result with -l is, in this case, exactly what you don’t want. But the example illustrates the effect of the switch.

If a line ends with a newline character already, running it through -l adds another newline. In general, the -l switch isn’t commonly used or seen, largely because of the availability of puts to achieve the “add a newline only if needed” behavior, but it’s good to know -l is there and to be able to recognize it.

Require named file or extension (-rname)

The -r switch calls require on its argument; ruby -rscanf will require scanf when the interpreter starts up. You can put more than one –r switch on a single command line.

Run in verbose mode (-v, --verbose)

Running with -v does two things: it prints out information about the version of Ruby you’re using, and then it turns on the same warning mechanism as the -w flag. The most common use of -v is to find out the Ruby version number:

$ ruby -v
ruby 2.1.0p0 (2013-12-25 revision 44422) [x86_64-darwin12.0]

In this case, we’re using Ruby 2.1.0 (patchlevel 0), released on December 25, 2013, and compiled for an i686-based machine running Mac OS X. Because there’s no program or code to run, Ruby exits as soon as it has printed the version information.

Print Ruby version (--version)

This flag causes Ruby to print a version information string and then exit. It doesn’t execute any code, even if you provide code or a filename. You’ll recall that -v prints version information and then runs your code (if any) in verbose mode. You might say that -v is slyly standing for bothversion and verbose, whereas --version is just version.

Print some help information (-h, --help)

These switches give you a table listing all the command-line switches available to you, and summarizing what they do.

In addition to using single switches, you can also combine two or more in a single invocation of Ruby.

Combining switches (-cw)

You’ve already seen the -cw combination, which checks the syntax of the file without executing it, while also giving you warnings:

$ ruby -cw filename

Another combination of switches you’ll often see is -v and -e, which shows you the version of Ruby you’re running and then runs the code provided in quotation marks. You’ll see this combination a lot in discussions of Ruby, on mailing lists, and elsewhere; people use it to demonstrate how the same code might work differently in different versions of Ruby. For example, if you want to show clearly that a string method called start_with? wasn’t present in Ruby 1.8.6 but is present in Ruby 2.1.0, you can run a sample program using first one version of Ruby and then the other:

(Of course, you must have both versions of Ruby installed on your system.) The undefined method 'start_with?' message on the first run (the one using version 1.8.6) means that you’ve tried to perform a nonexistent named operation. But when you run the same Ruby snippet using Ruby 2.1.0, it works : Ruby prints true. This is a convenient way to share information and formulate questions about changes in Ruby’s behavior from one release to another.

At this point, we’ll go back and look more closely at the interactive Ruby interpreter, irb. You may have looked at this section already, when it was mentioned near the beginning of the chapter. If not, you can take this opportunity to learn more about this exceptionally useful Ruby tool.

Specifying switches

You can feed Ruby the switches separately, like this

$ ruby -c -w

or

$ ruby -v -e "puts 'abc'.start_with?('a')"

But it’s common to type them together, as in the examples in the main text.

1.4.2. A closer look at interactive Ruby interpretation with irb

As you’ve seen, irb is an interactive Ruby interpreter, which means that instead of processing a file, it processes what you type in during a session. irb is a great tool for testing Ruby code and a great tool for learning Ruby.

To start an irb session, you use the command irb. irb prints out its prompt:

$ irb
2.1.0 :001 >

As you’ve seen, you can also use the --simple-prompt option to keep irb’s output shorter:

$ irb --simple-prompt
>>

Once irb starts, you can enter Ruby commands. You can even run a one-shot version of the Celsius-to-Fahrenheit conversion program. As you’ll see in this example, irb behaves like a pocket calculator: it evaluates whatever you type in and prints the result. You don’t have to use a print orputs command:

>> 100 * 9 / 5 + 32
=> 212

To find out how many minutes there are in a year (if you don’t have a CD of the relevant hit song from the musical Rent handy), type in the appropriate multiplication expression:

>> 365 * 24 * 60
=> 525600

irb will also, of course, process any Ruby instructions you enter. For example, if you want to assign the day, hour, and minute counts to variables, and then multiply those variables, you can do that in irb:

>> days = 365
=> 365
>> hours = 24
=> 24
>> minutes = 60
=> 60
>> days * hours * minutes
=> 525600

The last calculation is what you’d expect. But look at the first three lines of entry. When you type days = 365, irb responds by printing 365. Why?

The expression days = 365 is an assignment expression: you’re assigning the value 365 to a variable called days. The main business of an assignment expression is to assign a value to a variable so that you can use the variable later. But the assignment expression (the entire line days = 365) has a value. The value of an assignment expression is its right-hand side. When irb sees any expression, it prints out the value of that expression. So, when irb sees days = 365, it prints out 365. This may seem like overzealous printing, but it comes with the territory; it’s the same behavior that lets you type 2 + 2 into irb and see the result without having to use an explicit print statement.

Similarly, even a call to the puts method has a return value—namely, nil. If you type a puts statement in irb, irb will obediently execute it, and will also print out the return value of puts:

$ irb --simple-prompt
>> puts "Hello"
Hello
=> nil

There’s a way to get irb not to be quite so talkative: the --noecho flag. Here’s how it works:

$ irb --simple-prompt --noecho
>> 2 + 2
>> puts "Hi"
Hi

Thanks to --noecho, the addition expression doesn’t report back its result. The puts command does get executed (so you see "Hi"), but the return value of puts (nil) is suppressed.

Interrupting irb

It’s possible to get stuck in a loop in irb, or for the session to feel like it’s not responding (which often means you’ve typed an opening quotation mark but not a closing one, or something along those lines). How you get control back is somewhat system-dependent. On most systems, Ctrl-C will do the trick. On others, you may need to use Ctrl-Z. It’s best to apply whatever general program-interrupting information you have about your system directly to irb. Of course, if things get really frozen, you can go to your process or task-management tools and kill the irb process.

To exit from irb normally, you can type exit. On many systems, Ctrl-D works too.

Occasionally, irb may blow up on you (that is, hit a fatal error and terminate itself). Most of the time, though, it catches its own errors and lets you continue.

Once you get the hang of irb’s approach to printing out the value of everything, and how to shut it up if you want to, you’ll find it an immensely useful tool (and toy).

Ruby’s source code is marked up in such a way as to provide for automatic generation of documentation; and the tools needed to interpret and display that documentation are ri and RDoc, which we’ll look at now.

1.4.3. ri and RDoc

ri (Ruby Index) and RDoc (Ruby Documentation), originally written by Dave Thomas, are a closely related pair of tools for providing documentation about Ruby programs. ri is a command-line tool; the RDoc system includes the command-line tool rdoc. ri and rdoc are standalone programs; you run them from the command line. (You can also use the facilities they provide from within your Ruby programs, although we’re not going to look at that aspect of them here.)

RDoc is a documentation system. If you put comments in your program files (Ruby or C) in the prescribed RDoc format, rdoc scans your files, extracts the comments, organizes them intelligently (indexed according to what they comment on), and creates nicely formatted documentation from them. You can see RDoc markup in many of the source files, in both Ruby and C, in the Ruby source tree, and in many of the Ruby files in the Ruby installation.

ri dovetails with RDoc: it gives you a way to view the information that RDoc has extracted and organized. Specifically (although not exclusively, if you customize it), ri is configured to display the RDoc information from the Ruby source files. Thus on any system that has Ruby fully installed, you can get detailed information about Ruby with a simple command-line invocation of ri.

For example, here’s how you request information about the upcase method of string objects:

$ ri String#upcase

And here’s what you get back:

= String#upcase

(from ruby core)
------------------------------------------------------------------------------
str.upcase -> new_str


------------------------------------------------------------------------------

Returns a copy of str with all lowercase letters replaced with their
uppercase counterparts. The operation is locale insensitive---only characters
``a'' to ``z'' are affected. Note: case replacement is effective only in
ASCII region.

"hEllO".upcase #=> "HELLO"

The hash mark (#) between String and upcase in the ri command indicates that you’re looking for an instance method, as distinct from a class method. In the case of a class method, you’d use the separator :: instead of #. We’ll get to the class method/instance method distinction in chapter 3. The main point for the moment is that you have lots of documentation at your disposal from the command line.

Tip

By default, ri runs its output through a pager (such as more on Unix). It may pause at the end of output, waiting for you to hit the spacebar or some other key to show the next screen of information or to exit entirely if all the information has been shown. Exactly what you have to press in this case varies from one operating system, and one pager, to another. Spacebar, Enter, Escape, Ctrl-C, Ctrl-D, and Ctrl-Z are all good bets. If you want ri to write the output without filtering it through a pager, you can use the –T command-line switch (ri –T topic).

Next among the Ruby command-line tools is rake.

1.4.4. The rake task-management utility

As its name suggests (it comes from “Ruby make”), rake is a make-inspired task-management utility. It was written by the late Jim Weirich. Like make, rake reads and executes tasks defined in a file—a Rakefile. Unlike make, however, rake uses Ruby syntax to define its tasks.

Listing 1.5 shows a Rakefile. If you save the listing as a file called Rakefile, you can then issue this command at the command line:

$ rake admin:clean_tmp

rake executes the clean_tmp task defined inside the admin namespace.

Listing 1.5. Rakefile defining clean_tmp tasks inside the admin namespace

The rake task defined here uses several Ruby techniques that you haven’t seen yet, but the basic algorithm is pretty simple:

1. Loop through each directory entry in the /tmp directory .

2. Skip the current loop iteration unless this entry is a file. Note that hidden files aren’t deleted either, because the directory listing operation doesn’t include them .

3. Prompt for the deletion of the file .

4. If the user types y (or anything beginning with y), delete the file .

5. If the user types q, break out of the loop; the task stops .

The main programming logic comes from looping through the list of directory entries (see the sidebar “Using each to loop through a collection”) and from the case statement, a conditional execution structure. (You’ll see both of these techniques in detail later in chapter 6.)

Using each to loop through a collection

The expression Dir["/tmp/*"].each do |f| is a call to the each method of the array of all the directory entry names. The entire block of code starting with do and ending with end (the end that lines up with Dir in the indentation) gets executed once for each item in the array. Each time through, the current item is bound to the parameter f; that’s the significance of the |f| part. You’ll see each several times in the coming chapters, and we’ll examine it in detail when we look at iterators (methods that automatically traverse collections) in chapter 9.

The desc command above the task definition provides a description of the task. This comes in handy not only when you’re perusing the file, but also if you want to see all the tasks that rake can execute at a given time. If you’re in the directory containing the Rakefile in listing 1.5 and you give the command

$ rake --tasks

you’ll see a listing of all defined tasks:

$ rake --tasks
(in /Users/ruby/hacking)
rake admin:clean_tmp # Interactively delete all files in /tmp

You can use any names you want for your rake namespaces and tasks. You don’t even need a namespace; you can define a task at the top-level namespace,

task :clean_tmp do
# etc.
end

and then invoke it using the simple name:

$ rake clean_tmp

But namespacing your tasks is a good idea, particularly if and when the number of tasks you define grows significantly. You can namespace to any depth; this structure, for example, is legitimate:

namespace :admin do
namespace :clean do
task :tmp do
# etc.
end
end
end

The task defined here is invoked like this:

$ rake admin:clean:tmp

As the directory-cleaning example shows, rake tasks don’t have to be confined to actions related to Ruby programming. With rake, you get the whole Ruby language at your disposal, for the purpose of writing whatever tasks you need.

The next tool on the tour is the gem command, which makes installation of third-party Ruby packages very easy.

1.4.5. Installing packages with the gem command

The RubyGems library and utility collection includes facilities for packaging and installing Ruby libraries and applications. We’re not going to cover gem creation here, but we’ll look at gem installation and usage.

Installing a Ruby gem can be, and usually is, as easy as issuing a simple install command:

$ gem install prawn

Such a command gives you output something like the following (depending on which gems you already have installed and which dependencies have to be met by installing new gems):

Fetching: Ascii85-1.0.2.gem (100%)
Fetching: ruby-rc4-0.1.5.gem (100%)
Fetching: hashery-2.1.0.gem (100%)
Fetching: ttfunk-1.0.3.gem (100%)
Fetching: afm-0.2.0.gem (100%)
Fetching: pdf-reader-1.3.3.gem (100%)
Fetching: prawn-0.12.0.gem (100%)
Successfully installed Ascii85-1.0.2
Successfully installed ruby-rc4-0.1.5
Successfully installed hashery-2.1.0
Successfully installed ttfunk-1.0.3
Successfully installed afm-0.2.0
Successfully installed pdf-reader-1.3.3
Successfully installed prawn-0.12.0
7 gems installed

These status reports are followed by several lines indicating that ri and RDoc documentation for the various gems are being installed. (The installation of the documentation involves processing the gem source files through RDoc, so be patient; this is often the longest phase of gem installation.)

During the gem installation process, gem downloads gem files as needed from rubygems.org (www.rubygems.org). Those files, which are in .gem format, are saved in the cache subdirectory of your gems directory. You can also install a gem from a gem file residing locally on your hard disk or other storage medium. Give the name of the file to the installer:

$ gem install /home/me/mygems/ruport-1.4.0.gem

If you name a gem without the entire filename (for example, ruport), gem looks for it in the current directory and in the local cache maintained by the RubyGems system. Local installations still search remotely for dependencies, unless you provide the -l (local) command-line flag to the gemcommand; that flag restricts all operations to the local domain. If you want only remote gems installed, including dependencies, then you can use the -r (remote) flag. In most cases, the simple gem install gemname command will give you what you need. (To uninstall a gem, use the gem uninstall gemname command.)

Once you’ve got a gem installed, you can use it via the require method.

Loading and using gems

While you won’t see gems in your initial load path ($:), you can still “require” them and they’ll load. Here’s how you’d require "hoe" (a utility that helps you package your own gems), assuming you’ve installed the Hoe gem:

>> require "hoe"
=> true

At this point, the relevant hoe directory will appear in the load path, as you can see if you print out the value of $: and grep (select by pattern match) for the pattern "hoe":

>> puts $:.grep(/hoe/)
/Users/dblack/.rvm/gems/ruby-2.1.0/gems/hoe-3.8.1/lib

If you have more than one gem installed for a particular library and want to force the use of a gem other than the most recent, you can do so using the gem method. (Note that this method isn’t the same as the command-line tool called gem.) Here, for example, is how you’d force the use of a not-quite-current version of Hoe:

Most of the time, of course, you’ll want to use the most recent gems. But the gem system gives you tools for fine-tuning your gem usage, should you need to do so.

With the subject of RubyGems on the map, we’re now finished with our current business—the bin/ directory—and we’ll move next to a close study of the core language.

1.5. Summary

In this chapter, we’ve looked at a number of important foundational Ruby topics, including

· The difference between Ruby (the language) and ruby (the Ruby interpreter)

· The typography of Ruby variables (all of which you’ll meet again and study in more depth)

· Basic Ruby operators and built-in constructs

· Writing, storing, and running a Ruby program file

· Keyboard input and screen output

· Manipulating Ruby libraries with require and load

· The anatomy of the Ruby installation

· The command-line tools shipped with Ruby

You now have a good blueprint of how Ruby works and what tools the Ruby programming environment provides, and you’ve seen and practiced some important Ruby techniques. You’re now prepared to start exploring Ruby systematically.