Play Well with Others - Build Awesome Command-Line Applications in Ruby 2: Control Your Computer, Simplify Your Life (2013)

Build Awesome Command-Line Applications in Ruby 2: Control Your Computer, Simplify Your Life (2013)

Chapter 4. Play Well with Others

In the previous two chapters, we learned how to write command-line applications that are easy to use and self-documenting. Such apps get their input via sophisticated and easy-to-create command-line interfaces and provide help text and documentation that let users get up to speed quickly. But the user at a terminal is only part of the story. Our app will be used by other apps, possibly even integrated into a sophisticated system. db_backup.rb is an example. We will probably want to run our database app nightly and we won’t want to log into our server at midnight to do it, so we’ll arrange for another app (such as cron) to run it for us. The fact is, we don’t know who will run our app or how it will be run. What can we do to make sure it plays well with others?

On the command line, an app that “plays well with others” is an app that can be used easily by other apps directly or indirectly, alongside other apps, on the command line. db_backup.rb is a great example: it uses mysqldump and gzip to back up and compress a database. As we’ll see, because both of these commands were designed to play well with others, it will be easy to make db_backup.rb a robust, easy-to-use app.

Much like making your app helpful and easy to use, making it play well with others is quite straightforward once you understand how command-line apps interact. Command-line apps communicate with each other over a small number of simple interfaces, three of which we’ll go over in this chapter. The first is a simple messaging system between an app and an external command it invokes called exit codes ; this messaging system allows us to send valuable information back to the app that called our app. The second is the output streams of an app; two standard output streams are available to all apps, and by following some simple conventions, apps can play well with each other in a simple and straightforward manner. We’ll also talk about what sorts of output our app should be generating to make it work well in any situation. Along the way, we’ll see how Ruby helps us take advantage of these features, all with built-in methods (like exit ) and standard libraries (like Open3). Finally, we’ll discuss a simple signaling mechanism that can be used to communicate with long-running or “daemon-style” apps.

First let’s learn about exit codes, which are one of the simplest ways that programs interact with the system.

4.1 Using Exit Codes to Report Success or Failure

Every time a process finishes, it has the opportunity to return a single number to the process that called it. This is referred to as its exit code or exit status . A status of zero indicates that the program succeeded in what it was asked to do. Any other value indicates there was some sort of problem and that the command failed in some way. The shell variable $? allows us to examine the exit code of any app we run. We can use this to verify that mysqldump, the command our database backup app uses to perform the actual backup, exits zero on success and nonzero otherwise, like so:

$ mysqldump non_existent_database

No such database 'non_existent_database'

$ echo $?

1

$ mysqldump existing_database > existing_database.sql

$ echo $?

0

The first time it was called, mysqldump failed to perform a backup and exited with a 1. In the second invocation, it successfully backed up a database, so mysqldump exited with 0. We’d like to take advantage of this. As you’ll recall, db_backup.rb uses Ruby’s built-in system method to run mysqldump as follows:

play_well/db_backup/bin/db_backup.rb

auth = ""

auth += "-u#{options[:user]} " if options[:user]

auth += "-p#{options[:password]} " if options[:password]

database_name = ARGV[0]

output_file = "#{database_name}.sql"

command = "mysqldump #{auth}#{database_name} > #{output_file}"

system(command)

This code is pretty straightforward; we build up a few helper variables using our parsed command-line options, assemble them into a command string, and use system to execute the command. What happens if we use a database name that doesn’t exist, as we did in our earlier example?

As we saw earlier, if we ask it to back up a nonexistent database, then mysqldump will fail and exit nonzero. Since the primary purpose of db_backup.rb is to back up a database, a failure to do so should result in a message to the user and a nonzero exit code. To make that happen, we need to check the exit code of mysqldump.

Accessing Exit Codes of Other Commands

Like bash, system sets the value of $? after a command is executed. Unlike bash, this value is not the exit code itself but an instance of Process::Status, a class that contains, among other things, the exit code. $? is intuitive if you’ve done a lot of bash programming, but it’s otherwise a pretty bad variable name. To make our code a bit more readable, we can use a built-in library called English that will allow us to access this variable via the more memorable $CHILD_STATUS. We require the English built-in library first and can then use $CHILD_STATUS to examine the results of our system call.

play_well/db_backup/bin/db_backup.rb

require 'English'

puts "Running '#{command}'"

system(command)

*

unless $CHILD_STATUS.exitstatus == 0

puts "There was a problem running '#{command}'"

end

By using the exit code, we can now get better information about the commands we run, which in turn allows us to give better information to the user. The error message we output isn’t enough; our app needs to follow the exit code convention just like mysqldump does and exit with nonzero when it can’t do what the user asked. Our current implementation always exits with zero, since that’s the default behavior of any Ruby program that doesn’t explicitly set its exit code. This means that no other app has any visibility to the success or failure of our backup.

Sending Exit Codes to Calling Processes

Setting the exit code is very simple. Ruby provides a built-in method, exit , that takes an integer representing the exit status of our app. In our case, if the mysqldump command failed, we need to fail, so we simply add a call to exit right after we print an error message.

play_well/db_backup/bin/db_backup.rb

puts "Running '#{command}'"

system(command)

*

unless $CHILD_STATUS.exitstatus == 0

puts "There was a problem running '#{command}'"

*

exit 1

end

Are There Any Standard Exit Codes?

As we mentioned, any nonzero exit code is considered an error. We also saw that we can use different error codes to mean different failures in our app. If there are any standard error codes, you may be wondering whether users of our app expect certain failures to always be encoded as certain values. There is no common standard across all command-line apps; however, some operating systems do recommend standard codes.

FreeBSD[28] has a list of recommendations in the man page for sysexits, available at http://www.freebsd.org/cgi/man.cgi?query=sysexits&sektion=3 (or by running man 3 sysexits on any FreeBSD system). For example, it recommends using the value 64 for a problem parsing the command line and using 77 for a lack of permission to access a resource. The Ruby gem sysexits[29]provides an abstraction layer for your apps which maps logical names to these values.

The GNU Project[30] provides some less-specific recommendations, available at http://www.gnu.org/software/libc/manual/html_node/Exit-Status.html . Among their recommendations are to reserve numbers greater than 128 for special purposes and not to use the number of failures as the exit code.

Whether you choose to follow these conventions is up to you; however, it can’t hurt to follow them if you have no reason not to, especially if your app is specific to FreeBSD or relies heavily on GNU apps to work. In the end, what’s most important is that you clearly document the exit codes, regardless of which you use.

Now if mysqldump experiences a problem, that problem bubbles up to whoever called db_backup.rb. Is there anywhere else in the code that would be considered an error? In Chapter 3, Be Helpful, we added some code to check for a missing database name on the command line. When that happened, we displayed the help text as well as an error message. Although our app is being helpful by displaying both, it ultimately didn’t do what it was asked and thus should exit nonzero. Let’s add a call to exit to correct that omission:

play_well/db_backup/bin/db_backup.rb

option_parser.parse!

if ARGV.empty?

puts "error: you must supply a database name"

puts

puts option_parser.help

*

exit 2

end

You’ll notice that we used an exit code of 2 here, whereas we used 1 when mysqldump failed. Both of these values are nonzero and thus signal that our app failed, so what advantage do we get by using different values?

By associating a unique exit code with each known error condition, we give developers who use our app more information about failures, in turn giving them more flexibility in how they integrate our app into their system. For example, if another developer wanted to treat a failure to back up as a warning but a missing database name as an error, the following code could be used to implement this using db_backup.rb’s exit codes:

system("db_backup.rb #{database_name}")

*

if $?.exitstatus == 1

puts "warn: couldn't back up #{database_name}"

elsif $?.exitstatus != 0

puts "error: problem invoking db_backup"

exit 1

end

Making the most of exit codes allows users of our app to use it in ways we haven’t imagined. This is part of the beauty and flexibility of the command line, and exit codes are a big part of making that happen. By having each error condition represented by a different exit code, invokers have maximum flexibility in integrating our app into their systems.

There is a small limitation in using exit codes this way; only one piece of information can be sent back. Suppose we could detect several errors and wanted to let the caller know exactly what errors did, and didn’t, occur?

Reporting Multiple Errors in the Exit Status

Although an exit code is only a single number, we can encode several bits of information in that number by treating it as a bitmask. In this strategy, each possible error is represented by a bit in the number we send back. For example, suppose that an invalid command-line option is represented by the bit in the 1 position and that the omission of the database name is represented by the bit in the 2 position. We could let the user know if they forgot the database and if they gave us a bad username like so:

*

exit_status = 0

begin

option_parser.parse!

if ARGV.empty?

puts "error: you must supply a database name"

puts

puts option_parser.help

*

exit_status |= 0b0010

end

rescue OptionParser::InvalidArgument => ex

puts ex.message

puts option_parser

*

exit_status |= 0b0001

end

*

exit exit_status unless exit_status == 0

# Proceed with the rest of the program

If it’s been a while since you’ve done any bit-shifting, the |= operator tells Ruby to set a particular set of bits in the given number. We’re using Ruby’s binary literal syntax, so it’s easy to see which bits are getting set. As we’ll see in a minute, the & operator can be used to check whether a particular bit is set.

Also note that we must use the begin..rescue..end construct to detect an invalid command-line argument because option_parser.parse! will raise an OptionParser::InvalidArgument exception in that case. The message of that exception contains a reasonable error message explaining the problem. An invoker of our app could check for it like so:

system("db_backup.rb medium_client")

if $?.exitstatus & 0b0001

puts "error: bad command-line options"

end

if $?.exitstatus & 0b0010

puts "error: forgot the database name"

end

Note that the exit code is only an 8-bit number, so only eight possible errors can be encoded. This is probably sufficient for most uses, but it’s a constraint to be aware of. The exit code strategy you use depends on the situation and the types of errors you can detect before exiting. Whichever method you decide on, be sure to document it!

Exit codes are all about communicating a result to a calling program. Many command-line apps, however, produce more output than just a single value. Our database backup app produces an output file, as well as messages about what the app is doing. Our to-do list app produces output in the form of our to-do items (for example, when we execute todo list). As we’ll see, the form of our output can also communicate information to callers, as well as give them flexibility in integrating our apps into other processes.

4.2 Using the Standard Output and Error Streams Appropriately

In addition to the ability to return a single value to the calling program, all programs have the ability to provide output. The puts method is the primary way of creating output that we’ve seen thus far. We’ve used it to send messages to the terminal. A command line’s output mechanism is actually more sophisticated than this; it’s possible to send output to either of two standard output streams .

By convention, the default stream is called the standard output and is intended to receive whatever normal output comes out of your program. This is where puts sends its argument and where, for example, mysqldump sends the SQL statements that make up the database backup.[31]

The second output stream is called the standard error stream and is intended for error messages. The reason there are two different streams is so that the calling program can easily differentiate normal output from error messages. Consider how we use mysqldump in db_backup.rb:

play_well/db_backup/bin/db_backup.rb

command = "mysqldump #{auth}#{database_name} > #{output_file}"

system(command)

unless $CHILD_STATUS.exitstatus == 0

puts "There was a problem running '#{command}'"

exit 1

end

Currently, when our app exits with a nonzero status, it outputs a generic error message. This message doesn’t tell the user the nature of the problem, only that something went wrong. mysqldump actually produces a specific message on its standard error stream. We can see this by using the UNIX redirect operator (>) to send mysqldump’s standard output to a file, leaving the standard error as the only output to our terminal:

$ mysqldump some_nonexistent_database > backup.sql

mysqldump: Got error: 1049: Unknown database 'some_nonexistent_database' \

when selecting the database

backup.sql contains the standard output that mysqldump generated, and we see the standard error in our terminal; it’s the message about an unknown database. If we could access this message and pass it along to the user, the user would know the actual problem.

Using Open3 to Access the Standard Output and Error Streams Separately

The combination of system and $CHILD_STATUS that we’ve used so far provides access only to the exit status of the application. We can get access to the standard output by using the built-in backtick operator (‘) or the %x[] construct, as in stdout = %x[ls -l]. Unfortunately, neither of these constructs provides access to the standard error stream. To get access to both the standard output and the standard error independently, we need to use a module from the standard library called Open3.

Open3 has several useful methods, but the most straightforward is capture3 . It’s so-named because it “captures” the standard output and error streams (each as a String), as well as the status of the process (as a Process::Status, the same type of variable as $CHILD_STATUS). We can use this method’s results to augment our generic error message with the contents of the standard error stream like so:

play_well/db_backup/bin/db_backup_2.rb

require 'open3'

puts "Running '#{command}'"

*

stdout_str, stderr_str, status = Open3.capture3(command)

unless status.exitstatus == 0

puts "There was a problem running '#{command}'"

*

puts stderr_str

exit -1

end

The logic is exactly the same, except that we have much more information to give the user when something goes wrong. Since the standard error from mysqldump contains a useful error message, we’re now in a position to pass it along to the user:

$ db_backup.rb -u dave.c -p P@ss5word some_nonexistent_database

There was a problem running 'mysqldump -udavec -pP@55word \

some_nonexistent_database > some_nonexistent_database.sql'

*

mysqldump: Got error: 1049: Unknown database 'some_nonexistent_database' \

*

when selecting the database

Our use of the standard error stream allows us to “handle” any error from mysqldump, such as bad login credentials, which also generates a useful error message:

$ db_backup.rb -u dave.c -p password some_nonexistent_database

There was a problem running 'mysqldump -udavec -ppassword \

some_nonexistent_database > some_nonexistent_database.sql'

*

mysqldump: Got error: 1044: Access denied for user 'dave.c'@'localhost'\

*

to database 'some_nonexistent_database' when selecting the database

It’s always good practice to capture the output of the commands you run and either send it to your app’s output or store it in a log file for later reference (we’ll see later why you might not want to just send such output to your app’s output directly).

Now that we can read these output streams from programs we execute, we need to start writing to them as well. We just added new code to output an error message, but we used puts , which sends output to the standard output stream. We need to send our error messages to the right place.

Use STDOUT and STDERR to Send Output to the Correct Stream

Under the covers, puts sends output to STDOUT, which is a constant provided by Ruby that allows access to the standard output stream. It’s an instance of IO, and essentially the code puts "hello world" is equivalent to STDOUT.puts "hello world".

Ruby sets another constant, STDERR, to allow output to the standard error stream (see STDOUT and STDERR vs. $stdout and $stderr for another way to access these streams). Changing our app to use STDERR to send error messages to the standard error stream is trivial:

play_well/db_backup/bin/db_backup_3.rb

stdout_str, stderr_str, status = Open3.capture3(command)

unless status.success?

*

STDERR.puts "There was a problem running '#{command}'"

*

STDERR.puts stderr_str.gsub(/^mysqldump: /,'')

exit 1

end

You could also use the method warn (provided by Kernel) to output messages to the standard error stream. Messages sent with warn can be disabled by the user, using the -W0 flag to ruby (or putting that in the environment variable RUBYOPTS, which is read by Ruby before running any Ruby app). If you want to be sure the user sees the message, however, use STDERR.puts .

Users of our app can now use our standard error stream to get any error messages we might generate. In general, the standard error of apps we call should be sent to our standard error stream.

We now know how to read output from and write output to the appropriate error stream, and we’ve started to get a sense of what messages go where. Error messages go to the standard error stream, and “everything else” goes to the standard output stream. How do we know what’s an “error message” and what’s not? And for our “normal” output, what format should we use to be most interoperable with other applications?

Use the standard error stream for any message that isn’t the proper, expected output of your application. We can take a cue from mysqldump here; mysqldump produces the database backup, as SQL, to its standard output. Everything else it produces goes to the standard error. It’s also important to produce something to the standard error if your app is going to exit nonzero; this is the only way to tell the user what went wrong.

The standard output, however, is a bit more complicated. You’ll notice that mysqldump produces a very specific format of output to the standard output (SQL). There’s a reason for this. Its output is designed to be handed off, directly, as input to another app. Achieving this is not nearly as straightforward as producing a human-readable error message, as we’ll see in the next section.

STDOUT and STDERR vs. $stdout and $stderr

In addition to assigning the constants STDOUT and STDERR to the standard output and error streams, respectively, Ruby also assigns the global variables $stdout and $stderr to these two streams (in fact, puts uses $stdout internally).

Deciding which one to use is a mostly a matter of taste, but it’s worth noting that by using the variable forms, you can easily reassign the streams each represents. Although reassigning the value of a constant is possible in Ruby, it’s more straightforward to reassign the value of a variable. For example, you might want to reassign your input and output during testing to capture what’s going to the standard error or output streams.

We’ll use the constant forms in this book, because we want to think of the standard output and error streams as immutable. The caller of our app should decide whether these streams should be redirected elsewhere, and if we ever need to send output to one of the streams or another IO instance, we would abstract that out, rather than reassign $stdin.

4.3 Formatting Output for Use As Input to Another Program

If you’ve used the UNIX command line for even five minutes, you’ve used ls. It shows you the names of files in a particular directory. Suppose you have a directory of images with numeric names. ls nicely formats it for you:

$ ls

10_image.jpg 2_image.jpg 5_image.jpg 8_image.jpg

11_image.jpg 3_image.jpg 6_image.jpg 9_image.jpg

1_image.jpg 4_image.jpg 7_image.jpg

Suppose we want to see these files in numeric order. ls has no ability to do this; it sorts lexicographically, even when we use the invocation ls -1, which produces one file per line, in “sorted” order:[32]

$ ls -1

10_image.jpg

11_image.jpg

1_image.jpg

2_image.jpg

3_image.jpg

4_image.jpg

5_image.jpg

6_image.jpg

7_image.jpg

8_image.jpg

9_image.jpg

The UNIX command sort, however, can sort lines of text numerically if we give it the -n switch. If we could connect the output of ls to the input of sort, we could see our files sorted numerically, just how we’d like. Fortunately, the command line provides a way to do this. We follow our invocation of ls -1 with the pipe symbol (|) and follow that with a call to sort -n. This tells sort to use, as input, the standard output that came from ls:

$ ls -1 | sort -n

1_image.jpg

2_image.jpg

3_image.jpg

4_image.jpg

5_image.jpg

6_image.jpg

7_image.jpg

8_image.jpg

9_image.jpg

10_image.jpg

11_image.jpg

If the creator of ls had not provided an output format that is one file per line, this would’ve been very difficult to do, and we would’ve had to write a custom program to parse the default output format of ls. The ability to connect these two commands is what makes the command line so powerful. You’ll find that all UNIX commands obey this convention of formatting their output in a way to be used as input to another program. This is often referred to as the “UNIX Way,” summed up neatly at faqs.org ( http://www.faqs.org/docs/artu/ch01s06.html ):

Expect the output of every program to become the input to another, as yet unknown, program. Don’t clutter output with extraneous information. Avoid stringently columnar or binary input formats.

How can we design our output to work as input to a program we know nothing about? It’s actually pretty simple, once you’re aware of a few conventions. Most command-line apps operate on one or more “things” that we can generically think of as records . As we’ll see, each record should be on its own line. We got a hint of how handy that is in our earlier sorting example, in which a record in ls is a file. By using the -1 option, we got one record (file) per line. Currently, the output format of our todo list app is a multiline “pretty-printed” format that looks like so:

$ todo list

1 - Clean kitchen

Created: 2011-06-03 13:45

2 - Rake leaves

Created: 2011-06-03 17:31

Completed: 2011-06-03 18:34

3 - Take out garbage

Created: 2011-06-02 15:48

4 - Clean bathroom

Created: 2011-06-01 12:00

This formatting might be pleasing to a human eye, but it’s a nightmare as input to another program. Suppose we wanted to use our good friend sort to sort the list. As we’ve seen, sort sorts lines of text, so a naive attempt to sort our to-do list will lead to disastrous results:

$ todo list | sort

Completed: 2011-06-03 18:34

Created: 2011-06-01 12:00

Created: 2011-06-02 15:48

Created: 2011-06-03 13:45

Created: 2011-06-03 17:31

1 - Clean kitchen

2 - Rake leaves

3 - Take out garbage

4 - Clean bathroom

If we could format each record (in our case, a task) on one line by itself, we could then use UNIX tools like sort and cut to manipulate the output for todo to get a properly sorted list. But beyond interoperability with standard UNIX tools, we want our app to be able to work with as many apps as possible, in ways we haven’t thought of. This means that users can get the most out of our app and won’t need to wait for us to add special features. The easiest way to make that happen is to spend some time thinking about how to format each record.

Format Output One Record per Line, Delimiting Fields

We can easily format our records for output one line at a time like so:

$ todo list

1 Clean kitchen Created 2011-06-03 13:45

2 Rake leaves Created 2011-06-03 17:31 Completed 2011-06-03 18:34

3 Take out garbage Created 2011-06-02 15:48

4 Clean bathroom Created 2011-06-01 12:00

This approach certainly follows our “one record per line” rule, but it’s not that useful. We can’t reliably tell where the task name stops and the created date begins. This makes it hard to use a command like cut to extract, say, just the task name. cut expects a single character to separate each field of our record. In our case, there is no such character; an app that wanted to extract the different fields from our tasks would need a lot of smarts to make it work.

If we format our output by separating each field with an uncommon character, such as a comma, parsing each record becomes a lot simpler; we just need to document which field is which so a user can use a command like cut or awk to split up each line into fields. We can demonstrate the power of this format by using a few UNIX commands to get a list of our task names, sorted alphabetically.

$ todo list

1,Clean kitchen,2011-06-03 13:45,U

2,Rake leaves,2011-06-03 17:31,2011-06-03 18:34,C

3,Take out garbage,2011-06-02 15:48,U

4,Clean bathroom,2011-06-01 12:00,U

$ todo list | cut -d',' -f2

Clean kitchen

Rake leaves

Take out garbage

Clean bathroom

$ todo list | cut -d',' -f2 | sort

Clean bathroom

Clean kitchen

Rake leaves

Take out garbage

The code to do that is fairly trivial:

play_well/todo/bin/todo

complete_flag = completed ? "C" : "U"

printf("%d,%s,%s,%s,%s\n",index,name,complete_flag,created,completed)

By formatting our output in a general and parseable way, it can serve as input to any other program, and our app is now a lot more useful to a lot more users. A user of our app can accomplish her goals by sending our app’s output to another app’s input. Users get to use our app in new ways, and we don’t have to add any new features!

There’s one last problem, however. Suppose we wanted to use grep to filter out the tasks that have been completed. In our current format, a task is completed if it has a date in the fourth field. Identifying lines like this is a bit tricky, especially for simple tools like grep. If we add additional information in our output, however, we can make the job easier.

Add Additional Fields to Make Searching Easier

Even though users can see that a task is incomplete because the “completed date” field is omitted from its record, we can make life easier for them by making that information more explicit. To do that in our to-do app, we’ll add a new field to represent the status of a task, where the string “DONE” means the task has been completed and “INCOMPLETE” means it has not.

$ todo list

1,Clean kitchen,INCOMPLETE,2011-06-03 13:45,

2,Rake leaves,DONE,2011-06-03 17:31,2011-06-03 18:34

3,Take out garbage,INCOMPLETE,2011-06-02 15:48,

4,Clean bathroom,INCOMPLETE,2011-06-01 12:00,

$ todo list | grep ",INCOMPLETE,"

1,Clean kitchen,INCOMPLETE,2011-06-03 13:45,

3,Take out garbage,INCOMPLETE,2011-06-02 15:48,

4,Clean bathroom,INCOMPLETE,2011-06-01 12:00,

Note that we include the field delimiters in our string argument to grep so we can be sure what we are matching on; we don’t want to identify a field as incomplete because the word “INCOMPLETE” appears in the task name.

At this point, any user of our app can easily connect our output to another app and do things with our to-do list app we haven’t thought of. Our app is definitely playing well with others. The only problem is that machine-readable formats tend not to be very human readable. This wasn’t a problem with ls, whose records (files) have only one field (the name of the file). For complex apps like todo, where there are several fields per record, the output is a bit difficult to read.

A seasoned UNIX user would simply pipe our output into awk and format the list to their tastes. We can certainly leave it at that, but there’s a usability concern here. Our app is designed to be used by a user sitting at a terminal. We want to maintain the machine-readable format designed for interoperability with other apps but also want our app to interoperate with its users.

Provide a Pretty-Printing Option

The easiest way to provide both a machine-readable output format and a human-readable option is to create a command-line flag or switch to specify the format. We’ve seen how to do this before, but here’s the code we’d use in todo to provide this:

play_well/todo/bin/todo

desc 'List tasks'

command :list do |c|

c.desc 'Format of the output'

c.arg_name 'csv|pretty'

c.default_value 'pretty'

c.flag :format

c.action do |global_options,options,args|

if options[:format] == 'pretty'

# Use the pretty-print format

elsif options[:format] == 'csv'

# Use the machine-readable CSV format

end

end

end

We’ve chosen to make the pretty-printed version the default since, as we’ve mentioned, our app is designed primarily for a human user. That might not be the case for every app, so use your best judgment as to what is appropriate.

Command-line options, exit codes, and output streams are great for apps that start up, do something, and exit. For long-running (or daemon ) apps, we often need to communicate with the app without restarting it. This is done via an operating system feature called signals, which we’ll learn about next.

4.4 Trapping Signals Sent from Other Apps

Our two example apps, db_backup.rb and todo, are not long-running apps. They start up very quickly and exit when they’re done. While this is very common for command-line apps, there are occasions where we will need to write a long-running process. You may have a process that runs all the time, polling a message queue for work to do, or you may need to run a process that monitors other processes. Or, you might have a task that just takes a long time. In each case, you’ll want a way for a user to send information to your running app.

The most common example is to stop an app from running. We do this all the time by hitting Ctrl-C when an app is running in our terminal. This actually sends the app a signal , which is a rudimentary form of interprocess communication. By default, Ruby programs will exit immediately when they receive the signal sent by Ctrl-C . This may not be what you want, or you may want to cleanly shut down things before actually exiting. To allow for this, Ruby programs can trap these signals.

To trap a signal, the module Signal, which is built in to Ruby, provides the method trap . trap takes two arguments: the signal to trap and a block of code to run when that signal is received by the program.

The POSIX standard (which is followed by both UNIX and Windows) provides a list of signals that can be trapped. There are many different signals—see a complete list at http://en.wikipedia.org/wiki/Signal_(computing) —but the signal we’re generally interested in is SIGINT, which is sent by Ctrl-C as well as by the kill command. For a long-running process, you should also trap SIGABRT and SIGQUIT, since those two signals, along with SIGINT, could be used to attempt to shut down your app.

Suppose we wanted to enhance db_backup.rb to clean up the database dump whenever the user kills it. This scenario could happen, because a database dump takes a long time. The current implementation of db_backup.rb will, if killed, exit immediately, leaving a partial dump file in the current directory. Let’s fix that by trapping SIGINT, removing the database output file, and then exiting.

play_well/db_backup/bin/db_backup_3.rb

Signal.trap("SIGINT") do

FileUtils.rm output_file

exit 1

end

Cleaning up isn’t the only thing you can do by trapping signals. You can use the signal system to create a control interface to modify a running app’s behavior. For example, many daemon processes trap SIGHUP and reread their configuration. This allows such apps to reconfigure themselves without shutting down and restarting.

4.5 Moving On

So far, we’ve learned the nuts and bolts of creating an easy-to-use, helpful, and flexible command-line application. In this chapter, we’ve seen how exit codes can communicate success or failure to apps that call them. We’ve seen the importance of sending error messages to the standard error stream, and we’ve seen the amazing power of formatting our standard output as if it were destined to be input to another program. We’ve also seen how long-running apps can receive signals from users or other apps. These lessons are truly what makes the command line so infinitely extensible.

If all you did was follow these rules and conventions, you’d be producing great command-line apps. But, we want to make awesome command-line apps, and these rules and conventions can take us only so far. There are still a lot of open questions about implementing your command-line app. Our discussion of “pretty-printed” vs. “machine readable” formats is just one example: how should you decide which default to use? What about files that our app creates or uses; where should they live? Should we always use a one-letter name for our command-line options? When should we use the long-form names? How do we choose the default values for flags?

The answers to these questions aren’t as clear-cut, but in the next chapter, we’ll try to answer them so that your apps will provide the best user experience possible.

Footnotes

[28]

http://www.freebsd.org/

[29]

https://github.com/ged/sysexits

[30]

http://www.gnu.org/

[31]

A “database backup” produced by mysqldump is a set of SQL statements that, when executed, re-create the backed-up database.

[32]

Strictly speaking, the -1 is not required; we’ll talk more about that in Chapter 5, Delight Casual Users.