Be Easy to Maintain - Build Awesome Command-Line Applications in Ruby 2: Control Your Computer, Simplify Your Life (2013)

Build Awesome Command-Line Applications in Ruby 2: Control Your Computer, Simplify Your Life (2013)

Chapter 9. Be Easy to Maintain

We know how to make an easy-to-use, helpful app that interacts with the system as easily as its users yet is highly flexible. In the previous chapter, we learned how to test such an app. We’re almost done with our journey through command-line app development. What we haven’t talked about is how to manage our apps in the face of increasing complexity. We imagined a version of our todo app that integrated with JIRA, a commercial issue-tracking system. What if we decided to open source that app and users wanted it to integrate with other issue-tracking systems? How do we make sure we can contain the complexity of a growing, popular app?

You’ve already learned the first step in writing a maintainable application: a good test suite. With a solid set of tests, we can be free to make changes to how our app works. These sorts of changes aren’t for the users (at least not directly); they’re for us. Reorganizing our code makes it easier for us to work with it, as well as for others to understand it so they can help us. This chapter deals with this problem in two parts. In the first, we’ll talk about where files should go and how they should generally be structured (we got a taste of this in the previous chapter when we extracted code to write unit tests). The second part will demonstrate a few techniques for reorganizing code by applying some simple design patterns to our running example apps. Ideally, seeing some refactoring in action will give you the confidence you need to do the same with your own apps.

9.1 Dividing Code into Multiple Files

In the previous chapter, in order to test the main logic of our task-management app todo, we created the file lib/todo/task.rb, created a class Task that was inside the module Todo, and put some logic into the add_task method. We glossed over the reasons for this, but what we did was a standard way of organizing code in Ruby. This section will explain the how and why of code organization.

You might wonder why we should even bother storing our code in multiple files; it does add a certain amount of complexity to our application because code is no longer in one place. Keeping the code all in our executable carries complexity as well. In the previous chapter, we saw that we couldn’t unit test code that was stored in there; if we were to require an executable into a test, the executable would run before we got a chance to set up our test. There is another reason, as well.

Keeping our code organized across several files will make it easier to find things once our app gets to a certain level of complexity. Understanding an app that is made up of a single 1,000+ line file can be hard. Even todo, which consists of about 120 lines of code, is already showing some problems. The vast majority of the file sets up the user interface. The actual logic of what our app does is buried several layers deep and spread throughout the file. If we wanted to get an understanding of what a task is and how the code manages the task list, we’d have to bounce all over the file to figure it out.

So, by organizing code in several files, we can make things much clearer. We’ll look in one file to understand the user interface, another for the logic of task management, another for writing output for the user, and so on. All it takes is following a few conventions. First we’ll talk about the mechanics of accessing files outside of our executable, and then we’ll learn some conventions for organizing code within those “outside” files.

Configuring Code to Find External Files

In Ruby, it’s customary to place most of your code in files in (or in a subdirectory of) the lib directory. (Some Rubyists prefer to put all code here.) We’ll keep user interface code in our executable but place all non-UI-specific code in files in lib. We’ll talk about how to organize the code inside there in a moment. The main issue you can run into when moving your code out of the main executable is being able to find it so it can be required at runtime. When you use require, Ruby searches the load path for that file. By default, the load path won’t include your application’s lib directory. Since we’re deploying with RubyGems (see Chapter 7, Distribute Painlessly), this problem is easily solved in the gemspec.

As you may recall, the gemspec includes the list of files to package with your application. The gemspec also includes an array of require paths , which are all the paths, relative to your project root, that should be added to the load path when your application starts. Setting this is straightforward:

be_easy_to_maintain/todo/todo.gemspec

s.files = %w(

bin/todo

lib/todo/version.rb

)

*

s.require_paths << 'lib'

When gem installs your RubyGem, it will make sure that the executable it puts in the installing user’s path will set up the Ruby environment so that your files in lib can be found. It also means that this line from bin/todo that sets up the load path is unnecessary. (However, this is not the case when using a non-RubyGems installation method. Further, removing this line will have some slight implications for running our app locally and for running our tests. See Developing Locally Without Modifying the Load Path.)

be_easy_to_maintain/todo/bin/todo

# This line can be removed

#$LOAD_PATH << File.expand_path(File.dirname(__FILE__) + '/../lib')

require 'gli'

require 'todo/version'

require 'todo/task'

Developing Locally Without Modifying the Load Path

Since RubyGems takes care of setting up our app’s load path at runtime, it’s considered bad practice to modify the load path (using the variable $LOAD_PATH) in application code. Avoiding this bad practice causes us a problem, however.

$ bin/todo help

custom_require.rb:36:in `gem_original_require': no such file to load --

todo_version (LoadError)

from custom_require.rb:36:in `require'

from bin/todo:8

When run locally, our app cannot find the files in lib. If we were to run our Cucumber features, we would have the same problem. The fix for both is the environment variable RUBYLIB. RUBYLIB is a delimited list of paths that are added to the load path. The delimiter is platform-specific (colon on UNIX and semicolon on Windows), so fixing it for Cucumber is a bit trickier than in our shell. We don’t want to require users to set it in their environment, so we apply the same technique we did when dealing with the user’s home directory; we modify RUBYLIB inside ENV.

be_easy_to_maintain/todo/features/support/env.rb

LIB_DIR = File.join(File.expand_path(File.dirname(__FILE__)),'..','..','lib')

Before do

@original_rubylib = ENV['RUBYLIB']

ENV['RUBYLIB'] = LIB_DIR + File::PATH_SEPARATOR + ENV['RUBYLIB'].to_s

end

After do

ENV['RUBYLIB'] = @original_rubylib

end

To run locally, we could do the same thing like so:

$ RUBYLIB=lib bin/todo help

usage: todo [global options] command [command options]

# etc.

An alternative way, since we’re using Bundler, is to use bundle exec:

$ bundle exec bin/todo help

usage: todo [global options] command [command options]

# etc.

These two forms are slightly different; bundle exec will run our app with the exact versions of each gem as specified in our Gemfile.lock, which is likely what we want. It’s also easier to type bundle exec than RUBYLIB=lib on the command line.

Once we have our code in the lib directory and our gemspec is updated to add it to the load path, there’s now a question of where code goes in files.

Organizing Code Within Files

Now that we can access code from files in lib, we need to know the best way to organize those files. There are three conventions to follow:

· Each class should be namespaced inside a module named for the project.

· Each class should be in its own file, with the filename and path based on the class’s name and namespace.

· A single file inside lib, named for the project, should require all other files in lib.

There might be a few unfamiliar terms in there, but don’t worry, it’ll be clear in a moment. Let’s look at our first convention, which tells us how to name files based on the code they contain.

Namespacing Classes

In Ruby, classes can be namespaced using modules. You may recall that, when we extracted our code from todo into lib/todo/task.rb, we placed the class Task inside the module Todo. This namespaced Task inside Todo, making the class’s complete name Todo::Task. Since Ruby has open classes, if someone else had a class named Task and we didn’t namespace our Task class, we’d be adding methods to the existing Task class, and things would likely not work properly.

To avoid this situation, all of your classes should be namespaced, and the module in which to place them should be named for your app, in our case Todo. Note that you should “camelize” the module name, so for our db_backup MySQL backup app, its module would be DbBackup.

Naming Files According to Their Full Classname

Once a class is in a namespace, the path/name to the file containing that class’s source code should match the namespace. In our case, the class Todo::Task is stored in a file named todo/task.rb (relative to lib). If we created a new class named Todo::TicketSystem, its source would be stored in todo/ticket_system.rb (note how we use the underscore version of the classname). If we had a class named Todo::TicketSystem::JIRA, this would be stored in todo/ticket_system/jira.rb (again, relative to lib).

This leads to a proliferation of files, and it might seem that our executable is going to have a lot of require statements at the top. Further, it seems that every time we add a new file, the executable has to change to include it. We can avoid this by using the third convention: having one file dedicated to requireing the correct files.

Requiring All Files from a Single File in lib

We’d like to write the following in the executable and get every file and class we needed:

require 'todo'

We can make this happen by including all of our require statements in lib/todo.rb:

require 'todo/task'

require 'todo/ticket_system'

require 'todo/ticket_system/jira'

We will have to maintain this file as we add new classes, but it does keep our executable clean, and our Cucumber features will instantly fail if we forget to update this file.

We’ve now learned the mechanics of organizing bits of code into external files and making sure our application can find them at runtime. In Ruby, those “bits” are classes and modules, and making sure that the right code goes into the right class or module is just as important as where our files are located. In the next section, we’ll briefly discuss why, and then we’ll see some examples of how to design the internals of an application using classes and modules effectively.

9.2 Designing Code for Maintainability

Everything we’ve learned so far about making our codebase maintainable has been encapsulated with clear and simple guidelines. The conventions we’ve just discussed are shared by Rubyists everywhere and will make it very easy for anyone, including you, to navigate your code. This means that bugs get fixed faster, and new features can be pushed out quickly, which means users win, the app’s developers win, and you win.

However, there’s more to maintainability than just an organized file structure. The internal design of our application is just as important. Since Ruby is an object-oriented language, the internal design of a Ruby application revolves around organizing code into classes and modules. Achieving good object-oriented design is well outside the scope of this book, but suffice it to say, the more sense the code makes, the easier it is to enhance, fix, and work with. Let’s look at the code for our task management app, todo.

be_easy_to_maintain/todo/bin/todo

file.readlines.each do |todo|

name,created,completed = todo.chomp.split(/,/)

if options[:format] == 'pretty'

# Use the pretty-print format

printf("%2d - %s\n",index,name)

printf(" %-10s %s\n","Created:",created)

printf(" %-10s %s\n","Completed:",completed) if completed

elsif options[:format] == 'csv'

# Use the machine-readable CSV format

complete_flag = completed ? "C" : "U"

printf("%d,%s,%s,%s,%s\n",index,name,complete_flag,created,completed)

end

index += 1

end

This code is doing a lot of things at once: it’s parsing the tasks from the external tasklist file, it’s formatting them (based on the user’s selection), and it’s iterating over all the tasks. Suppose we wanted to add a new command-line switch to allow the user to hide completed tasks. We’d have to add even more code to this block. Understanding this seemingly simple bit of code requires keeping our brains focused on many levels of the software: the user interface, the domain of “task management,” and the internals of how we store our tasks. Classes are just the thing to sort this out.

There are many, many books on object-oriented design, and there are many design patterns that help solve common problems (see The Gang of Four [GHJV95] book for an overview of several common patterns). What we’ll do here is apply some of those patterns to the problems in our code. This will demonstrate how to apply these patterns, but it will also help give you a bit of direction when you decide that your code’s internal structure needs a cleanup.

We’ll start by encapsulating everything about tasks in our app by expanding our Task. Then, we’ll apply a pattern called Factory Method to allow us to easily construct our tasks from the task list file’s contents. Finally, we’ll abstract all of that formatting code by using the Strategy pattern. This way, the code inside list’s action block will be cleaner and easier to understand.

Encapsulating Data and Code into Classes

The remainder of this chapter depends on our data being stored in an object, not as “loose” variables. Right now, we have name, created, and completed in our code to represent a particular task, as well as its current state. If we can encapsulate these attributes inside our Task class, it will simplify things quite a bit. This change sets the stage for further changes, so its positive effect won’t be seen until the end of this section.

We already have a Task class from our previous refactoring from Chapter 8, Test, Test, Test. That class has only one method, and we created it to have a “unit” to test. Let’s expand this class by adding the attributes of a task to it.

be_easy_to_maintain/todo/lib/todo/task.rb

attr_reader :name,

:created_date,

:completed_date

def initialize(name,created_date,completed_date)

@name = name

@created_date = created_date

@completed_date = completed_date

end

def completed?

!completed_date.nil?

end

Now we have a single source of information about any task in our system. A task is formally defined as a name, a date of creation, and a date of completion. It also has a notion of being completed or not (the completed? method). This, in and of itself, will actually increase the size of our code as currently designed. As we said, this is setting the stage. The usefulness of this class will become apparent by the end of the section.

Using the Factory Method Pattern to Control How Objects Are Created

Currently, the way in which tasks are written to and read from the tasklist file is spread out across the executable. If we centralize how this is done, we can simplify our main code and provide another testable unit to ensure the quality of our application. We’ll add a class method to Task that, given a filename, reads it and returns a list of Task instances. Since this method creates new objects, it’s called a factory method . Here’s what it looks like:

be_easy_to_maintain/todo/lib/todo/task.rb

class Task

def self.from_file(file)

tasks = []

file.readlines.each do |line|

name,created,completed = line.chomp.split(/,/)

tasks << Task.new(name,created,completed)

end

tasks

end

end

With our factory method from_file vending instances of Task, our list command code is vastly simplified:

be_easy_to_maintain/todo/bin/todo

Todo::Task.from_file(tasklist).each do |task|

# ... formatting code

end

Notice how the code just reads better. Say it out loud: “tasks equal tasks from the file task list.” More importantly, we’ve now encapsulated the individual bits of a task inside the Task class, and the details of parsing the file are a class method of Task. If we were to add another attribute to a task—say, a priority—our code inside the list command won’t have to change. We’re keeping the general outline of the list command’s algorithm separate from the details of how the tasks are stored in the task list file.

Of course, we still have all that code to deal with for formatting a task for output. Let’s simplify that next.

Organizing Multiple Ways of Doing Something with the Strategy Pattern

Other than mixing in low-level details to our high-level list command code, there’s another situation that will make this code even harder to maintain. Let’s assume JSON as a choice for formatting output. To accommodate that, we’ll have to update our documentation string, add another elsif block, and implement the format.

Whenever we have many ways of doing conceptually the same thing, the Strategy pattern can usually simplify the code. This pattern involves placing the code for each way of doing something (i.e., each strategy) in a different class, each of which has the same interface. We then locate the correct class at runtime and apply the strategy. Instead of seeing the actual strategy classes first, let’s look at the complete picture that we’re aiming for:

be_easy_to_maintain/todo/bin/todo

command :list do |c|

output_formats = {

'csv' => Todo::Format::CSV.new,

'pretty' => Todo::Format::Pretty.new,

}

c.desc 'Format of the output (pretty for TTY, csv otherwise)'

c.arg_name output_formats.keys.join('|')

c.flag :format

c.action do |global_options,options,args|

formatter = output_formats[options[:format]]

File.open(global_options[:filename]) do |tasklist|

index = 1

Todo::Task.from_file(tasklist).each do |task|

formatter.format(index,task)

index += 1

end

end

end

end

The Hash named output_formats is a map of formatting strategies. The keys are the names the user will specify on the command line; we join them with pipes to generate the documentation string for arg_name . We then use the value of the --format flag to locate the correct strategy class. We assume that all strategy classes have a method format that takes the current index as well as the task to format.

Now, our list code is very clear and clean; we see that we’re reading the tasks from a file and then formatting each one for the user. Adding a new format is as simple as adding an entry to our output_formats hash; that’s it!

output_formats = {

'csv' => Todo::Format::CSV.new,

'pretty' => Todo::Format::Pretty.new,

*

'json' => Todo::Format::JSON.new,

}

Seeing the end state, it’s easy to think about how to implement our formatting strategies. Here’s how they ended up:

be_easy_to_maintain/todo/lib/todo/format/csv.rb

module Todo

module Format

class CSV

def format(index,task)

complete_flag = task.completed? ? "C" : "U"

printf("%d,%s,%s,%s,%s\n",

index,

task.name,

complete_flag,

task.created_date,

task.completed_date)

end

end

end

end

be_easy_to_maintain/todo/lib/todo/format/pretty.rb

module Todo

module Format

class Pretty

def format(index,task)

printf("%2d - %s\n",index,task.name)

printf(" %-10s %s\n","Created:",task.created_date)

if task.completed?

printf(" %-10s %s\n","Completed:",task.completed_date)

end

end

end

end

end

With our code redesigned to take advantage of some common design patterns, we’ve made the following programming tasks easier to do:

· Add new output formats

· Add new attributes of a task

· Isolate and fix bugs by unit testing

· Parse or format tasks in other parts of the codebase

All this adds up to fixing bugs and adding new features much more easily. In addition, since these patterns are well-known among developers, others will be able to more easily help you keep your apps working and up-to-date. These aren’t the only patterns that exist; there are countless ways of organizing your code to make it easier to work with. Ideally, this has given you some ideas about how to structure your code for maximum ease of development.

9.3 Moving On

We’ve learned two things in this chapter, both to help make our code easier to navigate, easier to understand, easier to enhance, and easier to work with. By putting code into the right place in our project, we can keep things organized and comprehensible. By applying design patterns to the structure of our code, we make it clear to ourselves and other developers how the application works. We’ve set ourselves up to quickly fix bugs and easily add new features.

You’re in the home stretch now and have learned everything you need to make awesome command-line applications. You’ve learned all the rules, so now it’s time to break some of them. Thus far, we’ve stuck to the “UNIX Way” of machine-parseable output and no interactive input. What if we want to do something fancier? We saw how Cucumber uses color to assist its users. You’ve likely used (and appreciated) a SQL client that formats its output in tabular form, as well as takes interactive input. How can we use these techniques in our apps? More importantly, should we?

In the next chapter, we’ll see how to add color, formatting, and interactive input to our apps, making them potentially more useful and even more fun! Of course, we’ll also learn when we should and when we should not apply these techniques.