Packaging and Distributing Code - THE RUBY WAY, Third Edition (2015)

THE RUBY WAY, Third Edition (2015)

Chapter 17. Packaging and Distributing Code

With aspirin leading the way, more and more products are coming out in fiercely protective packaging designed to prevent consumers from consuming them.

—Dave Barry

This chapter is all about being nice to developers who work on or use your code. If no one (including you) works on your code, or you don’t want to be nice to anyone (including yourself), feel free to skip this chapter.

Typical programmers don’t want to think about user documentation or installation procedures. I encourage you to think about both. Even if you are the only person who will ever see your code, just a few months is typically enough time to forget all the intricacies and details of the code you wrote. Always make things as easy as possible for other developers, because your future self will be one of them.

Being nice to other developers consists primarily of just a few things: making it easy to use the code, making it easy to work on the code, and making it easy to understand the code. You can use Rubygems, Bundler, and RDoc to do each of these things. We’ll look at how to use these tools in this chapter.

17.1 Libraries and Rubygems

Sharing Ruby code with other developers is amazingly easy: just give them a copy of the file with the code in it. They can require your file and start using your code. The problems arise when you want to update your code, or share it more broadly with the entire community. Rubygems (and rubygems.org) are the solution for those problems.

Rubygems has been included with Ruby itself since Ruby 1.9, and is central to Ruby code distribution. It allows Ruby developers to package their code up into a single compressed file, called a “gem.” Gems have filenames that end in the .gem extension, and they contain both the Ruby code that makes up the library in question and metadata about the library, including author and version information.

Rubygems.org is the community-funded host for public gems. This site creates a page for every gem that lists the gem’s authors and versions. If the gem’s author has provided them, links to the gem’s home page, source code, and online documentation will also be shown. Anyone can create a gem and host it on rubygems.org for others to use, for free.

17.1.1 Using Rubygems

The Rubygems executable is called gem. It has subcommands and options that make sense for each subcommand. The most basic usage is quite simple. To install the rake gem, you would run gem install rake. Rubygems will look for a .gem file in the current directory, and then check rubygems.org. Once found, the gem will be installed so that any other Ruby code can use it. The latest version of the gem will be installed by default, but if you wanted to install rake version 10.0.1, you could run gem install rake -v 10.0.1. See gem help installfor more options while installing gems.

Sometimes a gem will have dependencies on other gems. In that case, Rubygems will automatically search for and install those dependencies as well, leaving you with all the gems you need to use the gem you asked for.

How do you know what to ask for? The rubygems.org gem server lists every gem that has ever been created, and the names of all gems can be found there. Alternatively, you can look for gems on rubygems.org by name with the command gem list -r NAME.

However, neither of those options is helpful if you don’t already know the name of a gem. At ruby-toolbox.com, you can find an extremely handy directory of gems grouped together by what they do, rather than how they are named. The Ruby Toolbox also provides popularity and recent development information, which can be helpful when deciding between multiple gems with similar functionality.

After installing some gems, you can list all the gems you have installed with gem list. Predictably, you can also uninstall gems with gem uninstall NAME.

Now that you know how to find and install gems, let’s look at how to create your own.

17.1.2 Creating Gems

Creating a new gem simply requires creating a single file with the .gemspec extension and putting some specific code inside it. Here is a sample .gemspec file for a gem named drummer:

# drummer.gemspec
Gem::Specification.new do |spec|
spec.name = "Drummer"
spec.version = "1.0.2"
spec.authors = ["H. Thoreau"]
spec.email = ["cabin@waldenpond.net"]
spec.description = %q{A Ruby library for those who march to a different drummer.}
spec.summary = %q{Drum different}
spec.homepage = "http://waldenpond.com/drummer"
spec.license = "MIT"

spec.files = Dir["./**/*"]
spec.executables = Dir["./bin/*"]
spec.test_files = Dir["./spec/**/*"]
spec.require_paths = ["lib"]

spec.add_development_dependency "rake"
spec.add_runtime_dependency "activerecord", "~> 4.1.0"
end

Hopefully this .gemspec is more or less self-explanatory, but you don’t need to worry about the exact details just yet. After creating this .gemspec, it is possible to build a gem from it by running the command gem build drummer.gemspec. Then, to upload the gem to rubygems.org, just run gem push drummer-1.0.2.gem.

Covering Rubygems in detail would likely take an entire book, so we’ll stop here. For more information about Rubygems, including the gem command and .gemspec files, refer to the Rubygems guides at http://guides.rubygems.org.

17.2 Managing Dependencies with Bundler

Now that it’s easy to package and distribute individual libraries as gems, we have a new problem. Gems are designed to be small, focused libraries that are dedicated to a single task. Useful applications will probably need more than one gem. How do we keep track of all the gems needed for our larger project? What if we want to use a newer version of a gem we already use? The answer to these and other problems is Bundler, the Ruby language dependency manager.

Before Bundler, Ruby projects lacked a way to declare what gems they needed. Setting up a project on a new machine meant checking the readme for needed gems, hoping none were missing, and then hoping that those gems hadn’t changed to break anything since the readme was written. By keeping track of every gem and version used in your project, Bundler lets you install every gem the project needs with a single command.

It wasn’t just tracking gems that was hard before Bundler, though. Upgrading to new versions of gems might break your application if that gem turned out to depend on different versions of other gems that you already use. By analyzing every gem your project depends on at once, Bundler finds a set of versions for every gem that will all work together.

Setting up a new project to use Bundler is very straightforward, and consists of creating a single file named Gemfile and listing needed gems inside it. A Gemfile must have at least one source call, though it may have more. Following that, there is typically a list of calls to the gemmethod. Here is a Gemfile containing made-up examples, illustrating the most common forms these calls may take:

source "https://rubygems.org" # Download gems from rubygems.org

gem 'red' # A dependency on a gem called "red"
gem 'green', '1.2.1' # Gem "green" - version 1.2.1 exactly
gem 'blue', '>= 1.0' # Version 1.0 or greater of gem "blue"
gem 'yellow', '~> 1.1' # "yellow" 1.1 or greater, less than 2.0
gem 'purple', '~> 1.1.1' # At least 1.1.1, but less than 1.2

The Gemfile exists to store a list of the gems required by your project, and which versions are allowed. In any project containing a Gemfile, simply run bundle install to install all the required gems.

If it did not already exist, the Gemfile.lock file will also be created. In contrast to the Gemfile, the lock stores the names and exact installed versions of every required gem, including the dependencies of your gems, the dependencies of those dependencies, and so on.

Add both Gemfile and Gemfile.lock to your project’s source code. This allows other developers to install all your project dependencies at once, by running bundle install.

Because each project has its own version of each gem, it is often the case that two projects will use different versions of common gems such as rake and rspec. Bundler can restrict your commands to only use the correct gem version for your current project, but it must load first in order to do that.

To run a gem command using the correct version for a given project, prepend the gem command with bundle exec. So, for example, instead of running rake spec, you would run bundle exec rake spec. This guarantees that the version of rake that is used will be the version that the project requires.

Because it quickly gets tedious to type bundle exec all the time, it can be helpful to add a shell alias to a shorter command, such as b. Alternatively, Bundler can create project-specific executables in bin, such as bin/rake. For more information, see Bundler’s help about binstubs. Another trick is simply to do bundle exec bash to start a new shell with the proper environment set up.

If your application is simply a Ruby file, instead of an executable, it will need to require Bundler before any other gems are required. The most straightforward way to do this is with a single line:

require 'bundler/setup'
# other libraries can be required here...

Rails applications also run Bundler.require inside the boot.rb file. That means that every gem listed in the Gemfile will also be required automatically. Requiring many gems can be very slow, and is a common reason for applications to take a long time to start. The authors of Bundler recommend avoiding Bundler.require and adding require statements only where they are actually needed instead.

For the latest and most detailed information about Bundler, refer to the Bundler documentation website at http://bundler.io.

17.2.1 Semantic Versioning

At this point, you might be wondering why the Gemfile allows you to specify a range of versions, such as < 2.5 or >= 3.1.4. Version requirements, as these are called, allow you to update gems without having to edit your Gemfile. In an application with the Gemfile gem "drummer", "~> 1.0", you can upgrade your application from 1.0.2 to the newly released 1.0.3 by running bundle update drummer.

Updating gems can be dangerous, though—what if the new version of the gem works differently? Your application might break. To mitigate that problem, many gem developers adhere to a system for version numbers called “semantic versioning.” It sounds a bit complicated, but it boils down to three principles, one for each number in a gem version.

The first number, known as the “major” version, indicates that something fundamental has changed. If the major version increases, don’t expect to be able to keep using that gem without changing your own code. Likewise, if you make a change in your own gem that means current uses will break, always increment the major version.

The “minor” version is the second number in a gem version, and it indicates that new features have been added, but existing features have not changed. The gem will continue to provide exactly what it did before, and something new is also available that wasn’t there before.

Last is the “tiny” version, which comes third in a gem version. Sometimes it is omitted if it is zero, as in a versions such as 2.2. When the tiny version increases, a bug has been fixed. Tiny updates should be the lowest-risk updates because they contain no new code and no breaking changes.

Semantic versioning is generally accepted as the most helpful system for version numbering. Even Ruby itself uses semantic versioning starting with 2.1.0. Keep in mind, however, than not all gems use this versioning scheme.

Although semantic versioning provides a strong indication of how much a gem has changed between versions, it isn’t a perfect or complete fix to the problem of upgrading gems. Always write tests for code that needs to work correctly, and always run the tests after updating gems. That said, careful use of version requirements and bundle update can make updating gems, well, not exactly painless, but a much smaller hassle.

17.2.2 Dependencies from Git

In addition to managing gems and their versions, Bundler allows you to use gems directly from the git repos containing their source code. Because git allows you to create your own copy of a repo, Bundler makes it trivial to fork a gem repo, fix a bug, and test out your fix. Then, you can use the fixed version from your git repo while waiting for the gem owner to accept your fix and release a new version to rubygems.org.

The following Gemfile tells Bundler that you need the gem named "rake" and that you want to use it directly from Jim Weirich’s git repository on Github:

gem "rake", git: "https://github.com/jimweirich/rake.git",
branch: "master"

Running bundle install will clone the git repository from the given URL and check out the given branch or commit.

17.2.3 Creating Gems with Bundler

Although Bundler exists primarily to manage installing gems, it provides a very handy set of shortcuts that build on Rubygems to make creating and releasing gems as easy as possible.

To create our “drummer” gem with Bundler, we can simply run bundle gem drummer. This will create a drummer directory, with a ruby source file ready for our code, and a .gemspec file ready for our name and description. Releasing the gem is likewise made easier. The version of the gem is stored in the file lib/drummer/version.rb. After we update that file, releasing the gem to rubygems.org is as simple as running rake release.

As you develop your own code, be aware of code that gets reused and be ready to extract it to a gem. It’s easy to create a gem this way and to start using it as a private gem (which we’ll talk about in the next section). After a while, once those private gems have proven themselves useful, release them to rubygems.org if you can. The number of open-source, freely available gems that exist today is one of the greatest strengths of the Ruby community.

17.2.4 Private Gems

Once a project or company has grown for long enough, it’s extremely likely that private, internal gems will emerge. These gems contain code that is shared across applications or services but is not for public consumption. Although it is better to extract reusable libraries from private code and open source them whenever possible, private gems are practically guaranteed on any project that is large enough.

Initially, private gems can be managing using git gems, because git repositories can easily be made private. Only users authorized for access to the repository will be able to install those gems. Once there are more than a few private gems, however, it’s time to use a private gem server. Services such as Gemnasium provide private gem servers for a few dollars a month, whereas projects such as Geminabox allow you to install your own gem server wherever you like.

Now that you know how to create and use gems, let’s look at how to make them understandable to programmers who look at them later. It’s time to talk about documentation.

17.3 Using RDoc

The RDoc tool, created by Dave Thomas and maintained by Eric Hodel, is included with Ruby. It takes Ruby code as input, and it produces a doc directory containing HTML that documents the Ruby classes, methods, constants, and so on from the source files.

One great thing about RDoc is that it tries to produce useful output even if there are no comments in the source. It does this by parsing the code and organizing information on all the classes, modules, constants, methods, and so on. Therefore, you can get reasonably useful HTML out of a program source that doesn’t even have any real internal documentation. If you haven’t done this before, I suggest you try it.

But it gets better. RDoc also tries to associate the comments it finds with specific parts of the program. The general rule is that a block comment preceding a definition (such as a class or method) will be taken as the description of that definition.

If you simply invoke RDoc on a Ruby source, it will create a doc directory with all the files under it. (The default template looks good, but there are also others.) Browse to index.html and take a look.

Listing 17.1 shows a simple (nearly empty) source file. All the methods are empty, and none of it really does anything. But RDoc will still take it and produce a pretty doc page (see Figure 17.1).

Listing 17.1 A Simple Source File


require 'foo'

# The outer class, MyClass
class MyClass
CONST = 237

# The inner MyClass::Alpha class...
class Alpha

# The MyClass::Alpha::Beta class...
class Beta
# Beta's mymeth1 method
def mymeth1
end
end

# Alpha's mymeth2 method
def mymeth2
end
end

# Initialize the object
def initialize(a,b,c)
end

# Create an object with default values
def self.create
end

# An instance method
def do_something
end
end


Image

Figure 17.1 RDoc output from the source code in Listing 17.1

We’ll discuss two other useful features in this section. The documentation for every method can also display the source code of that method. This is an absolutely invaluable tool in learning a library; the API documentation is linked directly back to the code itself.

Also be aware that when RDoc recognizes a URL, it places a hyperlink in the output. The text of the link defaults to be the same as the URL, but you can change this. If you put descriptive text in braces followed by a URL in brackets (that is, {descriptive text}[myurl]), your descriptive text will be used in the link. If the text is a single word, the braces may be omitted.

17.3.1 Simple Markup

If you want to add more elaborate rich-text documentation, RDoc supports a simple markup format that allows formatting the HTML that it will generate. The markup is designed to be straightforward and easily readable when the code is being edited, but translate into HTML formatting in a clear-cut manner.

Listing 17.2 shows a few examples of markup capability; for more examples, consult Programming Ruby or the RDoc API documentation online. The output (bottom frame only) from Listing 17.2 is shown in Figure 17.2.

Listing 17.2 Examples of RDoc Markup


# This block comment will not appear in the output.
# Rdoc only processes a single block comment before
# each piece of code. The empty line after this
# block separates it from the block below.

# This block comment will be detected and
# included in the rdoc output.
#
# Here are some formatting tricks.
#
# Boldface, italics, and "code" (without spaces):
# This is *bold*, this is _italic_, and this is +code+.
#
# With spaces:
#
# This is a bold phrase. Have you read Intruder
# in the Dust? Don't forget to require thread
# at the top.
#
# = First level heading
# == Second level heading
# === Third level heading
#
# Here's a horizontal rule:
# —-
#
# Here's a list:
# - item one
# - item two
# - item three

class MarkupDocumentation
# This block will not appear because the class after
# it has been marked with a :nodoc: directive.
class NotDocumented # :nodoc:
end
end

# This block comment will not show up in in the output.
# Rdoc only processes blocks immediately before code,
# and this comment block is after the only code in this
# listing.


Image

Figure 17.2 RDoc output from markup examples in Listing 17.2

Listing 17.2 outlines some of the rules RDoc uses to parse comments. Not all of these are necessarily intuitive. There is a strong tendency for blank lines to terminate a section of comments, even if the blank line is immediately followed by another block comment.

Note that inside a block comment starting with #, we can “turn off” the copying of text into the output with a #— line (and turn it back on again the same way). Not all comments are intended to go into the user docs, after all.

Finally, some tags can be used inside block comments to alter the RDoc HTML output. Here are most of them:

:include:—Used to include the contents of the specified file in the documentation. Indentation will be adjusted for consistency.

:title:—Used to set the title of the document.

:main:—Used to set the initial page displayed in the output.

17.3.2 Advanced Documentation with Yard

Although RDoc is wonderful (and included with Ruby itself), another tool provides more advanced documentation capabilities: Yay! A Ruby Documentation tool, also known as YARD. The main advantages of YARD over RDoc is that it provides a live preview of the documentation as you write it, and it allows detailed documentation of the type and purpose for each argument and return value for every method. See Figure 17.3 for an example of YARD’s output when run against Listing 17.1. You can find out more about YARD at http://yardoc.org.

Image

Figure 17.3 YARD output from the source code in Listing 17.1

17.4 Conclusion

In this chapter, we’ve looked at the Ruby library system, Rubygems, and the Ruby dependency manager, Bundler. We’ve looked at finding gems, creating gems, and publishing gems for other developers to use. We’ve also discussed the basics of how to document a project using RDoc or YARD. In the next chapter, we’ll shift gears again and talk about an interesting and complex problem domain: network programming.