Deploying with Capistrano - Reliably Deploying Rails Applications: Hassle free provisioning, reliable deployment (2014)

Reliably Deploying Rails Applications: Hassle free provisioning, reliable deployment (2014)

15.0 - Deploying with Capistrano

Overview

In the first part of this book, we covered setting up a VPS ready to run a Rails application. In this part we’ll cover the process of deploying one or more apps to this VPS.

Capistrano

Capistrano is a ruby gem which provides a framework for automating tasks related to deploying a ruby based application, in our case a Rails app, to a remote server. These include tasks like checking out the code from a git repository onto the remote server and integrating stage specific configuration files into our app each time we deploy.

Capistrano 2 or 3

Choosing between them

Capistrano is the defacto standard for deploying Rails applications. The majority of tutorials and existing documents focus on Capistrano 2, in particular there are several excellent Railscasts on using Capistrano 2 to deploy to a VPS which many apps deployment processes are based on.

Capistrano 3 was recently released and, having recently migrated the deployment process for several large client applications from version 2 to version 3, I’m confident that the version 3 rewrite introduces some substantial improvements.

As a result, this book will focus on the use of Capistrano V3. If for any reason you need to use V2, the approach outlined in this blog post of mine: http://www.talkingquickly.co.uk/2013/11/deploying-multiple-rails-apps-to-a-single-vps/ serves as a starting point with sample code.

What’s new in V3

For full details see the release announcement, but the key bits I think make the upgrade worthwhile are:

· It uses the Rake DSL instead of a specialised Capistrano one; this makes writing Capistrano tasks exactly like writing rake tasks, something most Rails developers have some familiarity with.

· It uses SSHkit for lower level functions around connecting and interacting with remote machines. This makes writing convenience tasks, which do things like streaming logs or checking processes, much easier.

Stages

It’s normal for production applications to have several ‘stages’. In general, you would have, at a minimum, a staging environment and a production environment.

The staging environment is a copy of the production environment which uses dummy data (often copied from production) and can be used to test changes before they are deployed to production.

In this section we’ll assume you have one production and one staging configuration, each using a VPS configured using the instructions in the previous section.

Upgrading from V2

If you already have a Capistrano 2 configuration for the application to be deployed, I suggest you archive all of this off and start from scratch with Capistrano 3. In general this will mean renaming (for example by appending .old) all of the following:

1 Capfile

2 config/deploy.rb

3 config/deploy/

Adding Capistrano to an application

1 gem 'capistrano', '~> 3.1.0'

2

3 # rails specific capistrano functions

4 gem 'capistrano-rails', '~> 1.1.0'

5

6 # integrate bundler with capistrano

7 gem 'capistrano-bundler'

8

9 # if you are using Rbenv

10 gem 'capistrano-rbenv', "~> 2.0"

As you can see, Capistrano 3 splits out a lot of app specific functionality into separate gems. This increased focus on modularity is a theme throughout the version 3 rewrite.

Then run bundle install if you’re adding Capistrano 3 for the first time or bundle update capistrano if you’re upgrading. You may need to do some of the usual Gemfile juggling if you’re updating and there are dependency conflicts.

Installation

Assuming you’ve archived off any legacy Capistrano configurations, you can now run:

1 bundle exec cap install

Which generates the following files and directory structure:

1 ├── Capfile

2 ├── config

3 │ ├── deploy

4 │ │ ├── production.rb

5 │ │ └── staging.rb

6 │ └── deploy.rb

7 └── lib

8 └── capistrano

9 └── tasks

The source for the suggested starting configuration is available at https://github.com/TalkingQuickly/capistrano-3-rails-template. I suggest cloning this repository and copying these files into your project as you work through this section.

Capistrano 3 is Rake

This book provides a fully working Capistrano configuration which should work out of the box when used with the VPS configuration from the previous section. An understanding of how Capistrano is structured does however make a lot of operations much easier to understand so we’ll look at it in brief here.

Capistrano 3 is structured as a Rake Application. This means that in general, working with Capistrano is like working with Rake but with additional functionality specific to deployment work flows.

This is particularly interesting when we realise that the file Capfile generated in the root of the project is just a Rakefile. This makes understanding what Capistrano is doing under the hood much easier - and removes a lot of the magic feeling which makes me uncomfortable about many deploy scripts.

Therefore, when we talk about Capistrano tasks, we know they’re just rake tasks with access to the Capistrano deployment specific DSL. In practice, a lot of this deployment specific DSL is actually made up of wrappers around SSHkit.

The Capfile

We discussed above that the Capfile is essentially just a Rakefile. The example configuration Capfile looks like this:

1 # Load DSL and Setup Up Stages

2 require 'capistrano/setup'

3

4 # Includes default deployment tasks

5 require 'capistrano/deploy'

6

7 # Includes tasks from other gems included in your Gemfile

8 #

9 # For documentation on these, see for example:

10 #

11 # https://github.com/capistrano/rvm

12 # https://github.com/capistrano/rbenv

13 # https://github.com/capistrano/chruby

14 # https://github.com/capistrano/bundler

15 # https://github.com/capistrano/rails/tree/master/assets

16 # https://github.com/capistrano/rails/tree/master/migrations

17 #

18 # require 'capistrano/rvm'

19 require 'capistrano/rbenv'

20 # require 'capistrano/chruby'

21 require 'capistrano/bundler'

22 # require 'sidekiq/capistrano'

23 # require 'capistrano/rails/assets'

24 require 'capistrano/rails/migrations'

25

26 # Loads custom tasks from `lib/capistrano/tasks' if you have any defined.

27 Dir.glob('lib/capistrano/tasks/*.cap').each { |r| import r }

28 Dir.glob('lib/capistrano/**/*.rb').each { |r| import r }

Knowing that this is just a kind of Rakefile we can see that it’s simply requiring task definitions, initially from Capistrano itself and then from other gems which are intended to add functionality.

It then goes on to include any application specific tasks defined in lib/capistrano/tasks.

The final line:

1 Dir.glob('lib/capistrano/**/*.rb').each { |r| import r }

is non-standard. This allows us to include arbitrary ruby files in the lib/capistrano/ directory which can be used to define helper methods for the tasks.

If we were using Sidekiq we could simply uncomment the Sidekiq require entry and this would include the tasks the Sidekiq developers include for starting and stopping workers. This is a common pattern; many gems which require specific actions on deployment will provide pre-defined Capistrano tasks we can simply include in this manner.

Common configuration

When the Capfile requires capistrano/setup, this:

· Iterates over the stages defined in config/deploy/

· For each stage, loads the configuration defined in config/deploy.rb

· For each stage, loads the stage specific configuration defined in config/deploy/stage_name.rb

The approach in this book is to keep as much common configuration in config/deploy.rb as possible, with only minimal stage specific configuration in the stage files (e.g. config/deploy/production.rb).

The deploy.rb from the sample configuration looks like this:

1 set :application, 'app_name'

2 set :deploy_user, 'deploy'

3

4 # setup repo details

5 set :scm, :git

6 set :repo_url, 'git@github.com:username/repo.git'

7

8 # setup rbenv.

9 set :rbenv_type, :system

10 set :rbenv_ruby, '2.1.1'

11 set :rbenv_prefix, "RBENV_ROOT=#{fetch(:rbenv_path)} RBENV_VERSION=#{fetch(:rb\

12 env_ruby)}#{fetch(:rbenv_path)}/bin/rbenv exec"

13 set :rbenv_map_bins, %w{rake gem bundle ruby rails}

14

15 # how many old releases do we want to keep

16 set :keep_releases, 5

17

18 # files we want symlinking to specific entries in shared.

19 set :linked_files, %w{config/database.yml}

20

21 # dirs we want symlinking to shared

22 set :linked_dirs, %w{bin log tmp/pids tmp/cache tmp/sockets vendor/bundle publ\

23 ic/system}

24

25 # what specs should be run before deployment is allowed to

26 # continue, see lib/capistrano/tasks/run_tests.cap

27 set :tests, []

28

29 # which config files should be copied by deploy:setup_config

30 # see documentation in lib/capistrano/tasks/setup_config.cap

31 # for details of operations

32 set(:config_files, %w(

33 nginx.conf

34 database.example.yml

35 log_rotation

36 monit

37 unicorn.rb

38 unicorn_init.sh

39 ))

40

41 # which config files should be made executable after copying

42 # by deploy:setup_config

43 set(:executable_config_files, %w(

44 unicorn_init.sh

45 ))

46

47 # files which need to be symlinked to other parts of the

48 # filesystem. For example nginx virtualhosts, log rotation

49 # init scripts etc.

50 set(:symlinks, [

51 {

52 source: "nginx.conf",

53 link: "/etc/nginx/sites-enabled/#{full_app_name}"

54 },

55 {

56 source: "unicorn_init.sh",

57 link: "/etc/init.d/unicorn_#{full_app_name}"

58 },

59 {

60 source: "log_rotation",

61 link: "/etc/logrotate.d/#{full_app_name}"

62 },

63 {

64 source: "monit",

65 link: "/etc/monit/conf.d/#{full_app_name}.conf"

66 }

67 ])

68

69 # this:

70 # http://www.capistranorb.com/documentation/getting-started/flow/

71 # is worth reading for a quick overview of what tasks are called

72 # and when for `cap stage deploy`

73

74 namespace :deploy do

75 # make sure we're deploying what we think we're deploying

76 before :deploy, "deploy:check_revision"

77 # only allow a deploy with passing tests to be deployed

78 before :deploy, "deploy:run_tests"

79 # compile assets locally then rsync

80 after 'deploy:symlink:shared', 'deploy:compile_assets_locally'

81 after :finishing, 'deploy:cleanup'

82

83 # remove the default nginx configuration as it will tend

84 # to conflict with our configs.

85 before 'deploy:setup_config', 'nginx:remove_default_vhost'

86

87 # reload nginx to it will pick up any modified vhosts from

88 # setup_config

89 after 'deploy:setup_config', 'nginx:reload'

90

91 # Restart monit so it will pick up any monit configurations

92 # we've added

93 after 'deploy:setup_config', 'monit:restart'

94

95 # As of Capistrano 3.1, the `deploy:restart` task is not called

96 # automatically.

97 after 'deploy:publishing', 'deploy:restart'

98 end

When setting variables which are to be used across Capistrano tasks we use the set and fetch methods provided by Capistrano. Internally we’re setting and retrieving values in a hash maintained by Capistrano but in general we don’t need to worry about this, just that we set a configuration value in deploy.rb and in our stage files with:

1 set :key_name, "value"

And retrieve it with:

1 get :key_name

The key variables to set in deploy.rb are application, repo_url and rbenv_ruby. The Rbenv Ruby you set must match one installed with Rbenv on the machine you’re deploying to, otherwise the deploy will fail.

Running tests

When making small changes to an application, it’s easy to forget to run the test suite prior to deploying, only realising there’s a problem when some ‘unrelated’ feature doesn’t work. The below lines in deploy.rb allow you to select particular Rspec specs which must pass before a deploy will be allowed to continue.

1 # what specs should be run before deployment is allowed to

2 # continue, see lib/capistrano/tasks/run_tests.cap

3 set :tests, []

So, for example, if we were to add “spec” to the above array:

1 set :tests, ["spec"]

The command rspec spec would be run before deploying and the deploy would only be allowed to continue if there were no failures.

If you already have a full blown continuous integration system setup (or don’t want to run specs at all), this can be left as an empty array.

Hooks

The final section of deploy.rb looks like this:

1 namespace :deploy do

2 # make sure we're deploying what we think we're deploying

3 before :deploy, "deploy:check_revision"

4 # only allow a deploy with passing tests to deployed

5 before :deploy, "deploy:run_tests"

6 # compile assets locally then rsync

7 after 'deploy:symlink:shared', 'deploy:compile_assets_locally'

8 after :finishing, 'deploy:cleanup'

9

10 # remove the default nginx configuration as it will tend

11 # to conflict with our configs.

12 before 'deploy:setup_config', 'nginx:remove_default_vhost'

13

14 # reload nginx to it will pick up any modified vhosts from

15 # setup_config

16 after 'deploy:setup_config', 'nginx:reload'

17

18 # Restart monit so it will pick up any monit configurations

19 # we've added

20 after 'deploy:setup_config', 'monit:restart'

21

22 # As of Capistrano 3.1, the `deploy:restart` task is not called

23 # automatically.

24 after 'deploy:publishing', 'deploy:restart'

25 end

Capistrano works by calling tasks in a particular sequence. These are usually a mixture of internally defined tasks (such as those which checkout the source code from version control) and custom tasks such as the run tests task documented above.

If we want our custom tasks to be run automatically as part of a Capistrano work flow such as deploy then we use before and after hooks. So, for example, the following:

1 before :deploy, "deploy:run_tests"

tells Capistrano that before the task called deploy is invoked, it should invoke the task deploy:run_tests. Using this methodology we can completely automate all steps required to deploy our application. We’ll cover how to write custom tasks in section 15.1.

It’s worth taking a look at http://www.capistranorb.com/documentation/getting-started/flow/ to understand the internal task ordering for a typical deploy.

Setting up stages

A stage is a single standalone environment that an application runs in. At a minimum, a production application will generally have a staging environment, for testing new changes, in addition to the main production environment.

These map - although not necessarily one to one - to the “environments” which rails provides. In general, the only stage which will have its “environment” set to production is the live production configuration. All other remote environments generally use staging.

Ideally, the staging server would be an identical copy of the production one in order to minimise the chance of there being an error case which exists in production that does not show up in staging. In practice it’s often not cost effective to mirror the production environment completely, instead using a lower spec’d VPS for staging which is provisioned using exactly the same chef configuration as production.

Stages are defined in config/deploy/. We invoke Capistrano tasks in the format:

1 cap stage_name task

Where stage_name is the name of a .rb file in config/deploy. This means we are not limited to just a staging and a production stage, we can define as many arbitrarily named stages as needed.

The Production Stage

In the sample configuration, the production stage (defined in production.rb) looks like this:

1 # this should match the filename. E.g. if this is production.rb,

2 # this should be :production

3 set :stage, :production

4 set :branch, "master"

5

6 # This is used in the Nginx VirtualHost to specify which domains

7 # the app should appear on. If you don't yet have DNS setup, you'll

8 # need to create entries in your local Hosts file for testing.

9 set :server_name, "www.example.com example.com"

10

11 # used in case we're deploying multiple versions of the same

12 # app side by side. Also provides quick sanity checks when looking

13 # at filepaths

14 set :full_app_name, "#{fetch(:application)}_#{fetch(:stage)}"

15

16 server 'example.com', user: 'deploy', roles: %w{web app db}, primary: true

17

18 set :deploy_to, "/home/#{fetch(:deploy_user)}/apps/#{fetch(:full_app_name)}"

19

20 # don't try and infer something as important as environment from

21 # stage name.

22 set :rails_env, :production

23

24 # number of unicorn workers, this will be reflected in

25 # the unicorn.rb and the Monit configurations

26 set :unicorn_worker_count, 5

27

28 # whether we're using SSL or not, used for building Nginx

29 # config file

30 set :enable_ssl, false

The most important variable to update is the address of the server and the git branch to be deployed from.

We’ll look at Unicorn configuration in more detail in section 16. To begin with, I suggest setting unicorn_worker_count to two and then tuning it to suit your application once deployment is working smoothly.

To start, keep enable_ssl to false; this is covered in section 17.

Generating Remote Configuration Files

Capistrano uses a folder called shared to manage files and directories that should persist across releases. The key folder is shared/config which should contain configuration files that should persist across deploys.

Let’s take, as an example, the traditional database.yml file that ActiveRecord uses to determine the database and credentials required for accessing the database for the current environment.

We do not want to keep this file in version control since our production database details would be available to anyone who had access to the repository.

With Capistrano 3 we create a database.yml file in shared/config and the following:

1 # files we want symlinking to specific entries in shared.

2 set :linked_files, %w{config/database.yml}

in deploy.rb means that, after every deploy, the files listed in the array (remember %w{items} is just shorthand for creating an array of string literals) will be automatically symlinked to corresponding files in shared.

Therefore, after our code is copied to the remote server the file config/database.yml will be changed to be a symlink which points to shared/config/database.yml.

One approach to creating files like is to manually SSH into the remote machine and create files like shared/config/database.yml manually. This, however, seems inefficient, as a lot of the configuration will be the same across all our remote servers and can be automatically generated based on the contents of the stage files. The aim of this book is to avoid these manual, error prone steps.

To address this, this section:

1 # which config files should be copied by deploy:setup_config

2 # see documentation in lib/capistrano/tasks/setup_config.cap

3 # for details of operations

4 set(:config_files, %w(

5 nginx.conf

6 database.example.yml

7 log_rotation

8 monit

9 unicorn.rb

10 unicorn_init.sh

11 ))

12

13 # which config files should be made executable after copying

14 # by deploy:setup_config

15 set(:executable_config_files, %w(

16 unicorn_init.sh

17 ))

is a custom extension to the standard Capistrano 3 approach to configuration files, which makes the initial creation of these files easier by adding the task deploy:setup_config.

When this task is run, for each of the files defined in :config_files it will first look for a corresponding .erb file (so for nginx.conf it would look for nginx.conf.erb) in config/deploy/#{application}_#{rails_env}/. If it were not found there it would look for it in config/deploy/shared/. Once it finds the correct source file, it will parse the erb and then copy the result to the config directory in your remote shared path.

This allows you to define your common config files, which will be used by all stages (staging & production for example), in shared while still allowing for some templates to differ between stages.

Finally this section:

1 # remove the default nginx configuration as it will tend

2 # to conflict with our configs.

3 before 'deploy:setup_config', 'nginx:remove_default_vhost'

4

5 # reload nginx to it will pick up any modified vhosts from

6 # setup_config

7 after 'deploy:setup_config', 'nginx:reload'

8

9 # Restart monit so it will pick up any monit configurations

10 # we've added

11 after 'deploy:setup_config', 'monit:restart'

Means that after deploy:setup_config is run, we:

· Delete the default nginx Virtualhost to stop it over-riding our VirtualHost

· Reload Nginx to pickup any changes to the VirtualHost

· Reload Monit to pickup any changes to the Monit configuration

Managing non-Rails configuration

The target of the approach outlined in this book is to ensure that all configuration relating to the Rails application being deployed is managed by the deployment process. This means that in addition to files such as database.yml, which are internal to Rails, the configuration files that need to be managed by the deployment process include:

· Unicorn Monit definitions

· Nginx Virtual hosts entries

· Init scripts for unicorn and any background workers

· Log rotation definitions

Subsequent sections cover the contents of these files in detail. In the context of the deployment process, they differ from files like database.yml because they need to sit outside of the Rails directory structure.

If we take, as an example, the nginx virtual host file. Recalling from section 10.0 that these should be placed in /etc/nginx/sites-enabled/; one possibility is to create a Capistrano task which copies this file directly to that location.

A more elegant solution, however, is to use Symlinks. This allows us to keep all of the app’s configuration within shared/config and have Capistrano create Symlinks to the appropriate locations.

The below section outlines these symlinks which are created by the deploy:setup_config tasks defined in lib/capistrano/tasks/setup_config.cap:

1 # files which need to be symlinked to other parts of the

2 # filesystem. For example nginx virtualhosts, log rotation

3 # init scripts etc.

4 set(:symlinks, [

5 {

6 source: "nginx.conf",

7 link: "/etc/nginx/sites-enabled/#{fetch(:full_app_name)}"

8 },

9 {

10 source: "unicorn_init.sh",

11 link: "/etc/init.d/unicorn_#{fetch(:full_app_name)}"

12 },

13 {

14 source: "log_rotation",

15 link: "/etc/logrotate.d/#{fetch(:full_app_name)}"

16 },

17 {

18 source: "monit",

19 link: "/etc/monit/conf.d/#{fetch(:full_app_name)}.conf"

20 }

21 ])

Once you’ve made any required changes to production.rb, deploy.rb and the configuration files, use the below command to copy the configuration files to the remote server.

1 cap production deploy:setup_config

Database Credentials

The database example yml file intentionally doesn’t include actual credentials as these should not be stored in version control.

Therefore, you need to SSH into your remote server and cd into shared/config, then create a database.yml from the example:

1 cp database.yml.example database.yml

Edit it with a text editor such as vim, e.g:

1 vim database.yml

And enter the details of the database that the app should connect to. If you’re using Postgres or MySQL you’ll need to create this database using the instructions from Chapter 11.

Deploying

Now that we’ve run:

1 cap production deploy:setup_config

and created a database.yml file, we’re ready to deploy. Once we’ve committed any changes and pushed them to the remote we’ve chosen to deploy from, deploying is as simple as entering:

1 cap production deploy

And waiting. The first deploy can take a while as Gems are installed, so be patient.

tip

Failed doesn’t mean Failed

Don’t panic if you see lots of lines with (failed) in brackets. This is generally because Capistrano V3 uses NetSSH which will output (failed) if any command produces a none 0 return code. Not all of the commands follow the “0 = success” convention and so we see some of these (failed) messages. If a command has actually failed, in that the deploy will then fail, Capistrano will halt with an exception.

Conclusion

This configuration is based heavily on the vanilla Capistrano configuration, with some extra convenience tasks added in lib/capistrano/tasks/ to make it quick to setup work flows I’ve found to be efficient for big production configurations.

I strongly recommend forking my sample configuration and tailoring it to fit the kinds of applications you develop. I usually end up with a few different configurations, each of which is used either for a particular type of personal project or for all of a specific client’s applications.

In the following sections we’ll cover how to create custom Capistrano tasks and look at each of the configuration files from the sample configuration in detail. The remaining chapters provide a reference for the sample configuration along with the information required to customise it rather than step by step instructions for re building it from scratch.