Roles: Scaling Up Your Playbooks - Ansible: Up and Running (2015)

Ansible: Up and Running (2015)

Chapter 8. Roles: Scaling Up Your Playbooks

One of the things I like about Ansible is how it scales both up and down. I’m not referring to the number of hosts you’re managing, but rather the complexity of the jobs you’re trying to automate.

Ansible scales down well because simple tasks are easy to implement. It scales up well because it provides mechanisms for decomposing complex jobs into smaller pieces.

In Ansible, the role is the primary mechanism for breaking apart a playbook into multiple files. This simplifies writing complex playbooks, and it also makes them easier to reuse. Think of a role as something you assign to one or more hosts. For example, you’d assign a database role to the hosts that will act as database servers.

Basic Structure of a Role

An Ansible role has a name, such as “database.” Files associated with the database role go in the roles/database directory, which contains the following files and directories.

roles/database/tasks/main.yml

Tasks

roles/database/files/

Holds files to be uploaded to hosts

roles/database/templates/

Holds Jinja2 template files

roles/database/handlers/main.yml

Handlers

roles/database/vars/main.yml

Variables that shouldn’t be overridden

roles/database/defaults/main.yml

Default variables that can be overridden

roles/database/meta/main.yml

Dependency information about a role

Each individual file is optional; if your role doesn’t have any handlers, there’s no need to have an empty handlers/main.yml file.

WHERE DOES ANSIBLE LOOK FOR MY ROLES?

Ansible will look for roles in the roles directory alongside your . playbooks. It will also look for systemwide roles in /etc/ansible/roles. You can customize the system-wide location of roles by setting the roles_path setting in the defaults section of your ansible.cfg file, as shown in Example 8-1.

EXAMPLE 8-1. ANSIBLE.CFG: OVERRIDING DEFAULT ROLES PATH

[defaults]

roles_path = ~/ansible_roles

You can also override this by setting the ANSIBLE_ROLES_PATH environment variable, as described in Appendix B.

Example: Database and Mezzanine Roles

Let’s take our Mezzanine playbook and implement it with Ansible roles. We could create a single role called “mezzanine,” but instead I’m going to break out the deployment of the Postgres database into a separate role called “database.” This will make it easier to eventually deploy the database on a host separate from the Mezzanine application.

Using Roles in Your Playbooks

Before we get into the details of how to define roles, let’s go over how to assign roles to hosts in a playbook.

Example 8-2 shows what our playbook looks like for deploying Mezzanine onto a single host, once we have database and Mezzanine roles defined.

Example 8-2. mezzanine-single-host.yml

- name: deploy mezzanine on vagrant

hosts: web

vars_files:

- secrets.yml

roles:

- role: database

database_name: "{{ mezzanine_proj_name }}"

database_user: "{{ mezzanine_proj_name }}"

- role: mezzanine

live_hostname: 192.168.33.10.xip.io

domains:

- 192.168.33.10.xip.io

- www.192.168.33.10.xip.io

When we use roles, we have a roles section in our playbook. The roles section expects a list of roles. In our example, our list contains two roles, database and mezzanine.

Note how we can pass in variables when invoking the roles. In our example, we pass the database_name and database_user variables for the database role. If these variables have already been defined in the role (either in vars/main.yml or defaults/main.yml), then the values will be overridden with the variables that were passed in.

If you aren’t passing in variables to roles, you can simply specify the names of the roles, like this:

roles:

- database

- mezzanine

With database and mezzanine roles defined, writing a playbook that deploys the web application and database services to multiple hosts becomes much simpler. Example 8-3 shows a playbook that deploys the database on the db host and the web service on the web host. Note that this playbook contains two separate plays.

Example 8-3. mezzanine-across-hosts.yml

- name: deploy postgres on vagrant

hosts: db

vars_files:

- secrets.yml

roles:

- role: database

database_name: "{{ mezzanine_proj_name }}"

database_user: "{{ mezzanine_proj_name }}"

- name: deploy mezzanine on vagrant

hosts: web

vars_files:

- secrets.yml

roles:

- role: mezzanine

database_host: "{{ hostvars.db.ansible_eth1.ipv4.address }}"

live_hostname: 192.168.33.10.xip.io

domains:

- 192.168.33.10.xip.io

- www.192.168.33.10.xip.io

Pre-Tasks and Post-Tasks

Sometimes you want to run some tasks before or after you invoke your roles. Let’s say you wanted to update the apt cache before you deployed Mezzanine, and you wanted to send a notification to Slack channel after you deployed.

Ansible allows you to define a list of tasks that execute before the roles with a pre_tasks section, and a list of tasks that executes after the roles with a post_tasks section. Example 8-4 shows an example of these in action.

Example 8-4. Using pre-tasks and post-tasks

- name: deploy mezzanine on vagrant

hosts: web

vars_files:

- secrets.yml

pre_tasks:

- name: update the apt cache

apt: update_cache=yes

roles:

- role: mezzanine

database_host: "{{ hostvars.db.ansible_eth1.ipv4.address }}"

live_hostname: 192.168.33.10.xip.io

domains:

- 192.168.33.10.xip.io

- www.192.168.33.10.xip.io

post_tasks:

- name: notify Slack that the servers have been updated

local_action: >

slack

domain=acme.slack.com

token={{ slack_token }}

msg="web server {{ inventory_hostname }} configured"

But enough about using roles; let’s talk about writing them.

A “Database” Role for Deploying the Database

The job of our “database” role will be to install Postgres and create the required database and database user.

Our database role involves the following files:

§ roles/database/tasks/main.yml

§ roles/database/defaults/main.yml

§ roles/database/handlers/main.yml

§ roles/database/files/pg_hba.conf

§ roles/database/files/postgresql.conf

This role includes two customized Postgres configuration files.

postgresql.conf

Modifies the default listen_addresses configuration option so that Postgres will accept connections on any network interface. The default for Postgres is to accept connections only from localhost, which doesn’t work for us if we want our database to run on a separate host from our web application.

pg_hba.conf

Configures Postgres to authenticate connections over the network using username and password.

NOTE

I don’t show these files here because they are quite large. You can find them in the code samples on GitHub in the ch08 directory.

Example 8-5 shows the tasks involved in deploying Postgres.

Example 8-5. roles/database/tasks/main.yml

- name: install apt packages

apt: pkg={{ item }} update_cache=yes cache_valid_time=3600

sudo: True

with_items:

- libpq-dev

- postgresql

- python-psycopg2

- name: copy configuration file

copy: >

src=postgresql.conf dest=/etc/postgresql/9.3/main/postgresql.conf

owner=postgres group=postgres mode=0644

sudo: True

notify: restart postgres

- name: copy client authentication configuration file

copy: >

src=pg_hba.conf dest=/etc/postgresql/9.3/main/pg_hba.conf

owner=postgres group=postgres mode=0640

sudo: True

notify: restart postgres

- name: create a user

postgresql_user:

name: "{{ database_user }}"

password: "{{ db_pass }}"

sudo: True

sudo_user: postgres

- name: create the database

postgresql_db:

name: "{{ database_name }}"

owner: "{{ database_user }}"

encoding: UTF8

lc_ctype: "{{ locale }}"

lc_collate: "{{ locale }}"

template: template0

sudo: True

sudo_user: postgres

Example 8-6 shows the handlers file.

Example 8-6. roles/database/handlers/main.yml

- name: restart postgres

service: name=postgresql state=restarted

sudo: True

The only default variable we are going to specify is the database port, shown in Example 8-7.

Example 8-7. roles/database/defaults/main.yml

database_port: 5432

Note that our list of tasks refers to several variables that we haven’t defined anywhere in the role:

§ database_name

§ database_user

§ db_pass

§ locale

In Example 8-2 and Example 8-3, we pass in database_name and database_user when we invoke the role. I’m assuming that db_pass is defined in the secrets.yml file, which is included in the vars_files section. The locale variable is likely something that would be the same for every host, and might be used by multiple roles or playbooks, so I defined it in the group_vars/all file in the code samples that accompany this book.

WHY ARE THERE TWO WAYS TO DEFINE VARIABLES IN ROLES?

When Ansible first introduced support for roles, there was only one place to define role variables, in vars/main.yml. Variables defined in this location have a higher precedence than variables defined in the vars section of a play, which meant that you couldn’t override the variable unless you explicitly passed it as an argument to the role.

Ansible later introduced the notion of default role variables that go in defaults/main.yml. This type of variable is defined in a role, but has a low precedence, so it will be overridden if another variable with the same name is defined in the playbook.

If you think you might want to change the value of a variable in a role, use a default variable. If you don’t want it to change, then use a regular variable.

A “Mezzanine” Role for Deploying Mezzanine

The job of our “mezzanine” role will be to install Mezzanine. This includes installing nginx as the reverse proxy and supervisor as the process monitor.

Here are the files that are involved:

§ roles/mezzanine/defaults/main.yml

§ roles/mezzanine/handlers/main.yml

§ roles/mezzanine/tasks/django.yml

§ roles/mezzanine/tasks/main.yml

§ roles/mezzanine/tasks/nginx.yml

§ roles/mezzanine/templates/gunicorn.conf.py.j2

§ roles/mezzanine/templates/local_settings.py.filters.j2

§ roles/mezzanine/templates/local_settings.py.j2

§ roles/mezzanine/templates/nginx.conf.j2

§ roles/mezzanine/templates/supervisor.conf.j2

§ roles/mezzanine/vars/main.yml

Example 8-8 shows the variables we’ve defined for this role. Note that we’ve changed the name of the variables so that they all start with mezzanine. It’s good practice to do this with role variables because Ansible doesn’t have any notion of namespace across roles. This means that variables that are defined in other roles, or elsewhere in a playbook, will be accessible everywhere. This can cause some unexpected behavior if you accidentally use the same variable name in two different roles.

Example 8-8. roles/mezzanine/vars/main.yml

# vars file for mezzanine

mezzanine_user: "{{ ansible_ssh_user }}"

mezzanine_venv_home: "{{ ansible_env.HOME }}"

mezzanine_venv_path: "{{ mezzanine_venv_home }}/{{ mezzanine_proj_name }}"

mezzanine_repo_url: git@github.com:lorin/mezzanine-example.git

mezzanine_proj_dirname: project

mezzanine_proj_path: "{{ mezzanine_venv_path }}/{{ mezzanine_proj_dirname }}"

mezzanine_reqs_path: requirements.txt

mezzanine_conf_path: /etc/nginx/conf

mezzanine_python: "{{ mezzanine_venv_path }}/bin/python"

mezzanine_manage: "{{ mezzanine_python }} {{ mezzanine_proj_path }}/manage.py"

mezzanine_gunicorn_port: 8000

Example 8-9 shows the default variables defined on our mezzanine role. In this case, we have only a single variable. When I write default variables, I’m less likely to prefix them because I might intentionally want to override them elsewhere.

Example 8-9. roles/mezzanine/defaults/main.yml

tls_enabled: True

Because the task list is pretty long, I’ve decided to break it up across several files. Example 8-10 shows the top-level task file for the mezzanine role. It installs the apt packages, and then it uses include statements to invoke two other task files that are in the same directory, shown in Examples 8-11 and 8-12.

Example 8-10. roles/mezzanine/tasks/main.yml

- name: install apt packages

apt: pkg={{ item }} update_cache=yes cache_valid_time=3600

sudo: True

with_items:

- git

- libjpeg-dev

- libpq-dev

- memcached

- nginx

- python-dev

- python-pip

- python-psycopg2

- python-setuptools

- python-virtualenv

- supervisor

- include: django.yml

- include: nginx.yml

Example 8-11. roles/mezzanine/tasks/django.yml

- name: check out the repository on the host

git:

repo: "{{ mezzanine_repo_url }}"

dest: "{{ mezzanine_proj_path }}"

accept_hostkey: yes

- name: install required python packages

pip: name={{ item }} virtualenv={{ mezzanine_venv_path }}

with_items:

- gunicorn

- setproctitle

- south

- psycopg2

- django-compressor

- python-memcached

- name: install requirements.txt

pip: >

requirements={{ mezzanine_proj_path }}/{{ mezzanine_reqs_path }}

virtualenv={{ mezzanine_venv_path }}

- name: generate the settings file

template: src=local_settings.py.j2 dest={{ mezzanine_proj_path }}/local_settings.py

- name: sync the database, apply migrations, collect static content

django_manage:

command: "{{ item }}"

app_path: "{{ mezzanine_proj_path }}"

virtualenv: "{{ mezzanine_venv_path }}"

with_items:

- syncdb

- migrate

- collectstatic

- name: set the site id

script: scripts/setsite.py

environment:

PATH: "{{ mezzanine_venv_path }}/bin"

PROJECT_DIR: "{{ mezzanine_proj_path }}"

WEBSITE_DOMAIN: "{{ live_hostname }}"

- name: set the admin password

script: scripts/setadmin.py

environment:

PATH: "{{ mezzanine_venv_path }}/bin"

PROJECT_DIR: "{{ mezzanine_proj_path }}"

ADMIN_PASSWORD: "{{ admin_pass }}"

- name: set the gunicorn config file

template: src=gunicorn.conf.py.j2 dest={{ mezzanine_proj_path }}/gunicorn.conf.py

- name: set the supervisor config file

template: src=supervisor.conf.j2 dest=/etc/supervisor/conf.d/mezzanine.conf

sudo: True

notify: restart supervisor

- name: ensure config path exists

file: path={{ mezzanine_conf_path }} state=directory

sudo: True

when: tls_enabled

- name: install poll twitter cron job

cron: >

name="poll twitter" minute="*/5" user={{ mezzanine_user }}

job="{{ mezzanine_manage }} poll_twitter"

Example 8-12. roles/mezzanine/tasks/nginx.yml

- name: set the nginx config file

template: src=nginx.conf.j2 dest=/etc/nginx/sites-available/mezzanine.conf

notify: restart nginx

sudo: True

- name: enable the nginx config file

file:

src: /etc/nginx/sites-available/mezzanine.conf

dest: /etc/nginx/sites-enabled/mezzanine.conf

state: link

notify: restart nginx

sudo: True

- name: remove the default nginx config file

file: path=/etc/nginx/sites-enabled/default state=absent

notify: restart nginx

sudo: True

- name: create tls certificates

command: >

openssl req -new -x509 -nodes -out {{ mezzanine_proj_name }}.crt

-keyout {{ mezzanine_proj_name }}.key -subj '/CN={{ domains[0] }}' -days 3650

chdir={{ mezzanine_conf_path }}

creates={{ mezzanine_conf_path }}/{{ mezzanine_proj_name }}.crt

sudo: True

when: tls_enabled

notify: restart nginx

There’s one important difference between tasks defined in a role and tasks defined in a regular playbook, and that’s when using the copy or template modules.

When invoking copy in a task defined in a role, Ansible will first check the rolename/files/ directory for the location of the file to copy. Similarly, when invoking template in a task defined in a role, Ansible will first check the rolename/templates directory for the location of the template to use.

This means that a task that used to look like this in a playbook:

- name: set the nginx config file

template: src=templates/nginx.conf.j2 \

dest=/etc/nginx/sites-available/mezzanine.conf

Now looks like this when invoked from inside the role (note the change of the src parameter):

- name: set the nginx config file

template: src=nginx.conf.j2 dest=/etc/nginx/sites-available/mezzanine.conf

notify: restart nginx

Example 8-13 shows the handlers file.

Example 8-13. roles/mezzanine/handlers/main.yml

- name: restart supervisor

supervisorctl: name=gunicorn_mezzanine state=restarted

sudo: True

- name: restart nginx

service: name=nginx state=restarted

sudo: True

I won’t show the template files here, since they’re basically the same as in the previous chapter, although some of the variable names have changed. Check out the accompanying code samples for details.

Creating Role Files and Directories with ansible-galaxy

Ansible ships with another command-line tool we haven’t talked about yet, ansible-galaxy. Its primary purpose is to download roles that have been shared by the Ansible community (more on that later in the chapter). But it can also be used to generate scaffolding, an initial set of files and directories involved in a role:

$ ansible-galaxy init -p playbooks/roles web

The -p flag tells ansible-galaxy where your roles directory is. If you don’t specify it, then the role files will be created in your current directory.

Running the command creates the following files and directories:

§ playbooks/roles/web/tasks/main.yml

§ playbooks/roles/web/handlers/main.yml

§ playbooks/roles/web/vars/main.yml

§ playbooks/roles/web/defaults/main.yml

§ playbooks/roles/web/meta/main.yml

§ playbooks/roles/web/files/

§ playbooks/roles/web/templates/

§ playbooks/roles/web/README.md

Dependent Roles

Imagine that we had two roles, web and database, that both required an NTP1 server to be installed on the host. We could specify the installation of the NTP server in both the web and database roles, but that would result in duplication. We could create a separate ntp role, but then we would have to remember that whenever we apply the web or database role to a host, we have to apply the ntp role as well. This would avoid the duplication, but it’s error-prone because we might forget to specify the ntp role. What we really want is to have an ntp role that is always applied to a host whenever we apply the web role or the database role.

Ansible supports a feature called dependent roles to deal with this scenario. When you define a role, you can specify that it depends on one or more other roles. Ansible will ensure that roles that are specified as dependencies are executed first.

Continuing with our example, let’s say that we created an ntp role that configures a host to synchronize its time with an NTP server. Ansible allows us to pass parameters to dependent roles, so let’s also assume that we can pass the NTP server as a parameter to that role.

We’d specify that the web role depends on the ntp role by creating a roles/web/meta/main.yml file and listing it as a role, with a parameter, as shown in Example 8-14.

Example 8-14. roles/web/meta/main.yml

dependencies:

- { role: ntp, ntp_server=ntp.ubuntu.com }

We can also specify multiple dependent roles. For example, if we had a django role for setting up a Django web server, and we wanted to specify nginx and memcached as dependent roles, then the role metadata file might look like Example 8-15.

Example 8-15. roles/django/meta/main.yml

dependencies:

- { role: web }

- { role: memcached }

For details on how Ansible evaluates the role dependencies, check out the official Ansible documentation on role dependencies.

Ansible Galaxy

If you need to deploy an open source software system onto your hosts, chances are somebody has already written an Ansible role to do it. Although Ansible does make it easier to write scripts for deploying software, some systems are just plain tricky to deploy.

Whether you want to reuse a role somebody has already written, or you just want to see how someone else solved the problem you’re working on, Ansible Galaxy can help you out. Ansible Galaxy is an open source repository of Ansible roles contributed by the Ansible community. The roles themselves are stored on GitHub.

Web Interface

You can explore the available roles on the Ansible Galaxy site. Galaxy supports freetext searching and browsing by category or contributor.

Command-Line Interface

The ansible-galaxy command-line tool allows you to download roles from Ansible Galaxy.

Installing a role

Let’s say I want to install the role named ntp, written by GitHub user bennojoy. This is a role that will configure a host to synchronize its clock with an NTP server.

Install the role with the install command.

$ ansible-galaxy install -p ./roles bennojoy.ntp

The ansible-galaxy program will install roles to your systemwide location by default (see “Where Does Ansible Look for My Roles?”), which we overrode in the preceding example with the -p flag.

The output should look like this:

downloading role 'ntp', owned by bennojoy

no version specified, installing master

- downloading role from https://github.com/bennojoy/ntp/archive/master.tar.gz

- extracting bennojoy.ntp to ./roles/bennojoy.ntp

write_galaxy_install_info!

bennojoy.ntp was installed successfully

The ansible-galaxy tool will install the role files to roles/bennojoy.ntp.

Ansible will install some metadata about the installation to the ./roles/bennojoy.ntp/meta/.galaxy_install_info file. On my machine, that file contains:

{install_date: 'Sat Oct 4 20:12:58 2014', version: master}

NOTE

The bennojoy.ntp role does not have a specific version number, so the version is simply listed as “master.” Some roles will have a specific version number, such as 1.2.

List installed roles

You can list installed roles by doing:

$ ansible-galaxy list

Output should look like this:

bennojoy.ntp, master

Uninstall a role

Remove a role with the remove command:

$ ansible-galaxy remove bennojoy.ntp

Contributing Your Own Role

See “How To Share Roles You’ve Written” on the Ansible Galaxy website for details on how to contribute a role to the community. Because the roles are hosted on GitHub, you’ll need to have a GitHub account to contribute.

At this point, you should now have an understanding of how to use roles, how to write your own roles, and how to download roles written by others. Roles are a great way to organize your playbooks. I use them all the time, and I highly recommend them.

1 NTP stands for Network Time Protocol, used for synchronizing clocks.