Ansible: Up and Running (2015)
Chapter 8. Roles: Scaling Up Your Playbooks
One of the things I like about Ansible is how it scales both up and down. I’m not referring to the number of hosts you’re managing, but rather the complexity of the jobs you’re trying to automate.
Ansible scales down well because simple tasks are easy to implement. It scales up well because it provides mechanisms for decomposing complex jobs into smaller pieces.
In Ansible, the role is the primary mechanism for breaking apart a playbook into multiple files. This simplifies writing complex playbooks, and it also makes them easier to reuse. Think of a role as something you assign to one or more hosts. For example, you’d assign a database role to the hosts that will act as database servers.
Basic Structure of a Role
An Ansible role has a name, such as “database.” Files associated with the database role go in the roles/database directory, which contains the following files and directories.
roles/database/tasks/main.yml
Tasks
roles/database/files/
Holds files to be uploaded to hosts
roles/database/templates/
Holds Jinja2 template files
roles/database/handlers/main.yml
Handlers
roles/database/vars/main.yml
Variables that shouldn’t be overridden
roles/database/defaults/main.yml
Default variables that can be overridden
roles/database/meta/main.yml
Dependency information about a role
Each individual file is optional; if your role doesn’t have any handlers, there’s no need to have an empty handlers/main.yml file.
WHERE DOES ANSIBLE LOOK FOR MY ROLES?
Ansible will look for roles in the roles directory alongside your . playbooks. It will also look for systemwide roles in /etc/ansible/roles. You can customize the system-wide location of roles by setting the roles_path setting in the defaults section of your ansible.cfg file, as shown in Example 8-1.
EXAMPLE 8-1. ANSIBLE.CFG: OVERRIDING DEFAULT ROLES PATH
[defaults]
roles_path = ~/ansible_roles
You can also override this by setting the ANSIBLE_ROLES_PATH environment variable, as described in Appendix B.
Example: Database and Mezzanine Roles
Let’s take our Mezzanine playbook and implement it with Ansible roles. We could create a single role called “mezzanine,” but instead I’m going to break out the deployment of the Postgres database into a separate role called “database.” This will make it easier to eventually deploy the database on a host separate from the Mezzanine application.
Using Roles in Your Playbooks
Before we get into the details of how to define roles, let’s go over how to assign roles to hosts in a playbook.
Example 8-2 shows what our playbook looks like for deploying Mezzanine onto a single host, once we have database and Mezzanine roles defined.
Example 8-2. mezzanine-single-host.yml
- name: deploy mezzanine on vagrant
hosts: web
vars_files:
- secrets.yml
roles:
- role: database
database_name: "{{ mezzanine_proj_name }}"
database_user: "{{ mezzanine_proj_name }}"
- role: mezzanine
live_hostname: 192.168.33.10.xip.io
domains:
- 192.168.33.10.xip.io
- www.192.168.33.10.xip.io
When we use roles, we have a roles section in our playbook. The roles section expects a list of roles. In our example, our list contains two roles, database and mezzanine.
Note how we can pass in variables when invoking the roles. In our example, we pass the database_name and database_user variables for the database role. If these variables have already been defined in the role (either in vars/main.yml or defaults/main.yml), then the values will be overridden with the variables that were passed in.
If you aren’t passing in variables to roles, you can simply specify the names of the roles, like this:
roles:
- database
- mezzanine
With database and mezzanine roles defined, writing a playbook that deploys the web application and database services to multiple hosts becomes much simpler. Example 8-3 shows a playbook that deploys the database on the db host and the web service on the web host. Note that this playbook contains two separate plays.
Example 8-3. mezzanine-across-hosts.yml
- name: deploy postgres on vagrant
hosts: db
vars_files:
- secrets.yml
roles:
- role: database
database_name: "{{ mezzanine_proj_name }}"
database_user: "{{ mezzanine_proj_name }}"
- name: deploy mezzanine on vagrant
hosts: web
vars_files:
- secrets.yml
roles:
- role: mezzanine
database_host: "{{ hostvars.db.ansible_eth1.ipv4.address }}"
live_hostname: 192.168.33.10.xip.io
domains:
- 192.168.33.10.xip.io
- www.192.168.33.10.xip.io
Pre-Tasks and Post-Tasks
Sometimes you want to run some tasks before or after you invoke your roles. Let’s say you wanted to update the apt cache before you deployed Mezzanine, and you wanted to send a notification to Slack channel after you deployed.
Ansible allows you to define a list of tasks that execute before the roles with a pre_tasks section, and a list of tasks that executes after the roles with a post_tasks section. Example 8-4 shows an example of these in action.
Example 8-4. Using pre-tasks and post-tasks
- name: deploy mezzanine on vagrant
hosts: web
vars_files:
- secrets.yml
pre_tasks:
- name: update the apt cache
apt: update_cache=yes
roles:
- role: mezzanine
database_host: "{{ hostvars.db.ansible_eth1.ipv4.address }}"
live_hostname: 192.168.33.10.xip.io
domains:
- 192.168.33.10.xip.io
- www.192.168.33.10.xip.io
post_tasks:
- name: notify Slack that the servers have been updated
local_action: >
slack
domain=acme.slack.com
token={{ slack_token }}
msg="web server {{ inventory_hostname }} configured"
But enough about using roles; let’s talk about writing them.
A “Database” Role for Deploying the Database
The job of our “database” role will be to install Postgres and create the required database and database user.
Our database role involves the following files:
§ roles/database/tasks/main.yml
§ roles/database/defaults/main.yml
§ roles/database/handlers/main.yml
§ roles/database/files/pg_hba.conf
§ roles/database/files/postgresql.conf
This role includes two customized Postgres configuration files.
postgresql.conf
Modifies the default listen_addresses configuration option so that Postgres will accept connections on any network interface. The default for Postgres is to accept connections only from localhost, which doesn’t work for us if we want our database to run on a separate host from our web application.
pg_hba.conf
Configures Postgres to authenticate connections over the network using username and password.
NOTE
I don’t show these files here because they are quite large. You can find them in the code samples on GitHub in the ch08 directory.
Example 8-5 shows the tasks involved in deploying Postgres.
Example 8-5. roles/database/tasks/main.yml
- name: install apt packages
apt: pkg={{ item }} update_cache=yes cache_valid_time=3600
sudo: True
with_items:
- libpq-dev
- postgresql
- python-psycopg2
- name: copy configuration file
copy: >
src=postgresql.conf dest=/etc/postgresql/9.3/main/postgresql.conf
owner=postgres group=postgres mode=0644
sudo: True
notify: restart postgres
- name: copy client authentication configuration file
copy: >
src=pg_hba.conf dest=/etc/postgresql/9.3/main/pg_hba.conf
owner=postgres group=postgres mode=0640
sudo: True
notify: restart postgres
- name: create a user
postgresql_user:
name: "{{ database_user }}"
password: "{{ db_pass }}"
sudo: True
sudo_user: postgres
- name: create the database
postgresql_db:
name: "{{ database_name }}"
owner: "{{ database_user }}"
encoding: UTF8
lc_ctype: "{{ locale }}"
lc_collate: "{{ locale }}"
template: template0
sudo: True
sudo_user: postgres
Example 8-6 shows the handlers file.
Example 8-6. roles/database/handlers/main.yml
- name: restart postgres
service: name=postgresql state=restarted
sudo: True
The only default variable we are going to specify is the database port, shown in Example 8-7.
Example 8-7. roles/database/defaults/main.yml
database_port: 5432
Note that our list of tasks refers to several variables that we haven’t defined anywhere in the role:
§ database_name
§ database_user
§ db_pass
§ locale
In Example 8-2 and Example 8-3, we pass in database_name and database_user when we invoke the role. I’m assuming that db_pass is defined in the secrets.yml file, which is included in the vars_files section. The locale variable is likely something that would be the same for every host, and might be used by multiple roles or playbooks, so I defined it in the group_vars/all file in the code samples that accompany this book.
WHY ARE THERE TWO WAYS TO DEFINE VARIABLES IN ROLES?
When Ansible first introduced support for roles, there was only one place to define role variables, in vars/main.yml. Variables defined in this location have a higher precedence than variables defined in the vars section of a play, which meant that you couldn’t override the variable unless you explicitly passed it as an argument to the role.
Ansible later introduced the notion of default role variables that go in defaults/main.yml. This type of variable is defined in a role, but has a low precedence, so it will be overridden if another variable with the same name is defined in the playbook.
If you think you might want to change the value of a variable in a role, use a default variable. If you don’t want it to change, then use a regular variable.
A “Mezzanine” Role for Deploying Mezzanine
The job of our “mezzanine” role will be to install Mezzanine. This includes installing nginx as the reverse proxy and supervisor as the process monitor.
Here are the files that are involved:
§ roles/mezzanine/defaults/main.yml
§ roles/mezzanine/handlers/main.yml
§ roles/mezzanine/tasks/django.yml
§ roles/mezzanine/tasks/main.yml
§ roles/mezzanine/tasks/nginx.yml
§ roles/mezzanine/templates/gunicorn.conf.py.j2
§ roles/mezzanine/templates/local_settings.py.filters.j2
§ roles/mezzanine/templates/local_settings.py.j2
§ roles/mezzanine/templates/nginx.conf.j2
§ roles/mezzanine/templates/supervisor.conf.j2
§ roles/mezzanine/vars/main.yml
Example 8-8 shows the variables we’ve defined for this role. Note that we’ve changed the name of the variables so that they all start with mezzanine. It’s good practice to do this with role variables because Ansible doesn’t have any notion of namespace across roles. This means that variables that are defined in other roles, or elsewhere in a playbook, will be accessible everywhere. This can cause some unexpected behavior if you accidentally use the same variable name in two different roles.
Example 8-8. roles/mezzanine/vars/main.yml
# vars file for mezzanine
mezzanine_user: "{{ ansible_ssh_user }}"
mezzanine_venv_home: "{{ ansible_env.HOME }}"
mezzanine_venv_path: "{{ mezzanine_venv_home }}/{{ mezzanine_proj_name }}"
mezzanine_repo_url: git@github.com:lorin/mezzanine-example.git
mezzanine_proj_dirname: project
mezzanine_proj_path: "{{ mezzanine_venv_path }}/{{ mezzanine_proj_dirname }}"
mezzanine_reqs_path: requirements.txt
mezzanine_conf_path: /etc/nginx/conf
mezzanine_python: "{{ mezzanine_venv_path }}/bin/python"
mezzanine_manage: "{{ mezzanine_python }} {{ mezzanine_proj_path }}/manage.py"
mezzanine_gunicorn_port: 8000
Example 8-9 shows the default variables defined on our mezzanine role. In this case, we have only a single variable. When I write default variables, I’m less likely to prefix them because I might intentionally want to override them elsewhere.
Example 8-9. roles/mezzanine/defaults/main.yml
tls_enabled: True
Because the task list is pretty long, I’ve decided to break it up across several files. Example 8-10 shows the top-level task file for the mezzanine role. It installs the apt packages, and then it uses include statements to invoke two other task files that are in the same directory, shown in Examples 8-11 and 8-12.
Example 8-10. roles/mezzanine/tasks/main.yml
- name: install apt packages
apt: pkg={{ item }} update_cache=yes cache_valid_time=3600
sudo: True
with_items:
- git
- libjpeg-dev
- libpq-dev
- memcached
- nginx
- python-dev
- python-pip
- python-psycopg2
- python-setuptools
- python-virtualenv
- supervisor
- include: django.yml
- include: nginx.yml
Example 8-11. roles/mezzanine/tasks/django.yml
- name: check out the repository on the host
git:
repo: "{{ mezzanine_repo_url }}"
dest: "{{ mezzanine_proj_path }}"
accept_hostkey: yes
- name: install required python packages
pip: name={{ item }} virtualenv={{ mezzanine_venv_path }}
with_items:
- gunicorn
- setproctitle
- south
- psycopg2
- django-compressor
- python-memcached
- name: install requirements.txt
pip: >
requirements={{ mezzanine_proj_path }}/{{ mezzanine_reqs_path }}
virtualenv={{ mezzanine_venv_path }}
- name: generate the settings file
template: src=local_settings.py.j2 dest={{ mezzanine_proj_path }}/local_settings.py
- name: sync the database, apply migrations, collect static content
django_manage:
command: "{{ item }}"
app_path: "{{ mezzanine_proj_path }}"
virtualenv: "{{ mezzanine_venv_path }}"
with_items:
- syncdb
- migrate
- collectstatic
- name: set the site id
script: scripts/setsite.py
environment:
PATH: "{{ mezzanine_venv_path }}/bin"
PROJECT_DIR: "{{ mezzanine_proj_path }}"
WEBSITE_DOMAIN: "{{ live_hostname }}"
- name: set the admin password
script: scripts/setadmin.py
environment:
PATH: "{{ mezzanine_venv_path }}/bin"
PROJECT_DIR: "{{ mezzanine_proj_path }}"
ADMIN_PASSWORD: "{{ admin_pass }}"
- name: set the gunicorn config file
template: src=gunicorn.conf.py.j2 dest={{ mezzanine_proj_path }}/gunicorn.conf.py
- name: set the supervisor config file
template: src=supervisor.conf.j2 dest=/etc/supervisor/conf.d/mezzanine.conf
sudo: True
notify: restart supervisor
- name: ensure config path exists
file: path={{ mezzanine_conf_path }} state=directory
sudo: True
when: tls_enabled
- name: install poll twitter cron job
cron: >
name="poll twitter" minute="*/5" user={{ mezzanine_user }}
job="{{ mezzanine_manage }} poll_twitter"
Example 8-12. roles/mezzanine/tasks/nginx.yml
- name: set the nginx config file
template: src=nginx.conf.j2 dest=/etc/nginx/sites-available/mezzanine.conf
notify: restart nginx
sudo: True
- name: enable the nginx config file
file:
src: /etc/nginx/sites-available/mezzanine.conf
dest: /etc/nginx/sites-enabled/mezzanine.conf
state: link
notify: restart nginx
sudo: True
- name: remove the default nginx config file
file: path=/etc/nginx/sites-enabled/default state=absent
notify: restart nginx
sudo: True
- name: create tls certificates
command: >
openssl req -new -x509 -nodes -out {{ mezzanine_proj_name }}.crt
-keyout {{ mezzanine_proj_name }}.key -subj '/CN={{ domains[0] }}' -days 3650
chdir={{ mezzanine_conf_path }}
creates={{ mezzanine_conf_path }}/{{ mezzanine_proj_name }}.crt
sudo: True
when: tls_enabled
notify: restart nginx
There’s one important difference between tasks defined in a role and tasks defined in a regular playbook, and that’s when using the copy or template modules.
When invoking copy in a task defined in a role, Ansible will first check the rolename/files/ directory for the location of the file to copy. Similarly, when invoking template in a task defined in a role, Ansible will first check the rolename/templates directory for the location of the template to use.
This means that a task that used to look like this in a playbook:
- name: set the nginx config file
template: src=templates/nginx.conf.j2 \
dest=/etc/nginx/sites-available/mezzanine.conf
Now looks like this when invoked from inside the role (note the change of the src parameter):
- name: set the nginx config file
template: src=nginx.conf.j2 dest=/etc/nginx/sites-available/mezzanine.conf
notify: restart nginx
Example 8-13 shows the handlers file.
Example 8-13. roles/mezzanine/handlers/main.yml
- name: restart supervisor
supervisorctl: name=gunicorn_mezzanine state=restarted
sudo: True
- name: restart nginx
service: name=nginx state=restarted
sudo: True
I won’t show the template files here, since they’re basically the same as in the previous chapter, although some of the variable names have changed. Check out the accompanying code samples for details.
Creating Role Files and Directories with ansible-galaxy
Ansible ships with another command-line tool we haven’t talked about yet, ansible-galaxy. Its primary purpose is to download roles that have been shared by the Ansible community (more on that later in the chapter). But it can also be used to generate scaffolding, an initial set of files and directories involved in a role:
$ ansible-galaxy init -p playbooks/roles web
The -p flag tells ansible-galaxy where your roles directory is. If you don’t specify it, then the role files will be created in your current directory.
Running the command creates the following files and directories:
§ playbooks/roles/web/tasks/main.yml
§ playbooks/roles/web/handlers/main.yml
§ playbooks/roles/web/vars/main.yml
§ playbooks/roles/web/defaults/main.yml
§ playbooks/roles/web/meta/main.yml
§ playbooks/roles/web/files/
§ playbooks/roles/web/templates/
§ playbooks/roles/web/README.md
Dependent Roles
Imagine that we had two roles, web and database, that both required an NTP1 server to be installed on the host. We could specify the installation of the NTP server in both the web and database roles, but that would result in duplication. We could create a separate ntp role, but then we would have to remember that whenever we apply the web or database role to a host, we have to apply the ntp role as well. This would avoid the duplication, but it’s error-prone because we might forget to specify the ntp role. What we really want is to have an ntp role that is always applied to a host whenever we apply the web role or the database role.
Ansible supports a feature called dependent roles to deal with this scenario. When you define a role, you can specify that it depends on one or more other roles. Ansible will ensure that roles that are specified as dependencies are executed first.
Continuing with our example, let’s say that we created an ntp role that configures a host to synchronize its time with an NTP server. Ansible allows us to pass parameters to dependent roles, so let’s also assume that we can pass the NTP server as a parameter to that role.
We’d specify that the web role depends on the ntp role by creating a roles/web/meta/main.yml file and listing it as a role, with a parameter, as shown in Example 8-14.
Example 8-14. roles/web/meta/main.yml
dependencies:
- { role: ntp, ntp_server=ntp.ubuntu.com }
We can also specify multiple dependent roles. For example, if we had a django role for setting up a Django web server, and we wanted to specify nginx and memcached as dependent roles, then the role metadata file might look like Example 8-15.
Example 8-15. roles/django/meta/main.yml
dependencies:
- { role: web }
- { role: memcached }
For details on how Ansible evaluates the role dependencies, check out the official Ansible documentation on role dependencies.
Ansible Galaxy
If you need to deploy an open source software system onto your hosts, chances are somebody has already written an Ansible role to do it. Although Ansible does make it easier to write scripts for deploying software, some systems are just plain tricky to deploy.
Whether you want to reuse a role somebody has already written, or you just want to see how someone else solved the problem you’re working on, Ansible Galaxy can help you out. Ansible Galaxy is an open source repository of Ansible roles contributed by the Ansible community. The roles themselves are stored on GitHub.
Web Interface
You can explore the available roles on the Ansible Galaxy site. Galaxy supports freetext searching and browsing by category or contributor.
Command-Line Interface
The ansible-galaxy command-line tool allows you to download roles from Ansible Galaxy.
Installing a role
Let’s say I want to install the role named ntp, written by GitHub user bennojoy. This is a role that will configure a host to synchronize its clock with an NTP server.
Install the role with the install command.
$ ansible-galaxy install -p ./roles bennojoy.ntp
The ansible-galaxy program will install roles to your systemwide location by default (see “Where Does Ansible Look for My Roles?”), which we overrode in the preceding example with the -p flag.
The output should look like this:
downloading role 'ntp', owned by bennojoy
no version specified, installing master
- downloading role from https://github.com/bennojoy/ntp/archive/master.tar.gz
- extracting bennojoy.ntp to ./roles/bennojoy.ntp
write_galaxy_install_info!
bennojoy.ntp was installed successfully
The ansible-galaxy tool will install the role files to roles/bennojoy.ntp.
Ansible will install some metadata about the installation to the ./roles/bennojoy.ntp/meta/.galaxy_install_info file. On my machine, that file contains:
{install_date: 'Sat Oct 4 20:12:58 2014', version: master}
NOTE
The bennojoy.ntp role does not have a specific version number, so the version is simply listed as “master.” Some roles will have a specific version number, such as 1.2.
List installed roles
You can list installed roles by doing:
$ ansible-galaxy list
Output should look like this:
bennojoy.ntp, master
Uninstall a role
Remove a role with the remove command:
$ ansible-galaxy remove bennojoy.ntp
Contributing Your Own Role
See “How To Share Roles You’ve Written” on the Ansible Galaxy website for details on how to contribute a role to the community. Because the roles are hosted on GitHub, you’ll need to have a GitHub account to contribute.
At this point, you should now have an understanding of how to use roles, how to write your own roles, and how to download roles written by others. Roles are a great way to organize your playbooks. I use them all the time, and I highly recommend them.
1 NTP stands for Network Time Protocol, used for synchronizing clocks.