Ansible: Up and Running (2015)

Chapter 7. Complex Playbooks

In the last chapter, we went over a fully functional Ansible playbook for deploying the Mezzanine CMS. That example exercised a number of common Ansible features, but it didn’t cover all of them. This chapter touches on those additional features, which makes it a bit of a grab bag.

Running a Task on the Control Machine

Sometimes you want to run a particular task on the control machine instead of on the remote host. Ansible provides the local_action clause for tasks to support this.

Imagine that the server we wanted to install Mezzanine onto had just booted, so that if we ran our playbook too soon, it would error out because the server hadn’t fully started up yet.

We could start off our playbook by invoking the wait_for module to wait until the SSH server was ready to accept connections before we executed the rest of the playbook. In this case, we want this module to execute on our laptop, not on the remote host.

The first task of our playbook would have to start off like this:

- name: wait for ssh server to be running

local_action: wait_for port=22 host="{{ inventory_hostname }}"

search_regex=OpenSSH

Note how we’re referencing inventory_hostname in this task, which evaluates to the name of the remote host, not localhost. That’s because the scope of these variables is still the remote host, even though the task is executing locally.

NOTE

If your play involves multiple hosts, and you use local_action, the task will be executed multiple times, one for each host. You can restrict this using run_once, as described in “Running on One Host at a Time”.

Running a Task on a Machine Other Than the Host

Sometimes you want to run a task that’s associated with a host, but you want to execute the task on a different server. You can use the delegate_to clause to run the task on a different host.

Two common use cases are:

§ Enabling host-based alerts with an alerting system such as Nagios

§ Adding a host to a load balancer such as HAProxy

For example, imagine we want to enable Nagios alerts for all of the hosts in our web group. Assume we have an entry in our inventory named nagios.example.com that is running Nagios. Example 7-1 shows an example that uses delegate_to.

Example 7-1. Using delegate_to with Nagios

- name: enable alerts for web servers

hosts: web

tasks:

- name: enable alerts

nagios: action=enable_alerts service=web host={{ inventory_hostname }}

delegate_to: nagios.example.com

In this example, Ansible would execute the nagios task on nagios.example.com, but the inventory_hostname variable referenced in the play would evaluate to the web host.

For a more detailed example that uses delegate_to, see the lamp_haproxy/rolling_update.yml example in the Ansible project’s examples GitHub repo.

Manually Gathering Facts

If it was possible that the SSH server wasn’t yet running when we started our playbook, we need to turn off explicit fact gathering; otherwise, Ansible will try to SSH to the host to gather facts before running the first tasks. Since we still need access to facts (recall that we use theansible_env fact in our playbook), we can explicitly invoke the setup module to get Ansible to gather our facts, as shown in Example 7-2.

Example 7-2. Waiting for ssh server to come up

- name: Deploy mezzanine

hosts: web

gather_facts: False

# vars & vars_files section not shown here

tasks:

- name: wait for ssh server to be running

local_action: wait_for port=22 host="{{ inventory_hostname }}"

search_regex=OpenSSH

- name: gather facts

setup:

# The rest of the tasks go here

Running on One Host at a Time

By default, Ansible runs each task in parallel across all hosts. Sometimes you want to run your task on one host at a time. The canonical example is when upgrading application servers that are behind a load balancer. Typically, you take the application server out of the load balancer, upgrade it, and put it back. But you don’t want to take all of your application servers out of the load balancer, or your service will become unavailable.

You can use the serial clause on a play to tell Ansible to restrict the number of hosts that a play runs on. Example 7-3 shows an example that removes hosts one at a time from an Amazon EC2 elastic load balancer, upgrades the system packages, and then puts them back into the load balancer. (We cover Amazon EC2 in more detail in Chapter 12.)

Example 7-3. Removing hosts from load balancer and upgrading packages

- name: upgrade packages on servers behind load balancer

hosts: myhosts

serial: 1

tasks:

- name: get the ec2 instance id and elastic load balancer id

ec2_facts:

- name: take the host out of the elastic load balancer

local_action: ec2_elb

args:

instance_id: "{{ ansible_ec2_instance_id }}"

state: absent

- name: upgrade packages

apt: update_cache=yes upgrade=yes

- name: put the host back in the elastic load balancer

local_action: ec2_elb

args:

instance_id: "{{ ansible_ec2_instance_id }}"

state: present

ec2_elbs: "{{ item }}"

with_items: ec2_elbs

In our example, we passed 1 as the argument to the serial clause, telling Ansible to run on only one host at a time. If we had passed 2, then Ansible would have run two hosts at a time.

Normally, when a task fails, Ansible stops running tasks against the host that fails, but continues to run against other hosts. In the load-balancing scenario, you might want Ansible to fail the entire play before all hosts have failed a task. Otherwise, you might end up with the situation where you have taken each host out of the load balancer, and have it fail, leaving no hosts left inside of your load balancer.

You can use a max_fail_percentage clause along with the serial clause to specify the maximum percentage of failed hosts before Ansible fails the entire play. For example, assume that we specify a maximum fail percentage of 25%, as shown here:

- name: upgrade packages on servers behind load balancer

hosts: myhosts

serial: 1

max_fail_percentage: 25

tasks:

# tasks go here

If we had four hosts behind the load balancer, and one of the hosts failed a task, then Ansible would keep executing the play, because this would not exceed the 25% threshold. However, if a second host failed a task, Ansible would fail the entire play, because then 50% of the hosts would have failed a task, exceeding the 25% threshold. If you want Ansible to fail if any of the hosts fail a task, set the max_fail_percentage to 0.

Running Only Once

Sometimes you might want a task to run only once, even if there are multiple hosts. For example, perhaps you have multiple application servers running behind the load balancer, and you want to run a database migration, but you only need to run the migration on one application server.

You can use the run_once clause to tell Ansible to run the command only once.

- name: run the database migrations

command: /opt/run_migrations

run_once: true

Using run_once can be particularly useful when using local_action if your playbook involves multiple hosts, and you want to run the local task only once:

- name: run the task locally, only once

local_action: command /opt/my-custom-command

run_once: true

Dealing with Badly Behaved Commands: changed_when and failed_when

Recall that in Chapter 6, we avoided invoking the custom createdb manage.py command, shown in Example 7-4, because the call wasn’t idempotent.

Example 7-4. Calling django manage.py createdb

- name: initialize the database

django_manage:

command: createdb --noinput --nodata

app_path: "{{ proj_path }}"

virtualenv: "{{ venv_path }}"

We got around this problem by invoking several django manage.py commands that were idempotent, and that did the equivalent of createdb. But what if we didn’t have a module that could invoke equivalent commands? The answer is to use changed_when and failed_when clauses to change how Ansible identifies that a task has changed state or failed.

First, we need to understand what the output of this command is the first time it’s run, and what the output is when it’s run the second time.

Recall from Chapter 4 that to capture the output of a failed task, you add a register clause to save the output to a variable and a failed_when: False clause so that the execution doesn’t stop even if the module returns failure. Then add a debug task to print out the variable, and finally afail clause so that the playbook stops executing, as shown in Example 7-5.

Example 7-5. Viewing the output of a task

- name: initialize the database

django_manage:

command: createdb --noinput --nodata

app_path: "{{ proj_path }}"

virtualenv: "{{ venv_path }}"

failed_when: False

- debug: var=result

- fail:

The output of the playbook when invoked the second time is Example 7-6.

Example 7-6. Returned values when database has already been created

TASK: [debug var=result] ******************************************************

ok: [default] => {

"result": {

"cmd": "python manage.py createdb --noinput --nodata",

"failed": false,

"failed_when_result": false,

"invocation": {

"module_args": '',

"module_name": "django_manage"

"msg": "\n:stderr: CommandError: Database already created, you probably

want the syncdb or migrate command\n",

"path":

"/home/vagrant/mezzanine_example/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bi

n:/sbin:/bin:/usr/games:/usr/local/games",

"state": "absent",

"syspath": [

``,

"/usr/lib/python2.7",

"/usr/lib/python2.7/plat-x86_64-linux-gnu",

"/usr/lib/python2.7/lib-tk",

"/usr/lib/python2.7/lib-old",

"/usr/lib/python2.7/lib-dynload",

"/usr/local/lib/python2.7/dist-packages",

"/usr/lib/python2.7/dist-packages"

]

}

This is what happens when the task has been run multiple times. To see what happens the first time, delete the database and then have the playbook recreate it. The simplest way to do that is to run an Ansible ad-hoc task that deletes the database:

$ ansible default --sudo --sudo-user postgres -m postgresql_db -a \

"name=mezzanine_example state=absent"

Now when I run the playbook again, the output is Example 7-7.

Example 7-7. Returned values when invoked the first time

ASK: [debug var=result] ******************************************************

ok: [default] => {

"result": {

"app_path": "/home/vagrant/mezzanine_example/project",

"changed": false,

"cmd": "python manage.py createdb --noinput --nodata",

"failed": false,

"failed_when_result": false,

"invocation": {

"module_args": '',

"module_name": "django_manage"

"out": "Creating tables ...\nCreating table auth_permission\nCreating

table auth_group_permissions\nCreating table auth_group\nCreating table

auth_user_groups\nCreating table auth_user_user_permissions\nCreating table

auth_user\nCreating table django_content_type\nCreating table

django_redirect\nCreating table django_session\nCreating table

django_site\nCreating table conf_setting\nCreating table

core_sitepermission_sites\nCreating table core_sitepermission\nCreating table

generic_threadedcomment\nCreating table generic_keyword\nCreating table

generic_assignedkeyword\nCreating table generic_rating\nCreating table

blog_blogpost_related_posts\nCreating table blog_blogpost_categories\nCreating

table blog_blogpost\nCreating table blog_blogcategory\nCreating table

forms_form\nCreating table forms_field\nCreating table forms_formentry\nCreating

table forms_fieldentry\nCreating table pages_page\nCreating table

pages_richtextpage\nCreating table pages_link\nCreating table

galleries_gallery\nCreating table galleries_galleryimage\nCreating table

twitter_query\nCreating table twitter_tweet\nCreating table

south_migrationhistory\nCreating table django_admin_log\nCreating table

django_comments\nCreating table django_comment_flags\n\nCreating default site

record: vagrant-ubuntu-trusty-64 ... \n\nInstalled 2 object(s) from 1

fixture(s)\nInstalling custom SQL ...\nInstalling indexes ...\nInstalled 0

object(s) from 0 fixture(s)\n\nFaking initial migrations ...\n\n",

"pythonpath": null,

"settings": null,

"virtualenv": "/home/vagrant/mezzanine_example"

}

Note that changed is set to false even though it did, indeed, change the state of the database. That’s because the django_manage module always returns changed=false when it runs commands that the module doesn’t know about.

We can add a changed_when clause that looks for "Creating tables" in the out return value, as shown in Example 7-8.

Example 7-8. First attempt at adding changed_when

- name: initialize the database

django_manage:

command: createdb --noinput --nodata

app_path: "{{ proj_path }}"

virtualenv: "{{ venv_path }}"

changed_when: '"Creating tables" in result.out'

The problem with this approach is that, if we look back at Example 7-6, we see that there is no out variable. Instead, there’s a msg variable. This means that if we executed the playbook, we’d get the following (not terribly helpful) error the second time:

TASK: [initialize the database] ********************************************

fatal: [default] => error while evaluating conditional: "Creating tables" in

result.out

Instead, we need to ensure that Ansible evaluates result.out only if that variable is defined. One way is to explicitly check to see if the variable is defined:

changed_when: result.out is defined and "Creating tables" not in result.out

Alternatively, we could also provide a default value for result.out if it doesn’t exist by using the Jinja2 default filter:

changed_when: '"Creating tables" not in result.out|default("")'

Or we could simply check for failed to be false:

changed_when: not result.failed and "Creating tables" not in result.out

We also need to change the failure behavior, since we don’t want Ansible to consider the task as failed just because createdb has been invoked already:

failed_when: result.failed and "Database already created" not in result.msg

Here the failed check serves as a guard for the existence of the msg variable. The final idempotent task is shown in Example 7-9.

Example 7-9. Idempotent manage.py createdb

- name: initialize the database

django_manage:

command: createdb --noinput --nodata

app_path: "{{ proj_path }}"

virtualenv: "{{ venv_path }}"

changed_when: not result.failed and "Creating tables" in result.out

failed_when: result.failed and "Database already created" not in result.msg

Retrieving the IP Address from the Host

In our playbook, several of the hostnames we use are derived from the IP address of the web server.

live_hostname: 192.168.33.10.xip.io

domains:

- 192.168.33.10.xip.io

- www.192.168.33.10.xip.io

What if we wanted to use the same scheme but not hardcode the IP addresses into the variables? That way, if the IP address of the web server changes, we wouldn’t have to modify our playbook.

Ansible retrieves the IP address of each host and stores it as a fact. Each network interface has an associated Ansible fact. For example, details about network interface eth0 are stored in the ansible_eth0 fact, an example of which is shown in Example 7-10.

Example 7-10. ansible_eth0 fact

"ansible_eth0": {

"active": true,

"device": "eth0",

"ipv4": {

"address": "10.0.2.15",

"netmask": "255.255.255.0",

"network": "10.0.2.0"

"ipv6": [

{

"address": "fe80::a00:27ff:fefe:1e4d",

"prefix": "64",

"scope": "link"

}

"macaddress": "08:00:27:fe:1e:4d",

"module": "e1000",

"mtu": 1500,

"promisc": false,

"type": "ether"

}

Our Vagrant box has two interfaces, eth0 and eth1. The eth0 interface is a private interface whose IP address (10.0.2.15) we cannot reach. The eth1 interface is the one that has the IP address we’ve assigned in our Vagrantfile (192.168.33.10).

We can define our variables like this:

live_hostname: "{{ ansible_eth1.ipv4.address }}.xip.io"

domains:

- ansible_eth1.ipv4.address.xip.io

- www.ansible_eth1.ipv4.address.xip.io

Encrypting Sensitive Data with Vault

Our Mezzanine playbook required access to some sensitive information, such as database and administrator passwords. We dealt with this in Chapter 6 by putting all of the sensitive information in a separate file called secrets.yml and making sure that we didn’t check this file into our version control repository.

Ansible provides an alternative solution: instead of keeping the secrets.yml file out of version control, we can commit an encrypted version. That way, even if our version control repository were compromised, the attacker would not have access to the contents of the secrets.yml file unless he also had the password used for the encryption.

The ansible-vault command-line tool allows you to create and edit an encrypted file that ansible-playbook will recognize and decrypt automatically, given the password.

We can encrypt an existing file like this:

$ ansible-vault encrypt secrets.yml

Alternately, we can create a new encrypted secrets.yml file by doing:

$ ansible-vault create secrets.yml

You will be prompted for a password, and then ansible-vault will launch a text editor so that you can populate the file. It launches the editor specified in the $EDITOR environment variable. If that variable is not defined, it defaults to vim.

Example 7-11 shows an example of the contents of a file encrypted using ansible-vault.

Example 7-11. Contents of file encrypted with ansible-vault

$ANSIBLE_VAULT;1.1;AES256

34306434353230663665633539363736353836333936383931316434343030316366653331363262

6630633366383135386266333030393634303664613662350a623837663462393031626233376232

31613735376632333231626661663766626239333738356532393162303863393033303666383530

...

62346633343464313330383832646531623338633438336465323166626335623639383363643438

64636665366538343038383031656461613665663265633066396438333165653436

You can use the vars_files section of a play to reference a file encrypted with ansible-vault the same way you would access a regular file: we would not need to modify Example 6-27 at all if we encrypted the secrets.yml file.

We do need to tell ansible-playbook to prompt us for the password of the encrypted file, or it will simply error out. Do so by using the --ask-vault-pass argument:

$ ansible-playbook mezzanine.yml --ask-vault-pass

You can also store the password in a text file and tell ansible-playbook the location of this password file using the --vault-password-file flag:

$ ansible-playbook mezzanine --vault-password-file ~/password.txt

If the argument to --vault-password-file has the executable bit set, Ansible will execute it and use the contents of standard out as the vault password. This allows you to use a script to provide the password to Ansible.

Table 7-1 shows the available ansible-vault commands.

Command	Description
ansible-vault encrypt file.yml	Encrypt the plaintext file.yml file
ansible-vault decrypt file.yml	Decrypt the encrypted file.yml file
ansible-vault view file.yml	Print the contents of the encrypted file.yml file
ansible-vault create file.yml	Create a new encrypted file.yml file
ansible-vault edit file.yml	Edit an encrypted file.yml file
ansible-vault rekey file.yml	Change the password on an encrypted file.yml file
Table 7-1. ansible-vault commands

Patterns for Specifying Hosts

So far, the host parameter in our plays has specified a single host or group, like this:

hosts: web

Instead of specifying a single host or group, you can specify a pattern. So far, we’ve seen the all pattern, which lets will run a play against all known hosts:

hosts: all

You can specify a union of two groups with a colon. To specify all dev and staging machines:

hosts: dev:staging

You can specify an intersection using colon ampersand. For example, to specify all of the database servers in your staging environment, you might do:

hosts: staging:&database

Table 7-2 shows the patterns that Ansible supports. Note that the regular expression pattern always starts with a tilde.

Action	Example usage
All hosts	all
All hosts	*
Union	dev:staging
Intersection	staging:&database
Exclusion	dev:!queue
Wildcard	*.example.com
Range of numbered servers	web[5:10]
Regular expression	~web\d\.example\.(com
Table 7-2. Supported patterns

Ansible supports multiple combinations of patterns — for example:

hosts: dev:staging:&database:!queue

Limiting Which Hosts Run

Use the -l hosts or --limit hosts flag to tell Ansible to limit the hosts to run the playbook against the specified list of hosts, as shown in Example 7-12.

Example 7-12. Limiting which hosts run

$ ansible-playbook -l hosts playbook.yml

$ ansible-playbook --limit hosts playbook.yml

You can use the pattern syntax just described to specify arbitrary combinations of hosts. For example:

$ ansible-playbook -l 'staging:&database' playbook.yml

Filters

Filters are a feature of the Jinja2 templating engine. Since Ansible uses Jinja2 for evaluating variables, as well as for templates, you can use filters inside of {{ braces }} in your playbooks, as well as inside of your template files. Using filters resembles using Unix pipes, where a variable is piped through a filter. Jinja2 ships with a set of built-in filters. In addition, Ansible ships with its own filters to augment the Jinja2 filters.

We’ll cover a few sample filters here, but check out the official Jinja2 and Ansible docs for a complete list of the available filters.

The Default Filter

The default filter is a useful one. Here’s an example of this filter in action:

"HOST": "{{ database_host | default('localhost') }}",

If the variable database_host is defined, then the braces will evaluate to the value of that variable. If the variable database_host is not defined, then the braces will evaluate to the string localhost. Some filters take arguments, and some don’t.

Filters for Registered Variables

Let’s say we want to run a task and print out its output, even if the task fails. However, if the task did fail, we want Ansible to fail for that host after printing the output. Example 7-13 shows how we would use the failed filter in the argument to the failed_when clause.

Example 7-13. Using the failed filter

- name: Run myprog

command: /opt/myprog

ignore_errors: True

- debug: var=result

- debug: msg="Stop running the playbook if myprog failed"

failed_when: result|failed

# more tasks here

Table 7-3 shows a list of filters you can use on registered variables to check the status.

Name	Description
failed	True if a registered value is a task that failed
changed	True if a registered value is a task that changed
success	True if a registered value is a task that succeeded
skipped	True if a registered value is a task that was skipped
Table 7-3. Task return value filters

Filters That Apply to File Paths

Table 7-4 shows a number of filters that are useful when a variable contains the path to a variable on the control machine’s filesystem.

basename	basename of file path
dirname	Directory of file path
expanduser	File path with ~ replaced by home directory
realpath	Canonical path of file path, resolves symbolic links
Table 7-4. File path filters

Consider this playbook fragment:

vars:

homepage: /usr/share/nginx/html/index.html

tasks:

- name: copy home page

copy: src=files/index.html dest={{ homepage }}

Note how it references index.html twice, once in the definition of the homepage variable, and a second time to specify the path to the file on the control machine.

The basename filter will let us extract the index.html part of the filename from the full path, allowing us to write the playbook without repeating the filename:1

vars:

homepage: /usr/share/nginx/html/index.html

tasks:

- name: copy home page

copy: src=files/{{ homepage | basename }} dest={{ homepage }}

Writing Your Own Filter

Recall that in our Mezzanine example, we generated the local_settings.py file from a template, where there is a line in the generated file that looks like what is shown in Example 7-14.

Example 7-14. Line from local_settings.py generated by template

ALLOWED_HOSTS = ["www.example.com", "example.com"]

We had a variable named domains that contained a list of the hostnames. We originally used a for loop in our template to generate this line, but a filter would be an even more elegant approach.

There is a built-in Jinja2 filter called join, that will join a list of strings with a delimiter such as a column. Unfortunately, it doesn’t quite give us what we want. If we did this in the template:

ALLOWED_HOSTS = [{{ domains|join(", ") }}]

Then we would end up with the strings unquoted in our file, as shown in Example 7-15.

Example 7-15. Strings incorrectly unquoted

ALLOWED_HOSTS = [www.example.com, example.com]

If we had a Jinja2 filter that quoted the strings in the list, as shown in Example 7-16, then the template would generate the output depicted in Example 7-14.

Example 7-16. Using a filter to quote the strings in the list

ALLOWED_HOSTS = [{{ domains|surround_by_quote|join(", ") }}]

Unfortunately, there’s no existing surround_by_quote filter that does what we want. However, we can write it ourselves. (In fact, Hanfei Sun on Stack Overflow covered this very topic.)

Ansible will look for custom filters in the filter_plugins directory, relative to the directory where your playbooks are.

Example 7-17 shows what the filter implementation looks like.

Example 7-17. filter_plugins/surround_by_quotes.py

# From http://stackoverflow.com/a/15515929/742

def surround_by_quote(a_list):

return ['"%s"' % an_element for an_element ina_list]

class FilterModule(object):

def filters(self):

return {'surround_by_quote': surround_by_quote}

The surround_by_quote function defines the Jinja2 filter. The FilterModule class defines a filters method that returns a dictionary with the name of the filter function and the function itself. The FilterModule class is Ansible-specific code that makes the Jinja2 filter available to Ansible.

You can also place filter plug-ins in the /usr/share/ansible_plugins/filter_plugins directory, or you can specify the directory by setting the ANSIBLE_FILTER_PLUGINS environment variable to the directory where your plug-ins are located. These paths are also documented in Appendix B.

Lookups

In an ideal world, all of your configuration information would be stored as Ansible variables, in the various places that Ansible lets you define variables (e.g., the vars section of your playbooks, files loaded by vars_files, files in the host_vars or group_vars directory that we discussed in Chapter 3).

Alas, the world is a messy place, and sometimes a piece of configuration data you need lives somewhere else. Maybe it’s in a text file or a .csv file, and you don’t want to just copy the data into an Ansible variable file because now you have to maintain two copies of the same data, and you believe in the DRY2 principle. Or maybe the data isn’t maintained as a file at all; it’s maintained in a key-value storage service such as etcd.3 Ansible has a feature called lookups that allows you to read in configuration data from various sources and then use that data in your playbooks and template.

Ansible supports a collection of lookups for retrieving data from different sources, as shown in Table 7-5.

Name	Description
file	Contents of a file
password	Randomly generate a password
pipe	Output of locally executed command
env	Environment variable
template	Jinja2 template after evaluation
csvfile	Entry in a .csv file
dnstxt	DNS TXT record
redis_kv	Redis key lookup
etcd	etcd key lookup
Table 7-5. Lookups

You invoke lookups by calling the lookup function with two arguments. The first is a string with the name of the lookup, and the second is a string that contains one or more arguments to pass to the lookup. For example, we call the file lookup like this:

lookup('file', '/path/to/file.txt')

You can invoke lookups in your playbooks in between {{ braces }}, or you can put them in templates.

In this section, I provided only a brief overview of what lookups are available. The Ansible documentation provide more details on how to use these lookups.

NOTE

All Ansible lookup plug-ins execute on the control machine, not the remote host.

file

Let’s say that you have a text file on your control machine that contains a public SSH key that you want to copy to a remote server. Example 7-18 shows how you can use the file lookup to read the contents of a file and pass that as a parameter to a module.

Example 7-18. Using the file lookup

- name: Add my public key as an EC2 key

ec2_key: name=mykey key_material="{{ lookup('file', \

'/Users/lorinhochstein/.ssh/id_rsa.pub') }}"

You can invoke lookups in templates as well. If we wanted to use the same technique to create an authorized_keys file that contained the contents of a public key file, we could create a Jinja2 template that invokes the lookup, as shown in Example 7-19, and then call the templatemodule in our playbook, as shown in Example 7-20.

Example 7-19. authorized_keys.j2

Example 7-20. Task to generate authorized_keys

- name: copy authorized_host file

template: src=authorized_keys.j2 dest=/home/deploy/.ssh/authorized_keys

pipe

The pipe lookup invokes an external program on the control machine and evaluates to the program’s output on standard out.

For example, if our playbooks are version controlled using git, and we wanted to get the SHA-1 value of the most recent git commit,4 we could use the pipe lookup:

- name: get SHA of most recent commit

debug: msg="{{ lookup('pipe', 'git rev-parse HEAD') }}"

The output would look something like this:

TASK: [get the sha of the current commit] *************************************

ok: [myserver] => {

"msg": "e7748af0f040d58d61de1917980a210df419eae9"

}

env

The env lookup retrieves the value of an environment variable set on the control machine. For example, we could use the lookup like this:

- name: get the current shell

debug: msg="{{ lookup('env', 'SHELL') }}"

Since I use Zsh as my shell, the output looks like this when I run it:

TASK: [get the current shell] *************************************************

ok: [myserver] => {

"msg": "/bin/zsh"

}

password

The password lookup evaluates to a random password, and it will also write the password to a file specified in the argument. For example, if we wanted to create a Postgres user named deploy with a random password and write that password to deploy-password.txt on the control machine, we could do:

- name: create deploy postgres user

postgresql_user:

name: deploy

password: "{{ lookup('password', 'deploy-password.txt') }}"

template

The template lookup lets you specify a Jinja2 template file, and then returns the result of evaluating the template. If we had a template that looked like Example 7-21:

Example 7-21. message.j2

This host runs {{ ansible_distribution }}

And we defined a task like this:

- name: output message from template

debug: msg="{{ lookup('template', 'message.j2') }}"

Then we’d see output that looks like this:

TASK: [output message from template] ******************************************

ok: [myserver] => {

"msg": "This host runs Ubuntu\n"

}

csvfile

The csvfile lookup reads an entry from a .csv file. Assume we had a .csv file that looked like Example 7-22.

Example 7-22. users.csv

username,email

lorin,lorin@ansiblebook.com

john,john@example.com

sue,sue@example.org

If we wanted to extract Sue’s email address using the csvfile lookup plug-in, we would invoke the lookup plug-in like this:

lookup('csvfile', 'sue file=users.csv delimiter=, col=1')

The csvfile lookup is a good example of a lookup that takes multiple arguments. Here, there are four arguments being passed to the plug-in:

§ sue

§ file=users.csv

§ delimiter=,

§ col=1

You don’t specify a name for the first argument to a lookup plug-in, but you do specify names for the additional arguments. In the case of csvfile, the first argument is an entry that must appear exactly once in column 0 (the first column, 0-indexed) of the table.

The other arguments specify the name of the .csv file, the delimiter, and which column should be returned. In our example, we want to look in the file named users.csv and locate where the fields are delimited by commas; look up the row where the value in the first column is sue; and return the value in the second column (column 1, indexed by 0). This will evaluate to sue@example.org.

If the user name we wanted to look up was stored in a variable named username, we could construct the argument string by using the + sign to concatenate the username string with the rest of the argument string:

lookup('csvfile', username + ' file=users.csv delimiter=, col=1')

dnstxt

NOTE

The dnstxt module requires that you install the dnspython Python package on the control machine.

If you’re reading this book, you’re probably aware of what the domain name system (DNS) does, but just in case you aren’t, DNS is the service that translates hostnames such as ansiblebook.com to IP addresses such as 64.99.80.30.

DNS works by associating one or more records with a hostname. The most commonly used types of DNS records are A records and CNAME records, which associate a hostname with an IP address (A record) or specify that a hostname is an alias for another hostname (CNAME record).

The DNS protocol supports another type of record that you can associate with a hostname, called a TXT record. A TXT record is just an arbitrary string that you can attach to a hostname. Once you’ve associated a TXT record with a hostname, anybody can retrieve the text using a DNS client.

For example, I own the ansiblebook.com domain, so I can create TXT records associated with any hostnames in that domain.5 I associated a TXT record with the ansiblebook.com hostname that contains the ISBN number for this book. You can look up the TXT record using the digcommand-line tool, as shown in Example 7-23.

Example 7-23. Using the dig tool to look up a TXT record

$ dig +short ansiblebook.com TXT

"isbn=978-1491915325"

The dnstxt lookup queries the DNS server for the TXT record associated with the host. If we created a task like this in a playbook:

- name: look up TXT record

debug: msg="{{ lookup('dnstxt', 'ansiblebook.com') }}"

The output would look like this:

TASK: [look up TXT record] ****************************************************

ok: [myserver] => {

"msg": "isbn=978-1491915325"

}

If there are multiple TXT records associated with a host, then the module will concatenate them together, and it might do this in a different order each time it is called. For example, if there were a second TXT record on ansiblebook.com with the text:

author=lorin

Then the dnstxt lookup will randomly return one of the two:

§ isbn=978-1491915325author=lorin

§ author=lorinisbn=978-1491915325

redis_kv

NOTE

The redis_kv module requires that you install the redis Python package on the control machine.

Redis is a popular key-value store, commonly used as a cache, as well as a data store for job queue services such as Sidekiq. You can use the redis_kv lookup to retrieve the value of a key. The key must be a string, as the module does the equivalent of calling the Redis GET command.

For example, let’s say that we had a Redis server running on our control machine, and we had set the key weather to the value sunny, by doing something like this:

$ redis-cli SET weather sunny

If we defined a task in our playbook that invoked the Redis lookup:

- name: look up value in Redis

debug: msg="{{ lookup('redis_kv', 'redis://localhost:6379,weather') }}"

The output would look like this:

TASK: [look up value in Redis] ************************************************

ok: [myserver] => {

"msg": "sunny"

}

The module will default to redis://localhost:6379 if the URL isn’t specified, so we could have invoked the module like this instead (note the comma before the key):

lookup('redis_kv', ',weather')

etcd

Etcd is a distributed key-value store, commonly used for keeping configuration data and for implementing service discovery. You can use the etcd lookup to retrieve the value of a key.

For example, let’s say that we had an etcd server running on our control machine, and we had set the key weather to the value cloudy by doing something like this:

$ curl -L http://127.0.0.1:4001/v2/keys/weather -XPUT -d value=cloudy

If we defined a task in our playbook that invoked the etcd plug-in:

- name: look up value in etcd

debug: msg="{{ lookup('etcd', 'weather') }}"

The output would look like this:

TASK: [look up value in etcd] *************************************************

ok: [localhost] => {

"msg": "cloudy"

}

By default, the etcd lookup will look for the etcd server at http://127.0.0.1:4001, but you can change this by setting the ANSIBLE_ETCD_URL environment variable before invoking ansible-playbook.

Writing Your Own Lookup Plug-in

You can also write your own lookup plug-in if you need functionality not provided by the existing plug-ins. Writing a custom lookup plug-in is out of scope for this book, but if you’re really interested, I suggest that you take a look at the source code for the lookup plug-ins that ship with Ansible.

Once you’ve written your lookup plug-in, place it in one of the following directories:

§ The lookup_plugins directory next to your playbook

§ /usr/share/ansible_plugins/lookup_plugins

§ The directory specified in your ANSIBLE_LOOKUP_PLUGINS environment variable

More Complicated Loops

Up until this point, whenever we’ve written a task that iterates over a list of items, we’ve used the with_items clause to specify a list of items. Although this is the most common way to do loops, Ansible supports other mechanisms for doing iteration. Table 7-6 provides a summary of the constructs that are available.

Name	Input	Looping strategy
with_items	list	Loop over list elements
with_lines	command to execute	Loop over lines in command output
with_fileglob	glob	Loop over filenames
with_first_found	list of paths	First file in input that exists
with_dict	dictionary	Loop over dictionary elements
with_flattened	list of lists	Loop over flattened list
with_indexed_items	list	Single iteration
with_nested	list	Nested loop
with_random_choice	list	Single iteration
with_sequence	sequence of integers	Loop over sequence
with_subelements	list of dictionaries	Nested loop
with_together	list of lists	Loop over zipped list
with_inventory_hostnames	host pattern	Loop over matching hosts
Table 7-6. Looping constructs

The official documentation covers these quite thoroughly, so I’ll just show examples from a few of them to give you a sense of how they work.

with_lines

The with_lines looping construct lets you run an arbitrary command on your control machine and iterate over the output, one line at a time.

Imagine you have a file that contains a list of names, and you want to send a Slack message for each name, something like this:

Leslie Lamport

Silvio Micali

Shafi Goldwasser

Judea Pearl

Example 7-24 shows how you can use with_lines to read a file and iterate over its contents line by line.

Example 7-24. Using with_lines as a loop

- name: Send out a slack message

slack:

domain: example.slack.com

token: "{{ slack_token }}"

msg: "{{ item }} was in the list"

with_lines:

- cat files/turing.txt

with_fileglob

The with_fileglob construct is useful for iterating over a set of files on the control machine.

Example 7-25 shows how to iterate over files that end in .pub in the /var/keys directory, as well as a keys directory next to your playbook. It then uses the file lookup plug-in to extract the contents of the file, which are passed to the authorized_key module.

Example 7-25. Using with_fileglob to add keys

- name: add public keys to account

authorized_key: user=deploy key="{{ lookup('file', item) }}"

with_fileglob:

- /var/keys/*.pub

- keys/*.pub

with_dict

The with_dict lets you iterate over a dictionary instead of a list. When you use this looping construct, the item loop variable is a dictionary with two keys:

key

One of the keys in the dictionary

value

The value in the dictionary that corresponds to key

For example, if your host has an eth0 interface, then there will be an Ansible fact named ansible_eth0, with a key named ipv4 that contains a dictionary that looks something like this:

{

"address": "10.0.2.15",

"netmask": "255.255.255.0",

"network": "10.0.2.0"

}

We could iterate over this dictionary and print out the entries one at a time by doing:

- name: iterate over ansible_eth0

debug: msg={{ item.key }}={{ item.value }}

with_dict: ansible_eth0.ipv4

The output would look like this:

TASK: [iterate over ansible_eth0] *********************************************

ok: [myserver] => (item={'key': u'netmask', 'value': u'255.255.255.0'}) => {

"item": {

"key": "netmask",

"value": "255.255.255.0"

"msg": "netmask=255.255.255.0"

}

ok: [myserver] => (item={'key': u'network', 'value': u'10.0.2.0'}) => {

"item": {

"key": "network",

"value": "10.0.2.0"

"msg": "network=10.0.2.0"

}

ok: [myserver] => (item={'key': u'address', 'value': u'10.0.2.15'}) => {

"item": {

"key": "address",

"value": "10.0.2.15"

"msg": "address=10.0.2.15"

}

Looping Constructs as Lookup Plug-ins

Ansible implements looping constructs as lookup plug-ins. You just slap a with at the beginning of a lookup plug-in to use it in its loop form. For example, we can rewrite Example 7-18 using the with_file form in Example 7-26.

Example 7-26. Using the file lookup as a loop

- name: Add my public key as an EC2 key

ec2_key: name=mykey key_material="{{ item }}"

with_file: /Users/lorinhochstein/.ssh/id_rsa.pub

Typically, you’d only use a lookup plug-in as a looping construct if it returns a list, which is how I was able to separate out the plug-ins into Table 7-5 (return strings) and Table 7-6 (return lists).

We covered a lot of ground in this chapter. In the next one, we’ll discuss roles, a convenient mechanism for organizing your playbooks.

1 Thanks to John Jarvis for this tip.

2 Don’t Repeat Yourself, a term popularized by The Pragmatic Programmer: From Journeyman to Master, which is a fantastic book.

3 etcd is a distributed key-value store, and is maintained by the CoreOS project.

4 If this sounds like gibberish, don’t worry about it; it’s just an example of running a command.

5 DNS service providers typically have web interfaces to let you perform DNS-related tasks such as creating TXT records.