Ansible: Up and Running (2015)

Chapter 12. Amazon EC2

Ansible has a number of features that make working with infrastructure-as-a-service (IaaS) clouds much easier. This chapter focuses on Amazon EC2 because it’s the most popular IaaS cloud and the one I know best. However, many of the concepts should transfer to other clouds supported by Ansible.

The two ways Ansible supports EC2 are:

§ A dynamic inventory plug-in for automatically populating your Ansible inventory instead of manually specifying your servers

§ Modules that perform actions on EC2 such as creating new servers

In this chapter, we’ll discuss both the EC2 dynamic inventory plug-in, as well as the EC2 modules.

WHAT IS AN IAAS CLOUD?

You’ve probably heard so many references to “the cloud” in the technical press that you’re suffering from buzzword overload.1 I’ll be precise about what I mean by an infrastructure-as-a-service (IaaS) cloud.

To start, here’s a typical user interaction with an IaaS cloud:

User

I want five new servers, each one with two CPUs, 4 GB of memory, and 100 GB of storage, running Ubuntu 14.04.

Service

Request received. Your request number is 432789.

User

What’s the current status of request 432789?

Service

Your servers are ready to go, at IP addresses 203.0.113.5, 203.0.113.13, 203.0.113.49, 203.0.113.124, 203.0.113.209.

User

I’m done with the servers associated with request 432789.

Service

Request received, the servers will be terminated.

An IaaS cloud is a service that enables a user to provision (create) new servers. All IaaS clouds are self-serve, meaning that the user interacts directly with a software service rather than, say, filing a ticket with the IT department. Most IaaS clouds offer three different types of interfaces to allow users to interact with the system:

§ Web interface

§ Command-line interface

§ REST API

In the case of EC2, the web interface is called the AWS Management Console, and the command-line interface is called (unimaginatively) the AWS Command-Line Interface. The REST API is documented at Amazon.

IaaS clouds typically use virtual machines to implement the servers, although you can build an IaaS cloud using bare metal servers (i.e., users run directly on the hardware rather than inside of a virtual machine) or containers. The SoftLayer and Rackspace clouds have bare metal offerings, and Amazon Elastic Compute Cloud, Google Compute Engine, and Joyent clouds offer containers.

Most IaaS clouds let you do more than than just start up and tear down servers. In particular, they typically give you provision storage so you can attach and detach disks to your servers. This type of storage is commonly referred to as block storage. They also provide networking features, so you can define network topologies that describe how your servers are interconnected, and you can define firewall rules that restrict networking access to your servers.

Amazon EC2 is the most popular public IaaS cloud provider, but there are a number of other IaaS clouds out there. In addition to EC2, Ansible ships with modules for Microsoft Azure, Digital Ocean, Google Compute Engine, Linode, and Rackspace, as well as clouds built using OpenStack and VMWare vSphere.

Terminology

EC2 exposes many different concepts. I’ll explain these concepts as they come up in this chapter, but there are two terms I’d like to cover up front.

Instance

EC2’s documentation uses the term instance to refer to a virtual machine, and I use that terminology in this chapter. Keep in mind that an EC2 instance is a host from Ansible’s perspective.

EC2 documentation interchangeably uses the terms creating instances, launching instances, and running instances to describe the process of bringing up a new instance. However, starting instances means something different — starting up an instance that had previously been put in the stopped state.

Amazon Machine Image

An Amazon machine image (AMI) is a virtual machine image, which contains a filesystem with an installed operating system on it. When you create an instance on EC2, you choose which operating system you want your instance to run by specifying the AMI that EC2 will use to create the instance.

Each AMI has an associated identifier string, called an AMI ID, which starts with “ami-” and then contains eight hexadecimal characters; for example, ami-12345abc.

Tags

EC2 lets you annotate your instances2 with custom metadata that it calls tags. Tags are just key-value pairs of strings. For example, we could annotate an instance with the following tags:

Name=Staging database

env=staging

type=database

If you’ve ever given your EC2 instance a name in the AWS Management Console, you’ve used tags without even knowing it. EC2 implements instance names as tags where the key is Name and the value is whatever name you gave the instance. Other than that, there’s nothing special about the Name tag, and you can configure the management console to show the value of other tags in addition to the Name tag.

Tags don’t have to be unique, so you can have 100 instances that all have the same tag. Because Ansible’s EC2 modules make heavy use of tags, they will come up several times in this chapter.

Specifying Credentials

When you make requests against Amazon EC2, you need to specify credentials. If you’ve used the Amazon web console, you’ve used your username and password to log in. However, all of the bits of Ansible that interact with EC2 talk to the EC2 API. The API does not use username and password for credentials. Instead, it uses two strings: an access key ID and a secret access key.

These strings typically look like this:

§ Sample EC2 access key ID: AKIAIOSFODNN7EXAMPLE

§ Sample EC2 secret access key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

When you are calling EC2-related modules, you can pass these strings as module arguments. For the dynamic inventory plug-in, you can specify the credentials in the ec2.ini file (discussed in the next section). However, both the EC2 modules and the dynamic inventory plug-in also allow you to specify these credentials as environment variables. You can also use something called identity and access management (IAM) roles if your control machine is itself an Amazon EC2 instance, which is covered in Appendix C.

Environment Variables

Although Ansible does allow you to pass credentials explicitly as arguments to modules, it also supports setting EC2 credentials as environment variables. Example 12-1 shows how you would set these environment variables.

Example 12-1. Setting EC2 environment variables

# Don't forget to replace these values with your actual credentials!

export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE

export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

export AWS_REGION=us-east-1

NOTE

Not all of Ansible’s EC2 modules respect the AWS_REGION environment variable, so I recommend that you always explicitly pass the EC2 region as an argument when invoking your modules. All of the examples in this chapter explicitly pass the region as an argument.

I recommend using environment variables because it allows you to use EC2-related modules and inventory plug-ins without putting your credentials in any of your Ansible-related files. I put these in a dotfile that runs when my session starts. I use Zsh, so in my case that file is~/.zshrc. If you’re running Bash, you might want to put it in your ~/.profile file.3 If you’re using a shell other than Bash or Zsh, you’re probably knowledgeable enough to know which dotfile to modify to set these environment variables.

Once you have set these credentials in your environment variables, you can invoke the Ansible EC2 modules on your control machine, as well as use the dynamic inventory.

Configuration Files

An alternative to using environment variables is to place your EC2 credentials in a configuration file. As discussed in the next section, Ansible uses the Python Boto library, so it supports Boto’s conventions for maintaining credentials in a Boto configuration file. I don’t cover the format here; for more information, check out the Boto config documentation.

Prerequisite: Boto Python Library

All of the Ansible EC2 functionality requires you to install the Python Boto library as a Python system package on the control machine. To do so:4

$ pip install boto

If you already have instances running on EC2, you can verify that Boto is installed properly and that your credentials are correct by interacting with the Python command line, as shown in Example 12-2.

Example 12-2. Testing out Boto and credentials

$ python

Python 2.7.6 (default, Sep 9 2014, 15:04:36)

[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin

Type "help", "copyright", "credits" or "license" for more information.

>>> import boto.ec2

>>> conn = boto.ec2.connect_to_region("us-east-1")

>>> statuses = conn.get_all_instance_status()

>>> statuses

[]

Dynamic Inventory

If your servers live on EC2, you don’t want to keep a separate copy of these servers in an Ansible inventory file, because that file is going to go stale as you spin up new servers and tear down old ones.

It’s much simpler to track your EC2 servers by taking advantage of Ansible’s support for dynamic inventory to pull information about hosts directly from EC2. Ansible ships with a dynamic inventory script for EC2, although I recommend you just grab the latest one from the Ansible GitHub repository.5

You need two files:

ec2.py

The actual inventory script

ec2.ini

The configuration file for the inventory script

Previously, we had a playbooks/hosts file, which served as our inventory. Now, we’re going to use a playbooks/inventory directory. We’ll place ec2.py and ec2.ini into that directory, and set ec2.py as executable. Example 12-3 shows one way to do that.

Example 12-3. Installing the EC2 dynamic inventory script

$ cd playbooks/inventory

$ wget https://raw.githubusercontent.com/ansible/ansible/devel/plugins/inventory/ec2.py

$ wget https://raw.githubusercontent.com/ansible/ansible/devel/plugins/inventory/ec2.ini

$ chmod +x ec2.py

Caution

If you are running Ansible on a Linux distribution that uses Python 3.x as the default Python (e.g., Arch Linux), then the ec2.py will not work unmodified because it is a Python 2.x script.

Make sure your system has Python 2.x installed and then modify the first line of ec2.py from this:

#!/usr/bin/env python

to this:

#!/usr/bin/env python2

If you’ve set up your environment variables as described in the previous section, you should be able to confirm that the script is working by running:

$ ./ec2.py --list

The script should output information about your various EC2 instances. The structure should look something like this:

{

"_meta": {

"hostvars": {

"ec2-203-0-113-75.compute-1.amazonaws.com": {

"ec2_id": "i-i2345678",

"ec2_instance_type": "c3.large",

...

}

"ec2": [