Amazon EC2 - Ansible: Up and Running (2015)

Ansible: Up and Running (2015)

Chapter 12. Amazon EC2

Ansible has a number of features that make working with infrastructure-as-a-service (IaaS) clouds much easier. This chapter focuses on Amazon EC2 because it’s the most popular IaaS cloud and the one I know best. However, many of the concepts should transfer to other clouds supported by Ansible.

The two ways Ansible supports EC2 are:

§ A dynamic inventory plug-in for automatically populating your Ansible inventory instead of manually specifying your servers

§ Modules that perform actions on EC2 such as creating new servers

In this chapter, we’ll discuss both the EC2 dynamic inventory plug-in, as well as the EC2 modules.

WHAT IS AN IAAS CLOUD?

You’ve probably heard so many references to “the cloud” in the technical press that you’re suffering from buzzword overload.1 I’ll be precise about what I mean by an infrastructure-as-a-service (IaaS) cloud.

To start, here’s a typical user interaction with an IaaS cloud:

User

I want five new servers, each one with two CPUs, 4 GB of memory, and 100 GB of storage, running Ubuntu 14.04.

Service

Request received. Your request number is 432789.

User

What’s the current status of request 432789?

Service

Your servers are ready to go, at IP addresses 203.0.113.5, 203.0.113.13, 203.0.113.49, 203.0.113.124, 203.0.113.209.

User

I’m done with the servers associated with request 432789.

Service

Request received, the servers will be terminated.

An IaaS cloud is a service that enables a user to provision (create) new servers. All IaaS clouds are self-serve, meaning that the user interacts directly with a software service rather than, say, filing a ticket with the IT department. Most IaaS clouds offer three different types of interfaces to allow users to interact with the system:

§ Web interface

§ Command-line interface

§ REST API

In the case of EC2, the web interface is called the AWS Management Console, and the command-line interface is called (unimaginatively) the AWS Command-Line Interface. The REST API is documented at Amazon.

IaaS clouds typically use virtual machines to implement the servers, although you can build an IaaS cloud using bare metal servers (i.e., users run directly on the hardware rather than inside of a virtual machine) or containers. The SoftLayer and Rackspace clouds have bare metal offerings, and Amazon Elastic Compute Cloud, Google Compute Engine, and Joyent clouds offer containers.

Most IaaS clouds let you do more than than just start up and tear down servers. In particular, they typically give you provision storage so you can attach and detach disks to your servers. This type of storage is commonly referred to as block storage. They also provide networking features, so you can define network topologies that describe how your servers are interconnected, and you can define firewall rules that restrict networking access to your servers.

Amazon EC2 is the most popular public IaaS cloud provider, but there are a number of other IaaS clouds out there. In addition to EC2, Ansible ships with modules for Microsoft Azure, Digital Ocean, Google Compute Engine, Linode, and Rackspace, as well as clouds built using OpenStack and VMWare vSphere.

Terminology

EC2 exposes many different concepts. I’ll explain these concepts as they come up in this chapter, but there are two terms I’d like to cover up front.

Instance

EC2’s documentation uses the term instance to refer to a virtual machine, and I use that terminology in this chapter. Keep in mind that an EC2 instance is a host from Ansible’s perspective.

EC2 documentation interchangeably uses the terms creating instances, launching instances, and running instances to describe the process of bringing up a new instance. However, starting instances means something different — starting up an instance that had previously been put in the stopped state.

Amazon Machine Image

An Amazon machine image (AMI) is a virtual machine image, which contains a filesystem with an installed operating system on it. When you create an instance on EC2, you choose which operating system you want your instance to run by specifying the AMI that EC2 will use to create the instance.

Each AMI has an associated identifier string, called an AMI ID, which starts with “ami-” and then contains eight hexadecimal characters; for example, ami-12345abc.

Tags

EC2 lets you annotate your instances2 with custom metadata that it calls tags. Tags are just key-value pairs of strings. For example, we could annotate an instance with the following tags:

Name=Staging database

env=staging

type=database

If you’ve ever given your EC2 instance a name in the AWS Management Console, you’ve used tags without even knowing it. EC2 implements instance names as tags where the key is Name and the value is whatever name you gave the instance. Other than that, there’s nothing special about the Name tag, and you can configure the management console to show the value of other tags in addition to the Name tag.

Tags don’t have to be unique, so you can have 100 instances that all have the same tag. Because Ansible’s EC2 modules make heavy use of tags, they will come up several times in this chapter.

Specifying Credentials

When you make requests against Amazon EC2, you need to specify credentials. If you’ve used the Amazon web console, you’ve used your username and password to log in. However, all of the bits of Ansible that interact with EC2 talk to the EC2 API. The API does not use username and password for credentials. Instead, it uses two strings: an access key ID and a secret access key.

These strings typically look like this:

§ Sample EC2 access key ID: AKIAIOSFODNN7EXAMPLE

§ Sample EC2 secret access key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

When you are calling EC2-related modules, you can pass these strings as module arguments. For the dynamic inventory plug-in, you can specify the credentials in the ec2.ini file (discussed in the next section). However, both the EC2 modules and the dynamic inventory plug-in also allow you to specify these credentials as environment variables. You can also use something called identity and access management (IAM) roles if your control machine is itself an Amazon EC2 instance, which is covered in Appendix C.

Environment Variables

Although Ansible does allow you to pass credentials explicitly as arguments to modules, it also supports setting EC2 credentials as environment variables. Example 12-1 shows how you would set these environment variables.

Example 12-1. Setting EC2 environment variables

# Don't forget to replace these values with your actual credentials!

export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE

export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

export AWS_REGION=us-east-1

NOTE

Not all of Ansible’s EC2 modules respect the AWS_REGION environment variable, so I recommend that you always explicitly pass the EC2 region as an argument when invoking your modules. All of the examples in this chapter explicitly pass the region as an argument.

I recommend using environment variables because it allows you to use EC2-related modules and inventory plug-ins without putting your credentials in any of your Ansible-related files. I put these in a dotfile that runs when my session starts. I use Zsh, so in my case that file is~/.zshrc. If you’re running Bash, you might want to put it in your ~/.profile file.3 If you’re using a shell other than Bash or Zsh, you’re probably knowledgeable enough to know which dotfile to modify to set these environment variables.

Once you have set these credentials in your environment variables, you can invoke the Ansible EC2 modules on your control machine, as well as use the dynamic inventory.

Configuration Files

An alternative to using environment variables is to place your EC2 credentials in a configuration file. As discussed in the next section, Ansible uses the Python Boto library, so it supports Boto’s conventions for maintaining credentials in a Boto configuration file. I don’t cover the format here; for more information, check out the Boto config documentation.

Prerequisite: Boto Python Library

All of the Ansible EC2 functionality requires you to install the Python Boto library as a Python system package on the control machine. To do so:4

$ pip install boto

If you already have instances running on EC2, you can verify that Boto is installed properly and that your credentials are correct by interacting with the Python command line, as shown in Example 12-2.

Example 12-2. Testing out Boto and credentials

$ python

Python 2.7.6 (default, Sep 9 2014, 15:04:36)

[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin

Type "help", "copyright", "credits" or "license" for more information.

>>> import boto.ec2

>>> conn = boto.ec2.connect_to_region("us-east-1")

>>> statuses = conn.get_all_instance_status()

>>> statuses

[]

Dynamic Inventory

If your servers live on EC2, you don’t want to keep a separate copy of these servers in an Ansible inventory file, because that file is going to go stale as you spin up new servers and tear down old ones.

It’s much simpler to track your EC2 servers by taking advantage of Ansible’s support for dynamic inventory to pull information about hosts directly from EC2. Ansible ships with a dynamic inventory script for EC2, although I recommend you just grab the latest one from the Ansible GitHub repository.5

You need two files:

ec2.py

The actual inventory script

ec2.ini

The configuration file for the inventory script

Previously, we had a playbooks/hosts file, which served as our inventory. Now, we’re going to use a playbooks/inventory directory. We’ll place ec2.py and ec2.ini into that directory, and set ec2.py as executable. Example 12-3 shows one way to do that.

Example 12-3. Installing the EC2 dynamic inventory script

$ cd playbooks/inventory

$ wget https://raw.githubusercontent.com/ansible/ansible/devel/plugins/inventory/ec2.py

$ wget https://raw.githubusercontent.com/ansible/ansible/devel/plugins/inventory/ec2.ini

$ chmod +x ec2.py

Caution

If you are running Ansible on a Linux distribution that uses Python 3.x as the default Python (e.g., Arch Linux), then the ec2.py will not work unmodified because it is a Python 2.x script.

Make sure your system has Python 2.x installed and then modify the first line of ec2.py from this:

#!/usr/bin/env python

to this:

#!/usr/bin/env python2

If you’ve set up your environment variables as described in the previous section, you should be able to confirm that the script is working by running:

$ ./ec2.py --list

The script should output information about your various EC2 instances. The structure should look something like this:

{

"_meta": {

"hostvars": {

"ec2-203-0-113-75.compute-1.amazonaws.com": {

"ec2_id": "i-i2345678",

"ec2_instance_type": "c3.large",

...

}

}

},

"ec2": [

"ec2-203-0-113-75.compute-1.amazonaws.com",

...

],

"us-east-1": [

"ec2-203-0-113-75.compute-1.amazonaws.com",

...

],

"us-east-1a": [

"ec2-203-0-113-75.compute-1.amazonaws.com",

...

],

"i-12345678": [

"ec2-203-0-113-75.compute-1.amazonaws.com",

],

"key_mysshkeyname": [

"ec2-203-0-113-75.compute-1.amazonaws.com",

...

],

"security_group_ssh": [

"ec2-203-0-113-75.compute-1.amazonaws.com",

...

],

"tag_Name_my_cool_server": [

"ec2-203-0-113-75.compute-1.amazonaws.com",

...

],

"type_c3_large": [

"ec2-203-0-113-75.compute-1.amazonaws.com",

...

]

}

Inventory Caching

When Ansible executes the EC2 dynamic inventory script, the script has to make requests against one or more EC2 endpoints to retrieve this information. Because this can take time, the script will cache the information the first time it is invoked by writing to the following files:

§ $HOME/.ansible/tmp/ansible-ec2.cache

§ $HOME/.ansible/tmp/ansible-ec2.index

On subsequent calls, the dynamic inventory script will use the cached information until the cache expires.

You can modify the behavior by editing the cache_max_age configuration option in the ec2.ini configuration file. It defaults to 300 seconds (5 minutes). If you don’t want caching at all, you can set it to 0:

[ec2]

...

cache_max_age = 0

You can also force the inventory script to refresh the cache by invoking it with the --refresh-cache flag:

$ ./ec2.py --refresh-cache

WARNING

If you create or destroy instances, the EC2 dynamic inventory script will not reflect these changes unless the cache expires, or you manually refresh the cache.

Other Configuration Options

The ec2.ini file includes a number of configuration options that control the behavior of the dynamic inventory script. Because the file itself is well-documented with comments, I won’t cover those options in detail here.

Auto-Generated Groups

The EC2 dynamic inventory script will create the following groups:

Type

Example

Ansible group name

Instance

i-123456

i-123456

Instance type

c1.medium

type_c1_medium

Security group

ssh

security_group_ssh

Keypair

foo

key_foo

Region

us-east-1

us-east-1

Tag

env=staging

tag_env_staging

Availability zone

us-east-1b

us-east-1b

VPC

vpc-14dd1b70

vpc_id_vpc-14dd1b70

All ec2 instances

N/A

ec2

Table 12-1. Generated EC2 groups

The only legal characters in a group name are alphanumeric, hyphen, and underscore. The dynamic inventory script will convert any other character into underscore.

For example, if you had an instance with a tag:

Name=My cool server!

Ansible would generate the group name tag_Name_my_cool_server_.

Defining Dynamic Groups with Tags

Recall that the dynamic inventory script automatically creates groups based on things such as instance type, security group, keypair, and tags. EC2 tags are the most convenient way of creating Ansible groups because you can define them however you like.

For example, you could tag all of your webservers with:

type=web

Ansible will automatically create a group called tag_type_web that contains all of the servers tagged with a name of type and a value of web.

EC2 allows you to apply multiple tags to an instance. For example, if you have separate staging and production environments, you can tag your production web servers like this:

env=production

type=web

Now you can refer to production machines as tag_env_production and your webservers as tag_type_web. If you want to refer to your production webservers, use the Ansible intersection syntax, like this:

hosts: tag_env_production:&tag_type_web

Applying Tags to Existing Resources

Ideally, you’d tag your EC2 instances as soon as you create them. However, if you’re using Ansible to manage existing EC2 instances, you will likely already have a number of instances running that you need to tag. Ansible has an ec2_tag module that will allow you to add tags to your instances.

For example, if you wanted to tag an instance with env=prodution and type=web, you could do it in a simple playbook as shown in Example 12-4.

Example 12-4. Adding EC2 tags to instances

- name: Add tags to existing instances

hosts: localhost

vars:

web_production:

- i-123456

- i-234567

web_staging:

- i-ABCDEF

- i-333333

tasks:

- name: Tag production webservers

ec2_tag: resource={{ item }} region=us-west-1

args:

tags: { type: web, env: production }

with_items: web_production

- name: Tag staging webservers

ec2_tag: resource={{ item }} region=us-west-1

args:

tags: { type: web, env: staging }

with_items: web_staging

This example uses the inline syntax for YAML dictionaries when specifying the tags ({ type: web, env: production}) in order to make the playbook more compact, but the regular YAML dictionary syntax would have worked as well:

tags:

type: web

env: production

Nicer Group Names

Personally, I don’t like the name tag_type_web for a group. I’d prefer to just call it web.

To do this, we need to add a new file to the playbooks/inventory directory that will have information about groups. This is just a traditional Ansible inventory file, which we’ll call playbooks/inventory/hosts (see Example 12-5).

Example 12-5. playbooks/inventory/hosts

[web:children]

tag_type_web

[tag_type_web]

Once you do this, you can refer to web as a group in your Ansible plays.

WARNING

You must define the empty tag_type_web group in your static inventory file, even though the dynamic inventory script also defines this group. If you forget it, Ansible will fail with the error:

ERROR: child group is not defined: (tag_type_web)

EC2 Virtual Private Cloud (VPC) and EC2 Classic

When Amazon first launched EC2 back in 2006, all of the EC2 instances were effectively connected to the same flat network.6 Every EC2 instance had a private IP address and a public IP address.

In 2009, Amazon introduced a new feature called Virtual Private Cloud (VPC). VPC allows users to control how their instances are networked together, and whether they will be publicly accessible from the Internet or isolated. Amazon uses the term “VPC” to describe the virtual networks that users can create inside of EC2. Amazon uses the term “EC2-VPC” to refer to instances that are launched inside of VPCs, and “EC2-Classic” to refer to instances that are not launched inside of VPCs.

Amazon actively encourages users to use EC2-VPC. For example, some instance types, such as t2.micro, are only available on EC2-VPC. Depending on when your AWS account was created and which EC2 regions you’ve previously launched instances in, you might not have access to EC2-Classic at all. Table 12-2 describes which accounts have access to EC2-Classic.7

My account was created

Access to EC2-Classic

Before March 18, 2013

Yes, but only in regions you’ve used before

Between March 18, 2013, and December 4, 2013

Maybe, but only in regions you’ve used before

After December 4, 2013

No

Table 12-2. Do I have access to EC2-Classic?

The main difference between having support for EC2-Classic versus only having access to EC2-VPC is what happens when you create a new EC2 instance and do not explicitly associate a VPC ID with that instance. If your account has EC2-Classic enabled, then the new instance is not associated with a VPC. If your account does not have EC2-Classic enabled, then the new instance is associated with the default VPC.

Here’s one reason why you should care about the distinction: in EC2-Classic, all instances are permitted to make outbound network connections to any host on the Internet. In EC2-VPC, instances are not permitted to make outbound network connections by default. If a VPC instance needs to make outbound connections, it must be associated with a security group that permits outbound connections.

For the purposes of this chapter, I’m going to assume EC2-VPC only, so I will associate instances with a security group that enables outbound connections.

Configuring ansible.cfg for Use with ec2

When I’m using Ansible to configure EC2 instances, I add the following lines in my ansible.cfg:

[defaults]

remote_user = ubuntu

host_key_checking = False

I always use Ubuntu images, and on those images you are supposed to SSH as the ubuntu user. I also turn off host key checking, since I don’t know in advance what the host keys are for new instances.8

Launching New Instances

The ec2 module allows you to launch new instances on EC2. It’s one of the most complex Ansible modules because it supports so many arguments.

Example 12-6 shows a simple playbook for launching an Ubuntu 14.04 EC2 instance.

Example 12-6. Simple playbook for creating an EC2 instance

- name: Create an ubuntu instance on Amazon EC2

hosts: localhost

tasks:

- name: start the instance

ec2:

image: ami-8caa1ce4

region: us-east-1

instance_type: m3.medium

key_name: mykey

group: [web, ssh, outbound]

instance_tags: { Name: ansiblebook, type: web, env: production }

Let’s go over what these parameters mean.

The image parameter refers to the Amazon Machine Image (AMI) ID, which you must always specify. As described earlier in the chapter, an image is basically a filesystem that contains an installed operating system. The example just used, ami-8caa1ce4, refers to an image that has the 64-bit version of Ubuntu 14.04 installed on it.

The region parameter specifies the geographical region where the instance will be launched.9

The instance_type parameter describes the amount of CPU cores, memory, and storage your instance will have. EC2 doesn’t let you choose arbitrary combinations of cores, memory, and storage. Instead, Amazon defines a collection of instance types.10 The preceding example uses the t2.medium instance type. This is a 64-bit instance type with 1 core, 3.75 GB of RAM, and 4 GB of SSD-based storage.

NOTE

Not all images are compatible with all instance types. I haven’t actually tested whether ami-8caa1ce4 works with m3.medium. Caveat lector!

The key_name parameter refers to an SSH key pair. Amazon uses SSH key pairs to provide users with access to their servers. Before you start your first server, you must either create a new SSH key pair, or upload the public key of a key pair that you have previously created. Regardless of whether you create a new key pair or you upload an existing one, you must give a name to your SSH key pair.

The group parameter refers to a list of security groups associated with an instance. These groups determine what kinds of inbound and outbound network connections are permitted.

The instance_tags parameter associates metadata with the instance in the form of EC2 tags, which are key-value pairs. In the preceding example, we set the following tags:

Name=ansiblebook

type=web

env=production

EC2 Key Pairs

In Example 12-6, we assumed that Amazon already knew about an SSH key pair named mykey. Let’s see how we can use Ansible to create new key pairs.

Creating a New Key

When you create a new key pair, Amazon generates a private key and the corresponding public key; then it sends you the private key. Amazon does not keep a copy of the private key, so you’ve got to make sure that you save it after you generate it. Here’s how you would create a new key with Ansible:

Example 12-7. Create a new SSH key pair

- name: create a new keypair

hosts: localhost

tasks:

- name: create mykey

ec2_key: name=mykey region=us-west-1

register: keypair

- name: write the key to a file

copy:

dest: files/mykey.pem

content: "{{ keypair.key.private_key }}"

mode: 0600

when: keypair.changed

In Example 12-7, we invoke the ec2_key to create a new key pair. We then use the copy module with the content parameter in order to save the SSH private key to a file.

If the module creates a new key pair, then the variable keypair that is registered will contain a value that looks like this:

"keypair": {

"changed": true,

"invocation": {

"module_args": "name=mykey",

"module_name": "ec2_key"

},

"key": {

"fingerprint": "c5:33:74:84:63:2b:01:29:6f:14:a6:1c:7b:27:65:69:61:f0:e8:b9",

"name": "mykey",

"private_key": "-----BEGIN RSA PRIVATE KEY-----\nMIIEowIBAAKCAQEAjAJpvhY3QGKh

...

0PkCRPl8ZHKtShKESIsG3WC\n-----END RSA PRIVATE KEY-----"

}

}

If the key pair already existed, then the variable keypair that is registered will contain a value that looks like this:

"keypair": {

"changed": false,

"invocation": {

"module_args": "name=mykey",

"module_name": "ec2_key"

},

"key": {

"fingerprint": "c5:33:74:84:63:2b:01:29:6f:14:a6:1c:7b:27:65:69:61:f0:e8:b9",

"name": "mykey"

}

}

Because the private_key value will not be present if the key already exists, we need to add a when clause to the copy invocation to make sure that we only write a private key file to disk if there is actually a private key file to write.

We add the line:

when: keypair.changed

to only write the file to disk if there was a change of state when ec2_key was invoked (i.e., that a new key was created). Another way we could have done it would be to check for the existence of the private_key value, like this:

- name: write the key to a file

copy:

dest: files/mykey.pem

content: "{{ keypair.key.private_key }}"

mode: 0600

when: keypair.key.private_key is defined

We use the Jinja2 defined test11 to check if private_key is present.

Upload an Existing Key

If you already have an SSH public key, you can upload that to Amazon and associate it with a keypair:

- name: create a keypair based on my ssh key

hosts: localhost

tasks:

- name: upload public key

ec2_key: name=mykey key_material="{{ item }}"

with_file: ~/.ssh/id_rsa.pub

Security Groups

Example 12-6 assumed that the web, SSH, and outbound security groups already existed. We can use the ec2_group module to ensure that these security groups have been created before we use them.

Security groups are similar to firewall rules: you specify rules about who is allowed to connect to the machine and how.

In Example 12-8, we specify the web group as allowing anybody on the Internet to connect to ports 80 and 443. For the SSH group, we allow anybody on the Internet to connect on port 22. For the outbound group, we allow outbound connections to anywhere on the Internet. We need outbound connections enabled in order to download packages from the Internet.

Example 12-8. Security groups

- name: web security group

ec2_group:

name: web

description: allow http and https access

rules:

- proto: tcp

from_port: 80

to_port: 80

cidr_ip: 0.0.0.0/0

- proto: tcp

from_port: 443

to_port: 443

cidr_ip: 0.0.0.0/0

- name: ssh security group

ec2_group:

name: ssh

description: allow ssh access

rules:

- proto: tcp

from_port: 22

to_port: 22

cidr_ip: 0.0.0.0/0

- name: outbound group

ec2_group:

name: outbound

description: allow outbound connections to the internet

region: "{{ region }}"

rules_egress:

- proto: all

cidr_ip: 0.0.0.0/0

NOTE

If you are using EC2-Classic, you don’t need to specify the outbound group, since EC2-Classic does not restrict outbound connections on instances.

If you haven’t used security groups before, the parameters to the rules dictionary bear some explanation. Table 12-3 provides a quick summary of the parameters for security group connection rules.

Parameter

Description

proto

IP protocol (tcp, udp, icmp) or “all” to allow all protocols and ports

cidr_ip

Subnet of IP addresses that are allowed to connect, using CIDR notation

from_port

The first port in the range of permitted ports

to_port

The last port in the range of permitted ports

Table 12-3. Security group rule parameters

Permitted IP Addresses

Security groups allow you to restrict which IP addresses are permitted to connect to an instance. You specify a subnet using classless interdomain routing (CIDR) notation. An example of a subnet specified with CIDR notation is 203.0.113.0/24,12 which means that the first 24 bits of the IP address must match the first 24 bits of 203.0.113.0. People sometimes just say “/24” to refer to the size of a CIDR that ends in /24.

A /24 is a nice value because it corresponds to the first three octets of the address, namely 203.0.113.13 What this means is that any IP address that starts with 203.0.113 is in the subnet, meaning any IP address in the range 203.0.113.0 to 203.0.113.255.

If you specify 0.0.0.0/0, that means that any IP address is permitted to connect.

Security Group Ports

One of the things that I find confusing about EC2 security groups is the from port and to port notation. EC2 allows you to specify a range of ports that you are allowed to access. For example, you could indicate that you are allowing TCP connections on any port from 5900 to 5999 by specifying:

- proto: tcp

from_port: 5900

to_port: 5999

cidr_ip: 0.0.0.0/0

However, I often find the from/to notation confusing, because I almost never specify a range of ports.14 Instead, I usually want to enable non-consecutive ports, such as 80 and 443. Therefore, in almost every case, the from_port and to_port parameters are going to be the same.

The c2_group module has a number of other parameters, including specifying inbound rules using security group IDs, as well as specifying outbound connection rules. Check out the module’s documentation for more details.

Getting the Latest AMI

In Example 12-6, we explicitly specified the AMI like this:

image: ami-8caa1ce4

However, if you want to launch the latest Ubuntu 14.04 image, you don’t want to hardcode the AMI like this. That’s because Canonical15 frequently makes minor updates to Ubuntu, and every time it makes a minor update, it generates a new AMI. Just because ami-8caa1ce4corresponds to the latest release of Ubuntu 14.04 yesterday doesn’t mean it will correspond to the latest release of Ubuntu 14.04 tomorrow.

Ansible ships with a nifty little module called ubuntu_ami_search (written by yours truly) that will retrieve the AMI that corresponds to a given operating system release. Example 12-9 shows this module in action:

Example 12-9. Retrieving the latest Ubuntu AMI

- name: Create an ubuntu instance on Amazon EC2

hosts: localhost

tasks:

- name: Get the ubuntu trusty AMI

ec2_ami_search: distro=ubuntu release=trusty region=us-west-1

register: ubuntu_image

- name: start the instance

ec2:

image: "{{ ubuntu_image.ami }}"

instance_type: m3.medium

key_name: mykey

group: [web, ssh, outbound]

instance_tags: { type: web, env: production }

Currently, the module only supports looking up Ubuntu AMIs.

Adding a New Instance to a Group

Sometimes I like to write a single playbook that launches an instance and then runs a playbook against that instance.

Unfortunately, before you’ve run the playbook, the host doesn’t exist yet. Disabling caching on the dynamic inventory script won’t help here, because Ansible only invokes the dynamic inventory script at the beginning of playbook execution, which is before the host exists.

You can add a task that uses the add_host module to add the instance to a group, as shown in Example 12-10.

Example 12-10. Adding an instance to groups

- name: Create an ubuntu instance on Amazon EC2

hosts: localhost

tasks:

- name: start the instance

ec2:

image: ami-8caa1ce4

instance_type: m3.medium

key_name: mykey

group: [web, ssh, outbound]

instance_tags: { type: web, env: production }

register: ec2

- name: add the instance to web and production groups

add_host: hostname={{ item.public_dns_name }} groups=web,production

with_items: ec2.instances

- name: do something to production webservers

hosts: web:&production

tasks:

- ...

RETURN TYPE OF THE EC2 MODULE

The ec2 module returns a dictionary with three fields, shown in Table 12-4.

Parameter

Description

instance_ids

List of instance ids

instances

List of instance dicts

tagged_instances

List of instance dicts

Table 12-4. Return value of ec2 module

If the user passes the exact_count parameter to the ec2 module, then the module might not create new instances, as described in “Creating Instances the Idempotent Way”. In this case, the instance_ids and instances fields will be populated only if the module creates new instances. However, the tagged_instances field will contain instance dicts for all of the instances that match the tags, whether they were just created or already existed.

An instance dict contains the fields shown in Table 12-5.

Parameter

Description

id

Instance id

ami_launch_index

Instance index within a reservation (between 0 and N-1) if N launched

private_ip

Internal IP address (not routable outside of EC2)

private_dns_name

Internal DNS name (not routable outside of EC2)

public_ip

Public IP address

public_dns_name

Public DNS name

state_code

Reason code for the state change

architecture

CPU architecture

image_id

AMI

key_name

Key pair name

placement

Location where the instance was launched

kernel

AKI (Amazon kernel image)

ramdisk

ARI (Amazon ramdisk image)

launch_time

Time instance was launched

instance_type

Instance type

root_device_type

Type of root device (ephemeral, EBS)

root_device_name

Name of root device

state

State of instance

hypervisor

Hypervisor type

Table 12-5. Contents of instance dicts

For more details on what these fields mean, check out the Boto16 documentation for the boto.ec2.instance.Instance class or the documentation for the output of the run-instances command of Amazon’s command-line tool.17

Waiting for the Server to Come Up

While IaaS clouds like EC2 are remarkable feats of technology, they still require a finite amount of time to create new instances. What this means is that you can’t run a playbook against an EC2 instance immediately after you’ve submitted a request to create it. Instead, you need to wait for the EC2 instance to come up.

The ec2 module supports a wait parameter. If it’s set to “yes,” then the ec2 task will not return until the instance has transitioned to the running state:

- name: start the instance

ec2:

image: ami-8caa1ce4

instance_type: m3.medium

key_name: mykey

group: [web, ssh, outbound]

instance_tags: { type: web, env: production }

wait: yes

register: ec2

Unfortunately, waiting for the instance to be in the running state isn’t enough to ensure that you can actually execute a playbook against a host. You still need to wait until the instance has advanced far enough in the boot process that the SSH server has started and is accepting incoming connections.

The wait_for module is designed for this kind of scenario. Here’s how you would use the ec2 and wait_for modules in concert to start an instance and then wait until the instance is ready to receive SSH connections:

- name: start the instance

ec2:

image: ami-8caa1ce4

instance_type: m3.medium

key_name: mykey

group: [web, ssh, outbound]

instance_tags: { type: web, env: production }

wait: yes

register: ec2

- name: wait for ssh server to be running

wait_for: host={{ item.public_dns_name }} port=22 search_regex=OpenSSH

with_items: ec2.instances

This invocation of wait_for uses the search_regex argument to look for the string OpenSSH after connecting to the host. This regex takes advantage of the fact that a fully functioning SSH server will return a string that looks something like Example 12-11 when an SSH client first connects.

Example 12-11. Initial response of an SSH server running on Ubuntu

SSH-2.0-OpenSSH_5.9p1 Debian-5ubuntu1.4

We could invoke the wait_for module to just check if port 22 is listening for incoming connections. However, sometimes an SSH server has gotten far enough along in the startup process that it is listening on port 22, but is not fully functional yet. Waiting for the initial response ensures that the wait_for module will only return when the SSH server has fully started up.

Creating Instances the Idempotent Way

Playbooks that invoke the ec2 module are not generally idempotent. If you were to execute Example 12-6 multiple times, then EC2 will create multiple instances.

You can write idempotent playbooks with the ec2 module by using the count_tag and exact_count parameters.

Let’s say we want to write a playbook that starts three instances. We want this playbook to be idempotent, so if three instances are already running, we want the playbook to do nothing. Example 12-12 shows what it would look like:

Example 12-12. Idempotent instance creation

- name: start the instance

ec2:

image: ami-8caa1ce4

instance_type: m3.medium

key_name: mykey

group: [web, ssh, outbound]

instance_tags: { type: web, env: production }

exact_count: 3

count_tag: { type: web }

The exact_count: 3 parameter tells Ansible to ensure that exactly three instances are running that match the tags specified in count_tag. In our example, I only specified one tag for count_tag, but it does support multiple tags.

When running this playbook for the first time, Ansible will check how many instances are currently running that are tagged with type=web. Assuming there are no such instances, Ansible will create three new instances and tag them with type=web and env=production.

When running this playbook the next time, Ansible will check how many instances are currently running that are tagged with type=web. It will see that there are three instances running and will not start any new instances.

Putting It All Together

Example 12-13 shows the playbook that create three EC2 instances and configures them as web servers. The playbook is idempotent, so you can safely run it multiple times, and it will create new instances only if they haven’t been created yet.

Note how we use the tagged_instances return value of the ec2 module, instead of the instances return value, for reasons described in “Return Type of the ec2 Module”.

Example 12-13. ec2-example.yml: Complete EC2 playbook

---

- name: launch webservers

hosts: localhost

vars:

region: us-west-1

instance_type: t2.micro

count: 3

tasks:

- name: ec2 keypair

ec2_key: name=mykey key_material="{{ item }}" region={{ region }}

with_file: ~/.ssh/id_rsa.pub

- name: web security group

ec2_group:

name: web

description: allow http and https access

region: "{{ region }}"

rules:

- proto: tcp

from_port: 80

to_port: 80

cidr_ip: 0.0.0.0/0

- proto: tcp

from_port: 443

to_port: 443

cidr_ip: 0.0.0.0/0

- name: ssh security group

ec2_group:

name: ssh

description: allow ssh access

region: "{{ region }}"

rules:

- proto: tcp

from_port: 22

to_port: 22

cidr_ip: 0.0.0.0/0

- name: outbound security group

ec2_group:

name: outbound

description: allow outbound connections to the internet

region: "{{ region }}"

rules_egress:

- proto: all

cidr_ip: 0.0.0.0/0

- name: Get the ubuntu trusty AMI

ec2_ami_search: distro=ubuntu release=trusty virt=hvm region={{ region }}

register: ubuntu_image

- name: start the instances

ec2:

region: "{{ region }}"

image: "{{ ubuntu_image.ami }}"

instance_type: "{{ instance_type }}"

key_name: mykey

group: [web, ssh outbound]

instance_tags: { Name: ansiblebook, type: web, env: production }

exact_count: "{{ count }}"

count_tag: { type: web }

wait: yes

register: ec2

- name: add the instance to web and production groups

add_host: hostname={{ item.public_dns_name }} groups=web,production

with_items: ec2.tagged_instances

when: item.public_dns_name is defined

- name: wait for ssh server to be running

wait_for: host={{ item.public_dns_name }} port=22 search_regex=OpenSSH

with_items: ec2.tagged_instances

when: item.public_dns_name is defined

- name: configure webservers

hosts: web:&production

sudo: True

roles:

- web

Specifying a Virtual Private Cloud

So far, we’ve been launching our instances into the default virtual private cloud (VPC). Ansible also allows us to create new VPCs and launch instances into them.

WHAT IS A VPC?

Think of a VPC as an isolated network. When you create a VPC, you specify an IP address range. It must be a subset of one of the private address ranges (10.0.0.0/8, 172.16.0.0/12, or 192.168.0.0/16).

You carve up your VPC into subnets, which have IP ranges that are subsets of the IP range of your entire VPC. In Example 12-14, the VPC has the IP range 10.0.0.0/16, and we associate two subnets: 10.0.0.0/24 and 10.0.10/24.

When you launch an instance, you assign it to a subnet in a VPC. You can configure your subnets so that your instances get either public or private IP addresses. EC2 also allows you to define routing tables for routing traffic between your subnets and to create Internet gateways for routing traffic from your subnets to the Internet.

Configuring networking is a complex topic that’s (way) outside the scope of this book. For more info, check out Amazon’s EC2 documentation on VPC.

Example 12-14 shows how to create a VPC with two subnets.

Example 12-14. create-vpc.yml: Creating a vpc

- name: create a vpc

ec2_vpc:

region: us-west-1

internet_gateway: True

resource_tags: { Name: "Book example", env: production }

cidr_block: 10.0.0.0/16

subnets:

- cidr: 10.0.0.0/24

resource_tags:

env: production

tier: web

- cidr: 10.0.1.0/24

resource_tags:

env: production

tier: db

route_tables:

- subnets:

- 10.0.0.0/24

- 10.0.1.0/24

routes:

- dest: 0.0.0.0/0

gw: igw

Creating a VPC is idempotent; Ansible uniquely identifies the VPC based on a combination of the resource_tags and the cidr_block parameters. Ansible will create a new VPC if no existing VPC matches the resource tags and CIDR block.18

Admittedly, Example 12-14 is a simple example from a networking perspective, as we’ve just defined two subnets that are both connected to the Internet. A more realistic example would have one subnet that’s routable to the Internet, and another subnet that’s not routable to the Internet, and we’d have some rules for routing traffic between the two subnets.

Example 12-15 shows a complete example of creating a VPC and launching instances into it.

Example 12-15. ec2-vpc-example.yml: Complete EC2 playbook that specifies a VPC

---

- name: launch webservers into a specific vpc

hosts: localhost

vars:

instance_type: t2.micro

count: 1

region: us-west-1

tasks:

- name: create a vpc

ec2_vpc:

region: "{{ region }}"

internet_gateway: True

resource_tags: { Name: book, env: production }

cidr_block: 10.0.0.0/16

subnets:

- cidr: 10.0.0.0/24

resource_tags:

env: production

tier: web

- cidr: 10.0.1.0/24

resource_tags:

env: production

tier: db

route_tables:

- subnets:

- 10.0.0.0/24

- 10.0.1.0/24

routes:

- dest: 0.0.0.0/0

gw: igw

register: vpc

- set_fact: vpc_id={{ vpc.vpc_id }}

- name: set ec2 keypair

ec2_key: name=mykey key_material="{{ item }}"

with_file: ~/.ssh/id_rsa.pub

- name: web security group

ec2_group:

name: vpc-web

region: "{{ region }}"

description: allow http and https access

vpc_id: "{{ vpc_id }}"

rules:

- proto: tcp

from_port: 80

to_port: 80

cidr_ip: 0.0.0.0/0

- proto: tcp

from_port: 443

to_port: 443

cidr_ip: 0.0.0.0/0

- name: ssh security group

ec2_group:

name: vpc-ssh

region: "{{ region }}"

description: allow ssh access

vpc_id: "{{ vpc_id }}"

rules:

- proto: tcp

from_port: 22

to_port: 22

cidr_ip: 0.0.0.0/0

- name: outbound security group

ec2_group:

name: vpc-outbound

description: allow outbound connections to the internet

region: "{{ region }}"

vpc_id: "{{ vpc_id }}"

rules_egress:

- proto: all

cidr_ip: 0.0.0.0/0

- name: Get the ubuntu trusty AMI

ec2_ami_search: distro=ubuntu release=trusty virt=hvm region={{ region }}

register: ubuntu_image

- name: start the instances

ec2:

image: "{{ ubuntu_image.ami }}"

region: "{{ region }}"

instance_type: "{{ instance_type }}"

assign_public_ip: True

key_name: mykey

group: [vpc-web, vpc-ssh, vpc-outbound]

instance_tags: { Name: book, type: web, env: production }

exact_count: "{{ count }}"

count_tag: { type: web }

vpc_subnet_id: "{{ vpc.subnets[0].id}}"

wait: yes

register: ec2

- name: add the instance to web and production groups

add_host: hostname={{ item.public_dns_name }} groups=web,production

with_items: ec2.tagged_instances

when: item.public_dns_name is defined

- name: wait for ssh server to be running

wait_for: host={{ item.public_dns_name }} port=22 search_regex=OpenSSH

with_items: ec2.tagged_instances

when: item.public_dns_name is defined

- name: configure webservers

hosts: web:&production

sudo: True

roles:

- web

NOTE

Unfortunately, as of this writing, the Ansible ec2 module can’t handle the case where you have security groups with the same name in different VPCs. This means we can’t have an SSH security group defined in multiple VPCs, because the module will try to associate all of the SSH security groups when we launch an instance. In our example, I’ve used different names for these security groups. I’m hoping this will be fixed in a future version of the module.

Dynamic Inventory and VPC

Oftentimes, when using a VPC, you will place some instances inside of a private subnet that is not routable from the Internet. When you do this, there is no public IP address associated with the instance.

In this case, you might want to run Ansible from an instance inside of your VPC. The Ansible dynamic inventory script is smart enough that it will return internal IP addresses for VPC instances that don’t have public IP addresses.

See Appendix C for details on how you can use IAM roles to run Ansible inside of a VPC without needing to copy EC2 credentials to the instance.

Building AMIs

There are two approaches you can take to creating custom Amazon machine images (AMIs) with Ansible. You can use the ec2_ami module, or you can use a third-party tool called Packer that has support for Ansible.

With the ec2_ami Module

The ec2_ami module will take a running instance and snapshot it into an AMI. Example 12-16 shows this module in action.

Example 12-16. Creating an AMI with the ec2_ami module

- name: create an AMI

hosts: localhost

vars:

instance_id: i-dac5473b

tasks:

- name: create the AMI

ec2_ami:

name: web-nginx

description: Ubuntu 14.04 with nginx installed

instance_id: "{{ instance_id }}"

wait: yes

register: ami

- name: output AMI details

debug: var=ami

With Packer

The ec2_ami module works just fine, but you have to write some additional code to create and terminate the instance.

There’s an open source tool called Packer that will automate the creation and termination of an instance for you. Packer also happens to be written by Mitchell Hashimoto, the creator of Vagrant.

Packer can create different types of images and works with different configuration management tools. This chapter focuses on using Packer to create AMIs using Ansible, but you can also use Packer to create images for other IaaS clouds, such as Google Compute Engine, DigitalOcean, or OpenStack. It can even be used to create Vagrant boxes and Docker containers. It also supports other configuration management tools, such as Chef, Puppet, and Salt.

To use Packer, you create a configuration file in JSON format and then use the packer command-line tool to create the image using the configuration file.

Example 12-17 shows a sample Packer configuration file that uses Ansible to create an AMI with our web role.

Example 12-17. web.json

{

"builders": [

{

"type": "amazon-ebs",

"region": "us-west-1",

"source_ami": "ami-50120b15",

"instance_type": "t2.micro",

"ssh_username": "ubuntu",

"ami_name": "web-nginx-{{timestamp}}",

"tags": {

"Name": "web-nginx"

}

}

],

"provisioners": [

{

"type": "shell",

"inline": [

"sleep 30",

"sudo apt-get update",

"sudo apt-get install -y ansible"

]

},

{

"type": "ansible-local",

"playbook_file": "web-ami.yml",

"role_paths": [

"/Users/lorinhochstein/dev/ansiblebook/ch12/playbooks/roles/web"

]

}

]

}

Use the packer build command to create the AMI:

$ packer build web.json

The output looks like this:

==> amazon-ebs: Inspecting the source AMI...

==> amazon-ebs: Creating temporary keypair: packer 546919ba-cb97-4a9e-1c21-389633

dc0779

==> amazon-ebs: Creating temporary security group for this instance...

...

==> amazon-ebs: Stopping the source instance...

==> amazon-ebs: Waiting for the instance to stop...

==> amazon-ebs: Creating the AMI: web-nginx-1416174010

amazon-ebs: AMI: ami-963fa8fe

==> amazon-ebs: Waiting for AMI to become ready...

==> amazon-ebs: Adding tags to AMI (ami-963fa8fe)...

amazon-ebs: Adding tag: "Name": "web-nginx"

==> amazon-ebs: Terminating the source AWS instance...

==> amazon-ebs: Deleting temporary security group...

==> amazon-ebs: Deleting temporary keypair...

Build 'amazon-ebs' finished.

==> Builds finished. The artifacts of successful builds are:

--> amazon-ebs: AMIs were created:

us-west-1: ami-963fa8fe

Example 12-17 has two sections: builders and provisioners. The builders section refers to the type of image being created. In our case, we are creating an Elastic Block Store–backed (EBS) Amazon Machine Image, so we use the amazon-ebs builder.

Packer needs to start a new instance to create an AMI, so you need to configure Packer with all of the information you typically need when creating an instance: EC2 region, AMI, and instance type. Packer doesn’t need to be configured with a security group because it will create a temporary security group automatically, and then delete that security group when it is finished. Like Ansible, Packer needs to be able to SSH to the created instance. Therefore, you need to specify the SSH username in the Packer configuration file.

You also need to tell Packer what to name your instance, as well as any tags you want to apply to your instance. Because AMI names must be unique, we use the {{timestamp}} function to insert a Unix timestamp. A Unix timestamp encodes the date and time as the number of seconds since Jan. 1, 1970, UTC. See the Packer documentation for more information about the functions that Packer supports.

Because Packer needs to interact with EC2 to create the AMI, it needs access to your EC2 credentials. Like Ansible, Packer can read your EC2 credentials from environment variables, so you don’t need to specify them explicitly in the configuration file, although you can if you prefer.

The provisioners section refers to the tools used to configure the instance before it is captured as an image. Packer supports an Ansible local provisioner: it runs Ansible on the instance itself. That means that Packer needs to copy over all of the necessary Ansible playbooks and related files before it runs, and it also means that Ansible must be installed on the instance before it executes Ansible.

Packer supports a shell provisioner that lets you run arbitrary commands on the instance. Example 12-17 uses this provisioner to install Ansible as an Ubuntu apt package. To avoid a race situation with trying to install packages before the operating system is fully booted up, the shell provisioner in our example is configured to wait for 30 seconds before installing Ansible.

Example 12-18 shows the web-ami.yml playbook we use for configuring an instance. It’s a simple playbook that applies the web role to the local machine. Because it uses the web role, the configuration file must explicitly specify the location of the directory that contains the web role so that Packer can copy the web role’s files to the instance.

Example 12-18. web-ami.yml

- name: configure a webserver as an ami

hosts: localhost

sudo: True

roles:

- web

Instead of selectively copying over roles, we can also tell Packer to just copy our entire playbooks directory instead. In that case, the configuration file would look like Example 12-19.

Example 12-19. web-pb.json copying over the entire playbooks directory

{

"builders": [

{

"type": "amazon-ebs",

"region": "us-west-1",

"source_ami": "ami-50120b15",

"instance_type": "t2.micro",

"ssh_username": "ubuntu",

"ami_name": "web-nginx-{{timestamp}}",

"tags": {

"Name": "web-nginx"

}

}

],

"provisioners": [

{

"type": "shell",

"inline": [

"sleep 30",

"sudo apt-get update",

"sudo apt-get install -y ansible"

]

},

{

"type": "ansible-local",

"playbook_file": "web-ami.yml",

"playbook_dir": "/Users/lorinhochstein/dev/ansiblebook/ch12/playbooks"

}

]

}

NOTE

As of this writing, Packer doesn’t support SSH agent forwarding. Check GitHub for the current status of this issue.

Packer has a lot more functionality than we can cover here. Check out its documentation for more details.

Other Modules

Ansible supports even more of EC2, as well as other AWS services. For example, you can use Ansible to launch CloudFormation stacks with the cloudformation module, put files into S3 with the s3 module, modify DNS records with the route53 module, create autoscaling groups with the ec2_asg module, create autoscaling configuration with the ec2_lc module, and more.

Using Ansible with EC2 is a large enough topic that you could write a whole book about it. In fact, Yan Kurniawan is writing a book on Ansible and AWS. After digesting this chapter, you should have enough knowledge under your belt to pick up these additional modules without difficulty.

1 The National Institute of Standards and Technology (NIST) has a pretty good definition of cloud computing The NIST Definition of Cloud Computing.

2 You can add tags to entities other than instances, such as AMIs, volumes, and security groups.

3 Or maybe it’s ~/.bashrc? I’ve never figured out the difference between the various Bash dotfiles.

4 You might need to use sudo or activate a virtualenv to install this package, depending on how you installed Ansible.

5 And, to be honest, I have no idea where the package managers install this file.

6 Amazon’s internal network is divided up into subnets, but users do not have any control over how instances are allocated to subnets.

7 Go to Amazon for more details on VPC and whether you have access to EC2-Classic in a region.

8 It’s possible to retrieve the host key by querying EC2 for the instance console output, but I must admit that I never bothing doing this because I’ve never gotten around to writing a proper script that parses out the host key from the console output.

9 Visit Amazon for a list of the regions that it supports.

10 There’s also a handy (unofficial) website that provides a single table with all of the available EC2 instance types.

11 For more information on Jinja2 tests, see the Jinja2 documentation page on built-in tests.

12 This example happens to correspond to a special IP address range named TEST-NET-3, which is reserved for examples. It’s the example.com of IP subnets.

13 Subnets that are /8, /16, and /24 make great examples because the math is much easier than, say, /17 or /23.

14 Astute observers might have noticed that ports 5900-5999 are commonly used by the VNC remote desktop protocol, one of the few applications where specifying a range of ports makes sense.

15 Canonical is the company that runs the Ubuntu project.

16 Boto is the Python library that Ansible uses to communicate with EC2.

17 The command-line tool is documented at http://aws.amazon.com/cli/.

18 As of this writing, a bug in Ansible causes it to incorrectly report a state of changed each time this module is invoked, even if it does not a create a VPC only.