An Introduction to Chef - Test-Driven Infrastructure with Chef (2011)

Test-Driven Infrastructure with Chef (2011)

Chapter 3. An Introduction to Chef

The best way to learn is to do. A lot of technical books, even ones aimed at beginners, take the form of a lengthy discursive preamble, followed by some abstract example for the reader to digest and understand. The trouble with this is it doesn’t map well onto how we learn technical skills. Learning a technical skill is like teaching a child to ride a bicycle. You can’t really teach someone the theory, and then show them a video of someone else cycling, and then expect them to just pick it up by themselves at some point in the future. A much better way is to go out there and then, with a bicycle, plonk them on, give them a push, and help them when they wobble.

Learning a technical skill or a programming language is very much about immersion. The learning process is reinforced by mistakes, by looking up documentation, by asking other more experienced people, and building up competence ourselves. So, to introduce the fundamental ideas of Chef, we’ll build some real infrastructure, which we’ll actually use later in the book. This chapter and the next are unashamedly influenced by the excellent series of books and courses by Zed Shaw found at Learn Code the Hard Way. An approach that focuses on diving in and using real examples, this has been proven to be an excellent method for building confidence and expertise in a technical subject.

The approach, as explained on the website:

“…emphasizes precision, attention to detail, and persistence by requiring you to type each exercise (no copy-paste!) and make it run, as well as to read up on outside topics and to return to exercises and ideas that you don’t understand, and understand them.”

At the end of this chapter and the next, you’ll understand the basics of Chef, have hands-on experience writing cookbooks and recipes, and use community resources to frame your infrastructure as code. Once we’ve covered these fundamentals, we’ll go on to look at some of the tools we can use to start thinking about test-driven infrastructure development, and then look at a full example of using these tools in practice.

I’m making some broad assumptions about your ability and set-up. They are as follows:

§ You can type instructions into a command prompt.

§ You can edit text.

§ You have a computer and have administrative power over it.

§ Your computer was made some time in the last four or five years, and has about 2G of memory or more.

§ You have a connection to the Internet.

§ You are not behind a proxy server, or can easily disable it.[1]

Anything beyond this is a bonus. For example, if you have access to dedicated test hardware and several machines, that’s excellent. However, that’s not needed at all. If you don’t have administrative control over your computer, or have a very old computer with not much memory, you probably want to fix that before we continue. In the first edition, I made the assumption that people would have access to a public-cloud infrastructure, or would be prepared to pay for their own (minimal) use. In this edition I’ve moved toward the view that people are more likely to have adequate hardware, and want to work with local virtual machines rather than machines hosted with a public cloud provider. Most Chef users these days make heavy use of local virtualization in addition to the cloud, and so I’ve decided to include setting up such a capability as a fundamental task. If this is truly impossible for you, simply skim the sections in Chapter 4 where we install VirtualBox, and once we get to installing Vagrant, set it up to use the Rackspace cloud or EC2.

The basic format is that I will set an objective, or set of objectives, that you will be asked to achieve. The objectives will be the equivalent of acceptance criteria; you’ll know you’re done when those objectives have been met. I’ll then give you high-level directions on how to meet the objectives. They categorically are not instructions for you to follow, but rather an outline of the high-level steps you need to follow. My expectation is that you will be able to work out how to follow those directions by a combination of referring to other sections in the book, using your own knowledge and common sense, and using the main online resources for Chef:

§ http://docs.opscode.com

§ http://wiki.opscode.com

§ #chef and #learnchef on irc.freenode.net

§ The chef-users mailing list

I will follow the instructions with a worked example. I ask explicitly that, if you’re reading this digitally, you don’t simply copy and paste this into your own system—this contravenes the spirit of “the hard way.” Additionally, your system may be subtly different from mine. I suggest you use my worked example as guidance for you as you achieve the objectives yourself. If you want to use the material in the worked example, I ask that you type it out yourself. Try to solve the exercises yourself, and only once you’ve tried, move on to look at the worked example.

Finally, we’ll discuss the way we achieved the objectives, covering any interesting points that arose, and ensuring the way we achieved them is fully understood. Again, I would firmly ask that if you don’t understand the discussion, don’t carry on with the next set of objectives. Go back over the instructions and discussion, and if you’re still stuck, seek help via the online resources previously mentioned. This is for your sake—master the fundamentals and build on them.

The infrastructure we’re going to build over the next two chapters is a cookbook development and testing environment, including some useful tools, and setting up VirtualBox, Vagrant, and Test Kitchen. We’re going to imagine we’re in a position where we want to share this infrastructure with a few other users, and that we’re going to host it on a physical machine somewhere on the public Internet, so we can collaborate with our friends and colleagues in different locations and timezones.

Exercise 1: Install Chef

Objectives

After completing this exercise, you will have done the following:

§ Installed the latest version of the Chef client tools on your machine

§ Identified how to find help on your machine

§ Understood the purpose of each of the tools that ship with Chef

Directions

1. Search for the term “omnibus” on http://docs.opscode.com and read and understand how this helps us install Chef on our systems.

2. Install Chef on your computer using the Omnibus package for your platform.

3. Access the documentation installed on the computer for chef-apply, chef-solo, chef-client, chef-shell, and knife.

4. Search http://docs.opscode.com for each tool and read about what they do.

Worked Example

I set up two machines—one running Ubuntu 12.04, one running CentOS 6.4, both 64-bit. I then browsed to http://docs.opscode.com/search.html, and searched for the word “Omnibus”. The top link provided an overview of how to install Chef on a workstation. It contained more information than I needed, but I identified that I should visit the http://www.opscode.com/chef/install page, and that for Linux and Unix machines, the installation process was broadly to run an install script, piped through a shell, with super-user privileges.

I browsed to the install page, filled out the form, and followed the instructions, which on each machine amounted to me running the following command:

curl -L https://www.opscode.com/chef/install.sh | sudo bash

On my CentOS machine, sudo was not configured, so I changed to the root user, and ran the command without sudo.

During the writing process, I also had 32-bit Ubuntu 13.04 machines. I mention this because the installation process was a bit trickier, as there weren’t any 32-bit packages for 13.04. Instead, I selected 12.10, which did offer a 32-bit package, downloaded the package manually, and installed it with the following command:

$ sudo dpkg --install chef-11.4-4.2.ubuntu*.deb

I verified the installation on each machine by opening a terminal and running:

$ chef-client --version

To obtain help for each of the listed commands, I ran the command with the --help switch. I identified that chef-apply didn’t require a configuration file, but the others did. chef-solo and chef-shell seemed simpler than chef-client, which had considerably more option flags.Knife seemed to have much more information available, including a knife help subcommand. I ran knife help knife and knife help list, and skimmed the pages.

I then browsed to http://docs.opscode.com and searched for each command. I found that I needed to quote the commands in order to get appropriate results. I read the documentation on chef-solo and chef-client. chef-apply only had a single line, and chef-shell yielded only a result telling me that this was once called “Shef”. A search for “Shef” didn’t bring results either, so I tried http://wiki.opscode.com, where I found a page about Shef, http://wiki.opscode.com/display/chef/Shef, which I skimmed.

Discussion

As you can see, installing Chef is a breeze! Opscode provides a fully supported package install for most platforms, including Windows and commercial Unix operating systems. These packages vendor everything needed to run Chef into an isolated location (typically /opt)—this includes Ruby, OpenSSL, and other supporting tools and libraries.

When we ran the following code, it downloaded and executed a simple shell script that calculated the exact version of the native OS package required, downloaded the package, installed it, and added the vendored location of the Chef commands to your user’s path:

curl -L https://www.opscode.com/chef/install.sh | bash

If you are worried about running arbitrary shell scripts on your machine, with root privileges you can always download the script, inspect it, and run it yourself. However, realistically, if you trust Opscode to develop an automation framework upon which you’re going to base the running of your entire infrastructure, I think you can probably risk running the shell script that installs it.

Having installed Chef, we saw that we had five new commands available on our system:

§ chef-apply

§ chef-shell

§ chef-solo

§ chef-client

§ knife

I asked you to make yourself familiar with the help available, both on your computer and on the Opscode documentation site. Naturally I don’t expect much of this to make sense right now, but it’s vital that you develop the impulse of using --help, help, and the Opscode documentation sites throughout the book. While I am “virtually” with you on this journey, in the real world, things won’t work as expected, and knowing where to look for help from the start is a great foundation. I’ll go on to explain what each of these tools is for, but first let’s cover, at a high level, what Chef actually is.

Chef is an open source tool and framework that provides system administrators and developers with a foundation of APIs and libraries, which makes this kind of workflow possible.

Chef allows us to effectively write programs that generate configuration directly on the machines we need to manage. We then keep these programs in version control, and use them to gain control of the complex systems we need to manage.

Navigating the labyrinth of resources that we need to provide an application infrastructure becomes achievable because, through its libraries and APIs, Chef presents a declarative interface to these resources. This allows us to define a policy and express the infrastructure requirements at a higher level—specifying what resources are required, but without specifying how.

Architecturally, machines managed by Chef pull configuration information rather than being passive receivers, which means that the infrastructure remains convergent—over time it will move into line with defined policy. A machine that was down for maintenance will pull its config as soon as it rejoins the network, rather than receiving a push, if the administrator remembers that that machine didn’t get the last update.

Therefore, Chef furnishes us with the power to build tools to help us manage infrastructure at scale. At the heart of the Chef approach is the recognition that the person who knows best how to run their own infrastructure is the person who lives with it on a day-to-day basis. Encapsulated in that daily experience is a wealth of domain experience, which leads to a clear understanding of the business and technology problems that are most pressing. Chef aims to furnish such a person with the ability to solve these problems in a creative, scalable, repeatable, maintainable, and shareable manner.

Let’s explore this a little further—Chef is a framework, a tool, and an API.

The Chef framework

As the discipline of software development has matured, frameworks have emerged with the aim of reducing development time by minimizing the overhead of having to implement or manage low-level details that support the development effort. This allows developers to concentrate on rapid delivery of software that meets customer requirements.

The common use of the word framework is to describe a supporting structure composed of parts fitted and joined together. The same is true in the software world. Frameworks tie together discrete components into a useful organic whole to provide structural support to the building of a software project. Frameworks also provide consistent and simple access to complex technologies by making wrappers available that simplify the interface between the programmer and underlying libraries.

Frameworks bring with them numerous benefits. In addition to increasing the speed of development, they can improve the quality of the software that is produced. Software frameworks provide conventions and design approaches that, if adhered to, encourage consistency across a team. Their modular design encourages code re-use and they frequently provide utilities to facilitate testing and debugging. By providing an extensive library of useful tools, frameworks reduce or eliminate the need for repetitive tasks and accord the developer a high degree of flexibility via abstraction.

Chef is a framework for infrastructure development—a supporting structure and package of associated benefits of direct relevance to framing one’s infrastructure as code. Chef provides an extensive library of primitives for managing just about every conceivable resource that is used in the process of building up an infrastructure within which we might deploy a software project. It also provides a powerful Ruby-based language for modeling infrastructure, and a consistent abstraction layer that allows developers and system administrators to design and build scalable environments without getting dragged into operating system and low-level implementation details. It also provides some design patterns and approaches for producing consistent, shareable, and reusable components.

The Chef tool

The use of tools is viewed by anthropologists as a hugely significant evolutionary milestone in the development of humans. Primitive tools enabled us to climb to the top of the food chain by allowing us to accomplish tasks that could not be carried out with our bodies alone. While tools have been available to system administrators and developers since the birth of computers, recent years have witnessed a further evolutionary leap, with the availability of network-enabled tools that can drive multiple services via a published API. These tools are frequently extensible, written in a modular fashion in powerful, flexible, high-level programming languages such as Python or Ruby.

Chef provides a number of such tools, built upon the framework:

Ohai

A system profiling tool that gathers large quantities of data about the system, from network and user data to software and kernel versions. Ohai is extendable—plug-ins can be written (usually in Ruby) that will furnish data in addition to the defaults. The collected data is emitted in a machine-parseable and readable format (JSON), and is used to build up a database of facts about each system that is managed by Chef.

chef-shell

An interactive debugging console that provides command-line access to the framework’s libraries, the API, and the local system’s data. This is an excellent tool for testing and exploring how Chef will behave under a variety of conditions. It allows the developer to run Chef within the Ruby interactive interpreter, IRB, and gives a read-eval-print loop ideal for debugging and exploring the data held on the Chef server.

chef-solo

A fully featured standalone configuration management tool that allows access to a subset of Chef’s features without using a Chef server; suitable for simple deployments.

chef-client

An agent that runs on systems being managed by Chef, and the primary mechanism by which such systems communicate with the Chef server. chef-client uses the framework’s library of primitives to configure resources on a system by talking to a central server API to retrieve data.

chef-apply

A lightweight tool for configuring a machine to perform a function with a single command, needing no configuration or Chef server.

knife

A multipurpose command-line tool that facilitates system automation, deployment, and integration. Knife provides command and control capabilities for managing physical, virtual, and cloud environments across a range of Linux, Unix, and Windows platforms. It is also the primary means by which the underlying model that makes up the Chef framework is managed. Knife is extensible and has a pluggable architecture, meaning that it is straightforward to create new functionality simply by writing custom Ruby scripts that include some of the Chef and Knifelibraries. Used most frequently in conjunction with the client/server model, Knife assumes less significance if one’s primary Chef implementation is Chef-solo.

The Chef API

In its most popular incarnation, Chef functions as a client/server web service.

The server component is written in Erlang and uses a JSON-oriented document datastore. The whole Chef framework is driven via a RESTful API, of which the Knife command-line tool is a client. We’ll drill into this API shortly, but the critical thing to understand is that in most cases, day-to-day use of the Chef framework translates directly to interfacing with the Chef server via its RESTful API.

The server is open sourced, under the Apache 2.0 license, and is considered a reference implementation of the Chef Server API. The API is also implemented as a hosted software-as-a-service offering. The hosted version, called Hosted Chef, offers a fully resilient, highly available, multitenant environment. The platform is free to use for fewer than five nodes, so it’s the ideal way to experiment with and gain experience with the framework, tool, and API. The pricing for the hosted platform is intended to be less than the cost of just the hardware resources to run a standalone server. For deployment in the enterprise, Opscode also provides a supported install on customer hardware, called Private Chef. This provides all the functionality of Hosted Chef, but behind the firewall with no multitenancy compromises.

The Chef server also provides an indexing service. All information gathered about the resources managed by Chef is indexed and searchable, meaning that Chef becomes a coordination point for dynamic, data-driven infrastructures. It is possible to issue queries for any combination of attributes—for example, VMware servers on VLAN 102 or MySQL slaves running CentOS 5. This opens up tremendously powerful capabilities—a simple example would be a dynamic load balancer configuration that automatically includes the web servers that match a given query to its pool of backend nodes.

The most important thing to understand is that the Chef server is fundamentally nothing more than a publishing platform with an API, an index, and a dependency solver. It does no heavy lifting. All interactions, without exception, are via the REST API.

The Chef community

Chef has a large and active community of users, with over 14,000 registered community members, over 700 individuals and companies as signed-up contributors, of which over 200 have committed code to the project. Opscode is a community-focused company. In the 55 releases that have been cut in the last four plus years, there have been 61 awards of most valuable person status (and another 24 for Ohai releases), for contributions to both the code and the community as a whole.

For a comparatively young product, uptake is very strong. Over a million known downloads of Chef have been recorded, with the real number being significantly larger. Adoption is on an exponential scale, from startups and small or medium enterprises (SMEs) through web operation poster-people such as Facebook, Etsy, 37signals, Rightscale, and Wikia to household names like Sony, Walt Disney, Turner, HP, and Adobe.

These companies all use Chef to automate the deployment of thousands of servers with a wide variety of applications and environments. Chef users can share their “recipes” for installing and configuring software with “cookbooks” on Opscode’s community website. Cookbooks exist for a large number of packages, with over 800 cookbooks available on the Opscode community site alone.

The cookbooks aspect of the community site can be thought of as akin to RubyGems—although the source of most of the cookbooks can be obtained at any time from GitHub, stable releases are made in the form of versioned cookbooks. Both the Chef project itself and many of the cookbooks from the opscode-cookbooks Git organization are consistently in GitHub’s list of the most popular watched repositories. In practice, these cookbooks are probably the most reusable IT artifacts I’ve encountered, partly due to the separation of data and behavior that the Chef framework encourages, and also due to the inherent power and flexibility accorded by the ability to configure and control complex systems with a mature 3GL programming language.

The community tends to gather around the mailing lists (one for users and one for developers), and the IRC channels on Freenode (again one for users, and one for developers). Chef users and developers tend to be highly experienced system administrators, developers, and architects, and are an outstanding source of advice and inspiration in general, as well as being friendly and approachable on the subject of Chef itself.

As the field of web operations has grown, the need to have a community of people who are solving hard problems, building tools, and sharing ideas has also expanded. Chef, as an expression of the concept of infrastructure as code is precisely that—a sharing of minds, ideas, awesome-sauce, and expertise, in reusable, testable, auditable, and versionable code.

Exercise 2: Install a User

Objectives

After completing this exercise, you will have achieved the following:

§ Used Chef to create a user on your machine

§ Understood the principles behind Chef’s recipe DSL

§ Understood how to use chef-apply, and what its limitations are

Directions

1. Create a file called tdi.rb using your text editor.

2. Read the documentation for the “user” resource at http://docs.opscode.com/chef/resources.html#user.

3. Declare a resource in tdi.rb to create a user called “tdi”.

4. Create the user by running chef-apply.

5. Verify that the user has been created.

6. Add another resource of type dotfile to drop off a configuration file called .tdi with content parameter of “bogus”.

7. Run chef-apply again.

8. Observe the failure characteristics.

9. Replace the resource type “dotfile” with “file” and run chef-apply again.

10.Replace the “file” resource with a “template” resource, and change the “content” parameter to “source”.

11.Run chef-apply once more.

Worked Example

In my tdi.rb file I wrote the following:

user 'tdi' do

action :create

comment "Test Driven Infrastructure"

home "/home/tdi"

supports :manage_home => true

end

I saved the file and ran chef-apply. On my CentOS machine I was still using the root user, so I didn’t need to use sudo. On my Ubuntu machine I was logged in as my sns user, so I used sudo:

$ sudo chef-apply tdi.rb

Recipe: (chef-apply cookbook)::(chef-apply recipe)

* user[tdi] action create

- create user user[tdi]

I then verified the user existed:

sns@ubuntu:~$ getent passwd | grep tdi

tdi:x:1001:1001:Test Driven Infrastructure:/home/tdi:/bin/sh

[root@centos ~]# getent passwd | grep tdi

tdi:x:500:500:Test Driven Infrastructure:/home/tdi:/bin/bash

I noticed that on the Ubuntu machine, the user didn’t set the default shell to Bash. Although this could be easily done by updating the recipe, I decided to fix it the quick and dirty way, with:

$ sudo chsh -s /bin/bash tdi

I added a bogus resource to tdi.rb as follows:

dotfile '/home/tdi/.tdi' do

action :create

content 'bogus'

end

When I ran Chef, I saw:

sns@ubuntu:~$ sudo chef-apply tdi.rb

[2013-06-26T20:09:10+01:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out

[2013-06-26T20:09:10+01:00] FATAL: NameError: Cannot find a resource for dotfile on ubuntu version 12.04

[root@centos ~]# chef-apply tdi.rb

[2013-06-26T19:28:11+01:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out

[2013-06-26T19:28:11+01:00] FATAL: NameError: Cannot find a resource for dotfile on centos version 6.4

Changing the resource to a “file” yielded the following:

Recipe: (chef-apply cookbook)::(chef-apply recipe)

* user[tdi] action create (up to date)

* file[/home/tdi/.tdi] action create

- create new file /home/tdi/.tdi with content checksum 81f7e3

--- /tmp/chef-tempfile20130528-13007-1cgpj8 2013-05-28 11:20:11.932272825 +0100

+++ /tmp/chef-diff20130528-13007-ipe5ju 2013-05-28 11:20:11.932272825 +0100

@@ -0,0 +1 @@

+bogus

I altered my file resource as follows:

template '/home/tdi/.tdi' do

action :create

source 'tdi-bashfile'

end

When I ran chef-apply, this time I saw:

chef-apply tdi.rb

Recipe: (chef-apply cookbook)::(chef-apply recipe)

* user[tdi] action create (up to date)

* template[/home/tdi/.tdi] action create

================================================================================

Error executing action `create` on resource 'template[/home/tdi/.tdi]'

================================================================================

NoMethodError

-------------

undefined method `preferred_filename_on_disk_location' for nil:NilClass

Resource Declaration:

---------------------

# In tdi.rb

6: template '/home/tdi/.tdi' do

7: action :create

8: source 'bogus'

9: end

Compiled Resource:

------------------

# Declared in tdi.rb:6:in `run_chef_recipe'

template("/home/tdi/.tdi") do

provider Chef::Provider::Template

action [:create]

retries 0

retry_delay 2

path "/home/tdi/.tdi"

backup 5

source "bogus"

cookbook_name "(chef-apply cookbook)"

recipe_name "(chef-apply recipe)"

end

[2013-05-28T11:24:48+01:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out

[2013-05-28T11:24:48+01:00] FATAL: NoMethodError: template[/home/tdi/.tdi] ((chef-apply cookbook)::(chef-apply recipe) line 6) had an error: NoMethodError: undefined method `preferred_filename_on_disk_location' for nil:NilClass

Discussion

To use Chef to manage infrastructure is to insert a very powerful and flexible abstraction layer between the engineer and the system. Instead of the developer logging onto three different types of machines and typing commands into a terminal, or navigating a sequence of menus, he types in a text editor, commits to a version control system, and effectively deploys what was written to a series of machines. We are practicing the discipline of infrastructure as code.

In practical terms, the way we do this is by thinking about the abstract system components that we need to configure our systems as we want. For example, if I want to ensure the clock on my Linux computer is regularly synchronized with an NTP server, I might need to install the package that provides NTP client functionality, alter the configuration file according to my requirements, and ensure the NTP daemon is running, or that the client is run as a scheduled task. In Chef we call these low-level components that we can reason about and discuss “resources.”

Resources are the very essence of Chef—the atoms, if you like. When we talk about a complicated or even a simple infrastructure, that conversation takes place at a level of resources. For example, we might discuss a web server—what are the components of a web server? Well, we need to install Apache, we need to specify its configuration and perhaps some virtual hosts, and we need to ensure the Apache service is running. Immediately, we’ve identified some resources—a package, a file, and a service.

Managing infrastructure using Chef is a case of specifying what resources are needed and how they interact with one another. We call this setting the policy.

If resources are the fundamental configuration objects, nodes are the fundamental things that are configured. It’s possible to get a bit confused when the word “node” is used. For most engineers, a “node” is synonymous with a physical (or virtual) machine on a network. To an extent this meaning is carried forward in Chef, as I just did: nodes are the things we’re configuring. However, most of the time, in Chef, the term “node” refers to the Chef node, which is ultimately a Ruby object representing the machine we’re configuring. This object behaves like a Hash: it has keys and values, getter and setter methods, and can be viewed, queried, and interacted with as JSON. With that caveat, a concise definition of what Chef does is this:

Chef manages resources on the node so they comply with policy.

It’s important to understand that when we talk about resources in Chef, we’re not talking about the actual resource that ends up on the box. Resources in Chef are an abstraction layer. If we were to write Chef code to install the korn shell package on a CentOS box, that would mean:

$ yum install ksh

This would be represented in Chef by:

package "ksh"

A resource in Chef can take action. Here again, note the difference—the user resource in Chef can create a user on a machine. It isn’t the user on the machine. Resources take action through providers. A provider is some library code that understands two things: first, how to determine the state of a resource; and second, how to translate the abstract requirement (install Apache) into the concrete action (run yum install httpd). Additionally it understands that, depending upon the underlying operating system or distribution, the utilities or commands used to install a package will be different—for example, on a Debian system, the provider would use dpkg or apt rather than yum or rpm. Determining the state of the resource is important in configuration management; we only want to take action if it is necessary. If the user has already been created or the package has already been installed, we don’t need to take action. This is the principle of idempotence. (See http://bit.ly/15M3qwJ for more on idempotency and its meaning in this context.) A provider knows how to check whether the user has already been created, and won’t take action if it has. The mathematicians amongst you may complain about this appropriation of the term. Within the configuration management world, we understand that idempotence literally means that an operation will produce the same results if executed once or multiple times (i.e., multiple application of the same operation has no side effect). We take this principle, specifically the idea that all functions should be idempotent with the same data, and use this as a metaphor. Not taking action unless it’s necessary is an implementation detail designed to ensure idempotence.

Resources have data. If we were to write code to create a user, in addition to a default action, which all resources have (in the case of a package it’s to install the package; in the case of a user, it’s to create the user), we’d also probably want to specify some additional configuration for a user. For example, we might want to set a shell, or a comment:

user "melisande" do

comment "International Master Criminal"

shell "/bin/ksh"

home "/export/home/melisande"

supports :manage_home => true

action :create

end

Resources, then, have a name (in this case, melisande), a type (in this case, a user), data in the form of parameter attributes (in this case, comment, shell, home directory, and supports), and an action (in this case, we’re going to create the user).

In our exercise we used the user resource to create a user called “tdi”. I asked you to review the documentation on the user resource on the docs site. Again, there is far more information there than you need now, but as you go on to build more complex infrastructure, you will refer to the resource documentation time and again. The most confusing aspect of the documentation (at the time of writing) is the idea of “supported features.” The resource has the attribute supports, with key/value pairs representing whether a given feature is supported by the underlying provider (for example, useradd on Solaris versus Linux). One such feature is manage_home. This flag is used to make explicit whether a home directory will be created at the same time as the user is created. The supports syntax is a bit cumbersome, so there’s a handy convenience methodmanage_home that can be set to true or false. It has the same effect, but looks a bit cleaner. I’ll draw your attention to one particular wart that could catch you if you’re a RHEL/CentOS user. The default behavior of the user resource is not to create the home directory. This is pretty much standard across Linux and Unix. However, an implementation detail of RHEL-family systems is that 'useradd does create a /home/user directory by default. The result is that you could get away with never declaring home or manage_home in your user resources on RHEL systems, but then get tripped up if you expected your code to work on other Linux systems. For this purpose, I recommend explicitly specifying both the home directory and manage_home: true in your user resource declarations.

You’ll notice that we called the file we wrote tdi.rb. It’s actually Ruby code (and if this is not familiar to you, don’t worry—you’ll learn all the Ruby you need to know in Chapter 2). We can prove this by adding some Ruby into the file, and running it again:

$ cat tdi.rb

10.times { puts "This is actually just Ruby" }

user 'tdi' do

action :create

comment "Test Driven Infrastructure"

home "/home/tdi"

supports :manage_home => true

end

template '/home/tdi/.tdi' do

action :create

source 'bogus'

end

# chef-apply tdi.rb

This is actually just Ruby

This is actually just Ruby

This is actually just Ruby

This is actually just Ruby

This is actually just Ruby

This is actually just Ruby

This is actually just Ruby

This is actually just Ruby

This is actually just Ruby

This is actually just Ruby

Recipe: (chef-apply cookbook)::(chef-apply recipe)

* user[tdi] action create (up to date)

* template[/home/tdi/.tdi] action create

================================================================================

Error executing action `create` on resource 'template[/home/tdi/.tdi]'

================================================================================

...

...

In Chef terms, the file that we wrote is called a recipe. It’s a set of instructions, a set of resources that we need to configure the machine in the way we want it. When we say that an infrastructure developer is writing Chef code, we are typically talking about using Chef’s “recipe” DSL. Let’s quickly explore the idea of a DSL.

DSL, or domain specific language, in practice means a way of encapsulating shared knowledge relating to a specific task or series of tasks, in a small, clearly defined set of words, with a small and clearly defined set of rules.

The example I like to give when I’m training people to use Chef is the game of Blackjack. I used to take a ferry from the south of England to the north of France or Spain, every so often. Especially on the longer journeys, I used to sit in the ship’s casino and play cards. A popular game was blackjack. The passengers were frequently French, Spanish, English, Dutch, or German. However, everyone was able to play blackjack because there was an established DSL in place. Everyone knew that “card” means “give me a card.” Everyone knew that “stick” means “I don’t want another card.” Everyone knew that “split” means “separate my two cards into two piles of one, and deal one card to each pile.” There were rules around the usage of the terms; you can’t use the language when it’s someone else’s turn. You can’t split if the cards aren’t of the same value. This is the same of all DSLs—they have a few meaningful keywords, and a few grammatical and syntactical rules.

Whenever we speak about a DSL, it naturally follows that we explain the purpose of the DSL. Thus if we were to say, “Gherkin is a DSL,” that doesn’t really tell us much. If, however, we were to say, “Gherkin is a DSL for translating stakeholder requirements to executable Ruby acceptance tests,” it makes much more sense. Similarly, as the old joke goes, Java is a DSL for producing stack traces.[2] It turns out that Chef has a DSL for several things: recipes, roles, environments, and the creation of custom resources and providers. We’ll cover most of the Chef DSLs in this book, but at a high level Chef provides DSLs for programmatically declaring which resources should be configured on a machine, for grouping related resources together and applying them to machines of the same sort, for isolating systems of a certain class, ensuring they remain in a defined state, and several other powerful concepts. This allows us to bring into being services using code.

You’ll notice that when we tried to use a bogus resource in our recipe, Chef complained that it couldn’t find a resource of the type we declared:

[2013-05-28T11:11:59+01:00] FATAL: NameError: Cannot find a resource for dotfile on centos version 6.4

Why then did we have a problem when we tried to use a template resource? Here we hit upon the limitations of chef-apply. chef-apply is really only useful for a quick job, or (as we’ve seen) for instructional purposes. It doesn’t have any context outside the single Ruby file it is passed. Templates, by their very definition, have a source template that is populated with data. We don’t have any way of providing a source template to Chef when using chef-apply, and so we get an error. In our next exercise, we’ll graduate to using chef-solo, and explore some more resource types.

Exercise 3: Install an IRC Client

Objectives

After completing this exercise, you will:

§ Be familiar with the package, directory, and cookbook_file resources

§ Understand chef-solo, and how it is configured

§ Understand the ideas of a recipe, a cookbook, and a run list

Directions

1. Ensure you don’t still have the “This is actually just Ruby” code in your recipe.

2. Run chef-solo without any configuration options, and read the output.

3. Look at the knife help output for the cookbook subcommand, paying particular attention to cookbook path, and then create a cookbook called irc.

4. Verify that a skeleton cookbook has been created.

5. Read the package resource documentation at http://docs.opscode.com/resource_package.html

6. Read the cookbook_file resource documentation at http://docs.opscode.com/resource_cookbook_file.html

7. Read the directory resource documentation at http://docs.opscode.com/resource_directory.html

8. Open the default.rb recipe in your text editor, and copy the user resource into the file.

9. Add a resource to install the irssi package.

10.Add a resource to create a .irssi directory in the “tdi” user’s home directory, owned by the “tdi” user.

11.Add a resource to drop off an irssi config file at ~/.irssi/config, also owned by the “tdi” user. Use the irssi config at https://gist.github.com/Atalanta/5676662.

12.Create a solo.rb config file, and specify your cookbook path.

13.Search the docs site for “run list” to understand the high level concept.

14.Run chef-solo, telling it to converge the node with the default recipe from the irc cookbook.

15.Become the “tdi” user, and launch your IRC client, by typing irssi at the command prompt, and say “ohai!” in the ##tdi chat room!

Worked Example

I ran chef-solo on one of the machines, and read the output, noting that it was unable to find a configuration file, but would take its configuration from the command line, and that it failed to compile any cookbooks, having looked in two locations. It suggested I make sure mycookbook_path was set correctly:

$ sudo chef-solo

[sudo] password for stephen:

[2013-05-28T18:05:01+01:00] WARN: *****************************************

[2013-05-28T18:05:01+01:00] WARN: Did not find config file: /etc/chef/solo.rb, using command line options.

[2013-05-28T18:05:01+01:00] WARN: *****************************************

Starting Chef Client, version 11.4.4

Compiling Cookbooks...

[2013-05-28T18:05:03+01:00] FATAL: No cookbook found in ["/var/chef/cookbooks", "/var/chef/site-cookbooks"], make sure cookbook_path is set correctly.

[2013-05-28T18:05:03+01:00] FATAL: No cookbook found in ["/var/chef/cookbooks", "/var/chef/site-cookbooks"], make sure cookbook_path is set correctly.

[2013-05-28T18:05:03+01:00] ERROR: Running exception handlers

[2013-05-28T18:05:03+01:00] ERROR: Exception handlers complete

Chef Client failed. 0 resources updated

[2013-05-28T18:05:03+01:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out

[2013-05-28T18:05:03+01:00] FATAL: Chef::Exceptions::CookbookNotFound: No cookbook found in ["/var/chef/cookbooks", "/var/chef/site-cookbooks"], make sure cookbook_path is set correctly.

I looked at knife help cookbook and was given a choice of two pages to read:

$ knife help cookbook

WARNING: No knife configuration file found

Multiple help topics match your query. Pick one:

1. knife-cookbook-site

2. knife-cookbook

I selected the second, and read the documentation, discovering that I could set the cookbook path with the -o --cookbook-path switch. I then created an irc cookbook as follows:

$ knife cookbook create irc -o .

WARNING: No knife configuration file found

** Creating cookbook irc

** Creating README for cookbook: irc

** Creating CHANGELOG for cookbook: irc

** Creating metadata for cookbook: irc

I verified the skeleton with the following:

$ ls -1F irc/

attributes/

CHANGELOG.md

definitions/

files/

libraries/

metadata.rb

providers/

README.md

recipes/

resources/

templates/

I read the documentation page for the package resource on the docs site, and concluded that I didn’t need to specify any particular attributes, and that Chef would work out the right thing to do on my platform.

I edited the recipe at irc/recipes/default.rb and added the user resource (with the neater manage_home syntax) and the following to ensure the user was created, and to install the irssi package:

user 'tdi' do

action :create

comment "Test Driven Infrastructure"

home "/home/tdi"

manage_home true

end

package 'irssi' do

action :install

end

I read the documentation for the directory resource, and added the following:

directory '/home/tdi/.irssi' do

owner 'tdi'

group 'tdi'

end

I read the documentation for the cookbook_file resource, and added the following resource:

cookbook_file '/home/tdi/.irssi/config' do

source 'irssi-config'

owner 'tdi'

group 'tdi'

end

I then created a file at files/default/irssi-config with the following content:

servers = (

{

address = "irc.freenode.net";

chatnet = "Freenode";

port = "6667";

autoconnect = "Yes";

}

);

chatnets = { Freenode = { type = "IRC"; }; };

settings = {

core = {

real_name = "Sir Edward Elgar";

nick = "elgar";

user_name = "elgar";

};

"fe-text" = { actlistort = "refnum"; };

};

channels = (

{ name = "#learnchef"; chatnet = "Freenode"; autojoin = "Yes"; },

{ name = "#chef"; chatnet = "Freenode"; autojoin = "Yes"; },

{ name = "##tdi"; chatnet = "Freenode"; autojoin = "Yes"; }

);

I searched the docs page for “run list” and read the first two hits (http://docs.opscode.com/essentials_node_object_run_lists.html and http://docs.opscode.com/essentials_cookbook_recipes_run_lists.html), which helped me to understand the idea of a run list. Having run chef-solo --help, I determined that I could pass the configuration file as an option using the -c, --config option, and that I could specify a run list using -o, --override-runlist. Armed with this knowledge I created a solo.rb config file within a .chef directory, and then ran Chef (again as root on Centos, and with sudo, and as sns on Ubuntu):

$ mkdir ~/.chef

$ cat ~/.chef/solo.rb

cookbook_path ENV['HOME']

$ sudo chef-solo --config ~/.chef/solo.rb --override-runlist 'recipe[irc]'

Starting Chef Client, version 11.4.4

[2013-05-30T10:46:23+01:00] WARN: Run List override has been provided.

[2013-05-30T10:46:23+01:00] WARN: Original Run List: []

[2013-05-30T10:46:23+01:00] WARN: Overridden Run List: [recipe[irc]]

Compiling Cookbooks...

Converging 4 resources

Recipe: irc::default

* user[tdi] action create (up to date)

* package[irssi] action install

- install version 0.8.15-5.el6 of package irssi

* directory[/home/tdi/.irssi] action create

- create new directory /home/tdi/.irssi

- change owner from '' to 'tdi'

- change group from '' to 'tdi'

* cookbook_file[/home/tdi/.irssi/config] action create

- create a new cookbook_file /home/tdi/.irssi/config

--- /tmp/chef-tempfile20130530-15376-1m5nhvp 2013-05-30 10:46:28.698288821 +0100

+++ /root/irc/files/default/irssi-config 2013-05-30 10:40:50.313288775 +0100

@@ -0,0 +1,25 @@

+servers = (

+ {

+ address = "irc.freenode.net";

+ chatnet = "Freenode";

+ port = "6667";

+ autoconnect = "Yes";

+ }

+);

+

+chatnets = { Freenode = { type = "IRC"; }; };

+

+settings = {

+ core = {

+ real_name = "Sir Edward Elgar";

+ nick = "elgar";

+ user_name = "elgar";

+ };

+ "fe-text" = { actlistort = "refnum"; };

+};

+

+channels = (

+ { name = "#learnchef"; chatnet = "Freenode"; autojoin = "Yes"; },

+ { name = "#chef"; chatnet = "Freenode"; autojoin = "Yes"; },

+ { name = "##tdi"; chatnet = "Freenode"; autojoin = "Yes"; }

+);

Chef Client finished, 3 resources updated

I su‘d to tdi, ran irssi, and found the ##tdi room, and said “Ohai”.

Discussion

When we ran chef-solo we learned three important things:

# chef-solo

[2013-03-12T16:41:53+00:00] WARN: *****************************************

[2013-03-12T16:41:53+00:00] WARN: Did not find config file: /etc/chef/solo.rb, using command line options.

[2013-03-12T16:41:53+00:00] WARN: *****************************************

Starting Chef Client, version 11.4.0

Compiling Cookbooks...

[2013-03-12T16:41:54+00:00] FATAL: No cookbook found in ["/var/chef/cookbooks", "/var/chef/site-cookbooks"], make sure cookbook_path is set correctly.

[2013-03-12T16:41:54+00:00] FATAL: No cookbook found in ["/var/chef/cookbooks", "/var/chef/site-cookbooks"], make sure cookbook_path is set correctly.

[2013-03-12T16:41:54+00:00] ERROR: Running exception handlers

[2013-03-12T16:41:54+00:00] ERROR: Exception handlers complete

Chef Client failed. 0 resources updated

[2013-03-12T16:41:54+00:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out

[2013-03-12T16:41:54+00:00] FATAL: Chef::Exceptions::CookbookNotFound: No cookbook found in ["/var/chef/cookbooks", "/var/chef/site-cookbooks"], make sure cookbook_path is set correctly.

1. Chef expects a configuration file, but will accept options on the command line, in lieu of a configuration file.

2. It expects the configuration file to reside in /etc/chef.

3. It looked for cookbooks in the /var/chef/cookbooks directories and the /var/chef/site-cookbooks directory, but failed to find any.

What are these cookbooks of which Chef speaks? My Chambers English Dictionary (highly recommended for budding cruciverbalists) defines a cookbook as:

A book of recipes for cooking dishes.

Well obviously we’re not cooking dishes, but the rest of the metaphor makes sense—cookbooks contain recipes. So what’s a recipe? Turning again to my trusty dictionary, I’m told that one definition of a recipe is:

A method laid down for achieving a desired end.

This is perfect! That’s exactly what a recipe is. It’s a method of achieving a desired outcome—the desired state of our infrastructure. That method might be fairly complex because realistically speaking, our infrastructures are much more complicated than can be expressed in a single or even a collection of independent resources. As infrastructure developers, the bulk of the code we write will be in the form of these recipes.

Recipes in Chef are written in a domain-specific language (DSL), which allows us to declare the state in which a node should be. Remember, a domain-specific language is a computer language designed to address a very specific problem space. It has grammar and syntax in the same way as any other language but is generally much simpler than a general purpose programming language. Ruby is a programming language particularly suited to the creation of DSLs. It’s very powerful, flexible, and expressive. As we already mentioned, DSLs are used in a number of places throughout the framework. However, a particularly important thing to understand about Chef is that not only do we have DSLs to address particular problem spaces, we also always have direct access to the entire Ruby programming language. This means that if at any stage you need to extend the DSL—or perform some calculation, transformation, or other task—you are never restricted by the DSL. This is one of the great advantages of Chef.

In Chef, order is highly significant. Recipes are processed in the exact order in which they are written, every time. Recipes are processed in two stages—a compile stage and an execute stage. The compile stage consists of gathering all the resources that, when configured, will result in conformity with policy, and placing them in a kind of list called the resource collection. At the second stage, Chef processes this list in order, taking actions as specified. As you become more advanced in Chef recipe development, you will learn that there are ways to subvert this process, and when it is appropriate to do so. However, for the purposes of this book, it is sufficient to understand that recipes are processed in order, and actions taken.

Recipes by themselves are frequently not much use. Many resources require additional data as part of their action—for example, the template resource will require, in addition to the resource block in the recipe, an Erubis template file. As you advance in your understanding and expertise, you may find that you need to extend Chef and provide your own custom resources and providers. For example, you might decide you want to write a resource to provide configuration file snippets for a certain service. Chef provides another DSL for specifically this purpose.

If recipes require supporting files and code, we need a way to package this up into a usable component. This is the purpose of a cookbook. Cookbooks can be thought of as package management for Chef recipes and code. They may contain a large number of different recipes and other components. Cookbooks have metadata associated with them, including version numbers, dependencies, license information, and attributes.

Cookbooks can be published and shared. This is another of Chef’s great strengths. Via the Opscode Chef community website, you can browse and download over 800 different cookbooks. The cookbooks are generally of very high quality, and a significant proportion of them are written by Opscode developers. Cookbooks can be rated and categorized on the community site, and users can elect to “follow” cookbooks to receive updates when new versions become available.

Knife provides a subcommand that will create a skeleton cookbook, ready to be used for modeling infrastructure. By default, Knife will attempt to create and populate a directory at /var/chef/cookbooks; this is the cookbook path, the place Knife looks for cookbooks:

$ knife cookbook create silly

WARNING: No knife configuration file found

** Creating cookbook silly

ERROR: Errno::EACCES: Permission denied - /var/chef/cookbooks

$ ls -ld /var/chef/*

drwxr-xr-x 2 root root 4096 May 28 18:05 /var/chef/cache

$ sudo knife cookbook create silly

[sudo] password for stephen:

WARNING: No knife configuration file found

** Creating cookbook silly

** Creating README for cookbook: silly

** Creating CHANGELOG for cookbook: silly

** Creating metadata for cookbook: silly

$ ls -ld /var/chef/*

drwxr-xr-x 2 root root 4096 May 28 18:05 /var/chef/cache

drwxr-xr-x 3 root root 4096 May 30 08:59 /var/chef/cookbooks

$ ls -ld /var/chef/cookbooks/*/*

drwxr-xr-x 2 root root 4096 May 30 08:59 /var/chef/cookbooks/silly/attributes

-rw-r--r-- 1 root root 409 May 30 08:59 /var/chef/cookbooks/silly/CHANGELOG.md

drwxr-xr-x 2 root root 4096 May 30 08:59 /var/chef/cookbooks/silly/definitions

drwxr-xr-x 3 root root 4096 May 30 08:59 /var/chef/cookbooks/silly/files

drwxr-xr-x 2 root root 4096 May 30 08:59 /var/chef/cookbooks/silly/libraries

-rw-r--r-- 1 root root 274 May 30 08:59 /var/chef/cookbooks/silly/metadata.rb

drwxr-xr-x 2 root root 4096 May 30 08:59 /var/chef/cookbooks/silly/providers

-rw-r--r-- 1 root root 1439 May 30 08:59 /var/chef/cookbooks/silly/README.md

drwxr-xr-x 2 root root 4096 May 30 08:59 /var/chef/cookbooks/silly/recipes

drwxr-xr-x 2 root root 4096 May 30 08:59 /var/chef/cookbooks/silly/resources

drwxr-xr-x 3 root root 4096 May 30 08:59 /var/chef/cookbooks/silly/templates

This configuration can be set in Knife’s own configuration file, which we’ll come to later. However, it can also be set on the command line with the -o, --cookbook-path option.

The cookbook generator will create a default recipe in the recipes directory of the cookbook. It was this file we opened in our text editor, to declare the package resource to install the IRC client. You’ll notice that at the top of the file, some boilerplate was generated:

#

# Cookbook Name:: irc

# Recipe:: default

#

# Copyright 2013, YOUR_COMPANY_NAME

#

# All rights reserved - Do Not Redistribute

#

The contents of this can be modified by making further changes in your Knife configuration file, which we’ll come to in the next exercise.

A WORD ABOUT TEXT EDITORS

The art of modeling infrastructure as code is a discipline that fits firmly within the software development world. We’re writing software that generates configuration dynamically on machines, in order to allow us to deploy and run applications that deliver business value. Software developers use full-featured text editors that remain open on the desktop at all times. They support syntax highlighting, may have the concept of a project drawer, may provide powerful search features, may be programmable, allow for multiple files to be edited at once and viewed side-by-side, and offer integration with source code management systems. Professional software developers use professional tools.

As an infrastructure developer, you’re now a professional software developer, and you should use the same quality of tools. If you already use an editor that provides these kinds of features, then this exhortation is not for you. However, if you use vanilla vi, nano, or Notepad: please stop. Different editors have their own fierce advocates. Personally, I’m a huge fan of Emacs. However,TextMate, Sublime Text, Vim, Emacs, or maybe even Eclipse would make a fine choice. If you’ve never used such a tool, I’d suggest starting with Sublime Text 2—it’s an excellent, modern editor with plug-in support for Chef development, and it works on Linux, OSX, and Windows. If you’re prepared to put in a few days on a somewhat steeper learning curve, I would wholeheartedly recommend Emacs. Whatever you do, pick an editor, make it part of your professional development to learn its features, and master it thoroughly.

So now that we have a recipe and a cookbook, how can we apply these to the machine we want to configure? We already know we can use chef-apply, but now that we have a config file in our recipe, we need something a bit more powerful. We placed the config file we wanted to drop off into the cookbook. Now we need to tell chef-solo where to find the cookbook. chef-solo takes a number of command-line options, but not one that tells it where to find the cookbooks. This gives us two options: either we put the cookbooks where chef-solo expects to find them, or we create a configuration file that tells chef-solo to find them where we want them to be. We already know that Chef looks for cookbooks in /var/chef/cookbooks, so that’s an option, but for my local machine, I prefer to keep them in my home directory and tell Chef how to find it. Hence, we set it in the solo.rb file.

cookbook_path ENV['HOME']

This introduces a common pattern in Chef: configuration files are Ruby files so we can use whatever Ruby constructs we need.

Now we can tell Chef where to find its configuration file and consequently the cookbooks, but Chef doesn’t know which recipe to run. When we used chef-apply it was simple: we just told Chef exactly which recipe to run. Obviously this doesn’t scale beyond exceptionally simple cases. Chef, therefore, has the concept of a Run List—a list of recipes to run on the node. The simplest way to do this is to pass it as a command line to chef-solo. Recipes on the run list take the following form: recipe[cookbook_name::recipe_name]. The convention is that if the “default” recipe is run, there’s no need to specify it, and so the run list item will be recipe[cookbook_name].

When Chef runs, the resources in the recipes on the run list are evaluated and action is taken to bring the system into desired state.

Exercise 4: Install Git

Objectives

Upon completing this exercise, you should have:

§ Used community cookbooks to build infrastructure

§ Understood how Chef differentiates between platforms, taking appropriate action

§ A Git repository containing Chef code and supporting files, and be able to interact with it

§ Understood the basics of Chef node attributes

§ Understood the concept of dependencies in cookbooks

§ Understood the mechanism for including recipes from other cookbooks inside another recipe

Directions

1. Read the documentation for knife cookbook site.

2. Download the git recipe from the Opscode community site, and extract it within your cookbook path.

3. Examine the metadata.rb file for the Git cookbook, and download the cookbooks upon which the Git cookbook depends.

4. Recurse through each downloaded cookbook, downloading each cook dependency.

5. Ensure all the cookbooks are on the cookbook path.

6. Search the documentation site for dna.json, and create a dna.json file containing a run list containing the default recipe from both the irc and the git cookbooks.

7. Run chef-solo with the appropriate arguments.

8. As the TDI user, find or locate a convenient position in your filesystem, and clone the https://github.com/opscode/chef-repo.git repository.

9. Configure Git with your name and email address.

10.Sign up for a GitHub account (if you don’t have one already).

11.Create an ssh key pair, and upload the public portion to GitHub.

12.Create a repository called chef-repo, and set the remote origin of the cloned repository to this new repository.

13.Copy your cookbooks into the cookbooks directory of the chef-repo, add them, and push to GitHub.

Worked Example

I ran knife help cookbook site, and read the manual page. I noted an install option, which seemed to do some magic with Git. Being skeptical of magic, I read on, and found the section on downloading a cookbook. Having digested this, I ran the following:

$ cd

$ knife cookbook site download git

WARNING: No knife configuration file found

Downloading git from the cookbooks site at version 2.5.2 to /home/stephen/git-2.5.2.tar.gz

Cookbook saved: /home/stephen/git-2.5.2.tar.gz

I decided that if I were going to have multiple cookbooks, I might as well have a cookbooks directory, so I made one, and updated my solo.rb:

$ mkdir ~/cookbooks

$ cat ~/.chef/solo.rb

cookbook_path "#{ENV['HOME']}/cookbooks"

I moved my irc cookbook into the cookbooks directory, and then extracted the git cookbook:

$ mv ~/irc ~/cookbooks

$ tar xzvf git-2.5.2.tar.gz -C cookbooks/

git/

git/.gitignore

git/.kitchen.yml

git/attributes/

git/Berksfile

git/CHANGELOG.md

git/CONTRIBUTING

git/Gemfile

git/LICENSE

git/metadata.json

git/metadata.rb

git/README.md

git/recipes/

git/templates/

git/TESTING.md

git/templates/default/

git/templates/default/git-xinetd.d.erb

git/templates/default/sv-git-daemon-log-run.erb

git/templates/default/sv-git-daemon-run.erb

git/recipes/default.rb

git/recipes/server.rb

git/recipes/source.rb

git/recipes/windows.rb

git/attributes/default.rb

I looked at the metadata.rb of the cookbook and discovered this:

%w{ dmg build-essential yum windows }.each do |cookbook|

depends cookbook

end

depends "runit", ">= 1.0"

I downloaded these dependencies with the following:

$ for dep in dmg build-essential yum windows runit; do knife cookbook site download $dep; tar xzvf $dep*gz -C cookbooks; done

I then recursed into these cookbooks as follows:

$ cd cookbooks

$ grep depends */metadata.rb

git/metadata.rb: depends cookbook

git/metadata.rb:depends "runit", ">= 1.0"

runit/metadata.rb:depends "build-essential"

runit/metadata.rb:depends "yum"

windows/metadata.rb:depends "chef_handler"

And downloaded the missing dependency:

$ knife cookbook site download chef_handler && tar xzvf chef_handler*gz -C cookbooks

I verified that this in turn didn’t have a dependency, by checking its metadata.rb file.

Having read about dna.json (http://bit.ly/1fQWLQE), I created a dna.json file in my .chef directory, with the following content:

{

"run_list": ["recipe[irc]", "recipe[git]"]

}

Upon running Chef, the Git package was successfully installed:

$ sudo chef-solo --config ~/.chef/solo.rb --json-attributes ~/.chef/dna.json

Starting Chef Client, version 11.4.4

Compiling Cookbooks...

Converging 5 resources

Recipe: irc::default

* user[tdi] action create (up to date)

* package[irssi] action install (up to date)

* directory[/home/tdi/.irssi] action create (up to date)

* cookbook_file[/home/tdi/.irssi/config] action create (up to date)

Recipe: git::default

* package[git] action install

- install version 1:1.8.1.2-1 of package git

Chef Client finished, 1 resources updated

I switched to the tdi user (with sudo in the case of Ubuntu), and felt that the root of the tdi home directory would be an admirable place to clone the Opscode Git repository.

$ sudo su - tdi

$ git clone git://github.com/opscode/chef-repo.git

Cloning into 'chef-repo'...

remote: Counting objects: 209, done.

remote: Compressing objects: 100% (128/128), done.

remote: Total 209 (delta 74), reused 170 (delta 47)

Receiving objects: 100% (209/209), 36.40 KiB, done.

Resolving deltas: 100% (74/74), done.

I set up Git to use my name and address, as shown:

$ git config --global color.ui "auto"

$ git config --global user.email "stephen@atalanta-systems.com"

$ git config --global user.name "Stephen Nelson-Smith"

I already have a GitHub account, so I simply created a key pair:

$ ssh-keygen -t dsa -f tdi-example

Generating public/private dsa key pair.

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in tdi-example.

Your public key has been saved in tdi-example.pub.

The key fingerprint is:

98:fb:2d:c6:ff:66:76:ac:b0:da:1d:37:1e:92:ae:64 stephen@Stephens-MacBook-Air.local

The key's randomart image is:

+--[ DSA 1024]----+

| |

| |

| |

| o |

| o S |

| . . |

| .. E +.+ |

| .+= =+=oo |

| .o+**=o. |

+-----------------+

To add my key, I simply logged into GitHub, and navigated to https://github.com/settings/ssh. There I clicked “Add SSH Key,” gave a title, pasted the public key—which was created in my working directory—and clicked “Add Key”.

I then clicked the “Create a new repo” button, just to the right of my username, and created a repo called tdi-example. I gave it a description, and clicked the button to create the repository.

I then changed the remote URL for the repository I cloned to match the one I created:

$ cd ~/chef-repo

$ git remote set-url origin git@github.com:atalanta-cookbooks/tdi-example

I changed back to a user with appropriate privileges (root or sns with sudo) and returned to the original directory where I had created my cookbooks directory and rsync’d them into the chef-repo/cookbooks directory:

$ cd

$ sudo rsync -Pvar cookbooks/ /home/tdi/chef-repo/cookbooks/

$ sudo chown -R tdi: ~tdi/chef-repo

To push the repo, I changed back to the tdi user, cached my ssh key, and then ran git add and git push:

$ whoami

tdi

$ ssh-agent bash

$ ssh-add tdi-example

Identity added: tdi-example (tdi-example)

$ cd chef-repo/

$ git add cookbooks

$ git commit -m "Adding TDI cookbooks"

$ git push -u origin master

The authenticity of host 'github.com (204.232.175.90)' can't be established.

RSA key fingerprint is 16:27:ac:a5:76:28:2d:36:63:1b:56:4d:eb:df:a6:48.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'github.com,204.232.175.90' (RSA) to the list of known hosts.

Counting objects: 209, done.

Compressing objects: 100% (101/101), done.

Writing objects: 100% (209/209), 36.40 KiB, done.

Total 209 (delta 74), reused 209 (delta 74)

To git@github.com:atalanta-cookbooks/tdi-example

* [new branch] master -> master

Branch master set up to track remote branch master from origin.(((range="endofrange", startref="ix_2-Introduction-to-Chef-asciidoc14")))(((range="endofrange", startref="ix_2-Introduction-to-Chef-asciidoc13")))

Discussion

Within the Chef world, pretty much everything is addressable via an API. This extends to the community cookbook site. There are hundreds of cookbooks—perhaps now even more than a thousand—available on the community cookbook site. Knife provides an interface to the site, allowing the searching, downloading, and sharing of cookbooks. Although for the purpose of this series of exercises, our concern is to learn the fundamentals of Chef so we’re writing recipes ourselves, a fairly standard workflow would be to query the cookbooks site for a key word, and then inspect or use an open source cookbook. To pick a random example, suppose I’d been discussing setting up some form of LDAP service. With one command, I immediately have a set of candidate cookbooks written using a framework I understand, in version control, rated and used by other infrastructure developers. Even if I decide, having reviewed the candidates, to write (and maybe share) my own cookbook, I have the work of other people to inspire, guide, and inform me.

$ knife cookbook site search ldap

ca_openldap:

cookbook: http://cookbooks.opscode.com/api/v1/cookbooks/ca_openldap

cookbook_description: Configures a node to be an OpenLDAP server or client.

cookbook_maintainer: carguel

cookbook_name: ca_openldap

ldap:

cookbook: http://cookbooks.opscode.com/api/v1/cookbooks/ldap

cookbook_description: Installs/Configures ldap

cookbook_maintainer: someara

cookbook_name: ldap

ldapknife:

cookbook: http://cookbooks.opscode.com/api/v1/cookbooks/ldapknife

cookbook_description: Installs ldapknife.pl to /usr/local/bin

cookbook_maintainer: jackl0phty

cookbook_name: ldapknife

opendj:

cookbook: http://cookbooks.opscode.com/api/v1/cookbooks/opendj

cookbook_description: Installs OpenDJ LDAP server

cookbook_maintainer: elliotkendall

cookbook_name: opendj

opendj-openam:

cookbook: http://cookbooks.opscode.com/api/v1/cookbooks/opendj-openam

cookbook_description: Installs/Configures opendj

cookbook_maintainer: thomasalrin

cookbook_name: opendj-openam

openldap:

cookbook: http://cookbooks.opscode.com/api/v1/cookbooks/openldap

cookbook_description: Configures a server to be an OpenLDAP master, replication slave or client for auth

cookbook_maintainer: opscode

cookbook_name: openldap

sssd_ldap:

cookbook: http://cookbooks.opscode.com/api/v1/cookbooks/sssd_ldap

cookbook_description: Installs/Configures LDAP on RHEL using SSSD

cookbook_maintainer: tas50

cookbook_name: sssd_ldap

zone2ldif:

cookbook: http://cookbooks.opscode.com/api/v1/cookbooks/zone2ldif

cookbook_description: Installs/Configures zone2ldif

cookbook_maintainer: jackl0phty

cookbook_name: zone2ldif

Once a cookbook has been identified as worthy of further investigation, it can be downloaded. As I alluded to in the worked example, Chef does provide a somewhat magical install subcommand, which will install upstream community cookbooks to a local Git repository. The steps it takes are as follows:

1. Create a fresh pristine copy branch for tracking upstream.

2. Remove any existing cookbook versions from the branch.

3. Download the cookbook tarball.

4. Extract the tarball and commits the contents to Git.

5. Merge pristine copy into master.

The idea is that upstream changes can be maintained as a patch and merged with local changes when needed. This pattern was pretty common in CVS and is called “vendor branching” (see http://bit.ly/19AU9os). I tend not to recommend this approach. First, I don’t much like magic. Git is complex. Blindly allowing branching and merging to happen without clearly understanding what is going on is a recipe for future pain. I also don’t tend to recommend keeping all your cookbooks in one repository for anything beyond learning the basics, and the site install mechanism expects you to be already within a Git repository when running the command. We’ll discuss this in more depth later, but for now I’d caution against using knife cookbook site install. The simplest approach is to use the download subcommand, which simply pulls down a tarball of the specified cookbook and version. You can then extract and work with it in your own way. Later we’ll discuss more powerful ways of approaching this issue, focused particularly on treating upstream cookbooks as dependencies and artifacts, rather than a grab bag of modifiable source code. However, for now, use knife cookbook site download.

Cookbooks are effectively the packaging system for infrastructure code. If you’ve ever worked with a packaging system before—RPM, dpkg, SVR4, pkgsrc, Rubygems—you will be aware that there is a known set of problems that need to be solved. These problems include how to express dependencies upon other cookbooks, how to handle versioning, how to handle potential conflicts, license information, discoverability, and so forth. The common component of all approaches to the solution is to maintain package metadata within each package.

Cookbook dependencies are a challenge you meet very quickly when you start to build on and use community cookbooks. Unsurprisingly, with a library of 1,000+ high-quality cookbooks, cookbook writers tend to use one another’s cookbooks to make life easy. For example, one cookbook may contain functionality for service management (runit or daemontools), and another for managing third-party upstream package repositories (apt or yum). A cookbook delivering a service that needs process management across platforms, and needs to configure upstream repositories, might well make use of the yum, apt, and runit cookbooks as a result. It gets slightly more involved when Windows and OSX are included. Additional primitives for managing Windows and OSX are provided in cookbooks, and not yet in core Chef. The Windows cookbook itself depends on functionality provided by the Chef Handler cookbook. All these dependencies must be expressed in the cookbook metadata. The metadata doesn’t support conditional logic so we can’t say, “If this machine is running on Linux, we don’t need the Windows or OSX-specific dependencies.” This means that in the case of a cross-platform community cookbook, you’ll find yourself depending on cookbooks for a platform you’ll never use. It’s a bit of a bore, but it’s not a solved problem; these challenges exist in all packaging solutions.

The cookbook metadata file is another Chef DSL. It’s a DSL for generating JSON that Chef uses for dependency solving and package management. An example metadata file would be:

name "windows"

maintainer "Opscode, Inc."

maintainer_email "cookbooks@opscode.com"

license "Apache 2.0"

description "Provides a set of useful Windows-specific primitives."

long_description IO.read(File.join(File.dirname(__FILE__), 'README.md'))

version "1.8.10"

supports "windows"

depends "chef_handler"

As a cookbook author, if your cookbook makes use of any recipes or library code from another cookbook, you must include this as a dependency in your metadata.

Obviously, this manual dependency solving is a bit of a pain. We’ll introduce tooling that removes this pain later, but this example serves to demonstrate how the dependencies work and makes their existence and importance explicit.

The dna.json file introduces an important new concept—node attributes. An attribute is that which inherently belongs to and can be predicated of anything. The sky has the attribute color: blue. A web server has an attribute: listen_port: 80. A server has the attribute disks: 8.Attributes, therefore, are data associated with the node.

You’ll remember I defined a node as a Ruby object representing the machine we’re configuring. This object behaves like a Hash: it has keys and values, getter and setter methods, and can be viewed, queried, and interacted with as JSON. The keys and values are referred to as node attribute data.

Some of this data is collected automatically by Ohai, such as the hostname, IP address, and a large amount of other pieces of information. However, arbitrary data can be associated with the node as well. Here we see a significant implication of using chef-solo. With chef-solo, there is no server; there is no persistent state that records the attributes of the node. That state must be handed to Chef, in the form of a JSON file. In our simple case, the only attribute that we’re setting is the run_list attribute. However, we could provide any number of keys and values.

Attributes allow sane defaults to be set for a cookbook. Rather than hardcoding implementation detail in a recipe, we can use an attribute like a variable. If you look at the Git cookbook we download, you’ll see that a number of attributes are set in the attributes/default.rb file:

case node['platform_family']

when 'windows'

default['git']['version'] = "1.8.1.2-preview20130201"

default['git']['url'] = "https://msysgit.googlecode.com/files/Git-#{node['git']['version']}.exe"

default['git']['checksum'] = "796ac91f0c7456b53f2717a81f475075cc581af2f447573131013cac5b63bb2a"

default['git']['display_name'] = "Git version #{ node['git']['version'] }"

when "mac_os_x"

default['git']['osx_dmg']['app_name'] = "git-1.8.2-intel-universal-snow-leopard"

default['git']['osx_dmg']['volumes_dir'] = "Git 1.8.2 Snow Leopard Intel Universal"

default['git']['osx_dmg']['package_id'] = "GitOSX.Installer.git182.git.pkg"

default['git']['osx_dmg']['url'] = "https://git-osx-installer.googlecode.com/files/git-1.8.2-intel-universal-snow-leopard.dmg"

default['git']['osx_dmg']['checksum'] = "e1d0ec7a9d9d03b9e61f93652b63505137f31217908635cdf2f350d07cb33e15"

else

default['git']['prefix'] = "/usr/local"

default['git']['version'] = "1.8.2.1"

default['git']['url'] = "https://nodeload.github.com/git/git/tar.gz/v#{node['git']['version']}"

default['git']['checksum'] = "bdc1768f70ce3d8f3e4edcdcd99b2f85a7f8733fb684398aebe58dde3e6bcca2"

end

default['git']['server']['base_path'] = "/srv/git"

default['git']['server']['export_all'] = "true"

We’ll cover how these attributes function in more detail shortly, but for now, I’d draw your attention to the conditional logic. The node has an attribute platform_family. This comes from Ohai. Ohai is able to determine if a machine is, for example, of Debian flavor or Windows flavor. Based on that, we can make decisions in our cookbooks and recipes. In this case, we’re specifying which versions of Git to obtain from an upstream provider, and which checksums should be used to verify that we obtained the correct file. Returning briefly to the metadata, you’ll also note that the metadata specifies which platforms the cookbook supports:

$ grep -C1 supports chef-repo/cookbooks/git/metadata.rb

%w{ amazon arch centos debian fedora redhat scientific oracle amazon ubuntu windows }.each do |os|

supports os

end

supports "mac_os_x", ">= 10.6.0"

Again, as a cookbook author, you should specify which platforms you support. If you don’t, you’re implicitly stating your cookbook supports all platforms, which is almost certainly not true.

So we set the run list to be an array of recipes in order. First we said that the default irc recipe should be applied, and then the default git recipe. The result was that Git was installed on our machine.

The example Opscode chef-repo contains all the directories you will need and work with as part of your regular workflow as an infrastructure developer. It also contains a Rakefile, which provides some useful tasks, such as creating self-signed SSL certificates. In practice, you’re unlikely to use rake, as knife will do more than 99% of the tasks you’ll find yourself needing to do. The chef-repo pattern is somewhat out of favor, and might even be considered an anti-pattern. The reason for this is that by putting absolutely everything in a single repository, we’re mixing temporal data—things that might change—with versioned artifacts. It also runs counter to the Git philosophy: have a repository for each software project and keep them light and mobile. Cookbooks are very much software projects, with independent versions, tags, development teams, and purposes. It just doesn’t make much sense to stick them all in one place. The emerging recommendation is to put temporal data in a chef-data repository and maintain a repository per cookbook. This makes it easy to track upstream by adding a remote, and pulling and merging when required. Note that this is without the magic of the knife cookbook site install command, and is a much more explicit procedure.

However, as a starting point, the monolithic Chef repository has its place. It gathers everything we need in one place. For users new to the idea of using version control at all, let alone something with the Byzantine reputation of Git, the learning curve of a single repository is pretty low. We can then refactor at the point of need—as soon as we start to feel the limitations of our approach, we should refactor—and move to the next level.

Of course we could have used the GitHub fork mechanism within the web interface to simplify the process of having our own Chef repository, but I wanted to show the manual process and support the ability to use other sources of Git server—such as an internally hosted Git server or an alternative public Git serve, such as Bitbucket.

In the next chapter, we’ll build on the work done here by installing some essential tools using Chef.


[1] Proxy support is provided in Chef, and most auxiliary utilities, but it can be a bit fiddly. Improvements are ongoing, and there are frequent discussions on the Chef mailing lists, and IRC and GitHub issues that will be relevant. Basically, you will be able to get up and running if you do have a proxy server in your environment, but it would be better for me to direct you to the latest discussion and details rather than attempt to provide a guide here, which will almost certainly date, rapidly.

[2] The original line is from Scott Bellware.