PostgreSQL - Reliably Deploying Rails Applications: Hassle free provisioning, reliable deployment (2014)

Reliably Deploying Rails Applications: Hassle free provisioning, reliable deployment (2014)

11.1 - PostgreSQL

Overview

In this chapter we’ll cover getting the PostgreSQL server packages installed, setting up authentication and managing databases with the console.

Installation

In our simple configuration, the only parameter which must be set is the postgres users password. It’s important that you note this as it will be required to gain access to the postgres user which will be needed for creating databases. This is generated using:

1 openssl passwd -1 "plaintextpassword".

And entered into the node definition:

1 "postgresql" : {

2 "password" : {

3 "postgres" : "openssl_output"

4 }

5 },

If, as happends very easily, this password is lost or forgotten, you can reset it with:

1 sudo passwd postgres

Which will then prompt you to enter and re-enter the plain text password.

The sample configuration contains a simple postgres-server role which automatically includes the PostgreSQL server recipe as well as a simple monit configuration:

1 {

2 "name": "postgres-server",

3 "description": "Postgres database server",

4 "default_attributes": {

5

6 },

7 "json_class": "Chef::Role",

8 "run_list": [

9 "postgresql::server",

10 "monit_configs-tlq::postgres"

11 ],

12 "chef_type": "role"

13 }

This can be included in a node definition by adding:

1 "role[postgres-server]"

to the run list.

Accessing the psql console

The PostgreSQL console can be accessed when logged into the server with ssh by first switching to the postgres user:

1 su postgres

And entering the password chosen above, then entering:

1 psql

You should then see something like:

1 postgres@precise64:/etc/postgresql/9.1/main$ psql

2 psql (9.1.9)

3 Type "help" for help.

4

5 postgres=#

This console will be the main tool used for creating databases and managing access to these databases.

You can exit the console by typing:

1 \q

Creating Databases

Access the psql console as above and then enter:

1 CREATE DATABASE database_name;

and press enter.

What this process actually does is create a copy of the default database ‘template1’ (which is created when PostgreSQL is first installed) with the name database_name. Advanced usage and manipulation of template databases is beyond the scope of this book but is well documented in the official PostgreSQL manual.

If, having created this database, you wanted to execute commands on it using the console, you would enter:

1 psql database_name

while logged in as the postgres user. psql expects the first none option (e.g. not - or --) to be the name of the database to be connected to. You can also specify a database with the -d flag, for example:

1 psql -d database_name

Adding users to Databases

At the moment only the postgres superuser can access the newly created database. In practice for security we want each database to have its own user which can access only one specific database and cannot create or modify other databases and users.

To create a new user, in the psql console enter:

1 CREATE USER my_user WITH PASSWORD 'my_password';

and press return.

You can then grant this user all privileges on a specific database with:

1 GRANT ALL PRIVILEGES ON DATABASE database_name to my_user;

You can now exit the psql console with \q and return to the deploy user by entering exit in the console.

If we now wanted to start a psql console specifically to execute commands in the database we have just created as the user we just created, there is no need to switch to the postgres user, we can simply use:

1 psql -h 127.0.0.1 database_name my_user

or

1 psql -h 127.0.0.1 -d database_name -U my_user

Note the addition of -h 127.0.0.1. This will force the connection to be established via tcp/ip rather than the default Unix-domain socket. We’ll see later in the Configuring Authentication section that in our default configuration, password based authentication is only enabled for tcp connections.

Listing all databases and permissions

An extremely useful command when in the psql console is the simple:

1 \l

Which, when run as the postgres user, will output a list of all databases and who has access to them.

Configuring Authentication

Postgres supports a variety of authentication methods but we’ll consider three key ones here; Peer, Ident, and md5.

pg_hba.conf

The authentication methods which are allowed on a PostgreSQL server are defined in a file traditionally called pg_hba.conf. You can find this in:

1 /etc/postgres/version/pg_hba.conf

Where version is the version of postgres you have installed.

Whilst the generation of this file will be taken care of by our Chef recipe, understanding how this file is structured and what it means will make troubleshooting connection issues and creating custom configurations further down the line much easier. This explanation is not exhaustive (the postgres manual is excellent for more details) but covers the basic structure and options relevant to our configuration.

Our default configuration will look something like this:

1 # This file was automatically generated and dropped off by Chef!

2

3 # PostgreSQL Client Authentication Configuration File

4 # ===================================================

5 #

6 # Refer to the "Client Authentication" section in the PostgreSQL

7 # documentation for a complete description of this file.

8

9 # TYPE DATABASE USER ADDRESS METHOD

10

11 ###########

12 # Other authentication configurations taken from chef node defaults:

13 ###########

14

15 local all postgres ident

16

17 local all all ident

18

19 host all all 127.0.0.1/32 md5

20

21 host all all ::1/128 md5

22

23 # "local" is for Unix domain socket connections only

24 local all all peer

Each line of the file represents a new record, with columns being separated by spaces or tabs.

The first column (type) specifies the type of connection the record applies to. Host means tcp/ip connections and local means unix-domain sockets.

Database and user refer to the database and user the rule applies to, our configuration will work on the basis that all access methods we define are available to all users on all databases so these will generally be all.

For our purposes the address field only applies to entries of type host and contains either a single ip address, an ip address range or a host name.

Method refers to one of the authentication methods supported by PostgreSQL, a selection of which are covered in more detail below.

Md5

This is the primary authentication method we will use for rails applications. Md5 is a type of password authentication allows us to authenticate a user by sending the md5 hash of the users password where the user is a PostgreSQL database user (created above with CREATE USER) as distinct from a system user.

In general the client will take care of creating the md5 hash and passing it to the server so whether entering details in a rails database.yml file or connecting to a psql console instance, we will never need to generate the hash ourselves. We provide the plain text password and the client will generate an md5 hash of it and send that to the server.

The benefit of md5 over simple password authentication is that the plain text password is never sent over the network making packet sniffing less of a concern.

We can now see why:

1 psql -d database_name -U my_user

would fail but:

1 psql -h 127.0.0.1 -d database_name -U my_user

Would prompt for a password and succeed (assuming the correct password is entered).

The entries to allow md5 auth have a type host and therefore are only available if a connection is made via tcp/ip. By default psql will attempt to connect via a unix-domain socket (type local) for which there are no password authentication methods available.

It’s worth noting that in our default configuration, md5 authentication is only available on the local loopback interface due to the host options being defined as the loopback IPV4 and IPV6 addresses.

Peer

Peer authentication queries the servers kernel for the username of the user who executed the command. Because it requires direct kernel access it can only be used for connections of type local.

The username of the user executing the command is taken as the database username. Therefore if your Unix username was deploy and the user deploy had been granted access to the database my_app_production then scripts running as this user would be able to access the database without any further authentication as would the psql console.

To grant your deploy user access to the the database database_name switch to the postgres user and load the psql console and enter:

1 CREATE USER deploy;

2 GRANT ALL PRIVILEGES ON DATABASE database_name to deploy;

3 \q

You can then exit back to your deploy user and enter:

1 psql database_name

And you will be able to enter the psql shell for the database database_name as user deploy without entering any password.

Note that PostgreSQL users are completely separate to system users therefore even though the unix user deploy already exists, we have to create a matching user in PostgreSQL.

Peer authentication looks at the current system user and then, if a PostgreSQL user with a matching name exists, authenticates as that user. It is also possible to create mapping tables which set which system user maps to which PostgreSQL user, the details of this are beyond the scope of this book but are covered in depth in the PostgreSQL manual.

Ident

Ident authentication is covered here only because it occurs often in documentation or tutorials about PostgreSQL and so an understanding of what it is and how it differs to Peer authentication is useful.

It is very similar to peer authentication except instead of querying the kernel for the username of the user executing the command it queries an ident server on tcp port 113.

An ident server answer questions such as “What user initiated the connection that originated your port X and connected to this server on port Y?”.

The obvious downside of this is that if the client machine is compromised or malicious, it can be set to return anything when port 113 is queried. As a result this authentication method is only suitable for use on networks where both client and server are entirely trusted.

It’s important to note that where the ident method is specified but the type is local, peer authentication will be used instead.

Allow External Access

By default our chef recipe will create the following line in postgresql.conf:

1 listen_addresses = 'localhost'

Which means that PostgreSQL will only accept connections from localhost.

To allow connections from other hosts, we can add the following to the postgres section of the node definition:

1 "config" :{

2 "listen_addresses" : "*"

3 }

which gives the following line in postgresql.conf:

1 listen_addresses = '*'

This will cause postgres to listen on all available network interfaces. I generally work on the basis that it is the job of the Firewall (in our case ufw + iptables) to manage incoming connections and so allowing the database to listen on all interfaces is acceptable.

Mangaging pg_hba.conf with chef

As with all of our configuration files, we should never modify them directly as changes will be overwritten by Chef. Instead, all changes should be made within our chef recipes.

The PostgreSQL cookbook we’re using provides an easy interface for manging entries in pg_hba/conf. The postgres section of our our node definition (in nodes/hostname.json accepts an array like this:

1 "postgres":

2 {"pg_hba" : [

3 {"comment" : "# Data Collection",

4 "type" : "host",

5 "db" : "the_database_name",

6 "user" : "the_user",

7 "addr" : "the_host",

8 "method" : "authentication method"}

9 ],

10 ....

11 }

The "pg_hba" key should contain an array of hashes, each of which will be converted into an entry in pg_hba.conf. So for example the following:

1 {"comment" : "# Data Collection",

2 "type" : "host",

3 "db" : "my_database_name",

4 "user" : "some_username",

5 "addr" : "0.0.0.0/0",

6 "method" : "md5"}

Would create the following entry in pg_hba.conf:

1 # Data Collection

2 host my_database_name some_username 0.0.0.0/0 md5

This would have the effect of allowing the user some_username to connect to my_database_name using md5 auth from any IP address. In practice this is very rarely a good idea and you’d want to specify individual IP’s or ranges who should be allowed.

Bare in mind that if you’re enabling access from external hosts, you’ll also need to add the configuring defined in “Allowing External Access” above.

Importing and Exporting Databases

PostgreSQL includes the utility pg_dump for generating an SQL dump of a database. Its options are very similar to the psql command. To generate a dump of the database test_db_1 authorising as the user test_user_1 we’d use the following command:

1 pg_dump -h 127.0.0.1 -f test1.sql -U test_user_1 test_db_1

Where the -f option is for specifying the filename to export to and the first none option argument is the database to be exported.

This file can then be imported using the following command:

1 psql -U test_user_1 -d database_to_import_to -f current.sql -h 127.0.0.1

This takes the contents of the file specified with the -f option and executes the sql contained in it on the database specified with the -d option. This can be used for anything from copying databases from production to development servers, migrating between servers or cloning production data to staging.

Monit

A monit configuration for postgresql is available in the monit_configs-tlq::postgres recipe:

1 check process postgresql with pidfile /var/run/postgresql/9.1-main.pid

2 start program = "/etc/init.d/postgresql start"

3 stop program = "/etc/init.d/postgresql stop"

4 if 15 restarts within 15 cycles then timeout

Note that the pid file name will change depending on the version of PostgreSQL being run.