Setting Up a Development Workflow - Drush for Developers Second Edition (2015)

Drush for Developers Second Edition (2015)

Chapter 6. Setting Up a Development Workflow

A few years ago, I joined a team to work on a web development project. On the first day, I got the following e-mail from the CTO:

Hi Juampy!

Welcome to the team. Here is how you can start working:

Clone the repository git@github.com:some-company/some-project.git

Download the database from http://intranet.some-company.com/some-project/db.sql.tar.gz

Set up your local environment and then open http://jira.some-company.com to start working on tickets.

Thanks and good luck!

I hope that you can figure out how I felt when I read this e-mail. If you can't, let me tell you that the project was a chaos, there was no effort to keep a certain level of quality; there were bugs everywhere and it took me two days to reach the project's homepage in my environment. This is definitely not a good welcome for a new developer. Here is an alternative e-mail that I got in a different team:

Hi Juampy!

Welcome to the project, I have just given you access to the project's repository. Please open up https://github.com/some-company/some- project and follow instructions there to get started.

Thanks and good luck!

The preceding URL presented the README.md file of the project with clear steps on how to set up a local environment, how to update it, and which tools and resources I had available. This experience was way more positive than the previous e-mail we saw above. The team had a development workflow. They understood that their code travelled from their local to development environment and finally the production environment, while content would stream back in the opposite direction from production to their local environments.

Drush can help a team to standardize many of the common tasks that they encounter every day in a Drupal project. In this chapter, we will leverage Drush concepts that we studied previously to implement a development workflow for a team. Here are some of the topics that we will cover:

· Moving configuration, commands, and site aliases out of Drupal

· Configuring the development database for the team

· Running post sql-sync tasks in local environments

Moving configuration, commands, and site aliases out of Drupal

Drupal's .htaccess file does a good job of blocking the execution of command files because their extension is drush.inc, but configuration files have a drushrc.php extension; hence, these will be executed by the web server if someone writes the full path in the browser. Let's test this in the command line. Our sample Drupal project has a few Drush commands and a configuration file at sites/all/drush:

$ curl -v http://example.local/sites/all/drush/policy.drush.inc

* Connected to example.local (127.0.0.1) port 80 (#0)

> GET /sites/all/drush/policy.drush.inc HTTP/1.1

> User-Agent: curl/7.35.0

> Host: example.local

> Accept: */*

>

< HTTP/1.1 403 Forbidden

We attempted to access our policy command file from the web browser and failed because Drupal's .htaccess file blocked access to it. Good! Now let's try this with our main Drush configuration file:

$ curl -v http://example.local/sites/all/drush/drushrc.php

* Connected to example.local (127.0.0.1) port 80 (#0)

> GET /sites/all/drush/drushrc.php HTTP/1.1

> User-Agent: curl/7.35.0

> Host: example.local

> Accept: */*

>

< HTTP/1.1 200 OK

Gotcha! Drupal's .htaccess file allowed access to drushrc.php so the web browser executed the code from that file. Although there is no output because drushrc.php simply sets a few array variables, it could be dangerous if we add further logic to it. Drush command files and configuration are not meant to be viewed in a web browser. Therefore, why have them under our project's document root? In this section, we will move all of our custom Drush configuration, commands, and site aliases one level above and then tell Drush how to find them.

Installing Drupal Boilerplate

In order to move Drush files out of Drupal, there must be a parent directory within our codebase. We need to set up a directory structure where docroot will contain our Drupal project and everything else that does not need to be available to the web browser is out.

Drupal Boilerplate (https://github.com/Lullabot/drupal-boilerplate) is a GitHub project that we will use as a foundation to structure Drupal projects. It comes with the following file structure:

· docroot: This is an empty directory where we will place our example Drupal project.

· drush: This will host configuration, commands, and aliases for Drush.

· patches: This can be used to keep track of core and contrib patches.

· results: This stores automated test results. It is useful when you want a third-party software to parse them.

· scripts: These are project scripts. For example, we can store shell scripts used by Jenkins jobs here.

· tests: These are automated test scripts and not SimpleTest scripts, but tests implemented with other testing technologies such as CasperJS or Behat.

· .gitignore: These are the default set of files and patterns to be ignored by Git.

· README.md: This is the main project's documentation and is meant to be adjusted for your project and the starting point for everyone new to the team.

Here is how we can download Drupal Boilerplate and then place Drupal into its docroot directory. We start by downloading Drupal Boilerplate into our temporary directory:

$ cd /tmp

$ wget https://github.com/Lullabot/drupal-boilerplate/archive/master.zip

HTTP request sent, awaiting response... 200 OK

Length: 40891 (40K) [application/zip]

Saving to: 'master.zip'

100%[======================================>] 40.891 160KB/s in 0,2s

'master.zip' saved [40891/40891]

Drupal Boilerplate has been downloaded to /tmp/master.zip. Let's unzip its contents:

$ unzip master.zip

Archive: master.zip

creating: drupal-boilerplate-master/

inflating: drupal-boilerplate-master/.gitignore

creating: drupal-boilerplate-master/docroot/

inflating: drupal-boilerplate-master/docroot/readme.md

creating: drupal-boilerplate-master/drush/

creating: drupal-boilerplate-master/drush/aliases/

...

Now, we will move our Drupal example project into docroot and then move Drupal Boilerplate to be the root of our project. Note that we are using rsync instead of mv because the latter does not move hidden files such as .htaccess:

$ rsync -v -a /home/juampy/projects/example/ drupal-boilerplate-master/docroot/

sending incremental file list

./

.gitignore

.htaccess

CHANGELOG.txt

COPYRIGHT.txt

INSTALL.mysql.txt

INSTALL.pgsql.txt

INSTALL.sqlite.txt

...

sent 24,016,476 bytes received 75,003 bytes 16,060,986.00 bytes/sec

total size is 23,692,782 speedup is 0.98

$ mv /tmp/drupal-boilerplate-master /home/juampy/projects/example

We need to adjust our local web server configuration, so the root of http://example.local points now to /home/juampy/projects/example/docroot. The same applies to the development and production environments. Furthermore, this directory change also affects our site alias definitions, which need to be updated. Here is what they look like after adjusting them at docroot/sites/all/drush/drush/example.aliases.drushrc.php:

<?php

/**

* @file

*

* Site alias definitions for Example project.

*/

// Development environment.

$aliases['dev'] = array(

'root' => '/var/www/drupal-dev/docroot',

'uri' => 'http://dev.example.com',

'remote-host' => 'dev.example.com',

'remote-user' => 'juampydev',

'command-specific' => array (

'sql-dump' => array (

'structure-tables-key' => 'common',

),

),

);

// Production environment.

$aliases['prod'] = array(

'root' => '/var/www/exampleprod/docroot',

'uri' => 'http://www.example.com',

'remote-host' => 'prod.example.com',

'remote-user' => 'exampleprod',

);

We are done relocating Drupal within the new directory structure. Welcome to Drupal Boilerplate!

Relocating Drush files

Now that we have our new directory structure in place, we can move Drush configuration, commands, and site aliases from sites/all/drush to drush. Let's take a look at the contents of this directory for our sample Drupal project:

$ ls docroot/sites/all/drush/

drushrc.php

example.aliases.drushrc.php

policy.drush.inc

registry_rebuild

updatepath.drush.inc

We have a mix of configuration files (drushrc.php), custom command files (policy.drush.inc and updatepath.drush.inc), site aliases files (example.aliases.drushrc.php), and a contributed project with a command file in it (registry_rebuild). We will reorganize them with the following commands:

Our Drush configuration file drushrc.php goes to drush:

$ mv docroot/sites/all/drush/drushrc.php drush/

Custom command files go to drush/commands:

$ mv docroot/sites/all/drush/*.drush.inc drush/commands/

Site aliases go to drush/aliases:

$ mv docroot/sites/all/drush/example.aliases.drushrc.php drush/aliases/

Let's remove Registry Rebuild from docroot/sites/all/drush because Drupal Boilerplate already has it at drush/commands:

$ rm -rf docroot/sites/all/drush/registry_rebuild

Now that we have relocated Drush files, how will they be discovered? Drush, while bootstrapping, is able to find resources at several locations in the system and the current Drupal project. On top of that, it can be provided with additional resources. We will add the following file at docroot/sites/all/drush/drushrc.php, which does a sanity check and then tells Drush where our configuration, commands, and site aliases are:

<?php

/**

* @file

* Drush configuration for Sample project.

*

* Loads configuration files located out of the document root.

*/

// Safety check. Only run in the command line.

if (php_sapi_name() != 'cli') {

return;

}

// Load Drush configuration, commands and site alias files from docroot/../drush.

$drupal_dir = drush_get_context('DRUSH_SELECTED_DRUPAL_ROOT');

if ($drupal_dir) {

include_once $drupal_dir . '/../drush/drushrc.php';

$options['include'][] = $drupal_dir . '/../drush/commands';

$options['alias-path'][] = $drupal_dir . '/../drush/aliases';

}

We have added a safety measure at the top of the file; so, if someone opens http://example.local/sites/all/drush/drushrc.php in a web browser, then no code will be executed. Next, we obtained Drupal's root directory through Drush's context system and we used it to add configuration, commands, and aliases located in the parent directory.

There are many ways to structure and load external configuration, commands, and site aliases in Drush. The drush topic docs-configuration command suggests a neat way of doing it through the project's Git repository. In this book, we did not choose this strategy because Git might not be available in the development and production environments.

Testing the new setup

Let's test that our new Drush setup works as expected. We will now run Drush's core-status command from the root of our Drupal project using the --verbose option to check where the configuration is being loaded from. We will analyze the command output as it goes:

$ cd /home/juampy/projects/example/docroot

$ drush --verbose core-status

Include /home/juampy/projects/example/docroot/../drush/commands

[notice]

Initialized Drupal 7.31 root directory at

/home/juampy/projects/example/docroot [notice]

Gotcha! Very early in Drush's bootstrap, our new directory containing Drush commands has been included. Let's see the next chunk of the command's output:

Initialized Drupal site default at sites/default [notice]

Drupal version : 7.31

Site URI : http://default

Database driver : mysql

Database hostname : localhost

Database port :

Drush configuration : /home/juampy/projects/example/docroot/sites/all/drush/drushrc.php /home/juampy/.drush/drushrc.php

Drush loaded two configuration files: one from our the $HOME path (which we defined in Chapter 5, Managing Local and Remote Environments) and another one that is inside our project at sites/all/drush/drushrc.php. Although, we do not see drush/drushrc.php listed here, we know that it has been loaded by sites/all/drush/drushrc.php through an include_once statement. Let's inspect the rest of the command's output:

Drush alias files : /home/juampy/.drush/aliases.drushrc.php

/home/juampy/projects/example/docroot/../drush/aliases/example.aliases.drushrc.php

Drupal root : /home/juampy/projects/example/docroot

Site path : sites/default

File directory path : sites/default/files

Temporary file directory path : /tmp

Command dispatch complete [notice]

Our project's site aliases were loaded from their new location. Note the /../, which we used at sites/all/drush/drushrc.php to access the parent directory of docroot. What about our custom shell aliases, which are now defined at drush/drushrc.php? Are they being loaded? Let's list the available shell aliases to verify it:

$ cd /home/juampy/projects/example/docroot

$ drush shell-alias

wipe : cache-clear all

unsuck : pm-disable -y overlay,dashboard

offline : variable-set -y --always-set maintenance_mode 1

online : variable-delete -y --exact maintenance_mode

pm-clone : pm-download --gitusername=juampy@git.drupal.org --package-handler=git_drupalorg

syncdb : --verbose --yes sql-sync @example.dev @self --create-db

syncfiles : --verbose --yes rsync @example.dev:%files/ @self:%files/

That's perfect. We can still use these shell aliases to download the database and files from the development environment. We are done! We have successfully moved our Drush configuration and commands out of Drupal.

Configuring the development database for the team

The development environment's database is the one that everyone in the team should download to work with. The production's database can be downloaded to our local environment in very specific occasions when we need bleeding edge fresh data and when we are aware of the security implications of using it.

By working with the development environment's database, we gain the following benefits:

· The development environment's database might not need tables that are present in production, such as old migration tables.

· Compromising data in the development environment can be sanitized after the database has been copied from production. Therefore, when developers download the development's database into their local environments, they get a safe database to work with.

· Large tables containing data that is not needed for development can be trimmed down, thus reducing the size of the database, which helps for faster performance of the sql-sync command.

· E-mail submission can be short circuited or forwarded to a logfile or dummy address.

In the previous chapter, we added a few adjustments to the sql-sync command for the team to download a copy of the development environment to their local environment. We will now work to fine tune the other side of the coin: the job that periodically copies the production environment's database and files into development. You can set this up in many ways: a crontab in development, a shell script in your local environment that logs in to the development environment, a Jenkins job that has SSH access to the development environment to open a connection, and so on. In this book, we will use Jenkins to set up a job that runs periodically.

Configuring Jenkins to sync production to development

Our example project has two remote site aliases referencing the development and production environments. We will now add the development server as a Jenkins node and then create a job where Jenkins will log in to development and run the sql-sync and core-rsync commands.

First of all, we need to create a jenkins user in development and give Jenkins SSH access to it. You can find instructions to accomplish this at http://www.caktusgroup.com/blog/2012/01/10/configuring-jenkins-slave.

Once we have configured the jenkins user and SSH access, we can proceed to add the node by clicking on New Node at the Jenkins administration interface. We are then presented with the following screenshot:

Configuring Jenkins to sync production to development

At Node name, we will give a name to the Jenkins node. For this case, we will type in ExampleDev as this node references the server that hosts http://dev.example.com. We will then choose Dumb Slave and click on OK. On the next page, we can configure the new node:

Configuring Jenkins to sync production to development

Here, we are specifying how to reach and access the server. We have set the Host field to dev.example.com and the Credentials field to use the default jenkins credentials (Go to Manage Jenkins | Manage Credentials if you need to change this). This setup will then translate to Jenkins running ssh jenkins@dev.example.com in order to run jobs at the development server.

Now that we added the development environment as a Jenkins node, we can create a job that uses it. Let's click on New Item at the Jenkins administration homepage to add the job. Name it Sync Production to Development, select Freestyle Project, and click onNext. In the following screenshot, we can configure our new job. Here are the form fields that we should set up: The first one defines in which server this job should run, where we will choose the ExampleDev node that we added in the previous step:

Configuring Jenkins to sync production to development

Next, at Build Triggers, we will make this job to run nightly at 3 AM. The following screenshot shows how we define this by typing H 3 * * *. This is a common way to define periods of time used by crontab, Jenkins, and other systems. The question icon next to the text area contains useful examples for you to define a different period of time. Furthermore, Jenkins will interpret what you type in and explain it in a sentence, as you can see at the text right below the text area:

Configuring Jenkins to sync production to development

The following step is to add a Build step of type Execute shell. This will show a text area for us to type in the commands that we want Jenkins to run. We will enter the following statements to run a shell script within the scripts directory of our project, which we will implement in the next step:

Configuring Jenkins to sync production to development

The preceding step simply runs a shell script located at the scripts/sync_prod_to_dev.sh directory. Here are the contents of the shell script:

#!/usr/bin/env bash

# Jenkins script to sync database and files from Production to Development.

cd /var/www/drupal-dev/docroot

# Sync database and files.

drush --verbose --yes sql-sync @example.prod @self

drush --verbose --yes core-rsync @example.prod:%files @self:%files

We could have pasted the preceding Drush commands in the Jenkins user interface, but what if the Jenkins server crashes and we lose all our jobs? By keeping shell scripts inside our project's repository, we benefit by keeping track of changes so that they evolve as the rest of the Drupal codebase does.

Read and adjust the rest of the settings for this job to suit your needs and click on Save. You can test it by clicking on the Build Now link on the left navigation menu and then inspecting the Jenkins console output.

Congratulations! You have automated a job to run periodically. Now, Jenkins will take care of keeping the development environment's database and files up to date with production.

Fine-tuning the development database

Now that we have set up a job to periodically obtain a fresh database copy from production, it's time to add a few enhancements to it. There are a few things that the production environment's database contains, which we do not need in the development environment:

· It has personal information such as names, usernames, and passwords

· It might contain extra tables that do not need to be downloaded

· It might use modules that send e-mail notifications

· It has development and data modeling modules in disabled status

The following sections will do a few iterations on the commands which are executed by the Jenkins job in order to fine tune the database of the development environment so that the team can work with it more comfortably.

Recreating the database on sql-sync

The sql-sync command has an option called --create-db. When used, Drush recreates the destination database prior to installing the database dump extracted from the source site (in this case, production). This option will save you more than one headache. The reason is that if you do not recreate the database, you won't be dropping tables that are not used anymore in the project. Here are a couple of scenarios where not using --create-db can cause trouble:

· If a field is removed in production, its data and revision tables won't be dropped from development when you run sql-sync. Now, if the field is added back again and exported to code through the Features module when you run the updatepath command in the development environment, you will get a SQL error saying that Features attempted to create a field table that already existed in your database.

· If a module was uninstalled but not all of its tables were removed, the next time you install the module, the installation process will fail because it will try to create a table that already exists.

Long story short: use this setting every time you use the sql-sync command. Here is what our shell script at scripts/sync_prod_to_dev.sh looks like after adding this setting:

#!/usr/bin/env bash

# Jenkins script to sync database and files from Production to Development.

cd /var/www/drupal-dev/docroot

# Sync database and files.

drush --verbose --yes sql-sync @example.prod @self --create-db

drush --verbose --yes core-rsync @example.prod:%files @self:%files

Excluding table data from production

Just as we defined in the previous chapter, a list of tables whose data was to be ignored by sql-sync when copying development's database into our local environment, we want to do the same when we copy the production environment's database into development. We already have the array of tables to exclude at drush/drushrc.php as $options['structure-tables']['common']. This array excludes cache tables, the search index, and other tables that contain data that we do not need to download. We can exclude the data of these tables easily by adjusting production's Drush site alias. Here is what it looks like after adjusting it at drush/aliases/example.aliases.drushrc.php:

// Production environment.

$aliases['prod'] = array(

'root' => '/var/www/exampleprod/docroot',

'uri' => 'http://www.example.com',

'remote-host' => www.example.com',

'remote-user' => 'exampleprod',

'command-specific' => array (

'sql-dump' => array (

'structure-tables-key' => 'common',

),

),

);

That's it. Now, when Jenkins runs sql-sync, it will load production's site alias and therefore load the list of tables to exclude.

Ignoring tables from production

Let's suppose now that production has a few tables that we don't want to be created in the development environment. This scenario usually happens after you have run a data migration using the Migrate module (https://www.drupal.org/project/migrate) from a legacy site to Drupal.

The Migrate module uses a set of custom tables to track the status of the migration process. Once it has completed, these tables will stay in the production environment. We do not need to download these tables from production to development. This is why we will use the skip-tables option to completely ignore them when running the sql-sync command. The Migrate module table names look like the following code:

migrate_log

migrate_map_source_a

migrate_map_source_b

migrate_message_source_a

migrate_message_source_b

migrate_status

These tables might contain a considerable amount of data depending on how much content was imported from the legacy website. We definitely do not need them in the development environment. Therefore, we will first add the $options['skip-tables']['common']option to our Drush configuration file in order to match these table names and then reference it at the site alias definition of the production environment. Here is our Drush configuration file at drush/drushrc.php after adding the list of tables to skip:

<?php

/**

* @file

* Drush configuration for Sample project.

*/

/**

* List of tables whose *data* is skipped by the 'sql-dump' and 'sql-sync'

* commands when the "--structure-tables-key=common" option is provided.

*/

$options['structure-tables']['common'] = array('cache', 'cache_*', 'history', 'search_*', 'sessions', 'watchdog');

/**

* List of tables to be omitted entirely from SQL dumps made by the 'sql-dump'

* and 'sql-sync' commands when the "--skip-tables-key=common" option is

* provided on the command line. This is useful if your database contains

* non-Drupal tables used by some other application or during a migration for

* example. You may add new tables to the existing array or add a new element.

*/

$options['skip-tables']['common'] = array('migrate_*');

// Shell aliases.

$options['shell-aliases']['syncdb'] = '--verbose --yes sql-sync @example.dev @self --create-db';

$options['shell-aliases']['syncfiles'] = '--verbose --yes rsync @example.dev:%files/ @self:%files/';

The $options['skip-tables']['common'] setting accepts wildcards, so just with a pattern like migrate_*, we will exclude all the migration tables when running sql-sync. The last step is to reference this array at our production's site alias atdrush/aliases/example.aliases.drushrc.inc:

// Production environment.

$aliases['prod'] = array(

'root' => '/var/www/exampleprod/docroot',

'uri' => 'http://www.example.com',

'remote-host' => 'www.example.com',

'remote-user' => 'exampleprod',

'command-specific' => array (

'sql-dump' => array (

'structure-tables-key' => 'common',

'skip-tables-key' => 'common',

),

),

);

Note that the setting is for the sql-dump command instead of the sql-sync command. The reason for this is that Drush uses sql-dump as a subcommand while running sql-sync in order to obtain a database dump. From now on, our Jenkins job will exclude migration tables in the resulting database dump to be installed in development. The sql-sync command will take less time to complete because the database dump to download will be smaller. As a consequence, when your team runs the Drush shell alias syncdb, it will download a smaller database from the development environment, thus making everyone happy.

Sanitizing data

So, now we have a Jenkins job that downloads the production environment's database into development, excluding the data of some tables and ignoring migration tables. However, we are not doing any sanitization of compromising data. I found a very good definition of data sanitization at Wikipedia:

"Sanitization is the process of removing sensitive information from a document or other medium, so that it may be distributed to a broader audience."

In our particular scenario, what we want to do is to reset usernames, passwords, personal data, and privileged data in the development database so that when the team downloads it, they get safe data to work with. Fortunately, the sql-sync command has a --sanitize option that resets all user e-mails to user+%uid@localhost and passwords to the literal password;. Additionally, it offers hook for us to add extra sanitizations.

Let's suppose that our project's users have a Full Name field that we also want to sanitize. We will now implement hook_drush_sql_sync_sanitize() at the bottom of our policy file located at drush/commands/policy.drush.inc with the following SQL statements, which will sanitize the field tables:

<?php

/**

* @file

* Policy rules for Example project.

*/

// ... some other Drush hooks that we implemented before go here.

/**

* Implements hook_drush_sql_sync_sanitize().

*

* Custom sql-sync sanitization to alter user's Full name. It is used by Drush

* when sql-sync is run with the --sanitize option.

*

* @see sql_drush_sql_sync_sanitize().

*/

function policy_drush_sql_sync_sanitize($source) {

drush_sql_register_post_sync_op('policy-sanitize-full-name', dt('Reset the full name of all users.'),

"UPDATE field_data_field_full_name

SET field_full_name_value = CONCAT('user+', entity_id);");

drush_sql_register_post_sync_op('policy-sanitize-full-name-revisions', dt('Reset the full name revisions of all users.'),

"UPDATE field_revision_field_full_name

SET field_full_name_value = CONCAT('user+', entity_id);");

}

The preceding hook resets the value of the Full Name field in the field_data_field_full_name and field_revision_field_full_name tables to something like user+1, 1 being the user's ID. The first table contains the actual data for each user's full name, and the second table is used by Drupal to keep track of the different revisions of this field (when you change a user's full name and click on Save, a new revision is created). Let's now add the sanitize option to our shell script that syncs production with development atscripts/sync_prod_to_dev.sh so that Drush will run sanitization tasks after completing sql-sync:

#!/usr/bin/env bash

# Jenkins script to sync database and files from Production to Development.

cd /var/www/drupal-dev/docroot

# Sync database and files.

drush --verbose --yes sql-sync @example.prod @self --create-db --sanitize

drush --verbose --yes core-rsync @example.prod:%files @self:%files

Now, let's force the Jenkins job to run immediately by clicking on Build Now at the Jenkins' administration interface. Here is an excerpt of the output while sql-sync is running:

Starting to import dump file onto Destination database. [ok]

...

Starting to sanitize target database on Destination. [ok]

/usr/bin/php /usr/share/drush-head/drush.php --php=/usr/bin/php --backend=2 --verbose --strict=0 [notice]

--root=/home/juampy/projects/example/docroot --uri=http://default sql-sanitize --create-db --sanitize 2>&1

Drush's sql-sync command internally dispatches the sql-sanitize command in the destination database (in this case, the development environment) to run sanitization queries. The sql-sanitize command will invoke the hook that we implemented atdrush/commands/policy.drush.inc, so our custom sanitization queries will run as well. Here is the last bit of the command's output:

Initialized Drupal site example.prod at sites/default [notice]

The following post-sync operations will be done on the destination:

* Reset the full name of all users.

* Reset the full name revisions of all users.

* Reset passwords and email addresses in users table

* Truncate Drupal's sessions table

Do you really want to sanitize the current database? (y/n): y

Command dispatch complete [notice]

Here is our confirmation: e-mails, passwords, and full names were sanitized. Additionally, the sessions table was truncated, which is not needed because we are not downloading the data of that table, but this is how the sql-sanitize command behaves by default. Ta-da! Now, you and your team can work safely with a database which does not have compromising data.

Preventing e-mails from being sent

The development database is now lean and safe, thanks to the optimizations that we did in previous sections. It's now time to run some extra tasks to reconfigure the development environment's database after it has synced with production. We will start by disabling e-mail submission.

By sanitizing user e-mails as we did in the previous section, we know that our users won't get any test e-mails. However, who knows? There might be some custom code that sends an e-mail manually, which Drupal won't catch. Here are some of the options that we have to avoid this from happening:

· If we know our codebase, we can just ignore it and let e-mails be sent to dummy e-mail addresses. Not my preference, but still an option.

· There are a few modules in Drupal.org to alter e-mail submission such as Reroute Email (https://www.drupal.org/project/reroute_email), which redirects e-mails to a given address or Devel (https://www.drupal.org/project/devel), which writes them to a file.

· You can also reroute all e-mail being sent to be written to a log at the server level by following the instructions at this article from the Lullabot blog: https://www.lullabot.com/blog/article/oh-no-my-laptop-just-sent-notifications-10000-users.

If e-mail is important for your project, then you might want to log it to a file so that it can be debugged. If it is not, then redirecting it to a dummy account such as dummy@example.com should be enough. For our example project, we will go for the simplest solution, which consists of installing Reroute Email (http://drupal.org/project/reroute_email) and redirecting all mail to dummy@example.com.

Here is our Jenkins script to sync production with development after we add a few commands to reroute e-mail submission at scripts/sync_prod_to_dev.sh:

#!/usr/bin/env bash

# Jenkins script to sync database and files from Production to Development.

cd /var/www/drupal-dev/docroot

# Sync database and files.

drush --verbose --yes sql-sync @example.prod @self --create-db --sanitize

drush --verbose --yes core-rsync @example.prod:%files @self:%files

# Disable email submission.

drush --verbose --yes pm-enable reroute_email

drush --verbose --yes variable-set reroute_email_enable 1

drush --verbose --yes variable-set reroute_email_address 'dummy@example.com'

drush --verbose --yes variable-set reroute_email_enable_message 1

In the preceding script, we installed the Reroute Email module and then set a few variables that the module uses:

· reroute_email_enable: This is a flag to activate e-mail rerouting

· reroute_email_address: This is the address designated to receive e-mails

· reroute_email_enable_message: This is a flag that, when active, adds a piece of text to the body informing that the e-mail was rerouted and where it should have been sent instead

E-mail submission won't be a problem anymore in our development environment and in the local environments of our team. Here is a chance for you to take a look at the project where you are currently working. Log in to your production environment and run drush pm-list --status=enabled. Inspect this list and ask yourself, do you need to disable any of those modules in development? Do you need to tweak any of their settings? If you do, then simply add your commands at the bottom of scripts/sync_prod_to_dev.sh.

The settings.php file is also a good place to overwrite configuration variables as per environment. If you have specific settings.php files for each environment or a way to identify the current environment at settings.php (for example, Acquia environments have a variable called $_ENV['AH_SITE_ENVIRONMENT']), then you can overwrite the configuration variables there.

Running post sql-sync tasks in local environments

We have come a long way to here. So far, we built together a workflow from production to development and provided the team with two simple commands:

· syncdb: This is a command to download a copy of the development environment's database

· syncfiles: This is a command to download files from the files directory into the development environment

This is the same as when we added extra tasks after syncing production to development; we would like to automatically adjust the database configuration of a local environment once sql-sync completes. Here are some of the things we will do:

· Enable user interface modules that are disabled in production and development, such as Views UI and Field UI

· Enable development modules such as Devel, Database Logging, and Stage File Proxy

· Disable production modules such as Update, Purge, or Memcache

· Adjust environment variables to fine tune installed modules and disable caches

The preceding list is what I consider the most common list of things that a developer would need. Depending on your background and the nature of the project, you might like to adjust it even further and add new items.

There are several ways to implement the requirements of the preceding list. Here are some of the alternatives:

· We could write a custom command and append it to the Drush shell alias syncdb so that it runs automatically

· We could implement drush_hook_post_sql_sync() in a command file and run Drush statements when the source alias is @dev.example.com and the target alias is @self

· We could install the Rebuild project (https://www.drupal.org/project/rebuild) and define the preceding list in a YAML file for the command to process it

· We could use the devify command, which ships with Drupal Boilerplate and is available for us at drush/commands

In our case, we will use the devify command due to its simplicity. Drupal Boilerplate now hosts our example project so that the devify command can be found at drush/commands/build.drush.inc. Let's jump to the command line and inspect its documentation:

$ cd /home/juampy/projects/example/docroot

$ drush help devify

Configures the current database for development.

Examples:

drush devify Uses command default values to set up a

database for development.

drush devify --enable-modules=xhprof,devel Enables XHProf and Devel

modules

drush devify --reset-variables=site_mail=local@local.com,file_temporary_path=/tmp

Resets site_mail and file_temporary_path variables.

Options:

--delete-variables A comma separated list of

variables to delete.

--reset-variables A comma separated list of

variables to reset with the

format foo=var,hey=ho.

--disable-modules A comma separated list of

modules to disable.

--enable-modules A comma separated list of

modules to enable.

As we can see from the preceding output, the command accepts a list of variables to delete, a list of variables to reset, a list of modules to enable, and a list of modules to disable. Our command invocation would be very long in order to make all the adjustments that we mentioned at the start of this section; so, we will instead define these options at our Drush configuration file at drush/drushrc.php. Here is the bottom of the file after we add the list of options for the devify command to use:

<?php

/**

* @file

* Drush configuration for Sample project.

*/

/**

* List of tables whose *data* is skipped by the 'sql-dump' and 'sql-sync'

* commands when the "--structure-tables-key=common" option is provided.

*/

$options['structure-tables']['common'] = array('cache', 'cache_*', 'history', 'search_*', 'sessions', 'watchdog');

/**

* List of tables to be omitted entirely from SQL dumps made by the 'sql-dump'

* and 'sql-sync' commands when the "--skip-tables-key=common" option is

* provided on the command line. This is useful if your database contains

* non-Drupal tables used by some other application or during a migration for

* example. You may add new tables to the existing array or add a new element.

*/

$options['skip-tables']['common'] = array('migrate_*');

// Shell aliases.

$options['shell-aliases']['syncdb'] = '!drush --verbose --yes sql-sync @example.dev @self --create-db && drush devify';

$options['shell-aliases']['syncfiles'] = '--verbose --yes rsync @example.dev:%files/ @self:%files/';

/**

* Command options for devify command.

* @see build.drush.inc

*/

$command_specific['devify'] = array(

'enable-modules' => array(

'dblog',

'devel',

'field_ui',

'reroute_email',

'stage_file_proxy',

'views_ui',

),

'disable-modules' => array(

'update',

'purge',

),

'reset-variables' => array(

// File management settings.

'file_temporary_path' => '/tmp/',

// Cache settings.

'cache' => FALSE,

'block_cache' => FALSE,

'preprocess_css' => FALSE,

'preprocess_js' => FALSE,

// Stage file proxy settings.

'stage_file_proxy_origin' => 'http://dev.example.com',

'stage_file_proxy_origin_dir' => 'sites/default/files',

'stage_file_proxy_hotlink' => TRUE,

),

);

The above array of settings name $command_specific['devify'] will be used by the devify command when we run it. It will enable the given list of modules, disable a couple of ones, and reset some variables.

Within the list of modules to enable, there is a Stage File Proxy module (https://www.drupal.org/project/stage_file_proxy). I found this module extremely helpful while working locally on large projects with a huge amount of media files at the files directory. The module uses what it calls a proxy origin to fetch files from it when Drupal can't find a file at your local files directory. This frees you from having to download files from the development environment to your local environment in order to obtain, for example, images from the latest content in the website. It is a great tool because it saves you both time and hard disk space.

The Stage File Proxy module needs a few variables to be defined after being installed for it to work. Here, they are along with the values we have given to them:

· 'stage_file_proxy_origin' => 'http://dev.example.com': This is the source to fetch images from. We are using the development environment as the proxy because its files are in sync with our local database.

· 'stage_file_proxy_origin_dir' => 'sites/default/files': This is the directory where files are served in the development environment.

· 'stage_file_proxy_hotlink' => TRUE: This setting tells Stage File Proxy not to download files to our local environment, but instead serve them directly from the development environment through a 301 response code. This will make pages in your local environment to load faster.

We have also altered the Drush shell alias syncdb, which now looks like the following code:

// Shell aliases.

$options['shell-aliases']['syncdb'] = '!drush --verbose --yes sql-sync @example.dev @self --create-db && drush devify';

$options['shell-aliases']['syncfiles'] = '--verbose --yes rsync @example.dev:%files/ @self:%files/';

The shell alias now starts with !drush. This tells Drush not to prepend drush when running the shell alias, which gives us the chance to append additional commands with &&. Now our team, after running drush syncdb, will not only get a lean and safe database to work with, but also will have everything they need to work comfortably. If any customizations have to be made, they can enter them at their settings.php file or even define their own Drush shell aliases at $HOME/.drush/drushrc.php.

Summary

First of all, thanks! I am so glad that you made it up to this point. This chapter was a hands-on training in defining a development workflow. We used a good amount of what is available in Drush core: configuration, shell aliases, commands, and site aliases. Each feature served as a piece of the final puzzle.

We started the chapter by moving our example Drupal project into Drupal Boilerplate, a default directory structure for Drupal projects. We moved all our custom Drush code (configuration, commands, and site aliases) out of Drupal and then added a small piece of code for Drush to discover the new location.

We created a Jenkins job to periodically copy the database and files from the production environment into the development environment. Then, we optimized this process as much as possible by reducing the amount of data that gets downloaded, sanitizing compromising data, and rerouting email submission.

We closed the chapter by offering a way to automate extra tasks to run in local environments after obtaining a copy of the development environment's database. Things such as enabling development modules and disabling caches can be accomplished easily with the devify command.

Thank you again for reading this book. I hope that you enjoyed reading it as much as I did writing it. My head is empty now. This was all I could share with you to help you master Drupal development with Drush. See you in the issue queues!