Transporting and Handling Email Messages - Programming - Running Linux, 5th Edition (2009)

Running Linux, 5th Edition (2009)

Part III. Programming

Chapter 23. Transporting and Handling Email Messages

Electronic mail (email ) is one of the most desirable features of a computer system. You can send and receive email on your Linux system locally between users on the host and between hosts on a network. You have to set up three classes of software to provide email service. These are themail user agent or mailer, the mail transport agent (MTA), and the transport protocol.

The mailer provides the user interface for displaying mail, writing new messages, and filing messages. Linux offers you many choices for mailers. They are always being improved, and a particular mailer may provide certain features, such as the ability to serve as a newsreader or as a web browser.

Mailers tend to differ in terms of their MIME support. (MIME stands for Multipurpose Internet Mail Extensions. It is really not multimedia-specific, but more a general standard for describing the contents of email messages.) Some do it better than others. It's difficult to give a recommendation here, though, since all mailers are continually moving toward better MIME support. Also, the problem is often not with the mail software, but rather with the need to register MIME types with the right viewer/handler applications in your environment.

The mailer relies on the MTA to route mail from one user to another, whether locally or across systems. The MTA in turn uses a transport protocol, usually either Unix-to-Unix Copy (UUCP, a very old protocol that was once common and has almost died out in the Western world, but is still common in regions with slow and unreliable dial-up lines) or Simple Mail Transport Protocol (SMTP), to provide the medium for mail transfer.

There are a number of possible scenarios for using email on a Linux system, and depending on those scenarios, you will have to install a different set of software packages. However, no matter which option you choose, you will always need a mailer.

The first scenario applies to dial-up access to the Internet via an Internet service provider (ISP). In this scenario, there is often only one user on the Linux machine, although this is not a requirement. The ISP accepts your mail from the Internet and stores it for you on its hard disks. You can then retrieve the mail whenever you want by using the common Post Office Protocol (POP3) or the newer Internet Message Access Protocol (IMAP). Outgoing mail in this scenario is almost exclusively sent via the SMTP protocol, which is universally used to transport mail over the Internet.

In the easiest case, you use your mailer both to retrieve the mail via POP3 or IMAP and to send it back via SMTP. When you do this, you do not even need to set up an MTA because the mailer handles everything. This is not terribly flexible, but if all you want is to access your mail easily, this might be an option for you. Mailers that support this include KMail from KDE and Mozilla's built-in mail program (both described later).

Browser-based email clients such as gmail or GMX are yet another story. They need to operate on a mailbox that is stored on a server; this mailbox could be filled by either POP3 or IMAP, often automatically. These days, it is quite common for browser-based email clients to use IMAP.

If you want more flexibility (which comes at the price of more configuration and maintenance work), you can install an MTA such as Postfix, described in the next section. You will need a program that transports the mail from your provider's POP3 or IMAP server. This program fetches your mail when you ask it to and passes the messages on to the MTA running on your system, which then distributes the mail to the recipients' mail folders. One program that does exactly that is fetchmail, which we cover later in this chapter. Outgoing mail is again sent via SMTP, but with anMTA running on your machine, you can choose not to send the outgoing messages directly to your provider's SMTP server, but rather to your own server, which is provided by the MTA. The MTA then forwards the mail to your provider, which in turn sends it to the recipients. With this setup, you can instruct your MTA to send outgoing mail at certain intervals so that you do not always have to make a dial-up connection.

The third scenario is meant for machines that have a permanent connection to the Internet, either because they are in a network that has a gateway with a permanent connection, or because they are using a leased line to your Internet provider. In this case, you might want to receive mail messages as soon as they arrive at your provider and not have them stored there. This also requires setting up an MTA. Incoming mail will be directed to your SMTP server (i.e., your MTA). Your provider will have to set things up accordingly for this to work.

Of course, there are many more scenarios for using mail, and mixtures between the three mentioned are possible as well. If you are going to set up a mail service for a whole network, you will most certainly want to read the Linux Network Administrator's Guide (O'Reilly) as well as a book about your MTA.

You have a number of software choices for setting up email on a Linux host. We can't describe all the available email solutions, but we do describe some packages that are often used and quite suitable for their respective tasks. Mail programs for end users, such as KMail and Evolution, have already been described in detail in previous chapters. In this chapter we document what we think are the most popular Linux advanced tools at this time: the Postfix mail transport agent and the fetchmail implementation of the POP3 and IMAP protocols. These are relatively simple to configure but provide all the features most users need. In addition, with these tools, you can cover all the scenarios described earlier.

The Postfix MTA

Several MTAs are available for Linux. Historically, the most common MTA on Unix has been sendmail, which has been around for a long time. It is generally considered somewhat more difficult to use than the alternatives, but it is thoroughly documented in the book sendmail, by Bryan Costales with Eric Allman (O'Reilly).

Postfix is a newer MTA, developed by security guru Wietse Venema as a replacement for sendmail. It's designed to be compatible with sendmail but to provide a higher level of security and be easier to configure.

Postfix is a highly flexible and secure piece of software that contains multiple layers of protection against would-be attackers. Postfix was also written with performance in mind, and employs techniques to limit slower activities such as creating new processes and accessing the filesystem. It is one of the easier email packages to configure and administer because it uses straightforward configuration files and simple lookup tables for address rewriting. It is remarkable in that it is simple to use as a basic MTA, yet still able to handle much more complicated environments.

Many Linux distributions have Postfix built in, so you may already have it installed on your system. If not, you can find prebuilt packages or compile it yourself from the source code. The Postfix home page (http://www.postfix.org) contains links to download both the source code ("Download") and packages for different Linux distributions ("Packages and Ports").

Postfix has two different release tracks: official and experimental. The experimental releases contain all the latest patches and new features, although these might change before they are included in the official release. Don't be put off by the term "experimental"; these releases are very stable and have been tested thoroughly. If you are looking for a feature that is available only in the experimental release, you should feel more than comfortable using it. Read the release notes for both tracks to know what the current differences are.

A Word About DNS

Before setting up Postfix , you should understand that if your system is going to receive mail from others across the Internet, the DNS for your domain has to be configured correctly. DNS is discussed in Chapter 13.

Let's assume for this discussion that you are configuring a host called halo in the domain example.org and that you have a user account michael on your system. Regardless of how you want to receive mail, your host halo.example.org must have a DNS A record that maps its hostname to its IP address.

In this example your email address is going to be either michael@halo.example.org or michael@example.org. If you want to use the first form, configuring the DNS A record is enough for messages to reach you.

If your system is going to receive all mail for example.org (including michael@example.org), the domain should have a DNS MX record pointing to your host halo.example.org. If you are configuring the DNS for your domain yourself, make sure you read the documentation to understand how it works; otherwise, speak to your DNS administrator or ISP about routing mail to your system.

Postfix frequently uses DNS in its normal operation, and it uses the underlying Linux libraries to perform its DNS queries. Make sure your system is configured correctly to perform DNS lookups (see "Configuring DNS" in Chapter 13). Postfix usually has to find an MX record to make its deliveries. Don't assume that if Postfix reports a DNS problem with an address, and you find that the domain resolves correctly, that email delivery should therefore succeed. If Postfix reports a problem, you can be almost certain there is a problem.

Installing Postfix

Although prepackaged distributions are available, you may want to build the package yourself if you want to use any of the add-on libraries or functions that are not included in your distribution. You might also want to get the latest version to obtain a new feature that has not yet been included in your distribution.

Before you install Postfix, be aware that it includes the three commands /usr/bin/newaliases , /usr/bin/mailq , and /usr/sbin/sendmail that are normally used by sendmail. Postfix provides replacements that work with the Postfix system rather than with sendmail. You should rename your existing sendmail commands so that the Postfix installation doesn't overwrite them in case you ever want to use the original sendmail binaries again:

# mv /usr/bin/newaliases /usr/bin/newaliases.orig

# mv /usr/bin/mailq /usr/bin/mailq.orig

#mv /usr/sbin/sendmail /usr/sbin/sendmail.orig

Postfix uses Unix database files to store its alias and lookup table information. You must, therefore, have the db libraries installed on your system before building Postfix. These libraries are contained within the db-devel RPM package or the Debian libdb4.3-dev package. If you are not using a package manager, you can obtain them directly from Sleepycat Software (http://www.sleepycat.com/). If you are using RPM, execute the following command to see if the necessary libraries have been installed on your system:

# rpm -qa | grep db-devel

db-devel-4.3.27-3

You should see a line similar to the second line in the preceding command that displays the db-devel package with a version number. If rpm returns nothing, you must install the libraries before installing Postfix.

On Debian, you can use dpkg to see if the libraries are installed:

# dpkg -l libdb4.3-dev

If you download a prepackaged Postfix, use your package manager (described in Chapter 12) to install it. If you download the source postfix-2.2.5.tar.gz, move that file to a suitable directory (such as your home directory) to unpack it. The numbers in the name of the file represent the version of this release. Your file may have different numbers depending on the current release when you download it.

Follow this basic procedure to build Postfix. Note that you'll have to be the root user to create the user and group and to install the package.

1. Rename your sendmail binaries as described earlier.

2. Create a user account called postfix and a group called postdrop. See "Managing User Accounts" in Chapter 11 for information on setting up accounts and groups.

3. Run gunzip on the compressed file to produce a file named postfix-2.2.5.tar.

4. Execute

tar -xvf postfix-2.2.5.tar

to unpack the source into a directory called postfix-2.2.5.

5. Move to the directory created when you unpacked the file. You'll find a file called INSTALL with detailed instructions about building your Postfix system. In most cases, building Postfix should be as simple as typing make in the directory.

6. If your build completes without any errors, type make install to install Postfix on your system. You should be able to accept all the defaults when prompted by the installation script.

After installation, you will have Postfix files in the following directories:

/usr/libexec/postfix

This directory contains the various Postfix daemons. Postfix uses a split architecture in which several discrete programs handle separate tasks. The master daemon is started first. It deals with starting other programs as they are needed. For the most part, you don't need to worry about any of the programs here. Stopping and starting Postfix is handled with the postfix command found in the /usr/sbin directory.

/etc/postfix

Typically this directory contains dozens of Postfix configuration files, but only master.cf and main.cf and a few lookup tables are used by Postfix. The rest of the files are examples that document the various parameters used for configuration.

The master.cf file controls the various Postfix processes. It includes a line for each component of Postfix. The layout of the file is described by comments in the file itself. Usually, you shouldn't have to make any changes to run a simple Postfix installation.

The main.cf file is the global SMTP configuration file. It includes a list of parameters set to one or more values using the format

parameter = value

Comments are marked with a hash mark (#) at the beginning of the line. You cannot put comments on the same line as parameters. Commented lines can begin with whitespace (spaces or tabs), but they must appear on lines by themselves.

Multiple values for parameters can be separated by either commas or whitespace (including newlines), but if you want to have more than one line for a parameter, start the second and subsequent lines with whitespace. Values can refer to other parameters by preceding the parameter name with a dollar sign ($).

Here's an example of an entry that includes comments, multiple lines, and a parameter reference:

# Here are all the systems I accept mail from.

mynetworks = $myhostname

192.168.75.0/24

10.110.12.15

/usr/sbin

All the Postfix commands are located in /usr/sbin and have names starting with post. There are commands to create index files, manage the mail queue, and otherwise administer your Postfix system. The postfix command, which is used to stop and start Postfix (described later), is found here.

/var/spool/postfix

The Postfix queue manager is an important component of the Postfix system that accepts incoming email messages and arranges with other Postfix components to deliver them. It maintains its files under the /var/spool/postfix directory. The queues it maintains are shown next. Postfix provides several tools to manage the queues, such as postcat, postsuper, and mailq, but you might also use the usual Linux commands, such as find and cat, to inspect your queue.

/var/spool/postfix/incoming

All incoming messages, whether from over the network or sent locally.

/var/spool/postfix/active

Messages that the queue manager is delivering or preparing to deliver.

/var/spool/postfix/deferred

Messages that could not be delivered immediately. Postfix will attempt to deliver them again.

/var/spool/postfix/corrupt

Messages that are completely unreadable or otherwise damaged and not deliverable are stored here for you to look at if necessary to figure out the problem. This queue is rarely used.

/usr/local/man

Postfix installs documentation in the form of manpages on your system. The documentation includes information on command-line utilities, daemons, and configuration files.

As mentioned earlier, Postfix also installs replacements for /usr/bin/newaliases, /usr/bin/mailq, and /usr/sbin/sendmail.

Postfix Configuration

Before you start Postfix for the first time, you have to make sure that the aliases table is formatted correctly and that a few of the critical configuration parameters are set correctly for your system.

Historically, sendmail has used the file /etc/aliases to map one local username to another. Postfix continues the tradition. The /etc/aliases file is a plain-text file that is used as input to create an indexed database file for faster lookups of aliases on your system. There are at least two important aliases on your system that must be set in your /etc/aliases file. If you have been running sendmail on your system, these aliases are probably already set correctly, but make sure your file has entries for root and postmaster pointing to a real account that receives mail on your system. Once you have verified the aliases, execute the command newaliases to rebuild the index file in the correct format for Postfix.

The /etc/postfix/main.cf file contains many parameters, but there are just a few important ones that you should verify before starting Postfix; we explain these in this section. If you installed Postfix from a prepackaged distribution, these parameters might already be set correctly. It's also possible that the Postfix defaults work for your system, but edit your /etc/postfix/main.cf file to make sure.

myhostname

This is the fully qualified hostname for your system. By default, Postfix uses the name returned by the gethostname function. If this value is not fully qualified, and you have not set this parameter, Postfix will not start. You can check it by executing the command hostname. It's probably a good idea to specify your fully qualified hostname here explicitly:

myhostname = halo.example.org

mydomain

Specifies the domain name for this system. This value is then used as the default in other places. If you do not set it explicitly, Postfix uses the domain portion of myhostname. If you have set myhostname as shown previously and example.org is correct for your system, you do not have to set this parameter.

mydestination

Specifies a list of domain names for which this system should accept mail. In other words, you should set the value of this parameter to the domain portions of email addresses for which you want to receive mail. By default, Postfix uses the value specified in myhostname. If you are setting up your system to accept mail for your entire domain, specify the domain name itself. You can use the variables $myhostname and $mydomain as the value for this parameter:

mydestination = $myhostname $mydomain

myorigin

This parameter is used to append a domain name to messages sent locally that do not already include one. For example, if a user on your system sends a message with only the local username in the From: address, Postfix appends this value to the local name. By default, Postfix usesmyhostname, but if your system is handling mail for the entire domain, you might want to specify $mydomain instead:

myorigin = $mydomain

Some Linux distributions that already include Postfix configure it to use Procmail by default. Procmail is a separate mail delivery agent (MDA) that can filter and sort mail as it makes deliveries to individual users on your system. We describe Procmail in more detail later in this chapter. If you need the features it provides, you should study the Procmail documentation carefully to understand how it interacts with Postfix. For many systems that don't filter mail for users at the MTA level, Procmail is an unnecessary additional layer of complexity because Postfix can also make local deliveries and provide some of the same functions. Your distribution might be configured to use Procmail in either the mailbox_command or mailbox_transport parameters. If you want Postfix to handle local deliveries directly, you can safely comment out either of these parameters in your /etc/postfix/main.cf file.

Starting Postfix

Once you have verified the important configuration parameters described earlier and rebuilt your aliases index file, you are ready to start Postfix. As the superuser, execute:

postfix start

You can stop Postfix by executing:

postfix stop

Whenever you make changes to either of Postfix's configuration files, you must reload the running Postfix image by executing:

postfix reload

Once you have Postfix running, all the users on your system should be able to send and receive email messages.

Any of your applications that depend on sendmail should still work, and you can use the sendmail command as you always did. You can pipe messages to it from within scripts and execute sendmail -q to flush the queue. The native Postfix equivalent for flushing the queue is postfix flush. Options to sendmail that deal with it running as a daemon and setting queue delays do not work because those functions are not handled by the sendmail command in Postfix. All the Postfix options are set in its two configuration files. Many parameters deal with the Postfix queue. You can find them in the manpage for qmgr(8).

Postfix Logging

After starting or reloading Postfix, you should check the log to see if Postfix reports any problems. (Most Linux distributions use /var/log/maillog, but you can also check the file /etc/syslog.conf to be sure.) You can see Postfix's most recent messages by running the command tail /var/log/maillog. Since Postfix is a long-running process, it's a good idea to check the log periodically even if you haven't been restarting it. You can execute the following to see if Postfix has reported anything interesting while running:

egrep '(reject|warning|error|fatal|panic):' /var/log/maillog

In general, Postfix keeps you informed of what is going on with your system by logging lots of good information to syslogd. On Linux, syslogd uses synchronous writes by default, which means that after every write to the logfile, there is also a sync to force everything in memory to be written to the disk. Therefore, the performance of Postfix (and other processes) can suffer. You can change this default by preceding the name of the logfile with a hyphen in /etc/syslog.conf. Your entry in syslog.conf for mail logging should look like the following:

mail.* -/var/log/maillog

Be sure to have syslogd reread its configuration file after you make any changes. You can execute killall -HUP syslogd to reinitialize it.

Running Postfix on System Startup

Because of Postfix's compatibility with sendmail, if you have your system configured to start sendmail at system initialization, more than likely Postfix will start correctly when your system boots. However, system shutdown will probably not work correctly. Most Linux distributions shut down sendmail by locating a process called sendmail and then killing that process. The Postfix processes, while in many ways compatible with sendmail, do not run under the name sendmail, so this shutdown fails.

If you would like your system to shut down cleanly, you should create your own rc script for Postfix, as described in "rc Files" in Chapter 17. The commands you need to include in your script to start and stop Postfix are the same as those you execute on the command line: postfix start andpostfix stop. Here's an example of a basic script to get you started. You may want to review other rc scripts on your system to see if you should add more system checks or follow other conventions and then make your adjustments to this example:

#!/bin/sh

PATH=""

RETVAL=0

if [ ! -f /usr/sbin/postfix ] ; then

echo "Unable to locate Postfix"

exit 1

fi

if [ ! -f /etc/postfix/main.cf ] ; then

echo "Unable to locate Postfix configuration"

exit 1

fi

case "$1" in

start)

echo -n "Starting Postfix: "

/usr/sbin/postfix start > /dev/null 2>1

RETVAL=$?

echo

;;

stop)

echo -n "Stopping Postfix: "

/usr/sbin/postfix stop > /dev/null 2>1

RETVAL=$?

echo

;;

restart)

echo -n "Restarting Postfix: "

/usr/bin/postfix reload > /dev/null 2>1

RETVAL=$?

echo

;;

*)

echo "Usage: $0 {start|stop|restart}"

RETVAL=1

esac

exit $RETVAL

Place this script in /etc/rc.d/init.d or /etc/init.d, depending on your Linux distribution. Then make the appropriate symbolic links in each of the rcN.d directories for each runlevel in which Postfix should start (see "init, inittab, and rc Files" in Chapter 17). For example, if you want to have Postfix start at runlevels 3 and 5 and stop at runlevels 0 and 6, create symbolic links like those that follow for Red Hat. For Debian, the rcN.d directories are directly below /etc.

# cd /etc/rc.d/rc3.d

# ln -s .../init.d/postfix S97postfix

# cd /etc/rc.d/rc5.d

# ln -s .../init.d/postfix S97postfix

# cd /etc/rc.d/rc0.d

# ln -s .../init.d/postfix K97postfix

# cd /etc/rc.d/rc6.d

#ln -s .../init.d/postfix K97postfix

If you create a Postfix rc script, you should configure your system not to start sendmail at startup.

Postfix Relay Control

The default installation allows any system on the same subnet as yours to relay mail through your mail server. If you want to override the default, you can set the parameter mynetworks to be a list of hosts or networks that you trust to relay mail through your system. You can specify a list of IP addresses or network/netmask patterns, and any connecting SMTP client that matches will be allowed to relay mail. You can list network or IP addresses that reside anywhere. So, for example, if you want to be able to relay mail through your home Postfix system from your work machine, you can specify the IP address of your machine at work in your home Postfix configuration.

Here's an example that allows mail from the local subnet (192.168.75.0/28) and a single host located elsewhere:

mynetworks = 192.168.75.0/28 10.150.134.15

If you want to allow relaying for mobile users who do not have static IP addresses, you have to use some kind of SMTP authentication mechanism. Postfix can work with SASL Authentication (which requires that Postfix be compiled with additional libraries, and that users' client software be specially configured) and pop-before-smtp (which requires a POP server running on the same system to first authenticate users).

It is important not to open relay access to anyone except users you trust. In the early days of the Internet, open relays were commonplace. Unfortunately, the current prevalence of spam has precluded that kind of freedom. If your MTA is not protected, you leave yourself and other Internet systems vulnerable to abuse. Spammers constantly scan for open relays, and if you place one on the network, it is only a matter of time before it will be found. Fortunately, the default Postfix installation behaves correctly. However, if you make lots of changes to your Postfix configuration (especially in setting up antispam controls, ironically), you may inadvertently open yourself up to relay abusers. There are some online antispam initiatives that offer to test if your server is configured to correctly deny relaying; try, for example, http://www.abuse.net/relay.html.

If you want your own Postfix installation to relay mail through another MTA, specify the IP address of the relay server using the relayhost parameter. Postfix normally figures out where to deliver messages on its own, based on the destination address. However, if your system is behind a firewall, for example, you may want Postfix to hand off all messages to another mail server to make the actual delivery. When you specify a relay server, Postfix normally performs a DNS query to obtain the mail exchanger (MX) address for that system. You can override this DNS lookup by putting the hostname in square brackets:

relayhost = [mail.example.org]

Additional Configurations

The configuration described here creates a simple Postfix installation to send and receive messages for users on your system. But Postfix is an extremely flexible MTA with many more configuration options, such as hosting multiple virtual domains, maintaining mailing lists, blocking spam, and scanning for viruses. The manpages, HTML files, and sample configuration files that come with Postfix contain a lot of information to guide you in the more advanced configurations.

Procmail

Being a celebrity on the Internet means that you get a lot of attention, just as celebrities do in the real world. The good news is that everyone can become celebrities: simply join a few public mailing lists, get yourself a home page, and you are all set. The bad news is that the attention is from spammers, who send you an enormous amount of suggestions about how you can become richer, extend certain body parts, and take most of their wealth if you want to help them get it out of Iraq.

The virtual bodyguards of your mail are a couple called Procmail and SpamAssassin. Procmail is a general-purpose mail filter, while SpamAssassin is a dedicated mail filter for fighting spam and the like (worms, viruses, etc.). This section discusses Procmail, and the next section is devoted to SpamAssassin.

Procmail Concepts

To understand Procmail, we need to start looking at how it is invoked. The usual sequence is that mail arrives at your account, and your MUA calls Procmail, giving it the mail as argument. The terms filter or rule, in many mail filtering programs, refer to both a set of conditions to check messages for and an action to perform on the messages that meet those conditions (such as putting them in a particular folder). Procmail refers to this set as a recipe, a term we will use throughout this section to describe each set of paired conditions and actions. Procmail goes through each of its recipes until one marks the mail as delivered. If no recipe blocks the mail, it is delivered in your inbox as if Procmail had never been in the picture.

Each recipe consists of two things: a set of conditions and a set of actions. The actions of a recipe are executed if all its conditions are met. In addition, a recipe may mark mail as delivered as described earlier.

The conditions may include the following:

§ The letter comes from president@whitehouse.gov.

§ The subject contain the text KimDaBa.

§ The body of the message contains the text The KDE Image Database.

§ All of the above.

The actions may include the following:

§ Reply to the sender that you are on holiday.

§ Forward the letter to another person.

§ Save the letter to a file.

§ Change some part of the letter (e.g., add a new header field, add some text to it etc.).

Before you dig too much into the details of this section, you should ask yourself if you really want to use Procmail at all. Many mail clients allow you to sort mail, and if we take KMail as an example, then it is much easier to use than Procmail. The following is a list of reasons why you may still want to use Procmail:

§ You are using a number of different mail clients, not always the same. For example, when you are on the road you use a web mail interface, but when you are home you use a normal mail client such as KMail or mutt.

§ You want to filter your mail the second it arrives, not at a later point when your mail client downloads it—an example of this may be out-of-the-office replies.

§ The amount of mail coming to your account is so big that you want filtering to be done before mail is loaded into your client (your client may be slow at filtering mail).

Preparing Procmail for Use

Procmail comes with most modern Linux systems nowadays, but should it not be available for your system, then you should have a look at http://www.procmail.org. At this site you will also find a large collection of sample recipes.

When you have ensured that Procmail is on your system, it is time to check if it is invoked by your MUA. The easiest way to do so is to place the following .procmailrc file in your home directory, and send yourself an email.

SHELL=/bin/sh

MAILDIR=${HOME}/Mail

LOGFILE=${MAILDIR}/procmail.log

LOG="--- Logging ${LOGFILE} for ${LOGNAME}, "

If the ~/Mail directory does not exist, then you need to create it for this script to work. If you store your email elsewhere, replace ${HOME}/Mail with the alternative location. Also please check that /bin/sh exists (it's quite likely that it does); otherwise, adapt the script.

If Procmail is invoked by default, then the file just shown should give you ~/Mail/procmail.log, with content similar to the following:

--- Logging /home/test/Mail/procmail.log for test, From blackie@blackie.dk

Fri Mar 18 12:25:23 2005

Subject: Fri Mar 18 12:25:22 CET 2005

Folder: /var/spool/mail/test

If this file didn't come into existence by sending yourself an email, don't panic. All you need to do is to add the following line to the ~/.forward file:

|IFS=' ' && exec /usr/bin/procmail || exit 75 #myid

Replace /usr/bin/procmail with the path to your system's Procmail binary, and replace myid with your login name. (This part is necessary to avoid problems with MUAs trying to optimize mail delivery.)

Now send yourself a mail again, and check if it works this time. If you still do not see the file, then it might be a result of a system that is too closed. Check that the .procmailrc and .forward files are readable by others, and perhaps only writable by yourself. Possibly you also need to add thex flag to the attributes of your home directory, which you may do with this command:

chmod go+x ~/

If things still do not work, then it is time to panic—or at least consult the vendor of your Linux system.

Setting up a sandbox

Are you ready to lose your email while playing with Procmail ? If not, then it might be a good idea to create a sandbox for your tests. To do so, create a Test directory, and copy your .procmailrc file into that directory and name it proctest.rc. Now edit this file instead of your real .procmailrcfile when testing things. In the Test directory, create this shell script:

#!/bin/sh

#The executable file named "proctest"

#

# You need a test directory.

TESTDIR=~/Test

if [ ! -d ${TESTDIR} ] ; then

echo "Directory ${TESTDIR} does not exist; First create it"

exit 0

fi

procmail ${TESTDIR}/proctest.rc < mail.msg

You may wish to adjust the LOGFILE line of the proctest.rc file so it doesn't write to your existing logfile, but instead simply writes to a logfile in the Test subdirectory. You may also want to add the following line to get improved debugging output from Procmail:

VERBOSE=yes

LOGABSTRACT=all

Finally you are ready to run the tests, which you do by placing a mail message in the file mail.msg, and running the script proctest. Most email programs allow you to save just one email to a file. Alternatively, send yourself an email, and copy it out of your /var/spool/mail/your-login file.

Recipe Syntax

With all the preparation done, we may now start looking at recipes. Recipes all follow this style:

:0 [flags] [ : [locallockfile] ]

<0 or more conditions (one per line)>

<exactly one action line>

Conditions start with a leading *. Everything after that character is passed on to the internal egrep literally, except for leading and trailing whitespace.

The action line may take several forms:

§ If it starts with a !, then the rest of the line is considered an email address to forward to.

§ If it starts with a |, then the rest of the line is considered a shell command to be executed.

§ If it starts with a {, then everything until the matching } is considered a nested block. Nested blocks consist of a number of recipes.

§ Anything else is considered a mailbox name.

The flags are a combination of a number of one-letter flags. The flags are described in Table 23-1 (taken from the procmailrc manpage). There is no need to read the table in detail now; instead, simply look back to it as we show examples in the following sections.

Table 23-1. Procmail flags

Flag

Function

H

Perform an extended regular expression search on the header (default).

B

Perform an extended regular expression search on the body.

D

Check against the regular expression in a case-sensitive manner (default is case-insensitive).

A

Execute the recipe only if there was a match on the most recent recipe without an A or a flag in the current block nesting level.

a

Same as A, but the preceding recipe must have completed successfully.

E

Execute the recipe only if the immediately preceding recipe was not executed. Execution of this recipe also disables any immediately following recipes with the E flag. This allows you to specify else if actions.

e

Execute the recipe only if the immediately preceding recipe was executed but did not complete successfully.

h

Send contents of header to the pipe, file, or mail destination (default).

b

Send contents of body to the pipe, file, or mail destination (default).

c

Create a copy of the mail message so it can be further processed by a later recipe, or delivered.

w

Wait for processing program and check its exit code.

W

Same as w, but suppresses any Program failure message.

i

Ignore any write errors (usually due to a closed pipe).

r

Raw mode. Do not ensure that message ends with an empty line.

Conditions are generally regular expressions found in the header or body of the email. Regular expressions are covered in Chapter 19. But some other special conditions can be used. To select them, the condition must start with one of the flags shown in Table 23-2.

Table 23-2. Procmail condition flags

Condition

Function

!

Act only if the specified condition is false.

$

Interpret text with double quotes in the rest of this condition as it would be interpreted in the bash shell

?

Use the exit code.

<

Run recipe on messages with a total length less than the following number.

>

Run recipe on messages with a total length greater than the following number.

variable

Match the following text against the value of this variable, which can be an environment variable or a combination of B for body and H for header

\

Escapes (leaves as a plain character) any of the entries in this table when it should start the line as a plain character without special meaning.

Examples

Procmail recipes are most easily understood through a number of examples, so the rest of this section will show examples of normal Procmail usage. See the manpage procmailex for more examples.

Each of the examples are simple recipes, not complete Procmail scripts, so you still need the initial content shown in "Preparing Procmail for Use."

Finally, when playing with recipes, remember that Procmail processes them in order. Thus, if a recipe marks mail as delivered, it doesn't show up with other recipes.

Making a backup of all incoming mail

When you are playing around with Procmail, there is a risk that you might develop a recipe that throws away messages that should not have been thrown away. It is therefore a very good idea to use the following recipe which is supposed to be the very first recipe in your Procmail setup:

:0c:

backup

The first line of this recipe has the flag c, which says that even though this recipe matches (which it always will, as the recipe has no conditions), the mail should continue on to other recipes, rather than be stopped here.

After the flag, there is a colon, indicating that the recipe should use a local lock. Thus, before the actions of this recipe can execute, the lock must first be obtained; while the action is executing, the lock will be in place.

The final part of the recipe is the text backup, which indicates that the mail will end in the mailbox named backup. If $MAIL/backup is a directory, the mail will be put in a unique named file in that directory (this is known as maildir storage). Alternatively, if $MAIL/backup is a file, the mail will be appended to that file (this is known as mbox storage).

Storing mail from a mailing list in a special mailbox

The next recipe might be what you most often do with Procmail — namely, to save mail from a mailing list into a dedicated mailbox. This is done with a recipe looking like this:

:0:

* Return-Path:.*kde-devel-bounces

kde-devel

Notice that this time we do not use the c flag, because we want mail from this mailing list to stay in the kde-devel mailbox, and not get to our inbox.

The line starting with an asterisk is the condition that must be met for this recipe to be triggered. This line is a regular expression that says that the header of the mail must contain the text Return-Path:, then any text (the regular expression .*), and then the text kde-devel-bounces. We got the idea for this regular expression by looking in an email from the mailing list. The trick is always to find a regular expression that will match any mail from the mailing list, but not match any other mail.

Forward messages as SMS

The following recipe forwards a message with a subject starting with the text SMS to a mobile phone in the form of a text message through an imaginary email-to-SMS gateway.

:0

* < 1000

* Subject: SMS

! 12345678@smsgateway.com

This recipe contains two conditions: the first is that the overall size of the letter be less than 1000 bytes, and the second is that the subject should start with SMS.

The action of this recipe starts with an exclamation mark, which indicates that the message is forwarded to the address following the exclamation mark.

Sending an out-of-office reply

The final example we show is how to send an out-of-office reply. Many systems provide a program named vacation that does this in a fairly robust way, but we provide something more customizable here so you can vary the message in any way your scripting skills allow. The basic recipe looks like this:

:0c

* !^FROM_DAEMON

* !^X-Loop: your@own.mail.address

{

SUBJECT=`formail -zx subject:`

:0

| (formail -r -I"Precedence: junk" \

-A"X-Loop: your@own.mail.address" ; \

echo "I recived the mail with the subject \"$SUBJECT.\""; \

echo "I'm out of the office and will answer it as soon as possible") | $SENDMAIL -t

}

Starting with the conditions again, this recipe sends an out-of-office reply only if (1) the mail is not from a mailer daemon, and (2) the mail does not contain the header line X-Loop:your@own.mail.address (this should, of course, be replaced with your actual email address). The first condition ensures we do not send out-of-office replies to mailing lists, and the second condition ensures we do not end up in a mail loop with someone else's out-of-office filter.

The action to take when these two conditions are met is a block of recipes. Whatever it says in between the braces is interpreted as if it were a normal Procmail script. If execution makes it to the end of the block (i.e., the mail has not yet been delivered), it will continue execution outside the block. This is, however, not the case in our setup.

The first line of the block is an assignment to the variable SUBJECT. The value comes from standard output from the formail command. This is a binary shipped with Procmail; its purpose is to either manipulate the emails or subtract part of them.

The second part of the block is the part that does the core work. It composes an answer and mails it back to the person who originally sent you an email. Let's take it bit by bit.

First we call formail -r to create an auto respond header from the incoming mail. That means that it will throw away headers that you do not want in the reply. We also hand the command-line options -I"Precedence: junk" and -A"X-Loop: your@own.mail.address" to formail. These two switches basically add new headers to the mail: the first telling the precedence of the mail, and the second adding the line that our condition checks against in order to avoid mail loops.

So far we have echoed the header of the reply mail to standard output. On standard output, we next print the out-of-office reply (i.e., the body part of the email). The whole mail is finally sent to sendmail. The -t option tells sendmail to look into the mail to figure out who it is meant for.

Filtering Spam

The constant flood of so-called spam (more precisely, unsolicited commercial email) has decreased the usefulness of email as a communication medium considerably. Luckily, there are tools that can help us with that as well. These are called spam filters, and what they do is to attempt to categorize each incoming message according to a large number of rules to determine whether it is spam. The filters then mark up the message with either certain additional header lines or a changed subject line. It is then your task (or your mail user agent's task) to sort the messages according to these criteria into separate folders (or, quite dangerously, into the trash can directly). At the end of the day, you decide how aggressively you want to handle spam. You need to make up your mind what is more important to you: to filter out as much spam as possible, or to ensure that no important message (such as a request from a potential customer) will ever get filtered out.

There are two different ways of using a spam filter: either directly on the mail server, or in your email client. Filtering directly on the mail server is advantageous if the mail server serves more than one mail client, because then the same set of filtering rules can be applied and maintained for all users connected to this mail server, and a message coming in to several users on this server only needs to pass the spam filter once, which saves processing time. On the other hand, filtering on the client side allows you to define your own rules and filter spam completely.

The best-known spam filter in the Linux world (even though it is by no means Linux-dependent) is a tool called SpamAssassin . You can find lots of information about SpamAssassin at its home page, http://spamassassin.spache.org. SpamAssassin can work both on the server and on the client; we'll leave it to you to read the ample documentation available on the web site for installing SpamAssassin on a Postfix (or other) mail server.

When SpamAssassin is run on a server, the best way to use it is to let it run in client/server mode. That way, the large tables that SpamAssassin needs do not have to be reread for each message. Instead, SpamAssassin runs as a daemon process called spamd, which is accessed for each message by a frontend command called spamc.

If you want to configure your email client to use SpamAssassin, you need to pipe every incoming email through the command spamassassin (you can even use the spamc/spamd combo on the client, of course). spamassassin will accept the incoming message on standard input, analyze it, and write the changed message to standard output. Most modern mail user agents have facilities for piping all (or just some) incoming messages through an external command, so you should almost always find a way to hook up spamassassin somehow.

If SpamAssassin has analyzed your message to be spam, it will add the header line:

X-Spam-Status: Yes

to your message. It is then up to you to configure the filters in your email client to do to this message whatever you want to be done to spam (sort into a separate folder, move directly to the trash can, etc.). If you want to do more detailed filtering, you can also look at the header line starting with:

X-Spam-Status:

This marker is followed by a number of stars; the more stars there are, the more likely the message is spam.

Before we look at one email client in more detail, to sum up, you need to do two things in order to set up SpamAssassin on the client:

§ Configure your mail client to pipe each incoming message through the spamassassin command.

§ According to the header lines added by SpamAssassin, filter the message per your personal requirements.

You can even use the procmail command that we covered in the previous section to pass the email messages through spamassassin. http://wiki.apache.org/spamassassin/UsedViaProcmail has ample information about how to do this.

As an example of how you can set up an email client to support SpamAssassin, we will look at KMail, the KDE email client. KMail allows you to perform the steps just mentioned, of course. But it can also automate the procedure by means of the anti-spam wizard. You can invoke it from Tools → Anti-Spam Wizard. This tool first scans for the available anti-spam tools on your system (searching for a couple more than just SpamAssassin), and then lets you select those that you want KMail to use. It is not a good idea to just select all available tools here, because each additional filtering slows down the processing of incoming email messages.

On the next page of the wizard, you will be given a number of options of what to do with spam. You should check at least "Classify messages using the anti-spam tools" and "Move detected messages to the selected folder." Then select a target folder for messages that are quite sure to be spam, and a target folder for messages where it is a bit less certain. Once you click Finish, KMail sets up all the necessary filter rules for you, and on your next email download, you can watch the spam folders filling. Your inbox should be, if not completely spam-free, then still a lot more free from spam than previously.

SpamAssassin has a lot of functionality that we have not covered at all here. For example, it contains a Bayes filter that operates on statistical data. When a spam message that comes into the system is not marked as spam, you can teach SpamAssassin to recognize similar messages as spam in the future. Likewise, if a message is erroneously recognized as spam, you can teach SpamAssassin to not consider messages like it as spam in the future (but rather ham, as the opposite of spam is often called). Please see the SpamAssassin documentation on how to set this up.

We have now discussed a number of options that you have when setting up your email system. Our advice is to start slowly, setting up one piece at a time and making sure that everything works after each step; trying to perform the whole setup in one go can be quite challenging.