Integrate Spamassassin into Postfix/Dovecot

As I stated before, I really like Christoph Haas’ ISPMail setup for Debian-based mailservers. I was quite fine without any server-side spam filtering solution until now, but somehow the spam amount in my inboxes increased more and more and I was looking for a decent and simple solution to filter out all that bullshit which is distracting me day after day.

I clearly wanted to go with Spamassassin (SA), as I made good experiences with it in the past and it’s more or less the standard spamfilter on linux based mailservers. The most common solutions to integrate SA into a Postfix based mailserver are the following:

  • Using amavisd-new
  • Using Postfix┬┤ content_filter

I don’t really like both of them. Amavis is quite heavy for the pure spam filtering purpose and the content filter checks both ingoing and outgoing mails by default which is obviously not in my interest. Amavis avoids checking outgoing mail just by checking if the sender domain is managed by the same system, but spammers can bypass this quite easily by faking the sender’s address to be the same as the recipient’s one (which is done quite often). There’s a discussion about this on the ISPMail page, so head there for more information. All this can be improved by using multiple Postfix instances and different ports (e.g. using 587/submission for authenticated clients and 25/smtp for normal SMTP traffic), but I want my mailserver to be as interoperable as possible without the need of any special setups on the client side.

So I was looking for another solution. I read some tutorials where people used procmail in user scripts to pass incoming mail to spamc before delivering it to the mailbox. I like this approach as the MTA isn’t involved into the spam filtering process, outgoing mail isn’t touched and you don’t need any complicated setups on the MTA side. All alias and transport definitions work fine and the final mail is checked right before being delivered to the user’s inbox.

First I thought about Sieve, which is already running through Dovecot’s Sieve implementation until I noticed that Sieve is not able to call any external programs (correct me if I’m wrong). Then I had a look at spamc and Postfix’ master.cf. spamc is capable to pipe its output to another program and in the ISPMail setup, Postfix passes the mail directly to Dovecot’s deliver, so why not just let Spamassassin check the mail right before it’s getting passed to Dovecot? I gave it a try and seems to work fine. I still need some automation in training SA databases (might follow in a later post), but the plain SA checking is working reliably and mails can easily be filtered with Sieve afterwards.

So much for the backstory, let’s get our hands dirty. Note: I’m running Debian Lenny.

Installing and configuring Spamassassin

First of all, let’s install our magic little helpers.

$ aptitude install spamassassin pyzor razor

DCC

Additionally, I want to use DCC, which is not in the Lenny repositories. First, let’s create a user for DCC.

$ groupadd dcc
$ useradd -g dcc -s /bin/false -d /var/dcc dcc

Then download and build it manually (you might need some additional Debian packages like build-essential). Just ignore the sendmail warnings during configure. The DCC version may vary.

$ mkdir ~/build
$ cd ~/build
$ wget http://www.dcc-servers.net/dcc/source/dcc-dccproc.tar.Z
$ tar xzvf dcc-dccproc.tar.Z
$ cd dcc-dccproc-1.3.116
$ ./configure --with-uid=dcc
$ make
$ make install
$ chown -R dcc.dcc /var/dcc
$ ln -s /var/dcc/libexec/dccifd /usr/local/bin/dccifd

Configuring Spamassassin

We make use of spamd to let SA run as daemon. To do this, we need a user for SA.

$ groupadd spamd
$ useradd -g spamd -s /bin/false -d /var/lib/spamassassin spamd

Then, edit /etc/default/spamassassin to look like the following listing (changed ENABLED to 1, added SAHOME and edited OPTIONS). The virtual-config-dir allows us to have separate user preferences and bayes databases for each virtual user. An improvement would be to save this data directly to the virtual user’s “home” directory in /var/vmail, but for now I got it like this. The bayes database for user@example.org would therefore be stored in /var/lib/spamassassin/users/example.org/user/.

[...]

# Spamassassin home
SAHOME="/var/lib/spamassassin"

# Change to one to enable spamd
ENABLED=1

# Options
# See man spamd for possible options. The -d option is automatically added.

# SpamAssassin uses a preforking model, so be careful! You need to
# make sure --max-children is not set to anything higher than 5,
# unless you know what you're doing.

OPTIONS="--create-prefs -x --max-children 3 --username spamd --helper-home-dir ${SAHOME} -s ${SAHOME}/spamd.log --virtual-config-dir=${SAHOME}/users/%d/%l"

[...]

The home directory:

$ mkdir -p /var/lib/spamassassin/users
$ chown spamd.spamd /var/lib/spamassassin -R

Now, let’s do some configuring. You can find all relevant config files in /etc/spamassassin. First of all, local.cf:

[...]

#   Save spam messages as a message/rfc822 MIME attachment instead of
#   modifying the original message (0: off, 2: use text/plain instead)
#
report_safe 0

[...]

use_dcc 1
dcc_path /usr/local/bin/dccproc

use_pyzor 1
pyzor_path /usr/bin/pyzor

use_razor2 1
razor_config /etc/razor/razor-agent.conf

Afterwards, edit v310.pre and check that the DCC, Razor and Pyzor plugins are enabled (DCC is disabled by default).

You can check your Spamassassin configuration with lint (use the -D flag for output):

$ spamassassin --lint

And you can update SA’s rules with sa-update:

$ sa-update --no-gpg

Now we are ready to start the SA daemon (will be automatically started at boot time):

$ /etc/init.d/spamassassin start

Configure Postfix

This step is quite easy. First, edit /etc/postfix/master.cf, copy the existing dovecot transport and edit it to look as follows (you can change the name ;)):

dovecot-spamass   unix  -       n       n       -       -       pipe
    flags=DRhu user=vmail:vmail argv=/usr/bin/spamc -u ${recipient} -e /usr/lib/dovecot/deliver -d ${recipient}

This transport will pass the mail to spamc which will pass it to deliver after checking.

Now just change the transport in /etc/postfix/main.cf. I used the second transport to be flexible. In case anything should go wrong with Spamassassin you just need to change the transport in main.cf and you get your mails without the SA step.

virtual_transport = dovecot-spamass

That’s all you need on the Postfix side. Just restart it to use the new transport.

$ /etc/init.d/postfix restart

Finishing touches

Now let’s do some testing. Send yourself a mail and take a look at the mail headers. There should be quite a lot of new mail headers which get injected by Spamassassin (you can configure the headers in Spamassassin’s configuration). A normal non-spam mail could look like this:

X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on mail.example.com
X-Spam-Level: *
X-Spam-Status: No, score=1.2 required=5.0 tests=ALL_TRUSTED,AWL,
    TVD_SPACE_RATIO autolearn=no version=3.2.5

Now cross your fingers and hope that a friendly spammer sends you new crap or just use the GTUBE test (include that line in your mail). SA should recognize it as junk and give detailed information about it in its headers:

X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on mail.example.com
X-Spam-Level: **************************************************
X-Spam-Status: Yes, score=1000.0 required=5.0 tests=ALL_TRUSTED,AWL,GTUBE
    autolearn=no version=3.2.5
X-Spam-Report: =?ISO-8859-1?Q?
    * 1000 GTUBE BODY: Test zur Pr=fcfung von Anti-Spam-Software

Other example for a real spam mail:

X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on mail.example.com
X-Spam-Level: *******************
X-Spam-Status: Yes, score=19.6 required=5.0 tests=BAD_ENC_HEADER,DCC_CHECK,
    DIGEST_MULTIPLE,HELO_LOCALHOST,HTML_MESSAGE,PYZOR_CHECK,
    RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E8_51_100,RAZOR2_CHECK,RDNS_NONE,
    URIBL_RHS_DOB,URIBL_SBL,URIBL_WS_SURBL autolearn=spam version=3.2.5
X-Spam-Report: =?ISO-8859-1?Q?
    *  0.9 URIBL_RHS_DOB Contains an URI of a new domain (Day Old Bread)
    *      [URIs: spammersdomain.com]
    *  2.1 URIBL_WS_SURBL Enth=e4lt URL in WS-Liste (www.surbl.org)
    *      [URIs: spammersdomain.com]
    *  4.5 HELO_LOCALHOST HELO_LOCALHOST
    *  2.9 BAD_ENC_HEADER Message has bad MIME encoding in the header
    *  0.0 HTML_MESSAGE BODY: Nachricht enth=e4lt HTML
    *  1.5 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence level
    *      above 50%
    *      [cf:  58]
    *  0.5 RAZOR2_CHECK Gelistet im "Razor2"-System (http://razor.sf.net/)
    *  0.5 RAZOR2_CF_RANGE_51_100 Razor2 Spam-Bewertung liegt zwischen 51 und
    *      100
    *      [cf:  58]
    *  2.8 PYZOR_CHECK Gelistet im Pyzor-System (http://pyzor.sf.net/)
    *  1.4 DCC_CHECK Gelistet im DCC-System
    *      (http://rhyolite.com/anti-spam/dcc/)
    *  2.5 URIBL_SBL Enth=e4lt URL in SBL-Liste (http://www.spamhaus.org/sbl/)
    *      [URIs: spammersdomain.com]
    *  0.0 DIGEST_MULTIPLE Mehrere Internettests (Razor, DCC, Pyzor, etc.)
    *      treffen zu
    *  0.1 RDNS_NONE Delivered to trusted network by a host with no rDNS?=

If you are wondering about the autolearn flag in the mail headers: SA triggers autolearning only starting from a certain treshold. For more information see Spamassassin Wiki/AutolearningNotWorking.

Filtering

Using SA’s mail headers, you can easily filter your mails. A simple Sieve filter could look like this:

require "fileinto";

if header :contains "X-Spam-Flag" ["YES"] {
  fileinto "Spam";
  stop;
}

If you are using Thunderbird, you can even configure it to filter mail using SA’s headers (in your account settings), however this shifts the filtering a bit too much on the client side for my taste.

Per-user preferences

As we use SA’s virtual-config-dir, we have a separate bayes db for each virtual user and can additionally specify user preferences (like blacklists) on a per-user basis. To make use of this, create a file called user_prefs in the virtual user directory for SA and add individual rules there.

Example: add a blacklist entry for all mails which are recieved for joe@example.com (assuming joe has received any mail before and spamd created the needed directories):

$ echo "blacklist_from jane@example.com" > /var/lib/spamassassin/users/example.com/joe/user_prefs

You can try adding yourself to your blacklist and sending yourself a mail. If all is working fine, the mail should be classified as spam. Example:

X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on mail.example.com
X-Spam-Level: **************************************************
X-Spam-Status: Yes, score=101.4 required=5.0 tests=ALL_TRUSTED,AWL,
    TVD_SPACE_RATIO,USER_IN_BLACKLIST autolearn=no version=3.2.5
X-Spam-Report: =?ISO-8859-1?Q?
    *  100 USER_IN_BLACKLIST From: address is in the user's black-list
[...]

Conclusion

This setup seems to me quite straightforward and lightweight, but it still needs some improvements like automated bayes database training (take a look at the first source) or moving user’s preferences to the database. However, the stock Spamassassin installation does its work quite reliably and is working fine at the moment.

Sources