(Note as of June 2011: This document is no longer actively maintained, but hopefully remains useful to qmail users and administrators in thinking about spam prevention/reduction strategies.)
Table of Contents
- I. Introduction
- II. Who Should Be Reading This Document
- III. General Issues of Spam Prevention Policy
- IV. Specific Issues of Spam Prevention Policy
- V. Basic Things You Can do to Prevent/Reduce Spam
- VI. Commonly Held Views About Spam
- VII. Options for Individual Users
- VIII. Options for System Administrators
- IX. Other Resources
- X. History
- XI. Comments/Feedback
This document discusses anti-spam philosophies from a variety of perspectives and provides information about available options for dealing with spam. If you're sick of spam and want it to go away now, you can jump straight to some practical options.
If you're interested in professional consulting services related to qmail configuration and spam prevention, consult the commercial support section of qmail.org. If you're interested in having someone speak to your organization about spam and qmail issues (or related topics), see my information about speaking requests.
II. Who Should Be Reading this Document
This document could be useful for
- UNIX system administrators
- qmail administrators
- System Use/Abuse Policy Makers
- Casual users of e-mail concerned about SPAM
III. General Issues
Spam is defined here as unsolicited commercial e-mail, usually sent in bulk. In other words, spam is simply electronic junk mail. Dealing with spam is, at best, a very difficult task. This is mostly true because spammers have a wide array of tools and circumstances available to them that make it easy for them to send you mail but difficult for you to communicate back with them or any authority over them. Spam is also difficult to deal with because it almost always comes in under the guise of being a normal e-mail message. No amount of technology can automatically decide what content is undesirable to you, but there are many ways to use technology to reduce the amount of unwanted e-mail you or your users receive.
For a more thorough explanation of the issues here, I recommend reading the IETF Anti Spam Recommendations. David E. Sorkin has produced an excellent document called Technical and Legal Approaches to Unsolicited Electronic Mail. SpamHelp.org also has a wide range of useful articles on spam.
IV. Specific Issues of Policy
Anyone dealing with spam prevention will have to make definite and impacting decisions about the following issues:
- Is the prevention of spam worth the time and resources required to reach a given level of spam reduction?
- Is the prevention of spam the responsibility of a system administrator or the responsibility of the end user, or some combination of those?
- Should e-mail identified as potential spam be flatly rejected, or just tagged as spam and routed accordingly?
- Should system administrators (yours or anyone else's) who have misconfigured their systems be held responsible for any problems that result?
- Should you reject e-mail messages that are legitimate in content but that do not conform to known and accepted standards?
- Should you accept for delivery mail that does not have valid reply information (either in the envelope or From address)?
- What criteria should be met before an individual or ISP is justifiably classified as "spam-friendly"?
V. Basic Things You Can do to Prevent/Reduce Spam
- Avoid publishing your private e-mail addressPutting your e-mail address on a web page is often the fastest way to generate spam in your mailbox. If you have to publish your address, try to "spamproof it" (e.g. "chris at summersault dot com") or set up a "throw away" account that you can use for short periods of time and stop using later.
- Don't respond to spamNo matter how tempted you might be to respond to a spammer and tell them off / ask to be removed from their list, don't do it. This only serves to A) validate that your e-mail address is active and thus a good target for further messages, B) contribute to their "response rate", C) further waste your time and resources.
- Don't use a dot-qmail-default file in a lazy wayThe dot-qmail-default file (e.g. '.qmail-default') makes any mailbox name on your system valid, whether or not it actually exists. While it can be useful as a catch-all for setups where there is a possibility that the sender would misspell the address, it often leads to an increased volume of spam from spammers who use made up aliases. Typical examples of this are "firstname.lastname@example.org", "email@example.com", and so on. Note that the dot-qmail-default file can actually be useful in fighting spam, if you set it up to process mail sent to invalid addresses and then reject the mail with useful error messages. If you use vpopmail, encourage your users to leave the catch-all address set to "bounce", instead of having it deliver to a regular mailbox.
- Report any spam that you do get to the sender's ISPServices like SpamCop make this straightforward and efficient. See the resources section below for links.
- EducateEducate your users, friends, and family about spam, why it is (or is not) worth fighting, and what they can do. Spammers are often successful in getting mail through because the typical end user doesn't understand the technical issues enough to make decisions about how to respond and fight back.
- Make sure your system is properly configured and securedMuch spam is propagated because of improperly configured mail servers (or even desktop computers). All of the hosts on your system should have proper hostnames and corresponding IP addresses. Your mailing software should be standards compliant and issue standard error and warning messages when appropriate, and you should be aware of its configuration regarding rejection of messages. Use anti-virus and firewall products where appropriate.
VI. Commonly Held Views About Spam
The following are some commonly held views about spam. They are presented here to provide you with some perspective about the issues involved in preventing spam. As you will note, the disparity between the different views makes it even more difficult to develop a commonly acceptable solution to preventing and reducing spam.
This view holds that, because of the seemingly impossible task of accurately identifying e-mail messages as spam, and the difficulty in holding spammers responsible for their actions once they have been identified, efforts at stopping spam are a waste of time and resources, and can even result in losing legitimate mail.
People holding this view usually think that the tolerable ratio of lost legitimate e-mail to rejected spam is zero.
This view holds that, because the identification of spam is so difficult, the reduction of spam is the responsibility of the end user. This view is usually held by system administrators because of the liability issues in rejecting any e-mail on a system-wide basis. Many also think that user-level spam prevention is more effective because it allows the user to have more control over the kinds of messages that should be automatically blocked or should automatically get through. People holding this view sometimes acknowledge that, believe it or not, some people actually like receiving commercial e-mail, and tend to believe that rejecting any e-mail without obtaining the permission of the intended recipient is an invasion of privacy.
People holding this view usually think that the tolerable ratio of lost legitimate e-mail to rejected spam is a matter of personal opinion to be determined by the user affected.
This view holds that, because there are some tools available for rejecting spam at the system level, and because the prevention of spam is a worthwhile effort in the name of conserving system resources, spam prevention is the responsibility of the person or entity maintaining the mail server. This view is often held by administrative bodies in the context of reducing the amount of time their staff (both sysadmins and end-users) spend dealing with unwanted spam. It is also often held by system administrators who have a particular dislike for spammers who are, as they perceive it, misusing or abusing their systems for unwanted commercial e-mail.
This view also affects system administrators who maintain systems that are configured in such a manner that spammers can more easily use that system for spamming. The most common example of this is when a system is classified as an "open relay", allowing mail to pass through it that is neither to or from a user on the system. Systems like this are sometimes seen as facilitating spam in a significant manner, and that it is the responsibility of the system administrators to alter that configuration in the name of fighting spam.
People holding this view usually think that the tolerable ratio of lost legitimate e-mail to rejected spam is acceptable at low levels. They are usually willing to accept that some legitimate e-mails will be rejected because of a misconfiguration or error on the part of another system administrator.
Within each of the above views, there are many variations. Often the variations involve decisions about the issues stated above in "Specific Issues". Some common variations:
- Some people believe that any messages from senders who have been listed in one of the various "black hole lists" (see resources section for links) should be rejected without exception. Others sometime believe that these black hole lists are not always fair or just in their criteria for inclusion, and that depending on these lists would result in too many legitimate e-mails being rejected. Even others disagree with the use of the black hole lists because of grievances with the methodology the owners use to develop and maintain the lists.
- Some people believe that messages not conforming to known standards for mail delivery should be rejected or identified as potential spam. The most common example of this involves "From" headers or "envelope" addresses. These addresses are necessary to handle the "bouncing" of messages and for any sort of reply. For a variety of reasons, many spam messages do not have valid From or envelope headers. So, while some believe that any such message should be rejected, others hold that there are too many exceptions where these headers might be invalid but the content or intent of the message is legitimate.
VII. Options for Individual Users
This section discusses spam prevention options for individual users. It assumes that you have the access necessary to modify your personal mail setup and that you have basic knowledge of software configuration procedures. If either of the above is not true, you should ask your system administrator for assistance.
The spam-prevention options for individual users at first seem limited, given that if an e-mail has made its way to you, it has been successfully accepted for delivery by your system, and is now your responsibility for filtering. However, filtering mail (also sometimes called "rule-based delivery") can be a powerful tool (in spam-prevention and other areas).
Many modern mail reading software packages (Apple's Mail.app, and Mozilla's mail component for example) are starting to include built-in features that allow you to automatically detect and filter spam. They use a variety of techniques to achieve this detection, the most common of which is Bayesian filtering (or some variation thereof), in which you train the software to recognize undesirable content based on your personal mail reading habits.
Filtering within the mail client may be the most convenient and straightforward for the average end user because it is integrated into the other processes involved in retrieving and reading mail -- a point and click approach -- and is customizable without much technical knowledge. Filtering at this level can also be combined with other methods below for an even more spamproof mailbox.
Other Basic Filtering Tools
The most commonly used program for filtering mail under UN*X is procmail, a recipe-based set of scripts that will route, reject, forward, and modify your mail based on criteria you specify. Tools for filtering mail under other platforms (Windows, Mac, etc) are numerous, and are best chosen by considering your system setup, mail reading processes, and choice of mail client. Modern versions of Eudora, Outlook, Netscape Mail, etc. all have filtering capabilities (e.g. "If the subject line contains the word 'enlargement', put this message in the Trash.").
Third-party black-hole lists are databases of computers/mail senders on the Internet that have been identified in some way as "spam-friendly"; they're open relays, repeat offenders, innocent bystanders infected by a virus, etc. These lists are the source of much controversy because the criteria for being "listed" can vary so widely depending on the mission (and sometimes, personal preferences) of the people or organization maintaining them. Users are encouraged to find blacklists that are in line with their views about spam, and to review those choices regularly. Further, it's often preferable just to tag and filter a message that has senders in a blacklist as potential spam, rather than discard the message unseen.
A common unix program called rblcheck can be used to integrate checking against black-hole lists with your mail setup. See the rblcheck documentation, particularly the sections on setting it up with procmail and qmail, to get started. Note that if several users on your system are using this setup, you may want to ask your system administrator to install qqrbl, which does the RBL check as the message comes into the system, eliminating the need for each user to set up rblcheck.
An increasingly common approach to dealing with spam is to only allow messages through that are from known "good" senders. This method exploits a spammer's assumption that they have unrestricted access to your mailbox. With a whitelist-centric strategy, an initial list of acceptable senders is established (friends, coworkers, etc.). When these senders send a message, it goes through with no problem. Unknown senders must confirm the legitimacy of their message to you before it gets through. Various aspects of this approach can be used to minimize the percentage of senders who are asked to confirm their message.
The current favored implementation of this strategy is Tagged Message Delivery Agent (TMDA).
As mentioned above, another increasingly popular technique is Bayesian filtering (or some variation thereof), in which you train the software to recognize undesirable content based on your personal mail reading habits.This is appealing because you're no longer constrained in your filtering technique by what some other person thinks is or isn't spam, and your actual filtering is based on statistical precedent (i.e. knowledge about what you've considered legitimate and spam in the past), instead of abstract guessing about what might be spam.
There are a number of filters available that use the Bayesian technique; Paul Graham, author of the articles "A Plan for Spam" and "Better Bayesian Filtering", has a great list of options, many of which were inspired by his clarity on the matter.
Bogofilter is one of these that can easily integrate with qmail via a user's dot-qmail file, or more complex recipes (e.g. using procmail). Chris Wilkes said: "Its simple to use, just put something like this into your .qmail files:
| condredirect myname-spam /usr/local/bin/bogofilter -u -2 ./Maildir/
which says to send your email through bogofilter and update your database files accordingly. If the email is spam it exits with a return code of 0, meaning that condredirect will bounce it over to myname-spam,
which is my spam folder. If it isn't spam it continues on through the .qmail file, in this case being delivered to ./Maildir/"
As you can probably tell by now, there are a lot of options for detecting and filtering spam. Many of the options require detailed technical knowledge, or the willingness to spend lots of time tweaking and refining the methods employed so that they stay current with known spamming strategies, available blacklists, etc.
A few options exist that try to combine and balance all of these methods, and remove the technical complexities that are often barriers for end users.
- SpamAssassin, uses a wide range of heuristic tests on mail headers and body text to identify spam. Once identified, the mail can then be optionally tagged as spam for later filtering using the user's own mail user-agent application.
- SpamBouncer, an extensive set of recipes for procmail designed for the novice procmail user. To use SpamBouncer, just follow the instructions provided with the software. After SpamBouncer is installed, you can modify the lists of good and bad senders to meet your needs.
VIII. Options for qmail Administrators
This section discusses options for system administrators who want to implement anti-spam mechanisms at the system-wide level. Please note that you should resolve the specific issues listed above before implementing any of these solutions, and that you should always notify your users of any changes to the system that affect the mail they do or don't receive.
It is becoming common for the default installation of many Unix operating systems like FreeBSD and Linux to include a mechanism to block network traffic based on certain criteria, commonly referred to "host-based access control" and commonly implemented using the tcp_wrappers package. In some of these installations, network traffic from hostnames that do not map to valid IP addresses is blocked. While not an e-mail specific measure, this is one way to cut down on e-mail from hosts that have misconfigured their DNS, and therefore are thought by some to be more likely to be spam-friendly.
If you're using inetd, an example line in a FreeBSD /etc/hosts.allow is here:
ALL : PARANOID : RFC931 20 : deny
One can also achieve this using the ucspi-tcp package's tcpserver (now the recommended alternative to inetd), by enabling the "-p" option, for paranoid, e.g. in /service/qmail-smtpd/run, you might have:
#!/bin/sh QMAILDUID=`id -u qmaild` NOFILESGID=`id -g qmaild` exec softlimit -m 3000000 tcpserver -v -p -x /etc/qmail/tcp.smtp.cdb -u $QMAILDUID -g $NOFILESGID 0 smtp sh -c 'test -z "$TCPREMOTEHOST" && echo "451 bad reverse DNS" || exec /var/qmail/bin/qmail-smtpd' 2>&1
This basically tells tcpserver to remove the environment variable "TCPREMOTEHOST" if it can't resolve the reverse DNS, and then not to run qmail-smtpd if TCPREMOTE HOST isn't populated. (Thanks to Mike Jimenez for noting that only using tcpserver -p isn't enough, and to Gerrit Pape for suggesting the above code snippet. Thanks to Jerry Amundson for updating it to include a useful error message for the connecting SMTP server.)
David Crawshaw notes that using $TCPREMOTEHOST in a shell script is not secure, and points us to rhost-check, his 10-line C program that takes care of that issue. If you use his script, the resulting run file might look like this:
#!/bin/sh QMAILDUID=`id -u qmaild` NOFILESGID=`id -g nofiles` exec softlimit -m 3000000 tcpserver -v -p -x /etc/qmail/tcp.smtp.cdb -u $QMAILUID -g $NOFILESGID 0 smtp rhost-check rblsmtpd qmail-smtpd 2>&1
In the ucspi-tcp package there is the rblsmtpd package, an alternative to the usual qmail-smtpd, and works with any SMTP server that runs under tcpserver. (If you want to "flag" instead of "reject", see the variations section below. I've found qqrbl to be a great solution for ISPs and web hosting companies.)
So, if you follow the Life With qmail Installation guide, and then update your supervise scripts accordingly, your /var/qmail/supervise/qmail-smtpd/run script looks something like this:
#!/bin/sh QMAILDUID=`id -u qmaild` NOFILESGID=`id -g qmaild` MAXSMTPD=`cat /var/qmail/control/concurrencyincoming` exec /usr/local/bin/softlimit -m 2000000 /usr/local/bin/tcpserver -v -p -x /etc/tcp.smtp.cdb -c "$MAXSMTPD" -u "$QMAILDUID" -g "$NOFILESGID" 0 smtp /usr/local/bin/rblsmtpd /var/qmail/bin/qmail-smtpd 2>&1
Note that the above is an updated version of the call to rblsmtpd; the previous version was only correct for older versions of daemontools, and is now deprecated. The upgrade is worth it. Also note that I can't always keep the above syntax up to date with the recommended version in Life With qmail - check there for the latest and greatest.
If you're still using inetd (which isn't recommended for qmail), you can patch qmail to do about the same thing.
If you want to use other databases in addition to the RBL, you can modify your rblsmtpd configuration with the "-r" option to do so.
rblsmtpd -rbl.spamcop.net -rblackholes.mail-abuse.org
Mike Silbersack says "If you wish to use the *.mail-abuse.org black-hole lists, you'll have to apply the patch to make rblsmptd work with A records"
Ask Bjørn Hansen wrote qpsmtpd, a smtp server written in Perl with filtering tools.
Hermes Antispam Proxy is an SMTP proxy that supports banner delay, throttling, greylisting, and DNSBL lookups.
Greylisting is a spam control method that works by returning a temporary SMTP error to the first delivery attempt. Most spam is sent from bulk mailers which don't retry, so these are blocked. qgreylist is one qmail greylisting tool. Greylite is another that combines natively with qmail and works as a proxy for any SMTP server.
This area is a little blurry right now; it is the author's hope that readers will contribute their experiences here to improve the recommended options.
There are several patches out there that claim to make qmail reject messages with bad envelopes or From headers (i.e. if the envelope is blank or if the hostname in it doesn't have a valid DNS entry) or otherwise deal with suspected SPAM mail. Note that most of these patches are not featured on the qmail site and are therefore assumed to be "nonstandard".
- Nagy Balazs wrote a patch (mfcheck) to ensure that the domain name on the envelope sender is a valid DNS name. This ensures that you do not receive email which you cannot bounce, should that prove necessary.
- Erwin Hoffman has written a patch for qmail-smtpd called SPAMCONTROL which improves qmail's filtering abilities and makes it RFC 2505 compliant. He and Noel Mistula also produced some scripts for filtering attachments and subject lines (something you can also do with procmail).
- The folks at flame.org wrote another patch that performs various header checks and bounce/flagging functionality
- Will Harris wrote a patch that allows you to use a new control file to specify Perl regular expressions to be used when checking the validity of the envelope sender.
- qregex lets you match the envelope sender against a regex and accept or reject the mail accordingly.
It should also be noted here that messages with recipient addresses in the form "user%hosta@hostb" are not going to be relayed through your system unless you have misconfigured something. See the qmail.faqts page on this issue for further details.
LinuxMagic also has their magic-smtpd daemon, which does valid user checking at the SMTP level, as well as allowing for other integrated anti-spam methods.
There are a variety of ways to make it difficult for your users to create spam. This is an important effort; while most of this document focuses on avoiding incoming spam, don't forget that a lot of incoming spam is generated because of overly lax mail sending policies.
- Chris Johnson has written a patch for qmail called tarpit. Tarpitting is "the practice of inserting a small sleep in an SMTP session for each RCPT TO after some set number of RCPT TOs." This discourages a user from using a given system as a relay.
- Jonathan McDowell has written an X-Spam-Warning header patch that adds warning headers for messages from senders in ORBS, RSS, RBL and DUL without the use of any external programs. This is useful if you want to allow your users to decide how to handle SPAM while tagging it as such at the system level.
- Jay Soffian has written qqrbl, a script that also adds warning headers, but uses the existing QMAILQUEUE patch instead of patching qmail source itself.
- Chris Johnson wrote a patch to log attempted relay attempts
- Dale Woolridge, James Law, and Moto Kawasaki have created spam throttle, a qmail-smtpd patch which inserts a sleep after the DATA command when a client's throughput is too high.
- Russell Nelson has a patch to reject relay probes. These relay probes have '!', '%' and '@' in the local (username) part of the address. (Note that rejecting these probes may get your mail server listed on a few RBL lists, which are listed as resources elsewhere in this document.)
- SpamAssassin is a flexible, extendible spam filtering system that works with a variety of mail systems.
- Blackhole is another anti-spam/virus scanning package that works with qmail
- spamdyke is a filter for monitoring and intercepting SMTP connections between a remote host and a qmail serverbr />
IX. Other Resources
Real-time Third-Party Blocking Solutions
I've previously had a list of known blacklist/blocking list services here, but they are constantly out of date; for a more complete and accurate list of these, check out Jeff Makey's comparison page or browse openrlb.org's list.
Third Party Spam Reporting Services
General Related Mail and System Tools
- qmail (anti-spam section)
- RBLCheck script
- Tagged Message Delivery Agent
- magic-smtpd daemon
Writings on Spam
- Graham, Paul, "A Plan for Spam" (August 2002), "Better Bayesian Filtering" (January 2003)
- Sorkin, David E. Technical and Legal Approaches to Unsolicited Electronic Mail (April 2001) PDF
- Kinnard, Shannon, Marketing With Email : A Spam-Free Guide (December 1999)
- Loshin, Pete et al., Essential E-Mail Standards : Rfcs and Protocols Made Practical (November, 1999)
- Mulligan, Geoff, Removing the Spam : Email Processing and Filtering (April 1999)
- Schwartz, Alan et al., Stopping Spam (October 1998)
- Wyman, Carolyn, Spam: A Biography (July 1999) 🙂
Anti-Spam Manifestos and Organizations
- The IETF Anti Spam Recommendations
- Fight Spam on the Internet! (from abuse.net)
- The Coalition Against Unsolicited Commercial Email
- SpamCon Law Foundation Center (formerly the SueSpammers project)
- January 20, 2008 - v0.44; added greylite and hermes links
- September 1, 2006 - v0.43; added magic-smtpd, qgreylist links
- August 10, 2005 - v0.42; updated various links, added rhost-check info, stopped trying to maintain list of RBLs, general minor cleanup
- March 26, 2003 - v0.41; added bogofilter syntax
- February 16, 2003 - v0.40; major updates to user options, basic techniques, list of resources, and more
- June 2, 2002 - minor updates
- March 29, 2002 - v0.38; added SpamAssassin links
- March 14, 2002 - v0.37; fixed syntax of paranoid mode for tcpserver
- February 15, 2002 - v0.36; freshened up some links, added some "Variations" in sys-level options, removed NetMind links
- January 5, 2002 - v0.35, added and freshened links, updated sys-level options, added some clarifications based on user feedback
- July 17, 2001 - v0.33, updated references to the various realtime blocking services, as they're changing quite a bit now that ORBS is gone
- April 28, 2001 - v0.31, added some new misc links
- April 21, 2001 - version 0.30; added a new section of resources for administrators, updated some links, updated daemontools syntax to be current, added misc links
- January 31, 2001 - minor update regarding -r syntax for rblsmtpd
- December 16, 2000 - added section numbering; removed links to contributor's e-mail addresses (too much potential irony); added some new patches and software; expanded user section to include non-Unix-centric perspectives; other minor fixes and updates
- October 7, 2000 - added IETF recommendations, submitted by Robert Dalton
- October 5, 2000 - updated some links, reviewed content
- July 14, 2000 - added rblsmtpd syntax that includes logging; added more spam-related resources.
- July 3, 2000 - freshened up some links
- May 26, 2000 - updated rblsmtpd ORBS syntax with corrections provided by Konstantin Riabitsev
- April 30, 2000 - added new links
- April 4, 2000 - version 0.2, implemented some suggested changes from Dave Sill
- April 3, 2000 - Implemented suggested changes from Mikko Hanninen, Peter van Dijk
- April 3, 2000 - version 0.1 created by Chris Hardie
If you have specific suggestions for changes, corrections, and additions to this HOWTO, please send them to Chris Hardie and they will be integrated into the document as appropriate. If you're interested in professional consulting services related to qmail configuration and spam prevention, consult the commercial support section of qmail.org.