on the edge

computers & technology, books & writing, civilisation & society, cars & stuff


Greg Black

gjb at gbch dot net
Home page
Blog front page


If you’re not living life on the edge, you’re taking up too much space.


FQE30 at speed



Syndication / Categories

  All
   Announce
   Arts
   Books
   Cars
   Family
   House
   Meta
   People
   Places
   Random
   Society
   Software
   Technology
   Writing



Worthy organisations

Amnesty International Australia — global defenders of human rights

global defenders of human rights


Médecins Sans Frontières — help us save lives around the world

Médecins Sans Frontières - help us save lives around the world


Electronic Frontiers Australia — protecting and promoting on-line civil liberties in Australia

Electronic Frontiers Australia



Blogroll

(Coming soon…)



Software resources


GNU Emacs


blosxom


The FreeBSD Project

Tue, 23 Nov 2004

Progress with SpamAssassin

I’ve been using SpamAssassin for two years now, but only switched to a version with Bayesian code recently—mainly because installing the required version of Perl was more trouble than it was worth until the mail machine was upgraded. The results with SpamAssassin 2.63 have been pleasantly surprising and have convinced me of the value of the Bayesian approach.

Using version 2.20 and 2.44 (with dozens of custom rules), things had deteriorated to the point where it was detecting less than 50% of the spam addressed to my inbox. In the first week with 2.63, that improved to 88%; after two weeks, it was 93%; and now, after seven weeks, it’s over 96%. So, instead of getting 150 or more spam messages in my inbox each day, it’s now down to 12 or 13. That’s still more than I’d like, but it’s low enough that email is once again useful to me. And it’s done without any custom rules.

There is still a downside with the way I’m handling it. Because I’m using qmail, I can’t make it refuse the email during the SMTP transaction; and I certainly can’t set it up to bounce the stuff I want to reject because of the prevalence of forged sender addresses in spam. This means I have to take a few minutes each day to manually scan the sender and subject of all the suspect messages to make sure that there aren’t any false positives lurking in there.

However, I plan to put that behind me fairly soon. I’m going to switch from SpamAssassin to Dspam (to get the performance of a C library instead of Perl code) and I’m going to patch qmail to reject mail at the SMTP transaction when the spam assessment is positive or when the recipient address is either a spam trap or a non-existent address. The SMTP failure notice will point people at a web page that explains the particular reason for the failure and, for legitimate senders, gives instructions for sneaking past the filters. Of course, people with some Microsoft-based systems won’t ever see the failure reasons but will instead be confused by ridiculous made-up reasons inserted by utterly broken software. I suppose I’ll just have to put an easy-to-find page up that explains that, but I’m not all that fussed if Microsoft-using people can’t email me. My customers are not allowed to use Microsoft software and they know how to contact me; my family can always ring me; people who use my free software know how to contact me; if I lose a tiny amount of email, I think I can live with that.