Spam filter effectiveness
Saved up my incoming spam for a couple days, a couple days ago, and then looked it over to see how good a job Pair's spam filter was doing.
Over the course of 48 hours, I received 375 pieces of spam total. (Which is pretty remarkable given that less than a year ago I was still getting only about 5 pieces of spam a day.)
Pair's filter, based on SpamAssassin, correctly marked 347 of those as spam.
My own filters caught an additional 15 pieces of spam—mostly by the "everything addressed to Alex is spam" rule, which I implemented a few months back after months of verifying that it was true by hand.
Pair's filter incorrectly marked 1 non-spam message as spam, but it was a special case: a comment from my journal which quoted my previous posting about the spam filter, using keywords that the spam filter identified as spam. Which means that non-spam mail about spam is likely to be falsely identified as spam, but I don't receive much of that.
In addition, during that period I received 13 pieces of spam that neither Pair's filters nor mine identified as spam. Fortunately, Pair's filters mark not-quite-spam with abbreviated information about the criteria that the message did match; if, as Irilyth suggested the other day, I can customize the values that Pair assigns to each of those criteria, I should be able to get most of those items over the threshold. Some, though, mostly the plain-text spam with a few links in it, got lower spam scores than some of my regular non-spam mail, so there'll always be a few pieces of spam that Pair isn't going to catch. Still, even if I just leave the filter exactly as it is, it appears to be correctly filtering over 92% of all spam, and the one false positive was an anomaly. (Though I did receive a couple of false positives on more ordinary messages over the previous few days.)
I meant to, but neglected to, look through my non-spam mail during that period to see how much of it came close to triggering Pair's spam filter. I might be able to just lower the counts-as-spam threshold to 3 instead of 4, but that might increase the false-positive rate. Not sure. Further experimentation is clearly called for.
Oh, and in case anyone cares, the highest scoring piece of spam I got during that period received a 37.1 score (where 4 is enough to mark it as spam). Impressive.