« Response to Rosenbaum | Main | Book Report: Emil and the Detectives »

Lobster thermidor aux crevettes with a Mornay sauce garnished with truffle paté and brandy, with a fried egg on top

One of the many nice things about this hand-tooled blog is that comment-spammers had never really thought of it as a blog. It doesn’t look like a blog—or, rather, it doesn’t look like a blog to a machine. Plus, of course, there’s the cozy nature of its readership, which allows me to eschew exclusivity while still being able to have the whole readership sit on the front porch, if the weather’s nice. Not a lot of incoming links, not a lot of awards, not a lot of spam.

Of course, there is some spam, now and then, and my host’s hand-tooled audio-vibratory molecular spam-blocking device takes care of most of it. In the whole month of February, for instance, there were seventy-four spam comments, of which twenty-five were blocked by Jed’s code, leaving less than two a day that I had to actually delete by the tedious effort of three mouse clicks. Not so bad. And February was not an outlier. January had only 50 spam comments, and December had 123. November, in fact, was the highest total ever seen on this blog, with 188.

Until now. I’m edging very close to the thousand mark for March. In fact, I think I’ll just hold on to this post until I hit a thousand, which will likely be before I finish writing it anyway. Woof.

Jed’s widget has held up pretty well, by the way, catching well over nine hundred of the phony comments without any work on my part at all. In fact, the fivefold increase in spam has not actually resulted in any necessary new work on my part. I’ve done some new work, looking at the subject lines of the comments and wondering why a spam comment meant to push wagering sites would attempt to disguise itself as a spam comment pushing porn sites, but then any work there is my own fault for bothering.

I can’t help, even knowing that these things are essentially costless and therefore need no deep and elaborate game theory to account for them, wondering why the sudden onrush. I mean, did they just figure out that this was a blog? Is this Tohu Bohu getting links here, there and everywhere? My Mad Google Skillz were able to come up with a not-particularly-spurious-looking search site that appears to now include at least some of my posts, but nothing else that looks particularly suspicious. Although, at some point somebody I don’t know quoted from one of my posts (which had been quoted by people I know), so that was kind of cool.

The other thing that happened was that in the three days or so when the spam attack was at hits (so far) height, we had some really nice and interesting discussion here, on two very different topics. At least two.

... YHB was interrupted in writing this, and coming back to it after an hour and a half or so, discovered that the current tally lies at one thousand, one hundred and seven.

chazak, chazak, v’nitchazek,


I get spam comments on various of my web site's order forms, which are not anything like blogs. And the volume is increasing, though far behind yours. (Odd, because I'm lightyears ahead of most people in volume of e-mail spam.) It keeps evolving, and I wouldn't be surprised if some of the spam generation is evolving at this point without programmers behind it via neural nets and such. At which point trying to understand it from a human's perspective will become much more difficult, because it really will be alien (in the non-human sense, not the from-Mars sense -- Mars is still sending very little spam, mostly because shipping Martian hoodia is so bloody expensive).

And as of Jim Infantino's November 2004 Passim concert, there is now napkin spam. Coming soon to a napkin poetry performance near you!

The more time goes on, the more I think that the main comment-spam issue has to do with visibility. My MT blog is pinging three major blogging sites every time I post an entry. It's great for my PageRank -- a lot of my entries are on the first page of Google results for various queries -- but it attracts a whole lot of spam. I suspect that if yours were to ping those sites, you would get significantly more than you do -- and conversely, if you were to move to MT but not ping on post, I suspect you wouldn't get significantly more spam than you do now. But I could be totally wrong. Maybe I should try turning off the ping-on-post on my blog for a while and see if spam volume goes down.

It's certainly true that the spammers know what MT is and seem to focus on it. But note that my Neology blog, which doesn't ping-on-post (and which isn't nearly as widely linked to, which may be the more important factor), has gotten a total of 31 pieces of comment spam in a month and a half. Whereas my Lorem Ipsum blog is getting about 45 pieces of comment spam per hour lately. (It's usually more like 10 to 20 an hour, but volume's been very high the past few days.)

The good side is that MT has a much much better set of spam-handling capabilities than my homebrew system (the one you use) does. The bad side is that even with that much better system, something like 20 to 50 pieces of comment spam slip through per day. I'm pretty vigilant about it, and I update the filters regularly as I see new keywords come up in the spam, but even so it's pretty bad. Also, unfortunately the interface for MT is a little slow and a little clunky; not as bad as the nonexistent interface for my system, of course, but not as good as I had hoped.

It's also worth noting (I'm sorry if all this is repeating things that we've talked about before; it's late and I'm probably rambling) that MT lets you do things like disallow comments from non-registered users. I haven't done that yet because I find value in allowing anonymous comments, and because too many of my readers are irregular commenters who probably wouldn't bother registering. On the other hand, once a month or so I get a really viciously nasty personal attack from, apparently, some random stranger who's happened across one of my entries and feels the need to yell at me. So sometimes the registration-required option is pretty tempting.

Anyway, my point there was meant to be that you, V, have a smallish but loyal band of regular commenters, who I suspect would probably be happy to register, so you probably wouldn't lose much by going with the registration-required option. (Once you've registered, most browsers make it extremely easy to log in.)

...Of course, none of this is really meant to suggest that you should switch. It probably ain't really currently broke; I suspect that you're right in expecting that the current flood will go away soon. And luckily, my extremely primitive spam filtering system still seems generally adequate for the usual levels of spam you get here.

Registration for comments makes it appear that a comment is traceable to a particular person. That increases the perceived accuracy and utility of the military's effort to datamine for anti-American bloggers (you know who you are).

If you just want to block comment spam through registration, there's no need to require the registration id and the posted name to be related. If you want to increase the privacy and possibly safety of your commenters, all logs should be routinely and quickly deleted so that IP addresses cannot be retrieved at a much later date.

I don't post on sites requiring registration, and I don't read anything on sites requiring registration, with very rare exception. My browser deletes cookies regularly, so I have to log in every time I go to a cookie-based site like Salon. My browser also forgets passwords on plenty of sites, so I don't read the New York Times. Just my dos centimos.

As it turns out, the massive increase in spam has been in spam that Jed's blocker catches, so there's been no real extra work for YHB. We're nearly at 2,000 in the last week, which seems like a lot, and I suspect that most times some Gentle Reader has come to the page, the Most Recent Comments have been by Gentle Readers, not by spammers. So it's cool.

I am pleased to see that some Gentle Reader is throwing in his or her tuppence on the issue, since my inclination was to agree with Jed that my Gentle Readers wouldn't mind having some sort of Typekey or OpenID or something. Not that I'm planning on doing anything about it in the near future, but it's good to keep in mind that Unregistered and like-minded GRs would find registration unfriendly if not hostile.

As for logs, one of the nice things about being Jed's serverguest is that I have no idea what logs are being kept, being destroyed, being statistically analyzed, or being inspected in warrantless searches. I just post here.


Comments are closed for this entry. Usually if I close comments for an entry it's because that entry gets a disproportionate amount of spam. If you want to contact me about this entry, feel free to send me email.

Fatal error: Cannot redeclare is_valid_email() (previously declared in /usr/home/azaz/public_html/cgi-bin/MTOS-5.2.13/php/mt.php:931) in /usr/home/azaz/public_html/cgi-bin/MTOS-5.2.13/php/mt.php on line 937