I was pinged on IRC earlier today by someone who was having an e-mail discussion with Charles Arthur of the Guardian, in response to this article on Six steps to stopping spam. Since I spend a lot of my day job doing anti-spam engineering for a large organisation, Robbie thought that I might have some useful comment.
I've fired an e-mail off, which I reproduce below, in the hope that it might be useful to a wider audience.
I see Robbie's introduced me. In my $dayjob I run the anti-spam engineering for a very large (200,000+ users) financial institution.
There are a few things that I'd like to comment on in your article, http://technology.guardian.co.uk/weekly/story/0,,1948268,00.html. I realise some of these are quotes from other people.
Oh, and in the spam-fighting community there's the notion of the FUSSP -- that's the Final Ultimate Solution to the Spam Problem. So called because periodically people pop up proclaiming that spam is now a solved problem, if only everybody would adopt their scheme. There's a taxonomoy of these at
On "challenge-response" systems. These are a phenomenally bad idea. Primarily this is because on most spam the From: address is forged. So you receive a spam, and end up sending a challenge to the wrong person. In effect, you are spamming them with your challenges. This doesn't help.
Even if spam didn't forge addresses, by using a C-R system you are essentially out-sourcing your anti-spam operation to a myriad of third parties. Do you trust all those third parties to reliably accept the challenge? I know a number of people who will automatically junk any challenges they receive, and, similarly, a number of people who will automatically reply to any challenge they receive, irrespective of whether or not they sent the original message.
Oh, and there's also the issue of C-R deadlock. Support you and I both run a C-R system. I send you a message, and don't whitelist you. You send me a challenge. I send you a challenge in response to your challenge... and neither one is ever responded to.
On ISP's blocking other port 25 mail -- this is a very appropriate action to carry out. However, it needs to be carried out in conjunction with a second step -- that is, accepting mail submission on port 587. This is a standard port assignment, designed for exactly this situation. Submission on port 25 and 587 are very similar -- the difference is that port 587 submission is *designed* to require authentication.
So the big, ISP to ISP type communication continues on port 25, and individuals, such as Philip Parker, would configure their mail program to send mail to their university using port 587, authenticating with his university's mail servers.
This has still to catch on in a big way, although some of the larger ISPs are moving to support it (in the same way that port 25 blocking is still in relative infancy).
On limiting e-mails sent per day -- your correspondents are correct that no single limit is going to be appropriate for everybody. However, two modifications make this more acceptable.
First, provide a mechanism for customers to change their limit, possibly subject to some review. Start them off with a low limit, let them increase it to some capped value if necessary.
Second, change the granularity, so that instead of messages per day it becomes messages per hour, or similar.
On writing a worm that kills botnets. Hopefully the legal ramifications of this are obvious, and don't need to be stated.
What is missing, though, is the notion that ISPs might hold their customers to greater account. If your contract with the ISP allowed them to levy a charge for any damage to their network or reputation caused by your failure to keep your computer botnet free then customers might start to realise the true cost, and invest more time and attention in keeping their systems clean.
The "Adopt IPv6" comments seem misguided. IPv6 helps solve many problems (and Marshall's comments in the article are quite correct) but it will not significantly help in the fight against spam.
Something that wasn't covered is better detection of abuse by ISPs, with more help for cleaning customers who have become infected.
Botnet traffic patterns are relatively obvious (e.g., a customer host that's gone from sending 20 e-mails a day to 20,000). When this is detected the majority of the customer's internet access should be shut down. Attempts by them to browse any web site should redirect them to a site at the ISP that explains why their access has been curtailed, and allows them to download tools that can help them clean their PC.
Obviously, this needs to be backed up with a mechanism to rapidly unlock customers who are caught because their traffic patterns have changed legitimately, but those should be few and far between.
Finally, to respond to your comment to Robbie:
Charles Arthur wrote:
>> thanks for your letter. Interesting, though it seems to me that
>> increasing the time taken for mail transfer from a few milliseconds
>> to several hours would have very serious effects; and you'd have to
>> be certain you were tarpitting the right IPs.
>> Most of all, it wouldn't stop spam from botnets, which are
>> individual PCs.
I assume Robbie's talking about delaying all inbound connections.
Broadly, there are three ways to do this.
If you're a public spirited individual, and you have a lot of resources at your disposal, you may use various public lists of spamming IP addresses, and wait for connections from those IP addresses. Then you just respond to them v-e-r-y s-l-o-w-l-y.
Surprisingly, that does tend to work, at least a little, as it results in the spammers needing larger and larger botnets to send the same amount of spam. Sadly, it seems that acquiring larger and larger botnets is not proving to be too much of a problem, so this is, at best, a delaying tactic.
You may choose to adopt what's called "greylisting". Essentially, you have your servers maintain a record of IP address/sender address that you see.
If you receive a connection from an IP address and sender that you've not previously heard from, you temporarily reject the message, and note that you've done so.
If you receive a connection from an IP address and sender that you've previously heard from, you accept the message (or, at least, allow it to go through to the next stage of the spam filtering).
The thinking here is that most spam-sending software, when it receives a temporary rejection, will treat that as a permanent failure, and move on.
Well behaved, non-spam sending software will, on the other hand, hang on the message, and try to send it to you again at some point in the near future.
This is, as you noted, introducing a deliberate delay in to the mail flow, which may not be acceptable. Note, though, that you only do this once per IP/sender address combination, so it's only the first message that is ever delayed.
The third way to do this is to insert delays at the protocol level, taking advantage of the fact that much spam sending software tries to get spam out as quickly as possible, and will ignore any deliberate delays you introduce. These delays, which may be of the order of a few seconds, or tens of seconds, don't introduce significant delay in to the mail flow for legitimate senders, but are much more damaging to botnets.
I hope all that's helpful. Please don't hesitate to get in touch if you've got any other spam related (or general e-mail) queries.