Unsolicited Commercial E-mail, commonly called “spam,”[1] is not a security problem. Or is it? I chose to discuss it in this month’s [2] letter for 3 reasons. 1) The tactics of spammers are even more insidious and resourceful than many attackers. 2) Many spam messages are sent with real but “stolen” e-mail addresses in the From: header (identity theft, authentication issues). 3) I’ve been dealing with it myself and am writing a class about dealing with it, so it has been on my mind. I assume everyone reading this knows spam when he or she sees it. I will give some background and then look at 3 methods used for countering spam, combination systems, and finally what I do at avolio.com.
Spammers “harvest” e-mail from web pages, chat rooms, USENET, and bulletin board postings — anywhere they can find addresses on the Internet. This is why you probably get spam but your grandmother never does. They sell the lists to marketers, some reputable but naïve (why else would firms in China send me information about valves and pipe fittings) and some disreputable. Spammers make use of open e-mail relays — systems that accept e-mail from anywhere to anyone. They also make use of “free” e-mail accounts or legitimate accounts to send tens of thousands of spams through their ISPs relays until the ISP shuts them down.
We could define an RBL as “A list of servers which send out spam or are known to be open relays.”[3] The “list provider” provides a list of known spammers or open relay sites. A subscriber may then choose to reject connections from any site on that list. It is not differentiate between spam and non-spam. I believe that this is the main problem with RBLs. An IP address of a large ISP could be on such a list because someone believes the ISP is not as aggressive as it should be with spammers. But if a server knows that a connecting client appears on an RBL, it can use that information as part of rating whether a particular message might be spam.
This is very similar to using RBLs, except it is usually manual and sometimes less broad than using RBLs. For example, one can decide to reject connections from “real-big-noses.com.” (It was really hard to keep this tasteful and not hit an already established domain name for an example.) You wouldn’t do this, necessarily, if you were an ISP, but you might if you run your own small e-mail server (as I do), or if you were certain that there could be no business reason for someone at real-big-noses.com to send e-mail to someone in your organization.
One can use patterns to look for certain key words in the headers or body of the e-mail message. For example, if, as you server collects the e-mail, the system sees “Subject: Earn Big Money With E-mail!!!” you set it to drop the connection, not collecting the rest of the e-mail. Or, you can quarantine e-mail if, for example, the subject says, “REQUEST FOR URGENT BUSINESS.” It *could* be the Nigerian Scam[4], but it could be someone having urgent business.
You do have to be careful to get the regular expressions right. I set up one I thought would be fairly thorough in rejecting e-mail that was an offer to increase the size of my … nose, we’ll say. (I’d get one a day, and my nose is big enough, thank you, as you can tell from my photo on my web page.) It did stop such spam as well as two legitimate messages having nothing to do with my nose. You have to get it right.
Some systems use statistical analysis to check the “spamminess” of a message. Sure it is easy to decide that anything with “larger nose and nostrils” is spam. But spammers learn to avoid those words in various ways (as mentioned above). Statistical analysis can allow a system to “learn” characteristics of spam and “ham” — non-spam e-mail.
There are commercial servers that provide content-filtering, including spam protection, using a combination of many of these methods. In January 2003, I wrote a review of five such “e-mail firewalls” for Information Security Magazine ( www.infosecuritymag.com/2003/feb/gatewayguardians.shtml).
There are also services that screen e-mail for spam. The service will analyze e-mail routed through it and tag e-mail as spam. Examples include Brightmail (www.brightmail.com) — which my ISP uses — and Postini (www.postini.com).
A friend uses a paid service called Spamarrest (spamarrest.com). The first time someone sends him an e-mail, the sender receives a reply with a URL pointing to a verification screen containing a word obscured by lines. It asks the sender to type the word he sees. It is verifying that the sender is a person. If the sender does it correctly, it adds his address to an “allow” list for mail to that individual.
I never see rejected e-mail, so I have to be careful about this. Right now I reject all e-mail connections from ISPs in China (see *Wired* article www.wired.com/news/politics/0,1283,50455-2,00.html). I want to reject anything I am certain is spam, so it never makes it to my system at all, or is never delivered for me to download. (I am not interested spam solutions that only work on my PC.)
I also use Spamassassin (spamassassin.org) to use its statistical analysis and ability to “learn.” Configured to analyze and tag suspected spam with subject “[Maybe Spam]”, I set up my e-mail client to put mail with “Maybe Spam” in a folder for review at a later time. Finally, e-mail then comes from my domain server to my local ISP, which further sanitizes it through Brightmail.
I am happy with the results. I have not yet “taught” Spamassassin, but will do so by loading it with any spam it misses and a whole bunch of “ham.” Just like security, this is cannot be 100%. But it is getting closer.
I’m available for consultations in the area of e-mail configurations and spam control.
I wrote a security review of SSL-based VPNs for Aventail (www.aventail.com). It is at www.avolio.com/papers/SSLVPN_SecWP.pdf. I wrote DNS 101 ( www.avolio.com/columns/DNS101.html) for WatchGuard. My May 2003 “Just the Basics” column in Information Security Magazine is “A Firewall for All Occasions” and is at www.infosecuritymag.com/2003/may/justthebasics.shtml.
[1] SPAM — all capital letters is a trademark of Hormel. UCE — the term never caught on like “spam” is called spam probably referring to “Monty Python’s Flying Circus” sketch in which “SPAM” (yes, referring to the Hormel product) gets mentioned over and over again until it seems like there is no end to it. SPAM, the product, is quite tasty and is still a big seller for Hormel. Spam, the UCE, is not. You can find more on the web. You know what to do.
[2] I use the term “month” so you won’t notice it is not always monthly.
[3] theory.whirlycott.com/~phil/antispam/rbl-bad/rbl-bad.html
[4] www.ustreas.gov/usss/financial_crimes.shtml#Nigerian
[5] www.wired.com/news/politics/0,1283,50455,00.html
mail-abuse.org