RE: notes on SMTP-time spamassassin rejections

Replies:

  • None.

Parents:

impressive! :-) he he he

ok, but if a message comes in with bogus address and legit address this
should also lead you to believe that this is a bogus email. for example:

to: [email protected], [email protected],
[email protected]
from: [email protected]
subject: penis vagina penis vagina

crap crap crap crap


now say the spamcity.bastards.org is a legit address with MX records and
such. The To address is legit also but becuase it's accompanied with invalid
addresses you own shouldn't this also be rejected? Is that what the
honeypots do? is look at then entire SMTP and BSMTP headers?

Cheers,

David.


-----Original Message-----
From: [email protected] [mailto:[email protected]]On
Behalf Of Gerald Oskoboiny
Sent: Tuesday, June 08, 2004 11:15 AM
To: [email protected]
Subject: Re: notes on SMTP-time spamassassin rejections


* Gerald Oskoboiny <[email protected]> [2004-06-01 13:01-0400]

> The spamassassin that runs at SMTP time is a generic one that
> doesn't learn over time because it doesn't have a bayes DB that
> it can write to (because it runs as user nobody), so it is much
> less effective than it could be.
>
> I had planned to figure out how to set up a bogus user with a
> bayes DB that I could train over time, but it seems tricky to do
> that with exiscan-acl so maybe I should just configure SA on
> mr-burns to use my personal bayes DB.

I did this, and started training spamassassin on any spam it
misses (maybe 5-10/day), set up a few honeypots (notes below),
and wow, what a huge improvement.

My spam intake has dropped to 1998 levels. It's actually eerily
quiet. I'm worried that I must be rejecting too much stuff, but
can't find any evidence of legit mail being blocked.

To set up honeypot addresses, I checked for the most common
unrouteable addresses in exim's rejectlog (somehow a bunch of
bogus addrs got onto spammer's lists, usually truncated versions
of real addresses, e.g. [email protected]) and turned those
into aliases for a new user I created:

   # spam honeypots (most common unrouteable addrs in rejectlog)
   rald:   spam-honeypot
   ald:    spam-honeypot
   ...

(I could have also just created a bunch of fake addrs and put
those on my web site to be crawled by email harvesting bots, but
might as well use addresses that were already known to spammers.)

The 'spam-honeypot' user has the same uid as gerald so it can
write to my bayes DB, and it feeds all its non-daemon mail into
sa-learn using a procmailrc like this:

   # procmailrc for spam-honeypot user: feed all mail into sa-learn --spam

   PATH=$HOME/bin:/usr/bin:/bin:/usr/local/bin

   :0:
   * ^FROM_DAEMON
   from-daemon

   :0c
   | sa-learn --spam

   :0:
   sa-learned-spam

and ~spam-honeypot/.spamassassin is symlinked to ~gerald/.spamassassin

(spam-honeypot etc above are actually called something else; I
didn't put the real names here because I don't want spammers
finding out the names of my honeypots and poisoning them with
legitimate mail.)

--
Gerald Oskoboiny <[email protected]>
http://impressive.net/people/gerald/

HURL: fogo mailing list archives, maintained by Gerald Oskoboiny