Re: Spam filters

* Hugo Haas <[email protected]> [2001-01-27 19:49-0500]
> I have finally switched from Junkfilter[1] to whitelist based
> filtering.
[..]

* Hugo Haas <[email protected]> [2002-04-03 11:10-0500]
[..]
> It seems that there is no immediate nor easy technological answer, and
> no easy legal action either.

I have changed my spam filtering techniques taking into account the
new type of spam. I talked to Max who started using SpamAssassin[2]
and was happy about it. I had a look and found it cool. But I didn't
want to abandon my whitelist filtering.

I therefore am using 3 different folders:
- emails identified as spam.
- emails not identified as spam from people I know (who are on my
 whitelist).
- emails not identified as spam from people I don't know.

SpamAssassin works with a scoring system. I use my whitelist to
decrease the score when somebody is on my whitelist. It is therefore
easier to be considered as a spammer if the address in not on my
whitelist.

I have also enabled Vipul's Razor[3] for increasing my detection
accuracy. When I detect spam which isn't registered in Razor, I do so.

Here is what it looks like:

-*- Promailrc
=============

Whitelist detection:

 # White-base filtering
 WHITELIST_DIR=$HOME/whitelist
 WHITELIST=$WHITELIST_DIR/whitelist
 ffield=`formail -XFrom: | formail -r -xTo: | tr -d ' '`
 :0fhw
 * ? grep -F -i -x -q "$ffield" $WHITELIST
 | formail -i "X-HH-Whitelist: YES"

 :0Efhw
 | formail -i "X-HH-Whitelist: NO"

Spam filtering:

 SPAMASSASSINDIR=$HOME/spam/spamassassin

 :0fw
 | $SPAMASSASSINDIR/spamassassin -c $SPAMASSASSINDIR/rules -P

 :0:
 * ^X-Spam-Flag: YES
   spam

If something hasn't been classified as spam, see if I know the guy:

 INCLUDERC=$HOME/.procmail/spamfiltering

 :0:
 * ^X-HH-Whitelist: NO
   unknown

-*- SpamAssassin
================

Here is how I use my whitelist:

 # Whitelist filtering
 header          ON_WHITELIST    X-HH-Whitelist  =~      /^YES$/
 describe        ON_WHITELIST    Sender whitelisted
 score           ON_WHITELIST    -5.0

I have a few other non-related settings:

 # Don't rewrite the subject
 rewrite_subject 0

 # Leave the content-type alone
 defang_mime 0

 # Report in the header
 report_header 1
 use_terse_report 1

-*- Muttrc
==========

A few things that I configured to make my life easier:

 # Spam stuff
 # Show spam headers
 unignore X-Spam-Status X-Spam-Report
 # Highlight spam
 #ifndef USE_IMAP
 color index     red     default "~h '^X-Spam-Flag: YES'"
 color index     red     blue "~h '^X-Spam-Flag: YES' ! ~h '^X-Spam-Status: .*RAZOR_CHECK'"
 #endif
 # How to report spam
 #define REPORT_BULK_SPAM ";|formail -s spamassassin -r -D\n"
 macro index \eR "T! ~s '\^[[]Moderator Action[]] ' ~h '\^X-Spam-Flag: YES' ! ~h '\^X-Spam-Status: .*RAZOR_CHECK'\n"
 macro index \eS REPORT_BULK_SPAM
 macro pager \eS REPORT_BULK_SPAM

Note that there are spp commands because I preprocess my muttrc[4].

I am going to test that extensively and tweak it if necessary.

 2. http://spamassassin.taint.org/
 3. http://razor.sf.net/
 4. http://larve.net/people/hugo/2002/04/mutt-cpp
--
Hugo Haas <[email protected]> - http://larve.net/people/hugo/
Perhaps, but let's not get bogged down in semantics. -- Homer J.
Simpson

Re: Spam filters

On Mon, Apr 15, 2002 at 06:13:01PM -0400, Hugo Haas wrote:
> * Hugo Haas <[email protected]> [2001-01-27 19:49-0500]
> > I have finally switched from Junkfilter[1] to whitelist based
> > filtering.
> [..]

> I have changed my spam filtering techniques taking into account the
> new type of spam. I talked to Max who started using SpamAssassin[2]
> and was happy about it. I had a look and found it cool. But I didn't
> want to abandon my whitelist filtering.

SpamAssassin looks excellent from what I have seen. I understand
it has some kind of automatic whitelist feature: every time you
receive non-spam from someone, their whitelist score increases?
(or something like that)

Thanks for the docs on your setup, I'm sure that will be useful.

> Note that there are [cpp] commands because I preprocess my muttrc[4].

>   2. http://spamassassin.taint.org/
>   4. http://larve.net/people/hugo/2002/04/mutt-cpp

I'm curious why you need to use cpp; I have most of my settings
in my .muttrc [5], and use a couple extra files [6] for other stuff
that is specific to a certain environment (personal or w3c mail)

For my w3c mail, I invoke mutt with "w3cmutt", which is aliased to:
   zot "w3c mail"; localsuffix="-w3c" mutt

(zot just changes the rxvt title bar; it's called zot because that's
what it was called when I got it from a friend 10 years ago)

Hmm... I guess you tried something like that before switching to
cpp; I'm just wondering what it was you finally needed cpp for.

[5] http://impressive.net/people/gerald/misc/dotfiles/muttrc
[6] http://impressive.net/people/gerald/misc/dotfiles/muttrc-local-devo
   http://impressive.net/people/gerald/misc/dotfiles/muttrc-local-w3c

(I don't need different configs for local/remote, since I always
store all my mail on my laptop.)

--
Gerald Oskoboiny <[email protected]>
http://impressive.net/people/gerald/

Re: Spam filters

Replies:

  • None.

Parents:

* Gerald Oskoboiny <[email protected]> [2002-04-15 23:07-0400]
> SpamAssassin looks excellent from what I have seen. I understand
> it has some kind of automatic whitelist feature: every time you
> receive non-spam from someone, their whitelist score increases?
> (or something like that)
[..]

Yes, there is an auto-whitelist feature. I haven't tried it yet. I
wasn't sure about how to leverage my existing whitelist to bootstrap
it, so I preferred to try and integrate my whitelist another way, and
maybe I will play with the auto-whitelist later.

I was somewhat worried that the whitelist would let everything
through. By default, you need 5 points to be declared as spam. An
email from the 'EMail-IT' True Stealth System that I was complaining
about[7] scores as follows:

 X-Spam-Report:   13.4 hits, 5 required;
   * -0.3 -- Cc: contains similar domains at least 10 times
   *  1.7 -- BODY: Includes a link to send a mail with a subject
   * -0.2 -- BODY: Includes a URL link to send an email
   *  3.5 -- BODY: Link to a URL containing "remove"
   *  3.0 -- Listed in Razor, see http://razor.sourceforge.net/
   *  4.5 -- HTML-only mail, with no text version
   *  0.2 -- From and To the same address
   *  1.0 -- Received via a relay in orbs.dorkslayers.com
     [RBL check: found 150.82.130.139.orbs.dorkslayers.com.]

With my whitelist 5 point-bonus, it scores 8.4 and is still recognized
as spam. From what I have seen, their whitelist had a 100 point-bonus,
which seems for too much[8]:

header From: address is in the user's white-list USER_IN_WHITELIST -100.0

There must be something I haven't understood about it yet.

> > Note that there are [cpp] commands because I preprocess my muttrc[4].
>
> >   2. http://spamassassin.taint.org/
> >   4. http://larve.net/people/hugo/2002/04/mutt-cpp
>
> I'm curious why you need to use cpp; I have most of my settings
> in my .muttrc [5], and use a couple extra files [6] for other stuff
> that is specific to a certain environment (personal or w3c mail)
>
> For my w3c mail, I invoke mutt with "w3cmutt", which is aliased to:
>     zot "w3c mail"; localsuffix="-w3c" mutt
>
> (zot just changes the rxvt title bar; it's called zot because that's
> what it was called when I got it from a friend 10 years ago)
>
> Hmm... I guess you tried something like that before switching to
> cpp; I'm just wondering what it was you finally needed cpp for.

Indeed I tried something like that, but it got rapidly very complex.
On my laptop, depending on if I read my private mail or my work mail,
if I use isync to read my IMAP folders locally or if I read them
remotely, I have 4 aliases:

 imutt='my_mutt -DWORK_CONF -DON_LAPTOP -DUSE_IMAP --'
 imuttp='my_mutt -DON_LAPTOP -DUSE_IMAP --'
 mutt='my_mutt -DWORK_CONF -DON_LAPTOP --'
 muttp='my_mutt -DON_LAPTOP --'

and at work, I have:

 mutt='my_mutt -DWORK_CONF --'
 muttp='my_mutt --'

My configuration is fairly complex because I have lots of different
settings for each of them. Here is an example:

#ifdef WORK_CONF
  #ifdef USE_IMAP
    #define FOLDER "{localhost:1430}mail"
  #else
    #define FOLDER "~/mail"
  #endif
#else
  #ifdef USE_IMAP
    #define FOLDER "{localhost:1430}private-mail"
  #else
    #define FOLDER "~/private-mail"
  #endif
#endif

set folder=FOLDER

and another one:

#ifndef USE_IMAP
  #ifndef ON_LAPTOP
    # Hide the IMAP server messages
    folder-hook . "push \"<limit> ! (~s 'DELETE THIS MESSAGE -- FOLDER INTERNAL DATA' ~f MAILER-DAEMON)\n\""
  #endif
#endif

I used your technique for a long time, but it just became too complex
to manage so many configurations.

 7. http://impressive.net/archives/fogo/[email protected]
 8. http://spamassassin.taint.org/tests.html
--
Hugo Haas <[email protected]> - http://larve.net/people/hugo/
Kids, your mother's under a lot of pressure, why don't we let her clear
the table in peace? -- Homer J. Simpson

Re: Spam filters

Replies:

  • None.

Parents:

* Hugo Haas <[email protected]> [2002-04-15 18:13-0400]
> I have changed my spam filtering techniques taking into account the
> new type of spam. I talked to Max who started using SpamAssassin[2]
> and was happy about it. I had a look and found it cool. But I didn't
> want to abandon my whitelist filtering.
>
> I therefore am using 3 different folders:
> - emails identified as spam.
> - emails not identified as spam from people I know (who are on my
>   whitelist).
> - emails not identified as spam from people I don't know.
>
> SpamAssassin works with a scoring system. I use my whitelist to
> decrease the score when somebody is on my whitelist. It is therefore
> easier to be considered as a spammer if the address in not on my
> whitelist.
>
> I have also enabled Vipul's Razor[3] for increasing my detection
> accuracy. When I detect spam which isn't registered in Razor, I do so.

I wanted to give an update an my spam filtering system. With the new
version of SpamAssassin (2.20) and Razor (1.20), my spam filter
catches about 98% of the spam (I lowered the threshold to 3.6 hits and
tweaked a couple of other rules). The 2% of spams that got through
went into my unknown sender folder.

The only non-spam email I saw it catch were bounces from mailing
lists.

In order to make sure that I improve my (and everybody else's) spam
filtering, I systematically bounce spam that went through to
spamassassin-sightings[4] (ESC-B in my Mutt session) and register all
confirmed spam with Razor[5] (ESC-R ; ESC-Z in my Mutt session). This
is easy enough that it just takes a few seconds every day or two.

Basically, I am *very* happy about this new system, and would
encourage people to use it: the more people use Razor and report spams
to it, the less spam we will see.

 4. http://lists.sourceforge.net/lists/listinfo/spamassassin-sightings
 5. http://razor.sourceforge.net/
--
Hugo Haas <[email protected]> - http://larve.net/people/hugo/
Mais alors, tout se recoupe !

HURL: fogo mailing list archives, maintained by Gerald Oskoboiny