Whitelist-based spam filtering

Replies:

  • None.

Parents:

At 10:51 1/18/2001 -0500, Gerald Oskoboiny wrote:
>>[1] http://impressive.net/people/gerald/2000/12/spam-filtering.html

Sounds pretty nifty! However, for users of Eudora (not-as-cool as pine/mutt)
I thought I'd provide a some hints as to give a fairly similar white-list
functionality. (I'm using GNU, but windoze users could use whatever
scripting they are comfortable with.)

CREATING the whitelist:

1. Create a nickname "me" with your email addresses.
2. Create a nickname "whitelist"  containing
2a. those you've sent email to
 gnu: cat History.lst >> whitelist.txt
2b. those in your address book
 gnu: grep alias nndbase.txt | awk '{print $NF}' >> whitelist.txt
2b. those you've received email from and sit in clean mailboxes
 gnu: grep From: *.mbx | gawk 'BEGIN \
    { RS = "[.@]*[^-0-9A-Za-z_.@]+[.@]*" } /@/' | sort | uniq >>
whitelist.txt
 gnu: cat *.mbx | formail -XFrom: | formail -r -xTo: | tr -d ' '` >>
whitelist.txt
  /* formail is better, but doesn't always work on eudora files */

FILTERING the email:

1. Put the message in the spam mailbox.
2. If any recipient intersects-nickname "me" and is from @w3.org put in Inbox.
3. If any recipient intersects-nickname "me" and is from intersects-nickname
"whitelist" put in Inbox.
4. Run mailing list filters to sort everything else into WG boxes. Anything
else stays behind in the spam mbox.

ADDING addresses:
A. Run the process again.
B. Make your first filter a manual filter that Notifies an application with
the message. Select the messages for which you want them added, and hit
control-J (If you have other manual filters, you should probably put a stop
in after this one, that you have to disable later if you want them to work.)
B1. Have the filter notify and application, and pass it the message '%6'
which is actually the location of a temp file:
  D:\cygwin\bin\bash.exe "D:\Documents and Settings\reagle\bin\atw" %6
where atw is
  grep From: $1 | gawk 'BEGIN { RS = "[.@]*[^-0-9A-Za-z_.@]+[.@]*" } /@/' \
  >> /tmp/new-entries
Then add those to your whitelist nickname. You can also have these entries
automatically appended to that nickname in the eudora nickname file
(nndbase.txt), but you get the point.
__

While I've tested it, I actually don't use this exact method myself as I use
bits of it to easily whitelist email from friends (by enumeration) and most
colleagues (by domain and enumeration) into inbox, then everything else goes
in spam goes through about ~20 filters that are header and content based and
quite effective. If one of these pass, the process stops. If not, my mailing
list filters kick into filter the remaining messages that haven't been
immediately whitelisted into inbox, or immediately blacklisted kept in spam,
to their appropriate mailboxes. Those things that are left over stay in spam
too.

__
Regards,          http://www.mit.edu/~reagle/
Joseph Reagle     E0 D5 B2 05 B6 12 DA 65  BE 4D E3 C1 6A 66 25 4E
MIT LCS Research Engineer at the World Wide Web Consortium.

* This email is from an independent academic account and is
not necessarily representative of my affiliations.

HURL: fogo mailing list archives, maintained by Gerald Oskoboiny