[Evolution] Spam Filter
Steve Murphy
murf@e-tools.com
09 Jul 2002 16:15:24 -0600
--=-VPCuNKidFB8qqVSPEZTU
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable
I'm using both spambouncer(1.5) and spamassassin (2.40), via procmail,
which serves as the local delivery agent... both are set up to just tag
the messages.
I have filters set up in evolution to move the spam tagged messages into
folders, to mark them as important, to mark them as read, and to cease
further processing. Right now, I just act on spambouncer tags alone,
while I compare the two.
If you want to do both, like me, you have to do spamassassin first in
your procmail file, then after spamassassin adds its tags, let
spambouncer do it's thing. Do terse reports in the headers, don't defang
the mime, and Don't modify the subject with spamassassin, or you won't
have spambouncer work right.
http://www.spambouncer.org
http://www.spamassassin.taint.org
http://www.procmail.org
And, of course, you'll need perl for spamassassin. Spambouncer is purely
procmail based scripts.=20
And again, of course, you may want to install vipul's Razor, and the DCC
stuff, as they are pretty efficient now at spotting spam.
I've added some updates to spambouncer that make it better at catching
the base64 stuff, better at spotting letters that say they are from
whitelisted individuals or yourself, but really aren't, and better at
finding out if the sender is a free email account.=20
Personally, it's almost a tie between the two; spambouncer is a bit
better percentage wise, when spamassassin has a threshold of 5. I
haven't had the time to crank down the threshold for spamassassin till
it has about equal percentages with spambouncer; my guess that would
result in a good number of false positives, which would be odious. Right
now the false positives are fairly low, maybe 2 or 3 a week, out of
maybe 200 or so junk emails spotted in the same period. I've gotten
spambouncer to where I hardly EVER get a false negative.
With Spamassassin, the threshold of 5 is letting a few spams thru
unchallenged. At least they both unerringly always get the
Korean/Chinese spam, which seems to be around 80% of the junk I get
personally.
Tips and hints:
1. With spambouncer, define all program paths explicitly in the
.procmailrc; don't count on any of them being in the path.
2. With evolution, and only because I'd like to contribute
to dsbl, razor, dcc, and ordb, I add a recipe to the .procmailrc which sav=
es
copy of the incoming emails into a directory, with the (hopefully unique)
message-id as its title. This way, I have a copy of the ORIGINAL letter on
hand, which I can feed to razor-report, and dccproc, etc.
The following are the very last lines in my .procmailrc file:
...
...
...
LETTERDIR=3D<path to some lonely directory>
MESSID=3DXXXX
:0:
* ^Message-Id: \/.+
{
MESSID=3D${MATCH}
}
:0c
${LETTERDIR}/${MESSID}
:0fw
| /usr/local/bin/spamassassin -P -a
INCLUDERC=3D${SBDIR}/sb.rc
3. My advice is to mark the letter as read when you file it as spam. This
is purely for psychological reasons. You can check in the "spam" folder=20
as often as you please, but at least your mail reader isn't going to be=20
calling to you to read the stupid stuff.
4. I wrote a little program that will search thru the stdin for Message-Id
lines, and, for each one found, will use the message-id field to find the
copy of the original message, and will feed that letter to razor-report, t=
he
dsbl spamtrap, and DCC. I could also submit it to ORDB as well... I set=20
up an alias so mail sent to the alias gets run thru this program. spamassa=
ssin
has a "report" option as well, to submit to dcc and razor, which I could h=
ave
used just as well. For all the spam you wish to report, just highlight all=
the
appropriate messages in the spam folder, and hit control-j. Give the spamt=
rap
address, and away it goes. This will probably be even easier with 1.1 of=20
evolution, so you don't have to address the message to your spamtrap alias
every time. Setuid the executable of such sendmail-invoked programs.
5. I add a crontab command to limit the number of saved messages to maybe t=
he
last 3 days or so.
6. Take ALL your regular mailing lists (I'm on maybe 65 or so), and prefilt=
er
them out in your .procmailrc, so neither spamassassin nor spambouncer see =
it,
and you don't save any copies. Why waste cpu cycles on stuff you know won'=
t be
spam?
7. To keep down false positives, do a whitelist in spambouncer. It seems le=
ss
important in spamassassin (at least, if you stick with the default thresho=
ld
of 5!). Filter your addressbook into the .nobounce file. And if you do sho=
rt
domain names in there, like "sun.com", say "@sun.com" instead.
8. I mark spam letters as important, because they end up in different folde=
rs,
based on the tags from spambouncer. All I have to do is glance into the
"Important mail" VFolder to see them all at once. Nice.
9. Do NOT automatically report all mail the filters tag as spam to razor. Y=
ou
could do this with dcc, if you are careful that the count increment is "1".
In general, I'd advise a human eyeballs the message and classifies it as sp=
am
before submitting. It will make for much better databases. And, you can=20
tell dcc that it's spam with an "infinite" count that way.
It would be real cool if evolution would more tightly couple with the
spam filters, and have some features to help in spam submission to razor
and dcc, etc. But, it munges the incoming letter, which may be bad for chec=
ksum
databases like dcc and razor (but then, maybe not...! But rather than take =
a
chance, I keep an original around.).
What could evolution do to better couple with spam checking software? How
about:
1. Have a special attribute, similar to "importance", for "spam".
2. Provide a default "spam" folder in evolution, which is just part of the=
=20
default installation.
3. Default filters that file stuff in the spam folder if X-SBClass: is not =
"OK",
or "X-Spam-Flag: YES": is set. This'd give a new user a "fast start". An=
d they
could set the "spam" attribute.
4. If the configure script spots a working installation of razor-report or
dccproc, it'd be nice to turn on options that would allow you to submit
selected letters to them. If everybody feels that evolution-munged lette=
rs
yield the same checksums as the original, that is!
5. Auto-whitelist update for spambouncer and spamassassin, based on the
addressbook. Maybe even options in the addressbook to mark (checkbox?)=20
for those you don't want to add to the whitelists, with general options
of what whitelists to generate, and where they would be located. Spamass=
assin
and spambouncer don't use the exact same format. Maybe a button to write=
the=20
whitelists from the addressbook... or somesuch.
murf
On Tue, 2002-07-09 at 12:22, evolution-admin@ximian.com wrote:
> Message: 14
> From: "Patrick J. Doland" <pjdoland@pjdoland.com>
> To: evolution@ximian.com
> Date: 09 Jul 2002 14:33:11 -0400
> Subject: [Evolution] Spam Filter
>=20
> Howdy-
>=20
> Does anyone have an relatively effective set of filter rules to reduce
> spam with Evolution?=20
>=20
>=20
--=-VPCuNKidFB8qqVSPEZTU
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQA9K2B86/5EwE4UaOQRApAgAJsH6coSNHu4ZrRE58R6koQE7xgFXgCg32mw
iGa6PLrC803XfLMi5CL2P7w=
=9p74
-----END PGP SIGNATURE-----
--=-VPCuNKidFB8qqVSPEZTU--