Lightweight spam scoring in procmail
am 12.01.2007 01:39:43 von Jem Berkes
I've created SpamTestBuddy which is a very small (few KB, compiled C
program) scoring tool which can be used from procmail like SpamAssassin. It
might be the simplest way to use DNSBL/RBLs in procmail, flexible too.
http://www.pc-tools.net/unix/spamtestbuddy/
SpamTestBuddy is a simple, light-weight, multiple-input spam scoring tool.
It is standalone and can be used with simple procmail rules without root
access or daemons. Features built-in support for simple DNS checks
including DNSBL (DNS-based blocklist) queries, and can scan headers from
filters such as SpamProbe, QSF, DSPAM that you already use.
The configuration is flexible and easy to edit in a human readable file.
Different tests can add or remove from the total score.
(I am not trying to re-invent SpamAssassin. That is a very powerful piece
of software with all the features you need. It is also somewhat large and
is a greater challenge to install. SpamTestBuddy will just help you bring
together existing scores from filters you already use, with a few useful
extra tests thrown in. For an all-in-one solution, try SpamAssassin).
Tested under Linux, FreeBSD, and NetBSD.
--
Jem Berkes
www.sysdesign.ca
Re: Lightweight spam scoring in procmail
am 12.01.2007 03:47:44 von Jem Berkes
> I've created SpamTestBuddy which is a very small (few KB, compiled C
> program) scoring tool which can be used from procmail like
> SpamAssassin. It might be the simplest way to use DNSBL/RBLs in
> procmail, flexible too.
>
> http://www.pc-tools.net/unix/spamtestbuddy/
I wanted to post this as well, some 'common uses' for this program:
1) Combining more than one external filter:
If you have a number of filters that output a numeric probability or
score, you can combine them together using +TestHeaderFloat and make a
decision on the total (floating point) score. Examples of filters which
integrate seamlessly are SpamProbe, QSF, DSPAM, CRM114.
2) Querying DNSBL (DNS based blocklists) or real time lists:
Typically this is done by mail servers at the time of mail receipt.
However, there are advantages to doing these "RBL" lookups later.
SpamTestBuddy will let you query multiple real-time lists for fresh data
on known spam sources, abusive networks, etc. You can combine the results
with your other statistical body-reading filters.
3) Reducing false positives from other filters:
You can use SpamTestBuddy as a secondary filter to interpret existing
scores differently. Alternatively, you can make a more conservative
configuration by combining scores. This may be helpful for revisiting
classification errors
4) Parsing the IP address of the sender
The IP address of the SMTP server which relayed mail is always visible in
the Received headers, but parsing and extracting the correct IP address
is hard to do reliably with procmail recipes alone. SpamTestBuddy can
pick out the correct address by applying SkipReceived, a list of networks
you define to consider local and ignore. The resulting IP address is
conveniently displayed in the new X-SpamTestBuddy header, simplifying
your procmail recipes.
--
Jem Berkes
www.sysdesign.ca