Question for ProcMail Guys about tracking and scoring.

Question for ProcMail Guys about tracking and scoring.

am 28.12.2006 18:22:34 von ihatzi

Ok guys, thanks in advance for helping.

What I am trying to do is catch certian Spam Keywords in incomming mail
and move those to our Spam folder.

This is what I have and it works fine:

# Keywords to Move to Spam
:0 HB
* 1^1 .*0em software.*
* 1^1 .*ambien.*
* 1^1 .*antibiotics.*
* 1^1 .*anti.*
* 1^1 .*antidepressants.*
..Spam/

BUT!!!!

If I happen to have a non Spam email (ham) get caught up in the filter,
I want to know what keyword was responsible for scoring the message.

In the example above, the .*anti.* keyword would probably catch ham
along with spam emails.

How do I mark the email indicating this was the one that was triggered!
Our actuall keyword list has hundreds of entrys and some are catching
ham, and we have no way to tweak the list as we dont know which are
catching the ham.

HELP!

Ion

Re: Question for ProcMail Guys about tracking and scoring.

am 28.12.2006 18:42:21 von keeling

ihatzi@hotmail.com :
> Ok guys, thanks in advance for helping.
>
> What I am trying to do is catch certian Spam Keywords in incomming mail
> and move those to our Spam folder.
>
> This is what I have and it works fine:
>
> # Keywords to Move to Spam
> :0 HB
> * 1^1 .*0em software.*
> * 1^1 .*ambien.*
> * 1^1 .*antibiotics.*
> * 1^1 .*anti.*
> * 1^1 .*antidepressants.*
> .Spam/
>
> [snip]
> How do I mark the email indicating this was the one that was triggered!

* 1^1 ()\/.*Oem software.*

The "\/" thing is the MATCH operator. Whatever matches the regexp
that follows it goes into MATCH.

# Keywords to Move to Spam
#
* 1^1 ()\/.*Oem software.*

...

:0
{
LOG="${MATCH}"
:0:
.Spam/
}


--
Any technology distinguishable from magic is insufficiently advanced.
(*) http://www.spots.ab.ca/~keeling Linux Counter #80292
- - http://www.faqs.org/rfcs/rfc1855.html Please, don't Cc: me.
Spammers! http://www.spots.ab.ca/~keeling/emails.html

Re: Question for ProcMail Guys about tracking and scoring.

am 28.12.2006 20:32:17 von ihatzi

Ok, that s a little closer than I was.

The MATCH expression seems to return the entire content of what was
matched.

ie. if the text was found in the body of the email, MATCH seems to
return the enire body. This doesnt necessarly tell me which keyword
match had a hit.

Ion

Re: Question for ProcMail Guys about tracking and scoring.

am 28.12.2006 21:52:26 von Alan Clifford

On Thu, 28 Dec 2006 ihatzi@hotmail.com wrote:

>
> If I happen to have a non Spam email (ham) get caught up in the filter,
> I want to know what keyword was responsible for scoring the message.
>
> In the example above, the .*anti.* keyword would probably catch ham
> along with spam emails.
>

I do it this way to put all the reasons as headers into the mail.

# variables #
SPAMREASON_HEADER="X-Mundungus-Spam-Reason: "
NL="
"
# Note there really is a newline in the NL variable

###### Add reason header lines for selecting as spam #####
# remove the variable
SPAMREASON

:0
* ^FROM_MAILER
* ^(subject|assunto):.*\$.*[0-9][0-9][0-9][0-9]
{
nl
nl=${SPAMREASON+"$NL"}
SPAMREASON="${SPAMREASON}${nl}${SPAMREASON_HEADER}#105: dollar numbers"
}

# Mell
# intercept if the same long line has been found
:0
* ^X-MELL-Status: Yes
{
nl
nl=${SPAMREASON+"$NL"}
SPAMREASON="${SPAMREASON}${nl}${SPAMREASON_HEADER}#121 MELL threshold"
}


# continue with recipes.

# then

# Add reason headers

# Add the spam reason headers
:0 fhw
* ! SPAMREASON ?? ^^^^
| formail -A "${SPAMREASON}"

# deliver spam as required, eg:

# intercept if the same long line has been found
:0 :
* ^X-MELL-Status: Yes
IN.zzzz.mell


# deliver spam
:0:
* ! SPAMREASON ?? ^^^^
IN.zzzz.spam


The last recipe one is the key one. The penultimate one allows me to put
spam caught with a program of my own to be put in its own spam mailbox.

You should subscribe to the procmail mailing list. Lots of help there.
http://mailman.rwth-aachen.de/mailman/listinfo/procmail

--
Alan

( If replying by mail, please note that all "sardines" are canned.
There is also a password autoresponder but, unless this a very
old message, a "tuna" will swim right through. )

Re: Question for ProcMail Guys about tracking and scoring.

am 28.12.2006 22:02:27 von Alan Clifford

On Thu, 28 Dec 2006, ihatzi wrote:

i> Ok, that s a little closer than I was.
i>
i> The MATCH expression seems to return the entire content of what was
i> matched.
i>
i> ie. if the text was found in the body of the email, MATCH seems to
i> return the enire body. This doesnt necessarly tell me which keyword
i> match had a hit.
i>

Take out all the .*


--
Alan

( If replying by mail, please note that all "sardines" are canned.
There is also a password autoresponder but, unless this a very
old message, a "tuna" will swim right through. )

Re: Question for ProcMail Guys about tracking and scoring.

am 28.12.2006 22:22:23 von Allodoxaphobia

On 28 Dec 2006 11:32:17 -0800, ihatzi wrote:
> Ok, that s a little closer than I was.
>
> The MATCH expression seems to return the entire content of what was
> matched.
>
> ie. if the text was found in the body of the email, MATCH seems to
> return the enire body. This doesnt necessarly tell me which keyword
> match had a hit.

If you're going to insist on using googlegroups on usenet, learn how to
do it right -- firstly by quoting previous, pertinent material.

Earlier you showed us:

> * 1^1 .*0em software.*
> * 1^1 .*ambien.*
> * 1^1 .*antibiotics.*
> * 1^1 .*anti.*
> * 1^1 .*antidepressants.*

Sure, if you are trying to apply MATCH ("\/") to .*0em software.*,
$MATCH _will_ contain your regex match -- WHICH IS THE ENTIRE FREAKING
BODY OF THE EMAIL _if_ it contains "0em software".

Why don't you search just for "0em software"?

* 1^1 ()\/0em software


Your application of procmail recipes is naive.

Just for your first test you need to look for
0em software
oem software
03m softwar3
oem software -- spelled with the upper ASCII-encoded a's e's and o's
o.e.m. s.o.f.t.w.a.r.e.
oem sofftware
oem softtware
oem sotfware
... yaa-daa, yaa-daa, yaa-daa.

And, _this_ test of yours is DANGEROUS:

> * 1^1 .*anti.*

You will be tagging as spam any email containing anyone of at least
3,210 possible (English) words:
:
|$ grep -i anti /usr/share/dict/words | wc -l
|3210
:
Even this one is not too smart:

> * 1^1 .*ambien.*


There are PLENTY of examples of procmail scripts on the web -- as well
as a number of complete .promailrc's that some folks have posted.
Spend some time studying them.


Oh, and learn to use a real usenet newsreader. Get shed of that
googlegroups abomination.

Jonesy
--
Marvin L Jones | jonz | W3DHJ | linux
38.24N 104.55W | @ config.com | Jonesy | OS/2
*** Killfiling google posts: