big access_db?

big access_db?

am 16.10.2007 08:25:11 von pilsl

How big can access_db get without effecting overall sendmail-performance?

Background of my question is, that - like everyone else - my system suffers many
spammails to users that dont even exist. Nevertheless each of this mails is
spamchecked and virustested before it gets discarded or dumped. (depending on
the spamstatus a "no such user"-mail is sent or not). So most of my cpu-cycles
for spamfighting is wasted in adresses that doesnt even exist :(

Now I would like to collect the top50 notexisting users every day and add them
to access_db as
"TO:nosuchuser@mydomain.com 550 no spam"


I guess a few 100 entries will not harm access_db, but in fact I dont have any
clue, how this feature scales on bigger numbers.
I use the blacklist_recipients-feature too of course.

thnx,
peter

Re: big access_db?

am 16.10.2007 12:30:31 von Kees Theunissen

peter pilsl wrote:
> How big can access_db get without effecting overall sendmail-performance?
>
> Background of my question is, that - like everyone else - my system suffers many
> spammails to users that dont even exist. Nevertheless each of this mails is
> spamchecked and virustested before it gets discarded or dumped. (depending on
> the spamstatus a "no such user"-mail is sent or not). So most of my cpu-cycles
> for spamfighting is wasted in adresses that doesnt even exist :(
>
> Now I would like to collect the top50 notexisting users every day and add them
> to access_db as
> "TO:nosuchuser@mydomain.com 550 no spam"

How about only accepting messages for users/aliases that *do* exist?

Regards,

Kees.

--
Kees Theunissen.

Re: big access_db?

am 16.10.2007 16:03:46 von pilsl

Kees Theunissen wrote:
>
> How about only accepting messages for users/aliases that *do* exist?
>

While this has clear advantages it also has some disadvantages:

1) some domains like to have defaultmailboxes, where every mail that does not
belong to some explictely defined mailbox go this defaultmailbox. Additionally
the server works as MX-backup for other domains which it does not host itself
and can never know which users exist for this domain.

2) mails to a "unknown" user should get a RFC-compliant retourmail unless it has
a high spamscore, in which case its silently dropped.

3) sendmail does not know which users actually exist. The users are stored in a
postgres-database and changes should appear in sendmail in no time.


I guess there is a way around 1 + 3, but what about 2 ? My idea with blocking
the most notorious "false To-adresses" would drop this adresses without needing
to spamcheck them and save system-resources.

Additionally I'm afraid of connecting sendmail directly to postgres. I didnt
find any up2date tools or docs how to do that. There are some projects at
freshmeat that are older than 3 years and neither in groups nor in web there is
much reference of people actually using it.

thnx for any further information,
peter

Re: big access_db?

am 16.10.2007 16:23:32 von hume.spamfilter

peter pilsl wrote:
> 2) mails to a "unknown" user should get a RFC-compliant retourmail unless it has
> a high spamscore, in which case its silently dropped.

That's fine, but there's no real reason why YOU should necessarily need to
be the one generating that message for a remote person.

The problem occurs when you accept an email, process it, find out the user
doesn't exist, and generate a bounce... to some innocent person who had their
address spoofed at the beginning. This is known as backscatter, and to
many people is nearly as bad as generating raw spam yourself.

As someone who is trying to dig himself and his servers out from under a
long-standing legacy architecture that generates unacceptable amounts of
backscatter, I can honestly recommend that you stop as much mail at your
borders as possible, rather than taking it in and generating a bounce
yourself. If the remote end is legit, *their* mail server will generate
the bounce and nobody is harmed.

If you can block an email that would bounce anyway right at the RCPT command,
you'll save far more in spam-processing time.

> I guess there is a way around 1 + 3, but what about 2 ? My idea with blocking
> the most notorious "false To-adresses" would drop this adresses without needing
> to spamcheck them and save system-resources.

You can either write a milter that will check against the postgres database
as needed (depending on your mail traffic, this may be severe)... or, you can
generate a job that runs - say, hourly - and builds an acceptable user
list, perhaps using Sendmail's virtusertable mechanism, and uses that.

I wouldn't file this under "trivial", but it's not something that requires
ninja-like skills, just a proper amount of development and testing.

Your blacklist is simple, and can be quickly implemented. But I'd only call
that a stopgap measure, good enough to buy you some time but not enough to
live on. With spam as it is, you'll probably find that other administrators,
and their users, are just far too hostile to unsolicited bounces.

--
Brandon Hume - hume -> BOFH.Ca, http://WWW.BOFH.Ca/

Re: big access_db?

am 16.10.2007 16:42:58 von Clemens Zauner

hume.spamfilter@bofh.ca wrote:
>> I guess there is a way around 1 + 3, but what about 2 ? My idea with blocking
>> the most notorious "false To-adresses" would drop this adresses without needing
>> to spamcheck them and save system-resources.
>
> You can either write a milter that will check against the postgres database
> as needed (depending on your mail traffic, this may be severe)... or, you can
> generate a job that runs - say, hourly - and builds an acceptable user
> list, perhaps using Sendmail's virtusertable mechanism, and uses that.

Or he can postgres as a backend to LDAP, interfacing sendmail->LDAP->postgres,
solving all point 1) 2) and 3).

cu
Clemens.
--
/"\ http://czauner.onlineloop.com/
\ / ASCII RIBBON CAMPAIGN
X AGAINST HTML MAIL
/ \ AND POSTINGS

Re: big access_db?

am 16.10.2007 23:30:13 von jeff

In article <1ed91$47145948$557f53f6$1556@news.inode.at>,
peter pilsl wrote:
>How big can access_db get without effecting overall sendmail-performance?

My access file has more than 15000 entries in it. I have no reason to
think that this size has any significant effect on performance.

:: Jeff Makey
jeff@sdsc.edu

Department of Tautological Pleonasms and Superfluous Redundancies Department

Re: big access_db?

am 17.10.2007 03:56:01 von Richard Rognlie

Jeff Makey wrote:
> In article <1ed91$47145948$557f53f6$1556@news.inode.at>,
> peter pilsl wrote:
>> How big can access_db get without effecting overall sendmail-performance?
>
> My access file has more than 15000 entries in it. I have no reason to
> think that this size has any significant effect on performance.

As a DB, you're right. 500, 5000, 500000 entries will not impact the
performance of the MTA much. It takes the "key" performs some
operations on it, and the delta between lookups for a small DB vs. a
very large one is negligible.

However, there is a cost associated with building that DB. It can take
a non-trivial amount of time to run the makemap command to convert the
plaintext file into the DB.

But there are tricks to work around it.

e.g.

makemap hash filename
temporarily LOCKS filename.db and prevents sendmail from doing lookups.

makemap hash tmp.filename
avoids that issue. the DB change becomes atomic.

--
/ \__ | Richard Rognlie / Sendmail Ninja / Gamerz.NET Lackey
\__/ \ | http://www.gamerz.net/~rrognlie
/ \__/ | Creator of pbmserv@gamerz.net
\__/ | Helping reduce world productivity since 1994

how to tell sendmail which emailadresses are valid on my system? (was Re: big access_db?)

am 17.10.2007 16:12:25 von pilsl

hume.spamfilter@bofh.ca wrote:
>
> You can either write a milter that will check against the postgres database
> as needed (depending on your mail traffic, this may be severe)... or, you can
> generate a job that runs - say, hourly - and builds an acceptable user
> list, perhaps using Sendmail's virtusertable mechanism, and uses that.
>

ok - thats what I tried now, cause your whole mail was very convincing to me :)

I want only mail accepted by my sendmail to emailadresses that actually exists
on my system.

I created a big access-db that lists every single emailadress on my system:
To: email@adress.com OK

but this somehow completely messed up virtually everything :)
Some mails to valid adresses got the response "proper authentication required",
other mails to invalid adresses got through without causing any troubles.

So the access-db is not the proper place to tell sendmail which users I have on
my system.

This leads me to the question: how can I tell sendmail which users there are
actually on my system??

At the moment the way of a mail on my system is like this:

sendmail gets it, controlled by the access-db and then uses mailertable to
decide what to do with the mail. In mailertable each domain hosted on my system
has a entry like:

mydomain.at procmail:/etc/procmail_rc/mail.ext

and the procmail-script does the spamcheck and calls a perl-script that finally
looks up the mailbox from the postgres-database and performs lot of other stuff
(including simple mailrules, filtering) and finally connects to the imap-server
via lmtp and delivers the mail. If there is any error (ie: no such user,
temporary error, whatever) this error is passed back from the perl-script to
procmail and to sendmail.

virtusertable (like you recommended) is not part of my mail-delivery-strategy. I
would not know how to use it to tell sendmail about valid adresses on my system.

thnx for any idea,
peter


> I wouldn't file this under "trivial", but it's not something that requires
> ninja-like skills, just a proper amount of development and testing.
>
> Your blacklist is simple, and can be quickly implemented. But I'd only call
> that a stopgap measure, good enough to buy you some time but not enough to
> live on. With spam as it is, you'll probably find that other administrators,
> and their users, are just far too hostile to unsolicited bounces.
>

Re: how to tell sendmail which emailadresses are valid on my system? (was Re: big access_db?)

am 17.10.2007 17:02:30 von hume.spamfilter

peter pilsl wrote:
> virtusertable (like you recommended) is not part of my mail-delivery-strategy. I
> would not know how to use it to tell sendmail about valid adresses on my system.

virtusers is described in some detail at:

http://www.sendmail.org/m4/features.html

On my machines, I use virtusertables backed by LDAP. But originally it
used a regular hash db, generated from a static file pulled hourly from
lists of users on several machines.

In my sendmail.mc, I had something like:

define(`_VIRTUSER_STOP_ONE_LEVEL_RECURSION_')dnl
VIRTUSER_DOMAIN_FILE(`/etc/mail/virtdomains.txt')dnl
FEATURE(virtusertable, `hash /etc/mail/virtusers')dnl


"virtdomains.txt" contains a list of domains that should be checked
against virtusers. virtusers is a map just like your access db (although
perhaps you're using a dbm or something).

_VIRTUSER_STOP_ONE_LEVEL_RECURSION_ lets me do something like:

hume.spamfilter@bofh.ca hume.spamfilter@bofh.ca

.... without causing "excessive recursion" errors on lookup.

virtusers is just like an aliases file, except it aliases entire domains.
Using what I've shown above, it won't do anything. Until, of course, I
add a line like this:

@bofh.ca error:5.7.1:550 No such bofh.ca address.

If it can't find any "user@domain" entries in the file, it'll try one last
time for "@domain". And that's where you put your "go away" error. You
could also theoretically put your own email address there and then you'd
receive every single email to *@mydomain.at that didn't explicitly go to
someone else. You'd have to really be into pain to turn something like
that on, though.

So, with an /etc/mail/virtusers file that looks like:

whoever@bofh.ca hume.spamfilter@bofh.ca
hume.spamfilter@bofh.ca hume.spamfilter@bofh.ca
@bofh.ca error:5.7.1:550 No such bofh.ca address.


.... mail to whoever@bofh.ca would be aliases to hume.spamfilter. Mail
to hume.spamfilter would go through. Mail to anything else @bofh.ca
would be told to go away.

This is only applied to the addresses. Your mailertable file is used the
way it always was.


This is the way I'm using virtusers to "whitelist" addresses on my
front-line servers (which feed to the antispam/antivirus appliances,
which then hand off to the internal mail relays). It works quite well,
and without it, we'd probably have to double the coin spent on the AV
machines. Most complaints stem from the lag-time between an admin on an
inner-server adding a user and the time before the outer mail machines
are willing to accept mail for the address, but that's something I'm
cutting down with the move to LDAP (the AV boxes complicate the deployment
of something like milter-ahead).

Other admins on this group might have useful suggestions as well.

--
Brandon Hume - hume -> BOFH.Ca, http://WWW.BOFH.Ca/

Re: big access_db?

am 17.10.2007 17:23:14 von Michael Heiming

In comp.mail.sendmail peter pilsl :
> Kees Theunissen wrote:

>> How about only accepting messages for users/aliases that *do* exist?

> While this has clear advantages it also has some disadvantages:

> 1) some domains like to have defaultmailboxes, where every mail that does not
> belong to some explictely defined mailbox go this defaultmailbox. Additionally
> the server works as MX-backup for other domains which it does not host itself
> and can never know which users exist for this domain.

> 2) mails to a "unknown" user should get a RFC-compliant retourmail unless it has
> a high spamscore, in which case its silently dropped.

No they shouldn't. They should get a return message from their
own MTA, as you rejected during smtp time, in an ideal world..

> 3) sendmail does not know which users actually exist. The users are stored in a
> postgres-database and changes should appear in sendmail in no time.

> I guess there is a way around 1 + 3, but what about 2 ? My idea with blocking
> the most notorious "false To-adresses" would drop this adresses without needing
> to spamcheck them and save system-resources.

What about starting to check for likely spam during smtp connect,
before even greeting happens. Though I dunno off-hand how to do
this with sendmail, I suppose it isn't impossible. That has
lowered the need to fire of SA in order of magnitudes. Which was
taking serious amounts of memory already and started screaming
for a more clever solutions or simply tons of additional memory.

Actually sending bounces is quite bad, I was checking to use a
spamtrap domain to pick up automatically candidates for a local
rbl. But alas, quite some valid MTA are sending backscatters to
this domain, so it couldn't be used at all. Lately there is quite
some backscatter from .ru MTA, for whatever reason?

[..]

--
Michael Heiming (X-PGP-Sig > GPG-Key ID: EDD27B94)
mail: echo zvpunry@urvzvat.qr | perl -pe 'y/a-z/n-za-m/'
#bofh excuse 296: The hardware bus needs a new token.

Re: big access_db?

am 18.10.2007 09:47:06 von pilsl

hume.spamfilter@bofh.ca wrote:
>
> As someone who is trying to dig himself and his servers out from under a
> long-standing legacy architecture that generates unacceptable amounts of
> backscatter, I can honestly recommend that you stop as much mail at your
> borders as possible, rather than taking it in and generating a bounce
> yourself. If the remote end is legit, *their* mail server will generate
> the bounce and nobody is harmed.
>
> If you can block an email that would bounce anyway right at the RCPT command,
> you'll save far more in spam-processing time.
>

After playing around with access-db and virtuserdb and evertyhing else I came to
the conclusion that I need to adress the problem at a different level.

I use my own mailer that processes and delivers the mail. This mailer should per
definition be the instance that checks if a user exists or not.

So I now wrote a small testmailer (named goldfisch) in perl thats invoked by
mailertable for a testdomain and leads to a perl-script.

mailer.goldfisch.at goldfisch:standard:67

The last argument is the code my testmailer should return: in that case its 67
NO SUCH USER


It works - a mail sent to this domain is handled to my mailer and in the logs
the final result is:

Oct 18 09:31:18 goldfisch sendmail[1517]: l9I7VHa1001514:
to=, delay=00:00:00, xdelay=00:00:00,
mailer=goldfisch, pri=31768, relay=standard:67, dsn=5.1.1, stat=User unknown

Nevertheless a backscatter-mail is created !!!

My question now is:
* can I avoid creating a backscatter-mail using the above strategy?
* or is my strategy flawed, cause the mailer is far to late to stop the mail at
the ENTRANCE. meaning: as soon as the mail reaches the mailer itself, it will
produce a backscatter anyway?

In the second case the only possibility would be to split my mailer into two parts:

1) a milter that checks if the user exists, if its spam or virus and then reject
the mail - which would then hopefully NOT create a backscatter-mail
2) the final mailer, which simply delivers the mail and doesnt need to do any
spamcheck ...cause the milter has already done this.

I wonder if all this would be easier with a different mailer like procmail or
qmail. No offense meant - I use sendmail for many years now and it did a
wonderful job for me, but all this new demands with spam are hard to meet, if
someone has actually work to do beside that ;)

thnx
peter

Re: big access_db?

am 18.10.2007 12:25:23 von hume.spamfilter

peter pilsl wrote:
> After playing around with access-db and virtuserdb and evertyhing else I came to
> the conclusion that I need to adress the problem at a different level.

Are you able to post your .mc file?

--
Brandon Hume - hume -> BOFH.Ca, http://WWW.BOFH.Ca/

avoiding backscatter (was: big access_db?)

am 18.10.2007 16:53:35 von Tilman Schmidt

peter pilsl schrieb:
>
> After playing around with access-db and virtuserdb and evertyhing else I came to
> the conclusion that I need to adress the problem at a different level.
>
> I use my own mailer that processes and delivers the mail. This mailer should per
> definition be the instance that checks if a user exists or not.

That's too late. The mailer only gets the mail after the SMTP session has
completed, so you won't be able to send the "User unknown" reply directly
and have to generate a bounce message instead -> backscatter.

> It works - a mail sent to this domain is handled to my mailer and in the logs
> the final result is:
>
> Oct 18 09:31:18 goldfisch sendmail[1517]: l9I7VHa1001514:
> to=, delay=00:00:00, xdelay=00:00:00,
> mailer=goldfisch, pri=31768, relay=standard:67, dsn=5.1.1, stat=User unknown

Yes, but it's too late. The "to=" in this log entry shows that by the time
it is generated, Sendmail has already accepted the mail and is now trying
to pass it on.

> * or is my strategy flawed, cause the mailer is far to late to stop the mail at
> the ENTRANCE. meaning: as soon as the mail reaches the mailer itself, it will
> produce a backscatter anyway?

Exactly.

> In the second case the only possibility would be to split my mailer into two parts:
>
> 1) a milter that checks if the user exists, if its spam or virus and then reject
> the mail - which would then hopefully NOT create a backscatter-mail
> 2) the final mailer, which simply delivers the mail and doesnt need to do any
> spamcheck ...cause the milter has already done this.

That would be a better solution, yes. Indeed, if you reject mail in a milter,
no backscatter is generated (except possibly by the sending server, but that's
beyond your influence anyway). In fact, the first half of step 1 (checking if
the user exists) needn't even be done in a milter. The oft-cited "LDAP routing
without LDAP" solution comes to mind. It all depends on where your user
database resides and how you can query it in real time.

Another popular solution is to check for non-existing users in step 1 (during
the SMTP connection) but for spam only in step 2 (after the mail is accepted).
That way you avoid backscatter for unknown recipients, but still have the
problem of what to do with detected spam. Typical solutions involve
quarantining it in a "spam suspect" folder.

> I wonder if all this would be easier with a different mailer like procmail or
> qmail.

You are using the word "mailer" in a very ambiguous way. There are two
distinct functions involved:
- the Mail Transfer Agent (MTA), currently Sendmail; possible replacements
for this are qmail (although I wouldn't recommend it), Postfix, Exim etc.
- the Mail Delivery Agent (MDA), currently "your own mailer", which could
be replaced by Procmail.
Back to your question: I do not think your task would become any easier
by switching to a different MTA and/or MDA.

--
Please excuse my bad English/German/French/Greek/Cantonese/Klingon/...

Re: big access_db?

am 19.10.2007 11:20:06 von pilsl

hume.spamfilter@bofh.ca wrote:
> peter pilsl wrote:
>> After playing around with access-db and virtuserdb and evertyhing else I came to
>> the conclusion that I need to adress the problem at a different level.
>
> Are you able to post your .mc file?
>

here you are. thnx
peter


######################
#
# peter 2007
#
# sendmailfile version goldfisch v7
#
#################


VERSIONID(`peter.goldfisch v7.0')
OSTYPE(linux)

define(`STATUS_FILE', `/etc/mail/statistics')

FEATURE(`no_default_msa')
DAEMON_OPTIONS(`Port=smtp, Name=MTA_l, Address=127.0.0.1')
DAEMON_OPTIONS(`Port=smtp, Name=MTA_138, Address=62.99.149.138')
DAEMON_OPTIONS(`Port=smtp, Name=MTA_140, Address=62.99.149.140')
DAEMON_OPTIONS(`Port=587, Name=MSA_lo, M=E, Address=127.0.0.1')

# aliasing
define(`ALIAS_FILE',`/etc/mail/aliases')

# class{G}
#
GENERICS_DOMAIN_FILE(/etc/mail/genericsdomain)

# alter sender name/domain that is in class{G}
# example : root@goldfisch.at sepp@jans.it
FEATURE(genericstable, hash /etc/mail/genericstable)

# mailertable allows handling of mails using different mailers on a
per-domain-selection
FEATURE(`mailertable', hash /etc/mail/mailertable)

# allows different handling of mails based on emailadress
FEATURE(`virtusertable',hash /etc/mail/virtusertable)

# allows defining on permisson on a per-net or per-host-base, mainly for relaying
FEATURE(access_db, hash -T /etc/mail/access)

FEATURE(`blacklist_recipients')

# this makes sendmail use local-host-names which defines all domains that should
be delivered locally
# (virtusertable is only for local delivered mails !!)
# this domains and all local ip's and its reversed local hostnames form class{w}
#
FEATURE(`use_cw_file')

# this adds all users in /etc/mail/trusted-users to the trusted-user-"group"
that can send mails under a different
# sendername (-f flag in sendmail)
FEATURE(`use_ct_file')

FEATURE(local_procmail, /usr/bin/procmail)

FEATURE(`dnsbl')

define(`confDONT_PROBE_INTERFACES',`True')


TRUST_AUTH_MECH(`PLAIN LOGIN DIGEST-MD5 CRAM-MD5')
define(`confAUTH_MECHANISMS', `PLAIN LOGIN CRAM-MD5 SSAPI DIGEST-MD5')

define(`confCACERT_PATH', `/data/ssl/peter')
define(`confCACERT', `/data/ssl/peter/ca.crt')
define(`confSERVER_CERT', `/data/ssl/peter/smtp.goldfisch.at.crt')
define(`confSERVER_KEY', `/data/ssl/peter/smtp.goldfisch.at.key')


INPUT_MAIL_FILTER(`clmilter',`S=local:/var/run/clamav/clamdm ilter.sock,F=,
T=S:4m;R:4m')
INPUT_MAIL_FILTER(`goldmilter',`S=local:/var/run/goldmilter. sock,F=, T=S:4m;R:4m')
define(`confINPUT_MAIL_FILTERS', `clmilter,goldmilter')

MAILER(smtp)
MAILER(cyrus)
MAILER(local)
MAILER(procmail)
MAILER(goldfisch)

Use milter only for incoming mails (Re: avoiding backscatter)

am 19.10.2007 11:36:52 von pilsl

Tilman Schmidt wrote:

>
> Another popular solution is to check for non-existing users in step 1 (during
> the SMTP connection) but for spam only in step 2 (after the mail is accepted).
> That way you avoid backscatter for unknown recipients, but still have the
> problem of what to do with detected spam. Typical solutions involve
> quarantining it in a "spam suspect" folder.
>

ok - I tried this one by writing a small milter which does exactly this. Was
actually big fun to write my first milter and I actually succeeded - somehow :)
The milter checks all envrcpt-callbacks if the recipient on the system. Which
works very fine for incoming mails, but the milter also checks outgoing mails
and relayed mails, which makes it impossible to send mails out or use the server
as smtp-server for the clients.

I now dont know if this is a minor problem that can be easily solved if one
knows how or this is really a big thing I should have thought before.

I checked the milter-config in sendmail.mc, but there is no way to tell sendmail
that it should only check the incoming mails. I also didnt find a hint in the
milter-API-doc to determine if a mail is incoming, outgoing or relayed.
According to http://www.milter.org/archives/000647.php (discussion in 2003)
this problem is not trivial !!

So I'm somehow stuck at this point at the moment. I actually also dont know if
this newsgroup is the correct place for milter-specific questions (or if I
should at least finally start a new thread about this topic)
On the other hand this is one of the most interesting usenet-conversation I had
in years, cause I get a lot of very useful information and insights at a high
level from all of you. Thnx a lot for helping me here and sharing your knowledge
with me.

I actually wonder why there is not more information about milter

peter

Re: Use milter only for incoming mails (Re: avoiding backscatter)

am 19.10.2007 13:35:53 von pilsl

peter pilsl wrote:
>
> I checked the milter-config in sendmail.mc, but there is no way to tell sendmail
> that it should only check the incoming mails. I also didnt find a hint in the
> milter-API-doc to determine if a mail is incoming, outgoing or relayed.
> According to http://www.milter.org/archives/000647.php (discussion in 2003)
> this problem is not trivial !!
>

It seems there are ways to determine the type of mail:

the milter-API provides a getsymval-method to get the values of several macros.
Some of them seems to be relieable and not open to changes from spammers.

* envfrom provides AUTH_TYPE to detect clients relaying via SMTP-AUTH

* helo provides DAEMON_NAME to detect mails coming "inside interfaces" (lo, lan)
and which are coming from outside (wan) (assuming one defined different daemons
in sendmail.mc as I do - see below).

* envrcpt provides rcpt_mailer and rcpt_host. If one uses mailertable and
homemade mailers this two values are unique on the local system and may be used
to identify incoming mails.


ie: (processnumber: callback: macroname : VALUE)

30953: envrcpt: rcpt_mailer : procmail
30953: envrcpt: rcpt_host : /etc/procmail_rc/mail.ext
or
30957: envrcpt: rcpt_mailer : goldfisch
30957: envrcpt: rcpt_host : standard:67

vs
30958: envrcpt: rcpt_mailer : esmtp
30958: envrcpt: rcpt_host : gmail.com

Imho rcpt_mailer and rcpt_host are unfakeable values that identify incoming
mails 100%. Is this assumption correct?

additionally daemon_name and auth_type are unfakeable values to identify
thrustworty and not-thrustworthy emails (assuming all my users are
thrustworthy), cause emails with SMTP-AUTH or from a local/inside interface are
100% from my users. Is this assumption correct/wrong also?


thnx for any comment,
peter

ps:
the daemons in my sendmail.mc:

FEATURE(`no_default_msa')
DAEMON_OPTIONS(`Port=smtp, Name=MTA_l, Address=127.0.0.1')
DAEMON_OPTIONS(`Port=smtp, Name=MTA_138, Address=62.99.149.138')
DAEMON_OPTIONS(`Port=smtp, Name=MTA_140, Address=62.99.149.140')
DAEMON_OPTIONS(`Port=587, Name=MSA_lo, M=E, Address=127.0.0.1')

The daemon for the LAN is not activated at the moment, but it would read like:
DAEMON_OPTIONS(`Port=smtp, Name=MTA_lan, Address=10.21.0.1')

Re: big access_db?

am 19.10.2007 14:29:46 von hume.spamfilter

peter pilsl wrote:
> FEATURE(`virtusertable',hash /etc/mail/virtusertable)

I'd throw an explicit:

define(`_VIRTUSER_STOP_ONE_LEVEL_RECURSION_')dnl
VIRTUSER_DOMAIN_FILE(`/etc/mail/virtdomains.txt')dnl

(I like things spelled out...)

But overall I don't see anything blatantly wrong with that. The
virtuser stuff should happen long before procmail enters the picture.

Keep in mind that you don't need to break email for your users while
testing. You can generate a sendmail-tmp.cf, and test it from the
command line using:

sendmail -bt -C ./sendmail-tmp.cf

Then you can do stuff like:

ADDRESS TEST MODE (ruleset 3 NOT automatically invoked)
Enter


> 3,0 fakeaddress@bofh.ca
canonify input: fakeaddress @ bofh . ca
Canonify2 input: fakeaddress < @ bofh . ca >
Canonify2 returns: fakeaddress < @ bofh . ca . >
canonify returns: fakeaddress < @ bofh . ca . >
parse input: fakeaddress < @ bofh . ca . >
Parse0 input: fakeaddress < @ bofh . ca . >
Parse0 returns: fakeaddress < @ bofh . ca . >
ParseLocal input: fakeaddress < @ bofh . ca . >
ParseLocal returns: fakeaddress < @ bofh . ca . >
Parse1 input: fakeaddress < @ bofh . ca . >
final input: fakeaddress < @ bofh . ca . >
final returns: fakeaddress @ bofh . ca
Parse1 returns: $# error $@ 5 . 7 . 1 $: 550 No such bofh . ca address .
parse returns: $# error $@ 5 . 7 . 1 $: 550 No such bofh . ca address .

.... whereas a "real" address should get something like:

> 3,0 realaddr@bofh.ca
canonify input: realaddr @ bofh . ca
Canonify2 input: realaddr < @ bofh . ca >
Canonify2 returns: realaddr < @ bofh . ca . >
canonify returns: realaddr < @ bofh . ca . >
parse input: realaddr < @ bofh . ca . >
Parse0 input: realaddr < @ bofh . ca . >
Parse0 returns: realaddr < @ bofh . ca . >
ParseLocal input: realaddr < @ bofh . ca . >
ParseLocal returns: realaddr < @ bofh . ca . >
Parse1 input: realaddr < @ bofh . ca . >
final input: realaddr < @ bofh . ca . >
final returns: realaddr @ bofh . ca
canonify input: realaddr @ bofh . ca
Canonify2 input: realaddr < @ bofh . ca >
Canonify2 returns: realaddr < @ bofh . ca . >
canonify returns: realaddr < @ bofh . ca . >
Parse0 input: realaddr < @ bofh . ca . >
Parse0 returns: realaddr < @ bofh . ca . >
ParseLocal input: realaddr < @ bofh . ca . >
ParseLocal returns: realaddr < @ bofh . ca . >
Mailertable input: < bofh . ca > realaddr < @ bofh . ca . >
Mailertable input: bofh . < ca > realaddr < @ bofh . ca . >
Mailertable returns: realaddr < @ bofh . ca . >
Mailertable returns: realaddr < @ bofh . ca . >
MailerToTriple input: < kil-av-3 . ucis . dal . ca > realaddr < @ bofh . ca . >
MailerToTriple returns: $# relay $@ kil-av-3 . ucis . dal . ca $: realaddr < @ bofh . ca . >
Parse1 returns: $# relay $@ kil-av-3 . ucis . dal . ca $: realaddr < @ bofh . ca . >
parse returns: $# relay $@ kil-av-3 . ucis . dal . ca $: realaddr < @ bofh . ca . >

--
Brandon Hume - hume -> BOFH.Ca, http://WWW.BOFH.Ca/

Re: big access_db?

am 19.10.2007 14:37:27 von hume.spamfilter

I dug up an old FEATURE I wrote long ago. It builds a simple whitelist
used during check_rcpt. You'll probably have to adjust some paths in
there, and I'll be first to admit that this probably isn't classy rule
writing. It's extremely cautious... if the whitelist db isn't available for
any reason, it'll allow everything through.

You can test this (as I mentioned before) in sendmail -bt with:
check_rcpt some@address

/local/mail/whitelist_addresses is a hashfile build with just a plain
list of valid addresses, ie:

real1@bofh.ca
real2@bofh.ca
....

And /local/mail/whitelist_domains.txt is just a list of domains to
whitelist, ie:

bofh.ca
bofh2.ca

The domains file, being a class, is only read at sendmail restart.

#----------------------------------------------------------- ----------------

LOCAL_CONFIG

F{whitelist}/local/mail/whitelist_domains.txt
Kwhiteaddrs hash -o -T /local/mail/whitelist_addresses
Kwlsyslog syslog

LOCAL_RULESETS

############################################################ ################
# Whitelist check_rcpt
# Arguments:
# Returns: OK, REJECT, or DISCARD. We don't bother with ERROR.

SLocal_check_rcpt
R$* $: <$1>
R<<$*>> <$1>
R<$*@$+> $1<@$2> # Separate out the domain
R$+<@$+> $: <$1@$2>$2
R<$*>$={whitelist} $: <$1>.DOIT
R<$*>.DOIT $: <$1> . $>"Do_Whitelisting" $1
R<$*>.$@ $: <$1> $(wlsyslog "Local_check_rcpt: Do_Whitelisting returned nothing!" $)
R<$*>.REJECT $#error $@ 5.7.1 $: "550 Invalid recipient"
R<$*>.DISCARD $#discard $: discard
R<$*>.OK $: <$1>

###################################################
# Helper function;
# Arguments: user@host
# Returns: OK, REJECT, or DISCARD.
# We have to be careful of certain types of email addresses,
# like '+' addrs; user+detail@dal.ca, etc.

SDo_Whitelisting
R$* $: < $(whiteaddrs $1 $: ? $) > <$1>
R<$* > <$*> $@ OK # OK on map failure
R<$*> $: $1 # Mark an "unfound" error.
R<$+><$*> $@ $1 # We got a value from the map, use it.
R$* $: <><$1>
R<$*><$+ + $+ @ $+> <$1 + $3><$2 @ $4> # Strip out details
R<><$+> $@ REJECT # This is done so we don't needlessly recurse
R<$*><$*> $@ $>"Do_Whitelisting" $2 # Recurse with the "detail"-less address.

Re: how to tell sendmail which emailadresses are valid on my system? (was Re: big access_db?)

am 20.10.2007 01:16:23 von per

In article peter pilsl
writes:
>hume.spamfilter@bofh.ca wrote:
>>
>> You can either write a milter that will check against the postgres database
>> as needed (depending on your mail traffic, this may be severe)... or, you can
>> generate a job that runs - say, hourly - and builds an acceptable user
>> list, perhaps using Sendmail's virtusertable mechanism, and uses that.
>>
>
>ok - thats what I tried now, cause your whole mail was very convincing to me :)
>
>I want only mail accepted by my sendmail to emailadresses that actually exists
>on my system.
>
>I created a big access-db that lists every single emailadress on my system:
>To: email@adress.com OK

There's nothing really wrong with any of the other suggestions discussed
in this thread, but I'd say that you were on the right track *for your
situation* with access db in the first place. However an entry like that
has no function other than overriding something like 'To:address.com
REJECT', which in turn has no effect unless you use blacklist_recipients
(which you do). The major problem is that now you are not allowing
*relaying* to the users in the address.com domain, which you need due to
the mailertable->procmail setup - even if the delivery is actually
local, sendmail has no knowledge of that and considers the delivery to
be relaying.

But there is a very nice fix for that - as of sendmail 8.14.0, you can
allow relaying *per user* in access db - i.e. you put in entries like

To:email@address.com RELAY

- with nothing at all for 'address.com' itself, and blacklist_recipients
is irrelevant here (if you don't use it for other purposes, take it out
since it causes extra lookups). *However*, this requires that you add an
option to the FEATURE line for access_db, see cf/README:

`relaytofulladdress' enable entries of the form
To:user@example.com RELAY
to allow relaying to just a specific
e-mail address instead of an entire domain.

(this should be the third argument, after your filename spec). If you're
not running 8.14.x yet and don't have time to upgrade at the moment,
this is actually available as an undocumented feature since 8.13.0, but
then you instead need to use

define(`_RELAY_FULL_ADDR_' `1')

in your .mc file (and this method happens to work in 8.14.x too so far,
but shouldn't really be relied on there either).

>but this somehow completely messed up virtually everything :)
>Some mails to valid adresses got the response "proper authentication required",
>other mails to invalid adresses got through without causing any troubles.

The reason for the former is above, I'm not sure off-hand about the
reason for the latter, but maybe it's not worth spending too much time
figuring that out, since the method can't work anyway.

>So the access-db is not the proper place to tell sendmail which users I have on
>my system.

Sometimes it is, at least nowadays... Or rather, it can be the way to
tell sendmail which users that you have but *aren't* on your system, at
least as far as sendmail can see:-) - with a "traditional" setup with
users in the passwd file, sendmail of course "knows" without being told
anything - and the above method works well also for the case where the
users *really* aren't on your system, and sendmail is just relaying
their mail to other hosts.

--Per Hedeland
per@hedeland.org

Re: big access_db?

am 20.10.2007 02:09:01 von per

In article hume.spamfilter@bofh.ca writes:
>peter pilsl wrote:
>> FEATURE(`virtusertable',hash /etc/mail/virtusertable)
>
>I'd throw an explicit:
>
>define(`_VIRTUSER_STOP_ONE_LEVEL_RECURSION_')dnl
>VIRTUSER_DOMAIN_FILE(`/etc/mail/virtdomains.txt')dnl
>
>(I like things spelled out...)

Hm, that isn't "spelling out", those entries make significant semantic
changes to the functionality (the second line only if you actually have
something in the file, of course).

--Per Hedeland
per@hedeland.org

Re: big access_db?

am 23.10.2007 18:46:10 von pilsl

peter pilsl wrote:
>
> Background of my question is, that - like everyone else - my system suffers many
> spammails to users that dont even exist. Nevertheless each of this mails is
> spamchecked and virustested before it gets discarded or dumped. (depending on
> the spamstatus a "no such user"-mail is sent or not). So most of my cpu-cycles
> for spamfighting is wasted in adresses that doesnt even exist :(
>

everyone here for the insights and ideas. I finally managed to create a milter
that does exactely what I wanted: it checks if the recipient exists on the
envrcpt-call and returns a nice 550 if user doesnt. Backscatter is therefore
banned from my system.

If someone finds this thread via google later: I wrote the milter in perl using
Sendmail::PMilter and I hopefully will post more details soon on my homepage.
www.goldfisch.at

thnx,
peter

Re: big access_db?

am 23.10.2007 19:45:57 von pilsl

peter pilsl wrote:
>
> everyone here for the insights and ideas. I finally managed to create a milter

^^^^

it should read "great thnx to everyone here" :)

p