Emails bypassing MX priorities?

Emails bypassing MX priorities?

am 11.04.2008 21:25:14 von CGI-Carl

Hi,

We recently ran into a problem in our email alert system. We are
currently using Sendmail as a relay to deliver to outsides domains
and ,while it does it reliably, the emails sent to one particular
domain sometimes time out during the delivery and they show up 30
minutes later, something that we cant afford.

Here's the result of my nslookup to see the MX servers related to the
recipient's domain:

[root@sendmail /]# nslookup
> set type=MX
> pager.bellmobility.ca
Server: 192.168.248.5
Address: 192.168.248.5#53

Non-authoritative answer:
pager.bellmobility.ca mail exchanger = 5 bm.srvr.bell.ca.
pager.bellmobility.ca mail exchanger = 10 mail.txt.bell.ca.

Authoritative answers can be found from:
bellmobility.ca nameserver = ns1.mobility.com.
bellmobility.ca nameserver = ns1.bellglobal.com.
bellmobility.ca nameserver = ns2.mobility.com.
bm.srvr.bell.ca internet address = 206.47.199.69
bm.srvr.bell.ca internet address = 209.226.175.252
mail.txt.bell.ca internet address = 206.47.78.138
ns1.bellglobal.com internet address = 198.235.216.1


Now here's the part of maillog related to a specific mail who came in
30 minutes too late.

Apr 10 04:47:47 sendmailserver01 sendmail[29898]: m3A8jlrv029893:
to=, delay=00:02:00, xdelay=00:02:00,
mailer=esmtp, pri=120881, relay=mail.txt.bell.ca. [206.47.78.138],
dsn=4.0.0, stat=Deferred: Connection timed out with mail.txt.bell.ca.
Apr 10 05:21:53 sendmailserver01 sendmail[30730]: m3A8jlrv029893:
to=, delay=00:36:06, xdelay=00:00:16,
mailer=esmtp, pri=210881, relay=bm.srvr.bell.ca. [206.47.199.69],
dsn=2.0.0, stat=Sent (ok: Message 197821071 accepted)


What could cause Sendmail to fallback on the higher cost MX without
even trying the primary? We are not using any domain routing,
mailertable is clear, our Sendmail version is Sendmail 8.13.1/8.13.1
running on a CentOS release 4.4. I did a fair bit of reseach and came
out empty-handed so any lead would be apreciated.

Re: Emails bypassing MX priorities?

am 14.04.2008 05:23:46 von Bill Cole

In article
<86f928d1-b9c5-43f7-be0e-2d2821901a44@t54g2000hsg.googlegroups.com>,
CGI-Carl wrote:

> Hi,
>
> We recently ran into a problem in our email alert system. We are
> currently using Sendmail as a relay to deliver to outsides domains
> and ,while it does it reliably, the emails sent to one particular
> domain sometimes time out during the delivery and they show up 30
> minutes later, something that we cant afford.

Then you are mistaken about the nature of email and its applicability to
the task you have given it. Perhaps you would be better off with a
direct pager or SMS gateway of your own rather than Internet email.


> Here's the result of my nslookup to see the MX servers related to the
> recipient's domain:
>
> [root@sendmail /]# nslookup
> > set type=MX
> > pager.bellmobility.ca
> Server: 192.168.248.5
> Address: 192.168.248.5#53
>
> Non-authoritative answer:
> pager.bellmobility.ca mail exchanger = 5 bm.srvr.bell.ca.
> pager.bellmobility.ca mail exchanger = 10 mail.txt.bell.ca.
>
> Authoritative answers can be found from:
> bellmobility.ca nameserver = ns1.mobility.com.
> bellmobility.ca nameserver = ns1.bellglobal.com.
> bellmobility.ca nameserver = ns2.mobility.com.
> bm.srvr.bell.ca internet address = 206.47.199.69
> bm.srvr.bell.ca internet address = 209.226.175.252
> mail.txt.bell.ca internet address = 206.47.78.138
> ns1.bellglobal.com internet address = 198.235.216.1
>
>
> Now here's the part of maillog related to a specific mail who came in
> 30 minutes too late.
>
> Apr 10 04:47:47 sendmailserver01 sendmail[29898]: m3A8jlrv029893:
> to=, delay=00:02:00, xdelay=00:02:00,
> mailer=esmtp, pri=120881, relay=mail.txt.bell.ca. [206.47.78.138],
> dsn=4.0.0, stat=Deferred: Connection timed out with mail.txt.bell.ca.
> Apr 10 05:21:53 sendmailserver01 sendmail[30730]: m3A8jlrv029893:
> to=, delay=00:36:06, xdelay=00:00:16,
> mailer=esmtp, pri=210881, relay=bm.srvr.bell.ca. [206.47.199.69],
> dsn=2.0.0, stat=Sent (ok: Message 197821071 accepted)
>
>
> What could cause Sendmail to fallback on the higher cost MX without
> even trying the primary?

A cached host status could do that. If you are using host status
caching, you probably want to switch it off if immediacy is important to
you.

> We are not using any domain routing,
> mailertable is clear, our Sendmail version is Sendmail 8.13.1/8.13.1
> running on a CentOS release 4.4. I did a fair bit of reseach and came
> out empty-handed so any lead would be apreciated.

--
Now where did I hide that website...

Re: Emails bypassing MX priorities?

am 14.04.2008 21:14:58 von gtaylor

On 04/13/08 22:23, Bill Cole wrote:
> A cached host status could do that. If you are using host status
> caching, you probably want to switch it off if immediacy is important
> to you.

I would also suggest that you look in to using the DELIVER BY option so
you will get a notice elsewhere if there is a problem delivering the
message(s) with in the time specified.



Grant. . . .

Re: Emails bypassing MX priorities?

am 14.04.2008 22:35:13 von CGI-Carl

On 04/13/08 22:23, Bill Cole wrote:
>
> > A cached host status could do that. If you are using host status
> > caching, you probably want to switch it off if immediacy is important
> > to you.

Wouldnt that cause every email directed to that domain to bounce back
if it was due to cache? We dont use nscd on our sendmail server.
Tracking the problem is hard since I cant reproduce the error manually
because it is a random event, maybe 1 in 30 emails goes directly to
the secondary MX server. As for the fact that we're using emails as a
warning method, unfortunately, I have to work with it for now.

The question: What other option or config could cause mail to bypass
the priorities of MX servers and head for the secondary randomly? If
the sendmail server was contacting the primary everytime, we wouldnt
be in this mess.

>I would also suggest that you look in to using the DELIVER BY option so
>you will get a notice elsewhere if there is a problem delivering the
>message(s) with in the time specified.

Could you please elaborate?

Re: Emails bypassing MX priorities?

am 15.04.2008 00:06:18 von gtaylor

On 04/14/08 15:35, CGI-Carl wrote:
> Could you please elaborate?

http://www.rfc-editor.org/rfc/rfc2852.txt

If memory serves, DELIVERBY is an option supported by some servers
(Sendmail and others) that instructs the system to either deliver the
message (as it normally would) with in the time specified or to return
an error (DSN) to the (envelope) sender of the message indicating that
the message was not able to be delivered in time.

So what I was referring to was that you could send your messages from a
known email address using the DELIVERBY option so that the said email
address would receive notifications if the emails were not able to be
sent with in the time specified. This would at least let you know that
the message did not make it to your intended recipient(s). Hopefully
this knowledge would help you.



Grant. . . .

Re: Emails bypassing MX priorities?

am 15.04.2008 04:21:06 von Bill Cole

In article
<9c343378-9528-4acf-b1e7-3d8b4132de15@a23g2000hsc.googlegroups.com>,
CGI-Carl wrote:

> On 04/13/08 22:23, Bill Cole wrote:
> >
> > > A cached host status could do that. If you are using host status
> > > caching, you probably want to switch it off if immediacy is important
> > > to you.
>
> Wouldnt that cause every email directed to that domain to bounce back
> if it was due to cache?

No. If Sendmail has cached the fact that a *host* is unreachable, it
will try secondary MX hosts for the domain if they exist on subsequent
queue runs until that cached *host* status expires or with subsequent
messages in the same queue run.

> We dont use nscd on our sendmail server.

Not relevant. Host status caching isn't about DNS, it is about host
responsiveness, and it is done by Sendmail itself. If you have
HostStatusDirectory set in your sendmail.cf, you are caching host status
across queue runs. The default cache entry lifetime is 30 minutes. If
you don't have HostStatusDirectory set, individual queue runners will
still track host status for that run unless you have ForkEachJob set
(which is not generally good for performance.)

> Tracking the problem is hard since I cant reproduce the error manually
> because it is a random event, maybe 1 in 30 emails goes directly to
> the secondary MX server. As for the fact that we're using emails as a
> warning method, unfortunately, I have to work with it for now.

Look in the logs for reachability or timeout problems with the primary
MX host ahead of the problems. Those could result in a host status cache
entry, and even if you are not using the persistent cache, the status is
remembered by an individual queue run. (see the docs regarding
ForkEachJob)

> The question: What other option or config could cause mail to bypass
> the priorities of MX servers and head for the secondary randomly? If
> the sendmail server was contacting the primary everytime, we wouldnt
> be in this mess.

Nothing is random.

Beyond looking for the queue ID of a delayed message in the log, you
might also want to look for other lines mentioning the primary MX host.
Capturing a few lines before and after all of the clearly relevant lines
might illuminate the event as well.

To reduce the impact of these events, you might get some help by
shortening the time between queue runs, so that even if a message fails
on the first try and gets queued for retry (an event which is perfectly
possible without any mysterious behavior by Sendmail) it will be retried
sooner rather than later.


> >I would also suggest that you look in to using the DELIVER BY option so
> >you will get a notice elsewhere if there is a problem delivering the
> >message(s) with in the time specified.
>
> Could you please elaborate?

It is an SMTP extension. See RFC2852. Using it would require:

1. Modifying the originators of the messages so that they use the
extension on initial submission.
2. Assuring that SMTP path between the message origin(s) and your outer
border all support the extension.

There is no documented way and no way I can see from a quick search of
the Sendmail source to tell Sendmail to use DELIVERBY except via SMTP,
i.e. no command line or sendmail.cf interface. I don't think it is
likely to be your best option.

--
Now where did I hide that website...