"SYSERR(root): timeout writing message" after upgrade to 8.13.8

"SYSERR(root): timeout writing message" after upgrade to 8.13.8

am 08.10.2006 01:46:14 von mark

Hello,

I would like to bring the following issue to your attention. I posted it
on comp.mail.sendmail a few days back, but I received no relevant answer.
But I cannot just stop upgrading sendmail as of now. So I would really
like an answer.

When I do a google on:

"timeout writing message" "comp.mail.sendmail"

I see that this seems to be a persistent issue, which cropped up as of
8.13.6 (with same-version sendmails talking to each other, it seems).


-----------------
The other I upgraded from sendmail 8.13.5 to 8.13.8. And all of a sudden I
get messages like these:

Oct 5 11:36:19 asarian-host sendmail[6648]: k959PFdw006637: SYSERR(root):
timeout writing message to mail.asarian-host.net: Resource temporarily
unavailable

Or:

Oct 5 11:29:23 asarian-host sendmail[6522]: k959OMY8006511: SYSERR(root):
timeout writing message to mail.asarian-host.net: Broken pipe

(This is one sendmail talking to another; same version, same machine, over
a UNIX domain socket).

I wonder what could suddenly cause this. I did some googling, and there
was some mention of setting "O EightBitMode=m", but I already have that.

The "Broken pipe" is not good. Seems to clobber over itself, perhaps?
Anyway, it never did this on 8.13.5. So, if I cannot solve it, I may have
ro revert back. Pending that drastic decision, if anyone has any clues, I
would appreciate it if they shared them with me.

- Mark


P.S. Here is one such logged error:


Oct 5 11:25:19 asarian-host sendmail[6637]: k959PFdw006637: --- 235 2.0.0
OK Authenticated
Oct 5 11:25:19 asarian-host sendmail[6637]: k959PFdw006637: <-- MAIL
FROM:
Oct 5 11:25:19 asarian-host sendmail[6637]: k959PFdw006637: dns
asarian-host.net => mail.asarian-host.net
Oct 5 11:25:19 asarian-host sendmail[6637]: k959PFdw006637: --- 250 2.1.0
... Sender ok
Oct 5 11:25:19 asarian-host sendmail[6637]: k959PFdw006637: <-- RCPT
TO:
Oct 5 11:25:19 asarian-host sendmail[6637]: k959PFdw006637: --- 250 2.1.5
... Recipient ok
Oct 5 11:25:19 asarian-host sendmail[6637]: k959PFdw006637: <-- DATA
Oct 5 11:25:19 asarian-host sendmail[6637]: k959PFdw006637: --- 354 Enter
mail, end with "." on a line by itself
Oct 5 11:25:19 asarian-host sendmail[6637]: k959PFdw006637:
from=, size=954848, class=0, nrcpts=1,
msgid=<200610050921.k959KXbD006294@asarian-host.net>, proto=ESMTP,
daemon=client-SMTP, relay=localhost [127.0.0.1]
Oct 5 11:25:19 asarian-host sendmail[6637]: k959PFdw006637: --- 250 2.0.0
k959PFdw006637 Message accepted for delivery
Oct 5 11:25:20 asarian-host sendmail[6648]: k959PFdw006637: --- 050
... Connecting to mx1.ex.eclipse.net.uk. via
esmtp...
Oct 5 11:25:20 asarian-host sendmail[6648]: k959PFdw006637: SMTP outgoing
connect on asarian-host.net
Oct 5 11:36:19 asarian-host sendmail[6648]: k959PFdw006637: --- 451 4.4.1
timeout writing message to mail.asarian-host.net: Resource temporarily
unavailable (hold)
Oct 5 11:36:19 asarian-host sendmail[6648]: k959PFdw006637: SYSERR(root):
timeout writing message to mail.asarian-host.net: Resource temporarily
unavailable
Oct 5 11:36:19 asarian-host sendmail[6648]: k959PFdw006637: --- 050
... Connecting to mx2.ex.eclipse.net.uk. via
esmtp...
Oct 5 11:36:19 asarian-host sendmail[6648]: k959PFdw006637: SMTP outgoing
connect on asarian-host.net
Oct 5 11:36:19 asarian-host sendmail[6648]: k959PFdw006637: --- 050
... Closing connection to
mx1.ex.eclipse.net.uk.
Oct 5 11:36:35 asarian-host sendmail[6648]: k959PFdw006637: --- 050
... Sent (Ok: queued as 88963C0E4F)
Oct 5 11:36:35 asarian-host sendmail[6648]: k959PFdw006637:
to=, ctladdr=
(1813/1813), delay=00:11:16, xdelay=00:11:16, mailer=esmtp, pri=984848,
relay=mx2.ex.eclipse.net.uk. [82.153.251.2], dsn=2.0.0, stat=Sent (Ok:
queued as 88963C0E4F)
Oct 5 11:36:35 asarian-host sendmail[6648]: k959PFdw006637: done;
delay=00:11:16, ntries=1

Re: "SYSERR(root): timeout writing message" after upgrade to 8.13.8

am 08.10.2006 23:20:39 von per

In article "Mark"
writes:
>
>I would like to bring the following issue to your attention. I posted it
>on comp.mail.sendmail a few days back, but I received no relevant answer.
>But I cannot just stop upgrading sendmail as of now. So I would really
>like an answer.
>
>When I do a google on:
>
> "timeout writing message" "comp.mail.sendmail"
>
>I see that this seems to be a persistent issue, which cropped up as of
>8.13.6 (with same-version sendmails talking to each other, it seems).

I see only one report matching that description (besides yours) - there
were quite a few messages posted in the thread though, but unfortunately
the OP dropped off without reporting a resolution or even answers to
some of the questions asked.

>-----------------
>The other I upgraded from sendmail 8.13.5 to 8.13.8. And all of a sudden I
>get messages like these:
>
>Oct 5 11:36:19 asarian-host sendmail[6648]: k959PFdw006637: SYSERR(root):
>timeout writing message to mail.asarian-host.net: Resource temporarily
>unavailable
>
>Or:
>
>Oct 5 11:29:23 asarian-host sendmail[6522]: k959OMY8006511: SYSERR(root):
>timeout writing message to mail.asarian-host.net: Broken pipe
>
>(This is one sendmail talking to another; same version, same machine, over
>a UNIX domain socket).

Huh? Using a UNIX domain socket for sendmail-sendmail SMTP would be an
extremely unusual setup - are you really doing this, and if so why?
Furthermore, your log excerpt shows (when the problem occurs)
mail.asarian-host.net connecting to mx1.ex.eclipse.net.uk, surely those
are not the same host? And mx1.ex.eclipse.net.uk doesn't seem to be
running sendmail, FWIW.

>I wonder what could suddenly cause this. I did some googling, and there
>was some mention of setting "O EightBitMode=m", but I already have that.

Not likely to be relevant. Assuming the message actually is going over
the network, did you try all the suggestions in the other thread? In
particular doing a packet trace could give useful info.

>The "Broken pipe" is not good. Seems to clobber over itself, perhaps?

This is the standard message related to the EPIPE error, which is what
you get when trying to write to a pipe or a socket when there is no
listener on the other end. I.e. in the standard SMTP case of a TCP
socket, the other end has already closed.

--Per Hedeland
per@hedeland.org

Re: "SYSERR(root): timeout writing message" after upgrade to 8.13.8

am 09.10.2006 10:47:45 von mark

"Per Hedeland" wrote in message
news:...

> > When I do a google on:
> >
> > "timeout writing message" "comp.mail.sendmail"
> >
> > I see that this seems to be a persistent issue, which cropped up as of
> > 8.13.6 (with same-version sendmails talking to each other, it seems).
>
> I see only one report matching that description (besides yours) - there
> were quite a few messages posted in the thread though, but unfortunately
> the OP dropped off without reporting a resolution or even answers to
> some of the questions asked.

Thank you for answering. I truly appreciate it (more below).

> > Oct 5 11:29:23 asarian-host sendmail[6522]: k959OMY8006511:
> > SYSERR(root):
> > timeout writing message to mail.asarian-host.net: Broken pipe
> >
> > (This is one sendmail talking to another; same version, same machine,
> > over a UNIX domain socket).
>
> Huh? Using a UNIX domain socket for sendmail-sendmail SMTP would be an
> extremely unusual setup - are you really doing this, and if so why?

I'm sorry; I expressed myself poorly. I meant: my first sendmail (at the
gate) uses a domain socket to talk to an SMTP agent; like so:

Mesmtp, P=[IPC], F=SDFMuXa, S=EnvFromSMTP/HdrFromSMTP, R=EnvToSMTP, ...
T=DNS/RFC822/X-Unix,
A=FILE /var/run/smtpd.sock

> Furthermore, your log excerpt shows (when the problem occurs)
> mail.asarian-host.net connecting to mx1.ex.eclipse.net.uk, surely those
> are not the same host?

No, they are not. The above-mentioned SMTP agent, after processing the
mail some, starts a regular SMTP connection to a second sendmail, on the
same host. This second sendmal actually delivers to the outside world:

> Oct 5 11:25:19 asarian-host sendmail[6637]: k959PFdw006637: --- 250
> 2.0.0 k959PFdw006637 Message accepted for delivery
> Oct 5 11:25:20 asarian-host sendmail[6648]: k959PFdw006637: --- 050
> ... Connecting to mx1.ex.eclipse.net.uk.
> via esmtp...

The first process (pid = 6637) has accepted the message. The second
sendmail (pid = 6648) tries outside delivery to mx1.ex.eclipse.net.uk. But
the weird part is, it actually succeeds in doing so:

> Oct 5 11:36:35 asarian-host sendmail[6648]: k959PFdw006637: --- 050
> ... Sent (Ok: queued as 88963C0E4F)

The "Broken pipe" in the log must, I pressume, refer to the communication
between the first sendmail and its domain socket SMTP agent. But why that
pipe would suddenly break remains to baffle me. As why the email gets to
be delivered nonetheless.

> ... did you try all the suggestions in the other thread? In
> particular doing a packet trace could give useful info.

I will try and get a packet trace off of it. The "Broken pipe" only
happens a few times a day, but I'm sure I'll catch it again. :)

At any rate, thanks again.

- Mark

Re: "SYSERR(root): timeout writing message" after upgrade to 8.13.8

am 09.10.2006 13:23:34 von Kees Theunissen

Mark wrote:
> The first process (pid = 6637) has accepted the message. The second
> sendmail (pid = 6648) tries outside delivery to mx1.ex.eclipse.net.uk. But
> the weird part is, it actually succeeds in doing so:
>
>>Oct 5 11:36:35 asarian-host sendmail[6648]: k959PFdw006637: --- 050
>>... Sent (Ok: queued as 88963C0E4F)

That's not how I read your logs.
Delivery to mx2.ex.eclipse.net.uk succeeded, not mx1.ex.eclipse.net.uk.

Oct 5 11:36:19 asarian-host sendmail[6648]: k959PFdw006637: --- 050
... Connecting to mx2.ex.eclipse.net.uk. via
esmtp...
Oct 5 11:36:19 asarian-host sendmail[6648]: k959PFdw006637: SMTP outgoing
connect on asarian-host.net
Oct 5 11:36:19 asarian-host sendmail[6648]: k959PFdw006637: --- 050
... Closing connection to
mx1.ex.eclipse.net.uk.
Oct 5 11:36:35 asarian-host sendmail[6648]: k959PFdw006637: --- 050
... Sent (Ok: queued as 88963C0E4F)
Oct 5 11:36:35 asarian-host sendmail[6648]: k959PFdw006637:
to=, ctladdr=
(1813/1813), delay=00:11:16, xdelay=00:11:16, mailer=esmtp, pri=984848,
relay=mx2.ex.eclipse.net.uk. [82.153.251.2], dsn=2.0.0, stat=Sent (Ok:
queued as 88963C0E4F)
Oct 5 11:36:35 asarian-host sendmail[6648]: k959PFdw006637: done;
delay=00:11:16, ntries=1

Re: "SYSERR(root): timeout writing message" after upgrade to 8.13.8

am 09.10.2006 13:52:46 von mark

"Kees Theunissen" wrote in message
news:8e01e$452a3138$c02a7def$22588@news1.tudelft.nl...

> That's not how I read your logs.
> Delivery to mx2.ex.eclipse.net.uk succeeded, not mx1.ex.eclipse.net.uk.

Hmm, you're right. :) Thanks.

The "broken pipe" actually came from a different part of the log. Here the
second sendmail gives up after 10 minutes (which is consistent with the
timeout I set for delivery):

Oct 5 11:25:20 asarian-host sendmail[6648]: k959PFdw006637: SMTP outgoing
connect on asarian-host.net
Oct 5 11:36:19 asarian-host sendmail[6648]: k959PFdw006637: --- 451 4.4.1
timeout writing message to mail.asarian-host.net: Resource temporarily
unavailable (hold)
Oct 5 11:36:19 asarian-host sendmail[6648]: k959PFdw006637: SYSERR(root):
timeout writing message to mail.asarian-host.net: Resource temporarily
unavailable

Why it would generate a "SYSERR(root)" for the event is still unclear,
though.

I'm ktrace-ing the relevant processes now. Soon as one bums out again,
I'll hopefully have more to tell.

- Mark

Re: "SYSERR(root): timeout writing message" after upgrade to 8.13.8

am 09.10.2006 21:31:45 von per

In article "Mark"
writes:
>"Per Hedeland" wrote in message
>news:...
>
> ["Mark" wrote:]
>> > Oct 5 11:29:23 asarian-host sendmail[6522]: k959OMY8006511:
>> > SYSERR(root):
>> > timeout writing message to mail.asarian-host.net: Broken pipe
>> >
>> > (This is one sendmail talking to another; same version, same machine,
>> > over a UNIX domain socket).
>>
>> Huh? Using a UNIX domain socket for sendmail-sendmail SMTP would be an
>> extremely unusual setup - are you really doing this, and if so why?
>
>I'm sorry; I expressed myself poorly. I meant: my first sendmail (at the
>gate) uses a domain socket to talk to an SMTP agent; like so:
>
>Mesmtp, P=[IPC], F=SDFMuXa, S=EnvFromSMTP/HdrFromSMTP, R=EnvToSMTP, ...
> T=DNS/RFC822/X-Unix,
> A=FILE /var/run/smtpd.sock

OK, so the thing accepting connections on the Unix domain socket is not
sendmail? (There's no support for that in the program itself, but maybe
you could cobble something up with a wrapper and 'sendmail -bs'.)

>> Furthermore, your log excerpt shows (when the problem occurs)
>> mail.asarian-host.net connecting to mx1.ex.eclipse.net.uk, surely those
>> are not the same host?
>
>No, they are not. The above-mentioned SMTP agent, after processing the
>mail some, starts a regular SMTP connection to a second sendmail, on the
>same host.

Again, why are you doing this messy thing?:-)

> This second sendmal actually delivers to the outside world:
>
>> Oct 5 11:25:19 asarian-host sendmail[6637]: k959PFdw006637: --- 250
>> 2.0.0 k959PFdw006637 Message accepted for delivery
>> Oct 5 11:25:20 asarian-host sendmail[6648]: k959PFdw006637: --- 050
>> ... Connecting to mx1.ex.eclipse.net.uk.
>> via esmtp...

No, 6648 is just the child forked by 6637 to do the "outgoing" SMTP
delivery, i.e. to your "SMTP agent" in this case. It "thinks" it will
connect to the mx host, and would pass the host name to your custom
esmtp mailer if it could, but there is no $h in the Argv, so that host
name is effectively thrown away.

I don't think you have any log entries from the "second" sendmail in the
excerpt you posted, it would be using a different queue ID. If your
"SMTP agent" doesn't accept the complete message and finish the SMTP
dialogue before connecting to the "second" sendmail, but rather try to
pass the message "on the fly" from one sendmail to the other, those log
entries could be very relevant. E.g. some problems with the "secondary"
SMTP session could cause the "primary" one to time out.

Maybe you could provide some more detail on this setup, I suspect that
this is where your problem lies. If you are using sendmail in a
violently non-standard way, you can certainly expect it to break without
warning in a minor upgrade (could be timing issues or whatever that made
it just "happen" to work before).

Funny thing that sendmail's "connection" to the secondary MX, which in
this setup was just another try to talk to your "SMTP agent",
succeeded. Of course it would presumably have been a different instance
of the agent, since you'd better fork one off for each connection to the
Unix domain socket (you do, don't you?) - but it's certainly another
indication that the problem is with your agent.

--Per Hedeland
per@hedeland.org