hanging apache processes (1.3.29 + mod_ssl 2.8.9)
hanging apache processes (1.3.29 + mod_ssl 2.8.9)
am 24.06.2002 08:24:50 von Alex Kotov
We have a strange problem with our Apache+mod_ssl server
(Apache/1.3.26 (Unix) mod_perl/1.22 mod_ssl/2.8.9 OpenSSL/0.9.6,
on Linux 2.2.19).
After a while the server processes become stuck while waiting for
the data from a socket. The timeout is set to 300 in httpd.conf,
but the processes happily wait for data for about an hour before
timing out. If the load on the server is high enough, all process
slots eventually get populated and the server stops serving.
The interesting aspect is, most of the time processes get stuck
when the request comes from one particular IP, and they don't get
stuck on every request from that IP. DoS attack is very unlikely,
judging by the activity.
Did anybody see this before? Is there a fix or a workaround? Strace and
cipher_log results are below.
Thanks in advance,
- Alex
wslab@aracnet.com
Running strace on a hung process produces
read(5,
for a long time, eventually followed by
read(5, 0x959d2d8, 11) = -1 ETIMEDOUT (Connection timed out)
The connection takes about 3600 seconds to time out.
cipher_log contains this for a "normal" connection (dumps removed):
[23/Jun/2002 17:02:01 08719] [info] Connection to child 4 established
(server xxx.xx
:443, client xxx.xxx.xxx.xxx)
[23/Jun/2002 17:02:01 08719] [info] Seeding PRNG with 23177 bytes of
entropy
[23/Jun/2002 17:02:01 08719] [trace] OpenSSL: Handshake: start
[23/Jun/2002 17:02:01 08719] [trace] OpenSSL: Loop: before/accept
initialization
[23/Jun/2002 17:02:01 08719] [debug] OpenSSL: read 11/11 bytes from
BIO#09327750
[mem: 092E2FD8] (BIO dump follows)
[23/Jun/2002 17:02:01 08719] [debug] OpenSSL: read 91/91 bytes from
BIO#09327750
[mem: 092E2FE3] (BIO dump follows)
[23/Jun/2002 17:02:01 08719] [trace] OpenSSL: Loop: SSLv3 read client
hello A
[23/Jun/2002 17:02:01 08719] [trace] OpenSSL: Loop: SSLv3 write server
hello A
[23/Jun/2002 17:02:01 08719] [trace] OpenSSL: Loop: SSLv3 write change
cipher sp
ec A
etc.
For a stuck connection, cipher_log contains
[23/Jun/2002 17:02:04 08719] [info] Connection to child 4 established
(server xxx.xxx
:443, client xxx.xxx.xxx.xxx)
[23/Jun/2002 17:02:04 08719] [info] Seeding PRNG with 23177 bytes of
entropy
[23/Jun/2002 17:02:04 08719] [trace] OpenSSL: Handshake: start
[23/Jun/2002 17:02:04 08719] [trace] OpenSSL: Loop: before/accept
initialization
with nothing else for this PID for a long time.
It seems that the process is trying to start an SSL connection, but times
out on read and does not respect Timeout settings in the configuration
file.
____________________________________________________________ __________
Apache Interface to OpenSSL (mod_ssl) www.modssl.org
User Support Mailing List modssl-users@modssl.org
Automated List Manager majordomo@modssl.org
Re: hanging apache processes (1.3.29 + mod_ssl 2.8.9)
am 24.06.2002 08:27:08 von Cliff Woolley
On Sun, 23 Jun 2002, Alex Kotov wrote:
> After a while the server processes become stuck while waiting for
> the data from a socket.
> Running strace on a hung process produces
> read(5,
> for a long time, eventually followed by
> read(5, 0x959d2d8, 11) = -1 ETIMEDOUT (Connection timed out)
Are you sure that file descriptor 5 is the connection to the client?
What SSLRandomSeed are you using? This sounds like one of those
/dev/random not-enough-entropy problems to me.
--Cliff
____________________________________________________________ __________
Apache Interface to OpenSSL (mod_ssl) www.modssl.org
User Support Mailing List modssl-users@modssl.org
Automated List Manager majordomo@modssl.org
Re: hanging apache processes (1.3.29 + mod_ssl 2.8.9)
am 24.06.2002 19:08:30 von Alex Kotov
Hi Cliff,
Thanks for your response.
I'm using
SSLRandomSeed startup builtin
SSLRandomSeed connect builtin
and 5 is definitely the file descriptor for the network connection.
Is there anything else I should check?
Thanks,
- Alex
On Mon, 24 Jun 2002, Cliff Woolley wrote:
> On Sun, 23 Jun 2002, Alex Kotov wrote:
>
> > After a while the server processes become stuck while waiting for
> > the data from a socket.
> > Running strace on a hung process produces
> > read(5,
> > for a long time, eventually followed by
> > read(5, 0x959d2d8, 11) = -1 ETIMEDOUT (Connection timed out)
>
> Are you sure that file descriptor 5 is the connection to the client?
>
> What SSLRandomSeed are you using? This sounds like one of those
> /dev/random not-enough-entropy problems to me.
>
> --Cliff
>
>
> ____________________________________________________________ __________
> Apache Interface to OpenSSL (mod_ssl) www.modssl.org
> User Support Mailing List modssl-users@modssl.org
> Automated List Manager majordomo@modssl.org
>
____________________________________________________________ __________
Apache Interface to OpenSSL (mod_ssl) www.modssl.org
User Support Mailing List modssl-users@modssl.org
Automated List Manager majordomo@modssl.org
Re: hanging apache processes (1.3.29 + mod_ssl 2.8.9)
am 24.06.2002 19:24:10 von Andy Osborne
I've seen this happen sometimes on our SSL servers (which do
quite a lot of traffic). A quick search of the logs for
recent connections from the same address always shows the
client as IE5.0 - which is known to be broken. The connections
seem to stall in the SSL negotiation and get killed off
but our rather intolerant tcp keepalive settings. I've never
found a real answer to the problem.
Andy
Alex Kotov wrote:
> Hi Cliff,
>
> Thanks for your response.
>
> I'm using
>
> SSLRandomSeed startup builtin
> SSLRandomSeed connect builtin
>
> and 5 is definitely the file descriptor for the network connection.
>
> Is there anything else I should check?
>
> Thanks,
> - Alex
>
>
> On Mon, 24 Jun 2002, Cliff Woolley wrote:
>
>
>>On Sun, 23 Jun 2002, Alex Kotov wrote:
>>
>>
>>>After a while the server processes become stuck while waiting for
>>>the data from a socket.
>>>Running strace on a hung process produces
>>>read(5,
>>>for a long time, eventually followed by
>>>read(5, 0x959d2d8, 11) = -1 ETIMEDOUT (Connection timed out)
>>>
>>Are you sure that file descriptor 5 is the connection to the client?
>>
>>What SSLRandomSeed are you using? This sounds like one of those
>>/dev/random not-enough-entropy problems to me.
>>
>>--Cliff
>>
>>
>>__________________________________________________________ ____________
>>Apache Interface to OpenSSL (mod_ssl) www.modssl.org
>>User Support Mailing List modssl-users@modssl.org
>>Automated List Manager majordomo@modssl.org
>>
>>
>
> ____________________________________________________________ __________
> Apache Interface to OpenSSL (mod_ssl) www.modssl.org
> User Support Mailing List modssl-users@modssl.org
> Automated List Manager majordomo@modssl.org
>
>
--
Andy Osborne **************** "Vertical B2B Communities"
Senior Internet Engineer
Sift Group 100 Victoria Street, Bristol BS1 6HZ
tel:+44 117 915 9600 fax:+44 117 915 9630 http://www.sift.co.uk
____________________________________________________________ __________
Apache Interface to OpenSSL (mod_ssl) www.modssl.org
User Support Mailing List modssl-users@modssl.org
Automated List Manager majordomo@modssl.org
Re: hanging apache processes (1.3.29 + mod_ssl 2.8.9)
am 24.06.2002 20:18:25 von Alex Kotov
I've seen strange problems with IE5, too, but these connections have
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; DigExt)" as
User-Agent. Unfortunately, changing tcp keepalive setting is not an
option for us.
I don't know all the intricacies of SSL handshake, but it looks like it
starts by the server trying to read 11 bytes from the client, and this is
where mod_ssl may wait for a long time without checking for a timeout.
Could someone point me to the place in the code where this read happens? I
would hate to switch to stronghold :(
Thanks,
- Alex
On Mon, 24 Jun 2002, Andy Osborne wrote:
> I've seen this happen sometimes on our SSL servers (which do
> quite a lot of traffic). A quick search of the logs for
> recent connections from the same address always shows the
> client as IE5.0 - which is known to be broken. The connections
> seem to stall in the SSL negotiation and get killed off
> but our rather intolerant tcp keepalive settings. I've never
> found a real answer to the problem.
>
> Andy
>
> Alex Kotov wrote:
>
> > Hi Cliff,
> >
> > Thanks for your response.
> >
> > I'm using
> >
> > SSLRandomSeed startup builtin
> > SSLRandomSeed connect builtin
> >
> > and 5 is definitely the file descriptor for the network connection.
> >
> > Is there anything else I should check?
> >
> > Thanks,
> > - Alex
> >
> >
> > On Mon, 24 Jun 2002, Cliff Woolley wrote:
> >
> >
> >>On Sun, 23 Jun 2002, Alex Kotov wrote:
> >>
> >>
> >>>After a while the server processes become stuck while waiting for
> >>>the data from a socket.
> >>>Running strace on a hung process produces
> >>>read(5,
> >>>for a long time, eventually followed by
> >>>read(5, 0x959d2d8, 11) = -1 ETIMEDOUT (Connection timed out)
> >>>
> >>Are you sure that file descriptor 5 is the connection to the client?
> >>
> >>What SSLRandomSeed are you using? This sounds like one of those
> >>/dev/random not-enough-entropy problems to me.
> >>
> >>--Cliff
> >>
> >>
> >>__________________________________________________________ ____________
> >>Apache Interface to OpenSSL (mod_ssl) www.modssl.org
> >>User Support Mailing List modssl-users@modssl.org
> >>Automated List Manager majordomo@modssl.org
> >>
> >>
> >
> > ____________________________________________________________ __________
> > Apache Interface to OpenSSL (mod_ssl) www.modssl.org
> > User Support Mailing List modssl-users@modssl.org
> > Automated List Manager majordomo@modssl.org
> >
> >
>
>
> --
> Andy Osborne **************** "Vertical B2B Communities"
> Senior Internet Engineer
> Sift Group 100 Victoria Street, Bristol BS1 6HZ
> tel:+44 117 915 9600 fax:+44 117 915 9630 http://www.sift.co.uk
>
> ____________________________________________________________ __________
> Apache Interface to OpenSSL (mod_ssl) www.modssl.org
> User Support Mailing List modssl-users@modssl.org
> Automated List Manager majordomo@modssl.org
>
____________________________________________________________ __________
Apache Interface to OpenSSL (mod_ssl) www.modssl.org
User Support Mailing List modssl-users@modssl.org
Automated List Manager majordomo@modssl.org
Re: hanging apache processes (1.3.29 + mod_ssl 2.8.9)
am 24.06.2002 20:36:55 von Peter Viertel
--------------000004010906080203070307
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Perhaps if you watch the session with Eric Rescorla's excellent ssldump
tool you may get to the bottom of it....
http://www.rtfm.com/ssldump/
Or another possibility altogether... I had a problem which looked
similar to this which was some solaris specific mutex bug which meant
that child processes did not get released properly after certain types
of SSL connections - this was fixed only with rev 1.3.24, and also by
adding 'AcceptMutex pthread' to the config file.
Alex Kotov wrote:
>I've seen strange problems with IE5, too, but these connections have
>"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; DigExt)" as
>User-Agent. Unfortunately, changing tcp keepalive setting is not an
>option for us.
>
>I don't know all the intricacies of SSL handshake, but it looks like it
>starts by the server trying to read 11 bytes from the client, and this is
>where mod_ssl may wait for a long time without checking for a timeout.
>Could someone point me to the place in the code where this read happens? I
>would hate to switch to stronghold :(
>
>Thanks,
>- Alex
>
>
>On Mon, 24 Jun 2002, Andy Osborne wrote:
>
>
>
>>I've seen this happen sometimes on our SSL servers (which do
>>quite a lot of traffic). A quick search of the logs for
>>recent connections from the same address always shows the
>>client as IE5.0 - which is known to be broken. The connections
>>seem to stall in the SSL negotiation and get killed off
>>but our rather intolerant tcp keepalive settings. I've never
>>found a real answer to the problem.
>>
>>Andy
>>
>>Alex Kotov wrote:
>>
>>
>>
>>>Hi Cliff,
>>>
>>>Thanks for your response.
>>>
>>>I'm using
>>>
>>>SSLRandomSeed startup builtin
>>>SSLRandomSeed connect builtin
>>>
>>>and 5 is definitely the file descriptor for the network connection.
>>>
>>>Is there anything else I should check?
>>>
>>>Thanks,
>>>- Alex
>>>
>>>
>>>On Mon, 24 Jun 2002, Cliff Woolley wrote:
>>>
>>>
>>>
>>>
>>>>On Sun, 23 Jun 2002, Alex Kotov wrote:
>>>>
>>>>
>>>>
>>>>
>>>>>After a while the server processes become stuck while waiting for
>>>>>the data from a socket.
>>>>>Running strace on a hung process produces
>>>>>read(5,
>>>>>for a long time, eventually followed by
>>>>>read(5, 0x959d2d8, 11) = -1 ETIMEDOUT (Connection timed out)
>>>>>
>>>>>
>>>>>
>>>>Are you sure that file descriptor 5 is the connection to the client?
>>>>
>>>>What SSLRandomSeed are you using? This sounds like one of those
>>>>/dev/random not-enough-entropy problems to me.
>>>>
>>>>--Cliff
>>>>
>>>>
>>>>________________________________________________________ ______________
>>>>Apache Interface to OpenSSL (mod_ssl) www.modssl.org
>>>>User Support Mailing List modssl-users@modssl.org
>>>>Automated List Manager majordomo@modssl.org
>>>>
>>>>
>>>>
>>>>
>>>_________________________________________________________ _____________
>>>Apache Interface to OpenSSL (mod_ssl) www.modssl.org
>>>User Support Mailing List modssl-users@modssl.org
>>>Automated List Manager majordomo@modssl.org
>>>
>>>
>>>
>>>
>>--
>>Andy Osborne **************** "Vertical B2B Communities"
>>Senior Internet Engineer
>>Sift Group 100 Victoria Street, Bristol BS1 6HZ
>>tel:+44 117 915 9600 fax:+44 117 915 9630 http://www.sift.co.uk
>>
>>__________________________________________________________ ____________
>>Apache Interface to OpenSSL (mod_ssl) www.modssl.org
>>User Support Mailing List modssl-users@modssl.org
>>Automated List Manager majordomo@modssl.org
>>
>>
>>
>
>___________________________________________________________ ___________
>Apache Interface to OpenSSL (mod_ssl) www.modssl.org
>User Support Mailing List modssl-users@modssl.org
>Automated List Manager majordomo@modssl.org
>
>
--------------000004010906080203070307
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit
Perhaps if you watch the session with Eric Rescorla's excellent ssldump tool
you may get to the bottom of it....
Or another possibility altogether... I had a problem which looked similar
to this which was some solaris specific mutex bug which meant that child
processes did not get released properly after certain types of SSL connections
- this was fixed only with rev 1.3.24, and also by adding 'AcceptMutex pthread'
to the config file.
Alex Kotov wrote:
cite="midPine.LNX.4.33.0206241104110.22272-100000@shell1.ara cnet.com">
I've seen strange problems with IE5, too, but these connections have
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; DigExt)" as
User-Agent. Unfortunately, changing tcp keepalive setting is not an
option for us.
I don't know all the intricacies of SSL handshake, but it looks like it
starts by the server trying to read 11 bytes from the client, and this is
where mod_ssl may wait for a long time without checking for a timeout.
Could someone point me to the place in the code where this read happens? I
would hate to switch to stronghold :(
Thanks,
- Alex
On Mon, 24 Jun 2002, Andy Osborne wrote:
I've seen this happen sometimes on our SSL servers (which do
quite a lot of traffic). A quick search of the logs for
recent connections from the same address always shows the
client as IE5.0 - which is known to be broken. The connections
seem to stall in the SSL negotiation and get killed off
but our rather intolerant tcp keepalive settings. I've never
found a real answer to the problem.
Andy
Alex Kotov wrote:
Hi Cliff,
Thanks for your response.
I'm using
SSLRandomSeed startup builtin
SSLRandomSeed connect builtin
and 5 is definitely the file descriptor for the network connection.
Is there anything else I should check?
Thanks,
- Alex
On Mon, 24 Jun 2002, Cliff Woolley wrote:
On Sun, 23 Jun 2002, Alex Kotov wrote:
After a while the server processes become stuck while waiting for
the data from a socket.
Running strace on a hung process produces
read(5,
for a long time, eventually followed by
read(5, 0x959d2d8, 11) = -1 ETIMEDOUT (Connection timed out)
Are you sure that file descriptor 5 is the connection to the client?
What SSLRandomSeed are you using? This sounds like one of those
/dev/random not-enough-entropy problems to me.
--Cliff
____________________________________________________________ __________
Apache Interface to OpenSSL (mod_ssl)
User Support Mailing List
Automated List Manager
____________________________________________________________ __________
Apache Interface to OpenSSL (mod_ssl)
User Support Mailing List
Automated List Manager
--
Andy Osborne **************** "Vertical B2B Communities"
Senior Internet Engineer
Sift Group 100 Victoria Street, Bristol BS1 6HZ
____________________________________________________________ __________
Apache Interface to OpenSSL (mod_ssl)
User Support Mailing List
Automated List Manager
____________________________________________________________ __________
Apache Interface to OpenSSL (mod_ssl)
User Support Mailing List
Automated List Manager
--------------000004010906080203070307--
____________________________________________________________ __________
Apache Interface to OpenSSL (mod_ssl) www.modssl.org
User Support Mailing List modssl-users@modssl.org
Automated List Manager majordomo@modssl.org