Apache Processes Hung "Sending Reply"

Apache Processes Hung "Sending Reply"

am 01.02.2010 21:34:16 von Tom Ritter

I have 40 or so apache processes suspended in "Sending Reply". My hypothesis
is that MySQL had a problem, and either apache or php somehow got gummed up
and isn't cleaning up for some reason. I'm hoping the list can give me more
ideas for debugging or point me in the right direction.



Here is the output of http://localhost/server-status:

Server uptime: 1 day 6 hours 57 minutes 9 seconds
Total accesses: 47613 - Total Traffic: 498.2 MB
CPU Usage: u1446.77 s548.53 cu6.26 cs0 - 1.8% CPU load
.427 requests/sec - 4688 B/second - 10.7 kB/request
41 requests currently being processed, 8 idle workers
WW_WWW_WWWWW_WWWWWWWWWW_W_WWWW__WWW.WWWWWWWW_WWWWW

Examining the logs confirms that the last request on each pid was quite a while
ago, and they are just hanging out doing nothing.

The server:
- RHEL
$uname -a
Linux xxx 2.6.18-164.6.1.el5 #1 SMP Tue Oct 27 11:30:06 EDT 2009
i686 i686 i386 GNU/Linux
- Apache:
Server version: Apache/2.2.3
Server built: Nov 10 2009 09:06:57
- PHP:
$php -v
PHP 5.1.6 (cli) (built: Feb 26 2009 07:01:10)
Zend Engine v2.1.0
- Runs Wordpress (not my choice)
- Receives mostly search crawler traffic at a steady rate
- has a lot of "(32)Broken pipe: core_output_filter: writing data to the
network" and "(104)Connection reset by peer: core_output_filter: writing
data to the network" messages
- stopping reporting to rrdtool/cacti between 18:50 and 21:30 last night
- Had a child process die with the error /usr/sbin/httpd: free():
invalid pointer: 0x0a2044a4
however this was about 20 minutes *after* the problem began
- had some "database error MySQL server has gone away for query" errors around
18:50 last night
- is behind an F5 device that proxies all connections - so every connection to
the server comes from the same IP address

Relevant config:

Timeout 40
KeepAlive On
MaxKeepAliveRequests 200
KeepAliveTimeout 5
StartServers 3
MinSpareServers 2
MaxSpareServers 10
ServerLimit 50
MaxClients 50
MaxRequestsPerChild 1000


I've only been able to find one person who had a similar problem, and his was
caused by "dodgy sql": http://marc.info/?l=tomcat-user&m=106319217331935&w=2
(His was also involving tomcat which I do not have.)

The biggest issue is that the processes should time out and clean up after
themselves, right? But they're not - instead they're just sitting consuming
RAM. (Not entirely sure about that - in some stacktraces I see
followed by "zend_timeout ()".)

My hypothesis is that MySQL had a problem, and either apache or php somehow
got gummed up and isn't cleaning up for some reason.

I'm sure a httpd restart will clean everything up, but I wanted to debug this
as best I could. I gdb-ed a stacktrace for 8 of the hung threads, but it's
not compiled in debug mode. The stacktraces, and other relevant data, is here:
http://ritter.vg/misc/apache-debug/

If anyone can suggest further things to try to debug this, or any additional
info, I'd appreciate it.

-tom

------------------------------------------------------------ ---------
The official User-To-User support forum of the Apache HTTP Server Project.
See for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org

Re: Apache Processes Hung "Sending Reply"

am 02.02.2010 16:33:56 von Jeff Trawick

On Mon, Feb 1, 2010 at 3:34 PM, Tom Ritter wrote:
> I have 40 or so apache processes suspended in "Sending Reply". =A0My hypo=
thesis
> is that MySQL had a problem, and either apache or php somehow got gummed =
up
> and isn't cleaning up for some reason. =A0I'm hoping the list can give me=
more
> ideas for debugging or point me in the right direction.
>
>
>
> Here is the output of http://localhost/server-status:
>
> =A0 =A0 =A0 =A0Server uptime: 1 day 6 hours 57 minutes 9 seconds
> =A0 =A0 =A0 =A0Total accesses: 47613 - Total Traffic: 498.2 MB
> =A0 =A0 =A0 =A0CPU Usage: u1446.77 s548.53 cu6.26 cs0 - 1.8% CPU load
> =A0 =A0 =A0 =A0.427 requests/sec - 4688 B/second - 10.7 kB/request
> =A0 =A0 =A0 =A041 requests currently being processed, 8 idle workers
> =A0 =A0 =A0 =A0WW_WWW_WWWWW_WWWWWWWWWW_W_WWWW__WWW.WWWWWWWW_WWWWW
>
> Examining the logs confirms that the last request on each pid was quite a=
while
> ago, and they are just hanging out doing nothing.
>
> The server:
> =A0- RHEL
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0$uname -a
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Linux xxx 2.6.18-164.6.1.el5 #1 SMP Tue Oc=
t 27 11:30:06 EDT 2009
> i686 i686 i386 GNU/Linux
> =A0- Apache:
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Server version: Apache/2.2.3
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Server built: =A0 Nov 10 2009 09:06:57
> =A0- PHP:
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0$php -v
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0PHP 5.1.6 (cli) (built: Feb 26 2009 07:01:=
10)
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Zend Engine v2.1.0
> =A0- Runs Wordpress (not my choice)
> =A0- Receives mostly search crawler traffic at a steady rate
> =A0- has a lot of "(32)Broken pipe: core_output_filter: writing data to t=
he
> =A0 =A0 network" and "(104)Connection reset by peer: core_output_filter: =
writing
> =A0 =A0 data to the network" messages
> =A0- stopping reporting to rrdtool/cacti between 18:50 and 21:30 last nig=
ht
> =A0- Had a child process die with the error /usr/sbin/httpd: free():
> invalid pointer: 0x0a2044a4
> =A0 =A0 however this was about 20 minutes *after* the problem began
> =A0- had some "database error MySQL server has gone away for query" error=
s around
> =A0 =A0 18:50 last night
> =A0- is behind an F5 device that proxies all connections - so every conne=
ction to
> =A0 =A0 the server comes from the same IP address
>
> Relevant config:
>
> =A0 =A0 =A0 =A0Timeout 40
> =A0 =A0 =A0 =A0KeepAlive On
> =A0 =A0 =A0 =A0MaxKeepAliveRequests 200
> =A0 =A0 =A0 =A0KeepAliveTimeout 5
> =A0 =A0 =A0 =A0StartServers =A0 =A0 =A0 3
> =A0 =A0 =A0 =A0MinSpareServers =A0 =A02
> =A0 =A0 =A0 =A0MaxSpareServers =A0 10
> =A0 =A0 =A0 =A0ServerLimit =A0 =A0 =A0 50
> =A0 =A0 =A0 =A0MaxClients =A0 =A0 =A0 =A050
> =A0 =A0 =A0 =A0MaxRequestsPerChild =A01000
>
>
> I've only been able to find one person who had a similar problem, and his=
was
> caused by "dodgy sql": http://marc.info/?l=3Dtomcat-user&m=3D106319217331=
935&w=3D2
> (His was also involving tomcat which I do not have.)

Apache processes hung in W/"Sending Reply" is a huge class of problems
with endless root causes. The aspect common to most of these is that
application code running inside Apache (e.g., mod_php) or outside
Apache (e.g., Tomcat or anything Apache proxies too) has hung.

> The biggest issue is that the processes should time out and clean up afte=
r
> themselves, right? =A0But they're not - instead they're just sitting cons=
uming
> RAM. =A0(Not entirely sure about that - in some stacktraces I see
> followed by "zend_timeout ()".)

Apache hands the request over to mod_php to be processed synchronously
on the calling thread. It is up to mod_php to decide what to do,
whether to timeout any anticipated conditions, etc. Apache isn't
monitoring it.

> My hypothesis is that MySQL had a problem, and either apache or php someh=
ow
> got gummed up and isn't cleaning up for some reason.

mod_php never returned to Apache; no thoughts here on what event
triggered whatever bug you encountered.

>
> I'm sure a httpd restart will clean everything up, but I wanted to debug =
this
> as best I could. =A0I gdb-ed a stacktrace for 8 of the hung threads, but =
it's
> not compiled in debug mode. =A0The stacktraces, and other relevant data, =
is here:
> http://ritter.vg/misc/apache-debug/
>
> If anyone can suggest further things to try to debug this, or any additio=
nal
> info, I'd appreciate it.

You got the right sort of information to start with. Theoretically
some glibc heap experts could tell you what it means to block in that
spot, but I anticipate that the answer would be the rather vague
"memory overlays or other invalid use of the heap by the application."
As for which component did it, I'd wager that it isn't Apache.

I wonder if your PHP/extensions/related libraries have all appropriate
fixes for memory corruption or heap library misuse. (I guess these
are all RedHat-patched binaries.)

------------------------------------------------------------ ---------
The official User-To-User support forum of the Apache HTTP Server Project.
See for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org

Re: Apache Processes Hung "Sending Reply"

am 07.03.2010 18:18:16 von Richard_vK

I had the same problem, spent hours trying to figure out what the problem
was.

We use Munin to monitor our system status (server runs FreeBSD) and noticed
that whenever Apache processes started hanging, there was a corresponding
decrease in the number of established TCP connections to the machine. This
led us to suspect a problem with broken connections on our PIX firewall. The
firewall was rebooted and (touch wood) we have had no more problems since
then (several days now).

Note that even while the problem was active I could still make requests to
the server, but responses were very slow, but connectivity to the website
was not entirely down. It did reach MaxClients a few times which was a
problem. The only solution during the problem was to restart Apache, which
flushed the hung connections. Connections did not time out on their own,
even after more than 20 minutes of 'hanging'.

Maybe also important to note was that my database (MySQL) was almost
entirely idle while the problem was active, clearly there were no apache
processes making any new database connections.

Hope this helps someone!


Tom Ritter-2 wrote:
>
> I have 40 or so apache processes suspended in "Sending Reply". My
> hypothesis
> is that MySQL had a problem, and either apache or php somehow got gummed
> up
> and isn't cleaning up for some reason. I'm hoping the list can give me
> more
> ideas for debugging or point me in the right direction.
>
>
>
> Here is the output of http://localhost/server-status:
>
> Server uptime: 1 day 6 hours 57 minutes 9 seconds
> Total accesses: 47613 - Total Traffic: 498.2 MB
> CPU Usage: u1446.77 s548.53 cu6.26 cs0 - 1.8% CPU load
> .427 requests/sec - 4688 B/second - 10.7 kB/request
> 41 requests currently being processed, 8 idle workers
> WW_WWW_WWWWW_WWWWWWWWWW_W_WWWW__WWW.WWWWWWWW_WWWWW
>
> Examining the logs confirms that the last request on each pid was quite a
> while
> ago, and they are just hanging out doing nothing.
>
> The server:
> - RHEL
> $uname -a
> Linux xxx 2.6.18-164.6.1.el5 #1 SMP Tue Oct 27 11:30:06 EDT 2009
> i686 i686 i386 GNU/Linux
> - Apache:
> Server version: Apache/2.2.3
> Server built: Nov 10 2009 09:06:57
> - PHP:
> $php -v
> PHP 5.1.6 (cli) (built: Feb 26 2009 07:01:10)
> Zend Engine v2.1.0
> - Runs Wordpress (not my choice)
> - Receives mostly search crawler traffic at a steady rate
> - has a lot of "(32)Broken pipe: core_output_filter: writing data to the
> network" and "(104)Connection reset by peer: core_output_filter:
> writing
> data to the network" messages
> - stopping reporting to rrdtool/cacti between 18:50 and 21:30 last night
> - Had a child process die with the error /usr/sbin/httpd: free():
> invalid pointer: 0x0a2044a4
> however this was about 20 minutes *after* the problem began
> - had some "database error MySQL server has gone away for query" errors
> around
> 18:50 last night
> - is behind an F5 device that proxies all connections - so every
> connection to
> the server comes from the same IP address
>
> Relevant config:
>
> Timeout 40
> KeepAlive On
> MaxKeepAliveRequests 200
> KeepAliveTimeout 5
> StartServers 3
> MinSpareServers 2
> MaxSpareServers 10
> ServerLimit 50
> MaxClients 50
> MaxRequestsPerChild 1000
>
>
> I've only been able to find one person who had a similar problem, and his
> was
> caused by "dodgy sql":
> http://marc.info/?l=tomcat-user&m=106319217331935&w=2
> (His was also involving tomcat which I do not have.)
>
> The biggest issue is that the processes should time out and clean up after
> themselves, right? But they're not - instead they're just sitting
> consuming
> RAM. (Not entirely sure about that - in some stacktraces I see
> followed by "zend_timeout ()".)
>
> My hypothesis is that MySQL had a problem, and either apache or php
> somehow
> got gummed up and isn't cleaning up for some reason.
>
> I'm sure a httpd restart will clean everything up, but I wanted to debug
> this
> as best I could. I gdb-ed a stacktrace for 8 of the hung threads, but
> it's
> not compiled in debug mode. The stacktraces, and other relevant data, is
> here:
> http://ritter.vg/misc/apache-debug/
>
> If anyone can suggest further things to try to debug this, or any
> additional
> info, I'd appreciate it.
>
> -tom
>
> ------------------------------------------------------------ ---------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
> " from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>
>

--
View this message in context: http://old.nabble.com/-users%40httpd--Apache-Processes-Hung- %22Sending-Reply%22-tp27410910p27812896.html
Sent from the Apache HTTP Server - Users mailing list archive at Nabble.com.


------------------------------------------------------------ ---------
The official User-To-User support forum of the Apache HTTP Server Project.
See for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org