print throwing intermittent Segfaults

print throwing intermittent Segfaults

am 21.11.2009 10:43:19 von Denis Banovic

This is a multi-part message in MIME format.

------_=_NextPart_001_01CA6A8F.0EE5C327
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi Everybody,
=20
I'm having big problems with mod_perl throwing intermittent Segmentation =
faults our production machines on RHEL 4 & 5.
To be able to produce a core dump on this segfaults I've installed =
mod_dumpcore from this tutorial:
http://mituzas.lt/2009/09/26/getting-apache-core-dumps-in-li nux/
=20
gdb /usr/sbin/httpd core.1 produces following output:
=20
#0 0x00b29f4b in XS_Apache__RequestRec_content_type () from =
/usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi/aut o/Apache/Requ=
estRec/RequestRec.so
=20
and very rarely ( 1 in 15 )
=20
#0 0x003830b9 in apr_palloc () from /usr/lib/libapr-0.so.0
=20
=20
The content-type is set by
$r->content_type("text/html; charset=3Diso-8859-1") but this is not what =
is causing him to segfault...
=20
By try and error I've figured out that the segfault happens when I do a
$r->print($mypagecontent);
=20
I've even tried to do a
unless($r->connection->aborted) {
$r->print($mypagecontent);
}
but this didn't help either.
=20
The segfault happens randomly, between 30 and 250 mod_perl requests. =
There is no specific request URL or script that causes him to segfault, =
it just happens after some time.
More load on the server means more segfaults.=20
=20
From my Apache Config:

StartServers 8
MinSpareServers 5
MaxSpareServers 20
ServerLimit 256
MaxClients 200
MaxRequestsPerChild 15


There are some additional Perl Modules that I've build from CPAN:
Compress-Zlib-2.004
Digest-MD5-2.39
Email-MIME-1.861
Email-MIME-ContentType-1.014
Email-MIME-Encodings-1.311
Email-Simple-2.004
Encode-Detect-1.01
ExtUtils-CBuilder-0.23
File-Slurp-9999.12
IO-Compress-Zlib-2.004
MIME-Base64-3.07
MIME-Types-1.24
Module-Build-0.2808
Pod-Escapes-1.04
Pod-Simple-3.07
String-Similarity-1.03
Template-Plugin-XML-Escape-0.02
Test-Pod-1.26
Test-Simple-0.80
=20
Has anyone a hint where to start looking and what to do next to figure =
out why this segfault is happening?
=20
Thanks
=20
Denis
=20

------_=_NextPart_001_01CA6A8F.0EE5C327
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

=0A=
=0A=
=0A=
=0A=

Hi =
Everybody,
=0A=
 
=0A=
I'm having big problems =
with mod_perl throwing intermittent Segmentation faults our production =
machines on RHEL 4 & 5.
To be able to produce a core dump on this =
segfaults I've installed mod_dumpcore from this tutorial:
=0A=
href=3D"http://mituzas.lt/2009/09/26/getting-apache-core-dum ps-in-linux/"=
>http://mituzas.lt/2009/09/26/getting-apache-core-dumps-in-l inux/
NT>
=0A=
 
=0A=
gdb /usr/sbin/httpd =
core.1 produces following output:
=0A=
 
=0A=
#0  0x00b29f4b in =
XS_Apache__RequestRec_content_type ()  from =
/usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi/aut o/Apache/Requ=
estRec/RequestRec.so
=0A=
 
=0A=
and very rarely ( 1 in =
15 )
=0A=
 
=0A=
#0  0x003830b9 in =
apr_palloc () from /usr/lib/libapr-0.so.0
=0A=
 
=0A=
 
=0A=
The content-type is set =
by
$r->content_type("text/html; charset=3Diso-8859-1") but this is =
not what is causing him to segfault...
=0A=
 
=0A=
By try and error I've =
figured out that the segfault happens when I do =
a
$r->print($mypagecontent);
=0A=
 
=0A=
I've even tried to do =
a
size=3D2>unless($r->connection->aborted) {
    =
$r->print($mypagecontent);
}
 but this didn't help =
either.
=0A=
 
=0A=
The segfault happens =
randomly, between 30 and 250 mod_perl requests. There is no specific =
request URL or script that causes him to segfault, it just happens after =
some time.
More load on the server means more segfaults.
=0A=
 
=0A=
From my Apache =
Config:
<IfModule =
prefork.c>
StartServers       =
8
MinSpareServers    5
MaxSpareServers   =
20
ServerLimit      =
256
MaxClients       =
200
MaxRequestsPerChild  =
15
</IfModule>
size=3D2>=0A=

There are some additional Perl Modules that I've build from =
CPAN:
=0A=
Compress-Zlib-2.004
Digest-MD5-2.39
Email-MIME-1.861
Email-=
MIME-ContentType-1.014
Email-MIME-Encodings-1.311
Email-Simple-2.00=
4
Encode-Detect-1.01
ExtUtils-CBuilder-0.23
File-Slurp-9999.12 R>IO-Compress-Zlib-2.004
MIME-Base64-3.07
MIME-Types-1.24
Module=
-Build-0.2808
Pod-Escapes-1.04
Pod-Simple-3.07
String-Similarity=
-1.03
Template-Plugin-XML-Escape-0.02
Test-Pod-1.26
Test-Simple-=
0.80
=0A=
 
=0A=
Has anyone a hint where to start looking and what to do next to =
figure out why this segfault is happening?
=0A=
 
=0A=
Thanks
=0A=
 
=0A=
Denis
=0A=
 

------_=_NextPart_001_01CA6A8F.0EE5C327--

Re: print throwing intermittent Segfaults

am 21.11.2009 19:27:41 von William T

This is the list of stuff I usually start with when I get a problem
that doesn't seem to be tied to a particular code path.

* code path - perhaps a particular code path is only being exercised
rarely, and it has a bug
* forking - when child dies, all open descriptors in it's name space
also get closed
* eval - always a good thing to look at when weird things happen
* persistancy - globals, closures, persistant objects,
serialized/restored objects, shared memory, shared objects (between
processes)

-wjt

AW: print throwing intermittent Segfaults

am 23.11.2009 09:27:22 von Denis Banovic

Hi Willian,

Thanks for your checklist, I've run through it, segfaults still there...
Right now it takes less then a minute from apache restart to the first =
segfault.
This is from the error_log from the RedHat 5 Production machine:

Apache2::RequestIO::print: (103) Software caused connection abort at=20

The guys from rackspace are saying that I should recompile all my perl =
modules installed directly from CPAN ( see above ) , do you think this =
would help?
Or has someone another hint?

Thanks

Denis

-----Ursprüngliche Nachricht-----
Von: William T [mailto:dietbuddha@gmail.com]=20
Gesendet: Samstag, 21. November 2009 19:28
An: Denis Banovic
Cc: modperl@perl.apache.org
Betreff: Re: print throwing intermittent Segfaults

This is the list of stuff I usually start with when I get a problem that =
doesn't seem to be tied to a particular code path.

* code path - perhaps a particular code path is only being exercised =
rarely, and it has a bug
* forking - when child dies, all open descriptors in it's name space =
also get closed
* eval - always a good thing to look at when weird things happen
* persistancy - globals, closures, persistant objects, =
serialized/restored objects, shared memory, shared objects (between
processes)

-wjt


------------------------------------------------------------ -------------=
-------
Von: Denis Banovic [mailto:denis.banovic@ncm.at]=20
Gesendet: Samstag, 21. November 2009 10:43
An: modperl@perl.apache.org
Betreff: print throwing intermittent Segfaults


Hi Everybody,

I'm having big problems with mod_perl throwing intermittent Segmentation =
faults our production machines on RHEL 4 & 5.
To be able to produce a core dump on this segfaults I've installed =
mod_dumpcore from this tutorial:
http://mituzas.lt/2009/09/26/getting-apache-core-dumps-in-li nux/

gdb /usr/sbin/httpd core.1 produces following output:

#0 0x00b29f4b in XS_Apache__RequestRec_content_type () from =
/usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi/aut o/Apache/Requ=
estRec/RequestRec.so

and very rarely ( 1 in 15 )

#0 0x003830b9 in apr_palloc () from /usr/lib/libapr-0.so.0


The content-type is set by
$r->content_type("text/html; charset=3Diso-8859-1") but this is not what =
is causing him to segfault...

By try and error I've figured out that the segfault happens when I do a
$r->print($mypagecontent);

I've even tried to do a
unless($r->connection->aborted) {
$r->print($mypagecontent);
}
but this didn't help either.

The segfault happens randomly, between 30 and 250 mod_perl requests. =
There is no specific request URL or script that causes him to segfault, =
it just happens after some time.
More load on the server means more segfaults.=20

From my Apache Config:

StartServers 8
MinSpareServers 5
MaxSpareServers 20
ServerLimit 256
MaxClients 200
MaxRequestsPerChild 15


There are some additional Perl Modules that I've build from CPAN:
Compress-Zlib-2.004
Digest-MD5-2.39
Email-MIME-1.861
Email-MIME-ContentType-1.014
Email-MIME-Encodings-1.311
Email-Simple-2.004
Encode-Detect-1.01
ExtUtils-CBuilder-0.23
File-Slurp-9999.12
IO-Compress-Zlib-2.004
MIME-Base64-3.07
MIME-Types-1.24
Module-Build-0.2808
Pod-Escapes-1.04
Pod-Simple-3.07
String-Similarity-1.03
Template-Plugin-XML-Escape-0.02
Test-Pod-1.26
Test-Simple-0.80

Has anyone a hint where to start looking and what to do next to figure =
out why this segfault is happening?

Thanks

Denis

Re: AW: print throwing intermittent Segfaults

am 23.11.2009 09:46:03 von aw

Denis Banovic wrote:
> Hi Willian,
>
> Thanks for your checklist, I've run through it, segfaults still there...
> Right now it takes less then a minute from apache restart to the first segfault.
> This is from the error_log from the RedHat 5 Production machine:
>
> Apache2::RequestIO::print: (103) Software caused connection abort at
>
> The guys from rackspace are saying that I should recompile all my perl modules installed directly from CPAN ( see above ) , do you think this would help?
> Or has someone another hint?
>
Just my grain of salt : in my own experience, 99% of the "segfault"
cases I have encountered, was when Apache or Perl tried to run a piece
of code not meant for this machine (such as a library meant for another
machine or another OS version).
Maybe one of the modules you are using installed a wrong library ?
In that sense, the guys from rackspace may be right, although I believe
that the CPAN modules don't generally contain object-code libraries, or
else they do compile them at installation.
So maybe it is a library from the RHEL repository which is wrong.

RE: AW: print throwing intermittent Segfaults

am 23.11.2009 10:15:33 von morten.bjornsvik

Hi

I've had a similar error I "fixed" it by adding an eval block around the
offending code which was tracked back to MASON.

http://rt.cpan.org/Public/Bug/Display.html?id=3D49031

We compile everything from scratch apache,perl,mod_perl,mason, all =
modules
by an automated build script.=20

Earlier when we run on mod_perl1.99 and the redhat stack it worked fine.
But then we had other worries :-)

--
Morten Bjoernsvik, Developer, Decision Analytics


-----Original Message-----
From: Andr=E9 Warnier [mailto:aw@ice-sa.com]=20
Sent: 23. november 2009 09:46
To: mod_perl list
Subject: Re: AW: print throwing intermittent Segfaults

Denis Banovic wrote:
> Hi Willian,
>=20
> Thanks for your checklist, I've run through it, segfaults still =
there...
> Right now it takes less then a minute from apache restart to the first =
segfault.
> This is from the error_log from the RedHat 5 Production machine:
>=20
> Apache2::RequestIO::print: (103) Software caused connection abort at=20
>=20
> The guys from rackspace are saying that I should recompile all my perl =
modules installed directly from CPAN ( see above ) , do you think this =
would help?
> Or has someone another hint?
>=20
Just my grain of salt : in my own experience, 99% of the "segfault"=20
cases I have encountered, was when Apache or Perl tried to run a piece=20
of code not meant for this machine (such as a library meant for another=20
machine or another OS version).
Maybe one of the modules you are using installed a wrong library ?
In that sense, the guys from rackspace may be right, although I believe=20
that the CPAN modules don't generally contain object-code libraries, or=20
else they do compile them at installation.
So maybe it is a library from the RHEL repository which is wrong.

AW: print throwing intermittent Segfaults [ solved ]

am 23.11.2009 11:24:11 von Denis Banovic

Hi Morten,

Thanks a lot,

By putting an eval around the code I found out, that the segfault was =
produced by next request to the same child after the $r->print failed.
$r->print is still failing from time to time, but it's not producing =
segfaults anymore!

Thanks

Denis


-----Ursprüngliche Nachricht-----
Von: Morten Bj=F8rnsvik [mailto:morten.bjornsvik@experian-da.no]=20
Gesendet: Montag, 23. November 2009 10:16
An: mod_perl list
Betreff: RE: AW: print throwing intermittent Segfaults

Hi

I've had a similar error I "fixed" it by adding an eval block around the =
offending code which was tracked back to MASON.

http://rt.cpan.org/Public/Bug/Display.html?id=3D49031

We compile everything from scratch apache,perl,mod_perl,mason, all =
modules by an automated build script.=20

Earlier when we run on mod_perl1.99 and the redhat stack it worked fine.
But then we had other worries :-)

--
Morten Bjoernsvik, Developer, Decision Analytics


-----Original Message-----
From: Andr=E9 Warnier [mailto:aw@ice-sa.com]
Sent: 23. november 2009 09:46
To: mod_perl list
Subject: Re: AW: print throwing intermittent Segfaults

Denis Banovic wrote:
> Hi Willian,
>=20
> Thanks for your checklist, I've run through it, segfaults still =
there...
> Right now it takes less then a minute from apache restart to the first =
segfault.
> This is from the error_log from the RedHat 5 Production machine:
>=20
> Apache2::RequestIO::print: (103) Software caused connection abort at
>=20
> The guys from rackspace are saying that I should recompile all my perl =
modules installed directly from CPAN ( see above ) , do you think this =
would help?
> Or has someone another hint?
>=20
Just my grain of salt : in my own experience, 99% of the "segfault"=20
cases I have encountered, was when Apache or Perl tried to run a piece =
of code not meant for this machine (such as a library meant for another =
machine or another OS version).
Maybe one of the modules you are using installed a wrong library ?
In that sense, the guys from rackspace may be right, although I believe =
that the CPAN modules don't generally contain object-code libraries, or =
else they do compile them at installation.
So maybe it is a library from the RHEL repository which is wrong.

Re: AW: print throwing intermittent Segfaults

am 11.12.2009 20:08:44 von Kurt Hansen

Hello,

I started to see the following when I setup a new server on CentOS 5.4
and installed perl modules from CPAN on Dec. 4:

Denis Banovic wrote:
> This is from the error_log from the RedHat 5 Production machine:
>
> Apache2::RequestIO::print: (103) Software caused connection abort at
>
> The guys from rackspace are saying that I should recompile all my perl modules installed directly from CPAN ( see above ) , do you think this would help?
> Or has someone another hint?
>
>
I do not see this on another server with CentOS 5.2 and perl modules
primarily built Feb 09.

I think something has changed in either CentOS, Apache, or the perl
modules I use in the interim.

I think it has something to do with Apache children being killed either
because of size or request limit reached.

So, Denis, have you tried adjusting these:

> >From my Apache Config:
>
> StartServers 8
> MinSpareServers 5
> MaxSpareServers 20
> ServerLimit 256
> MaxClients 200
> MaxRequestsPerChild 15
>

>
>
The MaxRequetsPerChild especially seems awfully small.

Plus, you say you see the problem happening every 30 to 250 requests.
(StartServers)8 * (MaxRequestsPerChild)15 = 120. This sounds like
roughtly around the time you'd expect Apache children to die.

I had MaxRequests set at 2000 and saw the problem intermittently. I'm
going to set MaxRequests to 0 to turn it off and see if it goes away.

I wonder if something has changed in Apache recently? Or, mod_perl?

Both servers are running Apache 2.2.3. One is running mod_perl 2.0.2 and
the other one is 2.0.4. Judging on the version numbers...could this be a
bug in 2.0.4?

Take care,

Kurt Hansen