tracking a coredump problem

tracking a coredump problem

am 25.01.2009 07:54:39 von Carl Brewer

Hello,
I'm running apache 2.2.11 with mp2.0.4 and libapreq 2.0.8, or at least,
I think I am ... it's on CentOS/RHEL 5.2

We've been having some intermittent segfaults which are proving
difficult to track down, and so I seek the help of the list.

At startup, httpd reports as follows :

Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8b
mod_apreq2-20051231/2.6.0 mod_perl/2.0.4 Perl/v5.8.8 configured

I'm wondering where it's getting mod-apreq2-20051231/2.6.0 from?
I built libapreq2 from source and I think I got it right? :

$ ./configure
--with-apr-config=/usr/local/apache/2.2.11/bin/apr-1-config
--prefix=/usr/local/apache/2.2.11 --enable-perl
-glue --with-apache2-apxs=/usr/local/apache/2.2.11/bin/apxs


and mp2 was built by hand as well, but I'm not sure how to report on the
build arguments (no config.[log|status] file)

I've used strace to follow all the open()'s when starting httpd and it
seems to be looking in the right places :

grep libapr trace
open("/usr/local/apache/2.2.11/lib/libaprutil-1.so.0", O_RDONLY) = 3
open("/usr/local/apache/2.2.11/lib/libapr-1.so.0", O_RDONLY) = 3
open("/usr/local/apache/2.2.11/lib/libapreq2.so.3", O_RDONLY) = 4
open("/usr/local/apache/2.2.11/lib/libapreq2.so.3", O_RDONLY) = 8


grep Apache2 trace
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/PerlSections.pm",
O_RDONLY|O_LARGEFILE) = 7
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/CmdParms.pm",
O_RDONLY|O_LARGEFILE) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/XSLoader.pm",
O_RDONLY|O_LARGEFILE) = 9
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/Apache2/CmdParms/CmdParms.so",
O_RDONLY) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/Directive.pm",
O_RDONLY|O_LARGEFILE) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/Apache2/Directive/Directive.so",
O_RDONLY) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/ServerRec.pm",
O_RDONLY|O_LARGEFILE) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/Log.pm",
O_RDONLY|O_LARGEFILE) = 9
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/Apache2/Log/Log.so",
O_RDONLY) = 9
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/Apache2/ServerRec/ServerRec.so",
O_RDONLY) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/ServerUtil.pm",
O_RDONLY|O_LARGEFILE) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/Apache2/ServerUtil/ServerUtil.so",
O_RDONLY) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/Const.pm",
O_RDONLY|O_LARGEFILE) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/Apache2/Const/Const.so",
O_RDONLY) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/Module.pm",
O_RDONLY|O_LARGEFILE) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/Apache2/Module/Module.so",
O_RDONLY) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/RequestUtil.pm",
O_RDONLY|O_LARGEFILE) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/Apache2/RequestUtil/RequestUtil.so",
O_RDONLY) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/Cookie.pm",
O_RDONLY|O_LARGEFILE) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /APR/Request/Apache2.pm",
O_RDONLY|O_LARGEFILE) = 9
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/APR/Request/Apache2/Apache2.so",
O_RDONLY) = 9
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/RequestRec.pm",
O_RDONLY|O_LARGEFILE) = 9
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/Apache2/RequestRec/RequestRec.so",
O_RDONLY) = 9
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/Request.pm",
O_RDONLY|O_LARGEFILE) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/Util.pm",
O_RDONLY|O_LARGEFILE) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/Apache2/Util/Util.so",
O_RDONLY) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/URI.pm",
O_RDONLY|O_LARGEFILE) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/Apache2/URI/URI.so",
O_RDONLY) = 8
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/Response.pm",
O_RDONLY|O_LARGEFILE) = 11
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/Apache2/Response/Response.so",
O_RDONLY) = 11
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/RequestIO.pm",
O_RDONLY|O_LARGEFILE) = 11
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/Apache2/RequestIO/RequestIO.so",
O_RDONLY) = 11
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/Access.pm",
O_RDONLY|O_LARGEFILE) = 11
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/Apache2/Access/Access.so",
O_RDONLY) = 11
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /Apache2/Filter.pm",
O_RDONLY|O_LARGEFILE) = 10
open("/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /auto/Apache2/Filter/Filter.so",
O_RDONLY) = 10


Any clues, am I looking in the right place for causes of segfaults?

thankyou

Carl

Re: tracking a coredump problem

am 27.01.2009 21:55:12 von Carl Brewer

Following up on this, I'm still stuck, but maybe someone here can help
with a debugging step?

I'm trying to get httpd to dump cores, as we're seeing this a lot :

[Sun Jan 25 18:14:17 2009] [notice] child pid 20822 exit signal
Segmentation fault (11)


and I figure a core might just help. So, in httpd-perl's config :

CoreDumpDirectory /var/cores

and /var/cores is 777

And in the (RHEL/CentOS format) init.d script
start() {
echo -n $"Starting $prog: "
ulimit -S -c unlimited >/dev/null 2>&1
LANG=$HTTPD_LANG daemon $httpd $OPTIONS -f $conf
RETVAL=$?
echo
ulimit -c
[ $RETVAL = 0 ] && touch ${lockfile}
return $RETVAL
}


I think I got just about everything needed to get a coredump, but
nothing shows up in /var/cores, despite many segfaults in the logs.

Anything I've missed?

Thankyou

Carl

Re: tracking a coredump problem

am 28.01.2009 04:59:31 von gozer

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enigCB0D9CCA4D0F4820C692B396
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On 27/1/09 15:55, Carl Brewer wrote:
> Following up on this, I'm still stuck, but maybe someone here can help =

> with a debugging step?
>=20
> I'm trying to get httpd to dump cores, as we're seeing this a lot :
>=20
> [Sun Jan 25 18:14:17 2009] [notice] child pid 20822 exit signal=20
> Segmentation fault (11)
>=20
>=20
> and I figure a core might just help. So, in httpd-perl's config :
>=20
> CoreDumpDirectory /var/cores
>=20
> and /var/cores is 777
>=20
> And in the (RHEL/CentOS format) init.d script
> start() {
> echo -n $"Starting $prog: "
> ulimit -S -c unlimited >/dev/null 2>&1
> LANG=3D$HTTPD_LANG daemon $httpd $OPTIONS -f $conf
> RETVAL=3D$?
> echo
> ulimit -c
> [ $RETVAL =3D 0 ] && touch ${lockfile}
> return $RETVAL
> }
>=20
>=20
> I think I got just about everything needed to get a coredump, but=20
> nothing shows up in /var/cores, despite many segfaults in the logs.
>=20
> Anything I've missed?

Selinux enabled ?

--=20
Philippe M. Chiasson GPG: F9BFE0C2480E7680 1AE53631CB32A107 88C3A5A5
http://gozer.ectoplasm.org/ m/gozer\@(apache|cpan|ectoplasm)\.org/


--------------enigCB0D9CCA4D0F4820C692B396
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iD8DBQFJf9gjyzKhB4jDpaURAstUAKCVoO1lAsDPo+yEHlzorXPnbQOMUQCf WWMM
S3bbvWW6mUpSWgW9nlNm0RM=
=4jxx
-----END PGP SIGNATURE-----

--------------enigCB0D9CCA4D0F4820C692B396--

Re: tracking a coredump problem

am 28.01.2009 13:35:45 von Carl Brewer

Philippe M. Chiasson wrote:

> Selinux enabled ?

Good question, I don't think so, but will double-check.


>

Re: tracking a coredump problem

am 28.01.2009 13:42:54 von Carl Brewer

Carl Brewer wrote:
> Philippe M. Chiasson wrote:
>
>> Selinux enabled ?
>
> Good question, I don't think so, but will double-check.

Nope, selinux is disabled on /etc/selinux/config


>
>
>>
>

Re: tracking a coredump problem

am 28.01.2009 15:15:27 von Adam Prime

Carl Brewer wrote:
> And in the (RHEL/CentOS format) init.d script
> start() {
> echo -n $"Starting $prog: "
> ulimit -S -c unlimited >/dev/null 2>&1
> LANG=$HTTPD_LANG daemon $httpd $OPTIONS -f $conf
> RETVAL=$?
> echo
> ulimit -c
> [ $RETVAL = 0 ] && touch ${lockfile}
> return $RETVAL
> }
>
>

Why are there two ulimit lines in here? Does the second one undo what
the first one did, causing cores to not get generated?

When I needed to get cores generated I did exactly what it said here:

http://httpd.apache.org/dev/debugging.html#crashes

namely, that I ran ulimit in the shell, then in that same shell started
apache with apachectl. I was able to generate cores that way.

Adam

Re: tracking a coredump problem

am 28.01.2009 20:31:45 von Carl Brewer

Adam Prime wrote:
> Carl Brewer wrote:
>> And in the (RHEL/CentOS format) init.d script
>> start() {
>> echo -n $"Starting $prog: "
>> ulimit -S -c unlimited >/dev/null 2>&1
>> LANG=$HTTPD_LANG daemon $httpd $OPTIONS -f $conf
>> RETVAL=$?
>> echo
>> ulimit -c
>> [ $RETVAL = 0 ] && touch ${lockfile}
>> return $RETVAL
>> }
>>
>>
>
> Why are there two ulimit lines in here? Does the second one undo what
> the first one did, causing cores to not get generated?

The second one is a debugging thing, it just prints to STDOUT what the
ulimit has been set to.

>
> When I needed to get cores generated I did exactly what it said here:
>
> http://httpd.apache.org/dev/debugging.html#crashes
>
> namely, that I ran ulimit in the shell, then in that same shell started
> apache with apachectl. I was able to generate cores that way.

Sure, but I'm not sure why doing that in a startup script doesn't do the
same thing?


>
> Adam
>

RE: tracking a coredump problem

am 12.02.2009 15:37:50 von eric.berg

Carl,

I may have missed it, but did you say at what point you were seeing the
segfault? I assume you mean at startup, but can you confirm?

E

> -----Original Message-----
> From: Carl Brewer [mailto:carl@bl.echidna.id.au]=20
> Sent: Wednesday, January 28, 2009 7:43 AM
> To: Philippe M. Chiasson
> Cc: modperl@perl.apache.org
> Subject: Re: tracking a coredump problem
>=20
> Carl Brewer wrote:
> > Philippe M. Chiasson wrote:
> >=20
> >> Selinux enabled ?
> >=20
> > Good question, I don't think so, but will double-check.
>=20
> Nope, selinux is disabled on /etc/selinux/config
>=20
>=20
> >=20
> >=20
> >>
> >=20
>=20
>=20
_______________________________________________

This e-mail may contain information that is confidential, privileged or o=
therwise protected from disclosure. If you are not an intended recipient =
of this e-mail, do not duplicate or redistribute it by any means. Please =
delete it and any attachments and notify the sender that you have receive=
d it in error. Unless specifically indicated, this e-mail is not an offer=
=20to buy or sell or a solicitation to buy or sell any securities, invest=
ment products or other financial product or service, an official confirma=
tion of any transaction, or an official statement of Barclays. Any views =
or opinions presented are solely those of the author and do not necessari=
ly represent those of Barclays. This e-mail is subject to terms available=
=20at the following link: www.barcap.com/emaildisclaimer. By messaging wi=
th Barclays you consent to the foregoing. Barclays Capital is the invest=
ment banking division of Barclays Bank PLC, a company registered in Engla=
nd (number 1026167) with its registered office at 1 Churchill Place, Lond=
on, E14 5HP. This email may relate to or be sent from other members of t=
he Barclays Group.
_______________________________________________

Re: tracking a coredump problem

am 12.02.2009 15:43:32 von JPengCA

In a message dated 2009-2-12 22:39:48, eric.berg@barclayscapital.com writes:
> Nope, selinux is disabled on /etc/selinux/config


Or you may take a look at this article:

_http://blog.modsecurity.org/2009/01/building-qa-test-cases- from-waf-data.html
_
(http://blog.modsecurity.org/2009/01/building-qa-test-cases- from-waf-data.html)

J.
**************Nothing says I love you like flowers! Find a florist near you
now. (http://yellowpages.aol.com/search?query=florist&ncid=emlcnt usyelp00000001)

Re: tracking a coredump problem

am 03.03.2009 06:44:10 von Carl Brewer

eric.berg@barclayscapital.com wrote:
> Carl,
>
> I may have missed it, but did you say at what point you were seeing the
> segfault? I assume you mean at startup, but can you confirm?

Not at startup, it happens after 'a while'. It's very hard to track,
and I am stumped at trying to get a core file to see where it's really
breaking.

My init script (CentOS 5.2):

start() {
echo -n $"Starting $prog: "
ulimit -S -c unlimited >/dev/null 2>&1
LANG=$HTTPD_LANG daemon $httpd $OPTIONS -f $conf
RETVAL=$?
echo
ulimit -c
[ $RETVAL = 0 ] && touch ${lockfile}
return $RETVAL
}


I must be missing something, I get segfaults but no core dumps and I've
finally got to do something about it.

Re: tracking a coredump problem

am 03.03.2009 09:17:20 von Carl Brewer

I've got some coredumps now, after starting apache mp with apachectl not
the init script.

gdb shows this :

Core was generated by `/usr/local/apache/2.2.11/bin/httpd -f
/etc/httpd/conf/httpd-2.2.11-perl.conf -k'.
Program terminated with signal 11, Segmentation fault.
#0 0x0030a039 in Perl_pp_rv2cv () from
/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so


bt full shows a load of info but nothing I can immediately identify as a
failure.

This is its output :

#0 0x0030a039 in Perl_pp_rv2cv () from
/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so
No symbol table info available.
#1 0x002dd88f in Perl_runops_standard () from
/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so
No symbol table info available.
#2 0x0027dffe in Perl_magicname () from
/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so
No symbol table info available.
#3 0x00282806 in Perl_call_sv () from
/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so
No symbol table info available.
#4 0x00604a7f in modperl_callback (my_perl=0x8eac5f8,
handler=0x89b93b8, p=0x95efc50, r=0x95efc90, s=0x94d9ec8, args=0x9688e94)
at modperl_callback.c:101
items =
cv =
sp = (SV **) 0x970ac98
status = 0
#5 0x0060517a in modperl_callback_run_handlers (idx=6, type=4,
r=0x95efc90, c=0x0, s=0x94d9ec8, pconf=0x0, plog=0x0, ptemp=0x0,
run_mode=MP_HOOK_RUN_FIRST) at modperl_callback.c:262
my_perl = (PerlInterpreter *) 0x8eac5f8
interp = (modperl_interp_t *) 0x931fe18
scfg = (modperl_config_srv_t *) 0x94dc5e0
dcfg = (modperl_config_dir_t *) 0x94dd4f0
rcfg = (modperl_config_req_t *) 0x95f0a28
handlers = (modperl_handler_t **) 0x89b93e8
p = (apr_pool_t *) 0x95efc50
av = (MpAV *) 0x89b93d0
avp =
i = 0
status = 0
desc = 0x61ccd3 "PerlResponseHandler"
av_args = (AV *) 0x9688e94
#6 0x0060580a in modperl_callback_per_dir (idx=6, r=0x95efc90,
run_mode=MP_HOOK_RUN_FIRST) at modperl_callback.c:369
No locals.
#7 0x005fe79f in modperl_response_handler_run (r=0x95efc90, finish=0)
at mod_perl.c:1000
retval =
#8 0x005fe96b in modperl_response_handler_cgi (r=0x95efc90) at
mod_perl.c:1100
dcfg = (modperl_config_dir_t *) 0x94dd4f0
h_stdin = (GV *) 0x9688f18
h_stdout = (GV *) 0x9689134
retval = -1
rc =
rcfg = (modperl_config_req_t *) 0x95f0a28
my_perl = (PerlInterpreter *) 0x8eac5f8
interp = (modperl_interp_t *) 0x931fe18
#9 0x080821e9 in ap_run_handler ()
No symbol table info available.
#10 0x08082933 in ap_invoke_handler ()
No symbol table info available.
#11 0x080e0153 in ap_process_request ()
No symbol table info available.
#12 0x080dcb8f in ap_process_http_connection ()
No symbol table info available.
#13 0x0808a873 in ap_run_process_connection ()
No symbol table info available.
#14 0x0808ac86 in ap_process_connection ()
No symbol table info available.
#15 0x08102a73 in child_main ()
No symbol table info available.
#16 0x08102c5e in make_child ()
No symbol table info available.
#17 0x08102eaf in perform_idle_server_maintenance ()
No symbol table info available.
#18 0x081033d9 in ap_mpm_run ()
No symbol table info available.
#19 0x0806b61c in main ()
No symbol table info available.



Is there anything there that helps find where the problem is?