intermittent segfaults, ssl?

intermittent segfaults, ssl?

am 12.01.2010 19:24:17 von Mark Copper

Hi,

I have a server like this:
Server Version: Apache/2.2.9 (Debian) mod_ssl/2.2.9 OpenSSL/0.9.8g
mod_apreq2-20051231/2.6.0 mod_perl/2.0.4 Perl/v5.10.0
I'm also using HTML::Mason

I've been getting intermittent segfaults like this:
child pid 10142 exit signal Segmentation fault (11)
ever since in installed Apache2 in March.

I am going to try to debug this, but I thought I would ask if anyone
might have a suggestion based on this behavior:
- no segfaults occur with 8 hours of an apache restart; the first fault
can be 8 to 48 hours after restart, the 2nd may occur within seconds;
there have never been more than 5 in a day.

- I have observed these faults *only* for port 443 (ssl) requests.

- The simplest case has been when a plain HTML page was served through
mod_perl and mason; e.g. no database call.

I know probably shouldn't be imposing on list-readers time without doing
more work, but I just wonder is it isn't something really elementary
that I'm missing.

Mark

Re: intermittent segfaults, ssl?

am 12.01.2010 19:52:15 von Alex Aminoff

We had a very similar problem. After a lot of effort trying to track down
the source, we believe we isolated it to a line of perl code that
incorrectly sets $/ without localizing it.

mod_backtrace showed that the problem seemed to be happening in

modperl_perl_global_request_save

Here is the thread that suggested the fix that worked for us:

http://www.gossamer-threads.com/lists/modperl/modperl/100066

Here is the thread we posted, not actually any more informative:

http://www.gossamer-threads.com/lists/modperl/modperl/100842

The proof is in the pudding. After fixing the line with $/, we went from
dozens of seg faults per hour to none.

- Alex Aminoff
BaseSpace.net
National Bureau of Economic Research (nber.org)

On Tue, 12 Jan 2010, Mark Copper wrote:

> Hi,
>
> I have a server like this:
> Server Version: Apache/2.2.9 (Debian) mod_ssl/2.2.9 OpenSSL/0.9.8g
> mod_apreq2-20051231/2.6.0 mod_perl/2.0.4 Perl/v5.10.0
> I'm also using HTML::Mason
>
> I've been getting intermittent segfaults like this:
> child pid 10142 exit signal Segmentation fault (11)
> ever since in installed Apache2 in March.
>
> I am going to try to debug this, but I thought I would ask if anyone
> might have a suggestion based on this behavior:
> - no segfaults occur with 8 hours of an apache restart; the first fault
> can be 8 to 48 hours after restart, the 2nd may occur within seconds;
> there have never been more than 5 in a day.
>
> - I have observed these faults *only* for port 443 (ssl) requests.
>
> - The simplest case has been when a plain HTML page was served through
> mod_perl and mason; e.g. no database call.
>
> I know probably shouldn't be imposing on list-readers time without doing
> more work, but I just wonder is it isn't something really elementary
> that I'm missing.
>
> Mark
>

Re: intermittent segfaults, ssl?

am 14.01.2010 05:41:45 von Mark Copper

Despite what I said, this seems to be CPAN bug 37027 or, in my case,
Debian bug #520406 involving module DBD-mysql.

Mark

On Tue, Jan 12, 2010 at 12:24:17PM -0600, Mark Copper wrote:
> Hi,
>
> I have a server like this:
> Server Version: Apache/2.2.9 (Debian) mod_ssl/2.2.9 OpenSSL/0.9.8g
> mod_apreq2-20051231/2.6.0 mod_perl/2.0.4 Perl/v5.10.0
> I'm also using HTML::Mason
>
> I've been getting intermittent segfaults like this:
> child pid 10142 exit signal Segmentation fault (11)
> ever since in installed Apache2 in March.
>
> I am going to try to debug this, but I thought I would ask if anyone
> might have a suggestion based on this behavior:
> - no segfaults occur with 8 hours of an apache restart; the first fault
> can be 8 to 48 hours after restart, the 2nd may occur within seconds;
> there have never been more than 5 in a day.
>
> - I have observed these faults *only* for port 443 (ssl) requests.
>
> - The simplest case has been when a plain HTML page was served through
> mod_perl and mason; e.g. no database call.
>
> I know probably shouldn't be imposing on list-readers time without doing
> more work, but I just wonder is it isn't something really elementary
> that I'm missing.
>
> Mark