mod_perl "instability" on Ubuntu 10.04 server platform

mod_perl "instability" on Ubuntu 10.04 server platform

am 28.10.2010 12:22:05 von Vanja Hrustic

Hello. Not sure if this issue is related to mod_perl itself, but I
wanted to post it and see if anyone might have some ideas (since it
does, indeed, affect mod_perl :). Also, if it turns out there is an
issue on Ubuntu 10.04, people might want to know about it and not
deploy it.

I have been working on a mod_perl application for a while. I was using
Debian 5 as a testing platform, and everything worked fine.

It has been decided that Ubuntu 10.04 server will be deployed on
production machines, so I have moved my test environment to Ubuntu
10.04 server as well.

After moving to Ubuntu 10.04 environment, load tests against the
application (using ApacheBench) started showing failed requests
(threads being in 'W' state, which is "Sending Reply" - worker threads
ended up hanging so some requests were timing out), so I started
investigating, trying to find the bug in the application.

No matter what changes I made to the app, it was still failing. I
pretty much ended up returning from the application immediately after
invocation, but I would still end up with failed requests.

So, I decided to completely disable all mod_perl apps I had
configured, removed all of mod_perl configuration (so no extra modules
are loaded, etc), only defaults remained, and ended up with testing a
very basic application:

==============================================
package MyPackage::ltest;

use strict;
use warnings;

use Apache2::RequestRec;
use Apache2::RequestIO;

sub handler
{
my $r = shift;
$r->content_type("text/plain");

$r->puts("OK\n");
$r->rflush;

return 0;
}

1;
==============================================

I have configured Apache and ran the tests, but I got failed requests
again. Now, it doesn't make sense that something this simple ends up
hanging. Testing looks like this:

[code]
dev@dev:~> ab -c 20 -n 10000 "http://192.168.1.8/ltest"
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.1.8 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests

apr_poll: The timeout specified has expired (70007)
Total of 9993 requests completed
[/code]

This is the handler configuration:

==============================================

SetHandler perl-script
PerlResponseHandler MyPackage::ltest

==============================================

On some occasions, test will go through and I will get no errors. But
very very often (especially when I increase number of total requests,
over 1,000), I will end up getting up to 20 requests timing-out (and
Apache worker threads hanging), which is not acceptable for something
this simple.

So, I just wanted to point out that there might be an issue with
mod_Perl on Ubuntu 10.04 server.

I have tested this on 3 different setups:

1) Ubuntu 10.04 64-bit server, running on Virtual Box
2) Ubuntu 10.04 64-bit server, running on real server hardware
3) Ubuntu 10.04 32-bit server, running on Virtual Box

I have observed same type of behavior on all 3.

As I mentioned before, I moved to Ubuntu 10.04 server from Debian 5
test environment. I had no problems on Debian before, so I have tested
this on Debian as well before posting, to make sure it really is
Ubuntu specific issue.

So, I just made 1 million requests to a Debian box, and had no
problems and no failed requests. Did few more hundreds thousands
requests in batches of 100,000, just to make sure, and again - had no
issues whatsoever. All together, made almost 4 millions requests
against Debian (both 32-bit and 64-bit) servers without a single
failure.

Ran 1 million requests against Ubuntu 10.04 server, and ended up with
failed requests again:

dev@dev:~> ab -c 20 -n 1000000 "http://192.168.1.8/ltest"
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd,
[url]http://www.zeustech.net/[/url]
Licensed to The Apache Software Foundation, [url]http://www.apache.org/[/url]

Benchmarking 192.168.1.8 (be patient)
Completed 100000 requests
Completed 200000 requests
Completed 300000 requests
Completed 400000 requests
Completed 500000 requests
Completed 600000 requests
Completed 700000 requests
Completed 800000 requests
Completed 900000 requests
apr_poll: The timeout specified has expired (70007)
Total of 999981 requests completed

Ended up with 19 worker threads hanging forever. Apache server-status
confirmed that, had 19 of them in W state forever, even after half an
hour. There are another 25 or so in 'G' state ("Gracefully
finishing"), but I am not sure if those can also be considered 'dead'.

Oh yes, forgot to mention, I'm using apache2-mpm-worker Apache setup.

Since I don't expect I'll get any response on Ubuntu forums (where I
also posted), I am wondering if anyone would be able to tell me what I
could possibly to do find out what is causing this issue.

I suspect this is not a fault with mod_perl itself, but rather with
Apache or other packages/libraries, but I am not sure what would be
the easiest way (short of compiling everything from source on Ubuntu
and trying to see if it can be reproduced with everything compiled
from source) to find the source of the problem.

Any ideas or comments are most welcome :)

Thanks.

Vanja

Re: mod_perl "instability" on Ubuntu 10.04 server platform

am 28.10.2010 13:35:09 von Cosimo Streppone

On Thu, 28 Oct 2010 12:22:05 +0200, Vanja Hrustic
wrote:

> No matter what changes I made to the app, it was still failing. I
> pretty much ended up returning from the application immediately after
> invocation, but I would still end up with failed requests.

Are the Debian 5 and Ubuntu 10.04 servers
on different networks/switches?

Did you try making requests from localhost
or from your workstation?

Sometimes a faulty or overloaded network switch can cause
packet loss, and thus slow response times.

--
Cosimo

Re: mod_perl "instability" on Ubuntu 10.04 server platform

am 28.10.2010 13:49:04 von Vanja Hrustic

Tried on physical boxes on same LAN (Gigabit switch), virtual machines
(issuing requests from the same box where VBox is running - so no
switches involved, and also from other box on LAN), tried on colocated
boxes (those are Ubuntu - was issuing requests from one dedicated
server to another), etc.

Just did few million more requests with Debian, no problems. Did
10,000 with completely clean Ubuntu 64-bit 10.04 server setup, got 4
dead workers.

I will be going back to Debian for sure, but I kind of wanted to see
if I could possibly give some more useful bug description to Ubuntu
people, rather than "Hey, this doesn't work - fix it" ;)

Unfortunately, it seems like I'd have to dig deep into Apache or
mod_perl to hunt this down, and I do not have knowledge (nor
'intuition' :) to do this without some guidance.

Basically, if I end up with these unresponsive threads, is there
anything I can do to figure out what caused them to hang? Would gdb be
of any use, would I be able to attach to these threads and see any
useful details (never debugged threaded apps, so no idea how that
would work)?

Thanks.

Vanja

On Thu, Oct 28, 2010 at 6:35 PM, Cosimo Streppone wrote:
> On Thu, 28 Oct 2010 12:22:05 +0200, Vanja Hrustic
> wrote:
>
>> No matter what changes I made to the app, it was still failing. I
>> pretty much ended up returning from the application immediately after
>> invocation, but I would still end up with failed requests.
>
> Are the Debian 5 and Ubuntu 10.04 servers
> on different networks/switches?
>
> Did you try making requests from localhost
> or from your workstation?
>
> Sometimes a faulty or overloaded network switch can cause
> packet loss, and thus slow response times.
>
> --
> Cosimo
>

Re: mod_perl "instability" on Ubuntu 10.04 server platform

am 28.10.2010 13:53:42 von Dave Hodgkinson

--Apple-Mail-9-721976902
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=us-ascii


On 28 Oct 2010, at 12:49, Vanja Hrustic wrote:
>
> Unfortunately, it seems like I'd have to dig deep into Apache or
> mod_perl to hunt this down, and I do not have knowledge (nor
> 'intuition' :) to do this without some guidance.

I have a basic mistrust of shipped packages. I'm in the process of building
perl, httpd and modperl from scratch for a client. I think for serious
use, it's the only way to go.


>
> Basically, if I end up with these unresponsive threads, is there
> anything I can do to figure out what caused them to hang? Would gdb be
> of any use, would I be able to attach to these threads and see any
> useful details (never debugged threaded apps, so no idea how that
> would work)?


Yes, you can do a gdb attach and a stack trace. You should be able to
see from that if it's in apache or perl that the problem is happening.


--Apple-Mail-9-721976902
Content-Disposition: attachment;
filename=smime.p7s
Content-Type: application/pkcs7-signature;
name=smime.p7s
Content-Transfer-Encoding: base64

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEH AQAAoIIDWjCCA1Yw
ggI+oAMCAQICAQEwCwYJKoZIhvcNAQEFMEoxGDAWBgNVBAMMD0RhdmUgSG9k Z2tpbnNvbjELMAkG
A1UEBhMCR0IxITAfBgkqhkiG9w0BCQEWEmRhdmVob2RnQGdtYWlsLmNvbTAe Fw0xMDA1MjYxMjM3
NDVaFw0xMTA1MjYxMjM3NDVaMEoxGDAWBgNVBAMMD0RhdmUgSG9kZ2tpbnNv bjELMAkGA1UEBhMC
R0IxITAfBgkqhkiG9w0BCQEWEmRhdmVob2RnQGdtYWlsLmNvbTCCASIwDQYJ KoZIhvcNAQEBBQAD
ggEPADCCAQoCggEBAKNEsd4Pz8mjFrM97NLC6WwfXMJEFGgw9+1j5RPhVvFC D+jPATGVzON1lVaB
9C5vMQov/hzC6/B0bynpNzQnZC+v3Vy+Flgro+XNh1rMPtqqK757bjwEQk/3 deB3yHuT6qCzj3Mb
ze5uuGYGJFyzOeKjteUp0UqrgiMl587qE1OGSrvMWCKSAo1nNgXA8FXnn/nN jgLrocwM8GrzMfvG
pA3bpjdKeZxSN4KgWnHZNYMb50CNOs6epET0snvEpzjojjhCdHTQUWZ+FZqF nN8aNmm2/hB/D2vO
8XBF+wNzsbljDXxL7wvgJEIqvvEFhvRYzJfkC5iWsSfss5nxeNINV2MCAwEA AaNJMEcwDgYDVR0P
AQH/BAQDAgeAMBYGA1UdJQEB/wQMMAoGCCsGAQUFBwMEMB0GA1UdEQQWMBSB EmRhdmVob2RnQGdt
YWlsLmNvbTANBgkqhkiG9w0BAQUFAAOCAQEAiwQ36Blz4Ud5zViOOt13qTf6 trx+mD2Q4a21C7rJ
WOjSclwaCa97VW9ZNxdXkPTG+QQY0kDNNEp601rNbmRUoSoHOAKtWwdeDL7C U1jStW5Wld6443Bk
lgepnkuK5Wgau0FZo20L9seKWkG1A9PpbanwSsyazVaPl8hNVzao5SCjvly0 9x9rE2ba4uv05bZj
0AZbOjKX25cNoBlc8YBgrdY0UpMz8tMwQEeIlJtzzF/YN3Hcvd279wSuVRZK j0UCufduHILVkdit
xWKgWMKRoGbBuV6KUQEs8JR6XwHq6bSCu326MIMdSx7HpqCfr+BtLeeCy6Cm 58o/jyt9cpFgUTGC
ApkwggKVAgEBME8wSjEYMBYGA1UEAwwPRGF2ZSBIb2Rna2luc29uMQswCQYD VQQGEwJHQjEhMB8G
CSqGSIb3DQEJARYSZGF2ZWhvZGdAZ21haWwuY29tAgEBMAkGBSsOAwIaBQCg ggEfMBgGCSqGSIb3
DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcNAQkFMQ8XDTEwMTAyODExNTM0 MlowIwYJKoZIhvcN
AQkEMRYEFF7LM9JfErbns6KaashQ9ihIkxRkMF4GCSsGAQQBgjcQBDFRME8w SjEYMBYGA1UEAwwP
RGF2ZSBIb2Rna2luc29uMQswCQYDVQQGEwJHQjEhMB8GCSqGSIb3DQEJARYS ZGF2ZWhvZGdAZ21h
aWwuY29tAgEBMGAGCyqGSIb3DQEJEAILMVGgTzBKMRgwFgYDVQQDDA9EYXZl IEhvZGdraW5zb24x
CzAJBgNVBAYTAkdCMSEwHwYJKoZIhvcNAQkBFhJkYXZlaG9kZ0BnbWFpbC5j b20CAQEwDQYJKoZI
hvcNAQEBBQAEggEAMwulUIzUXj+PUck3yD6/rHJgpIbtd0m2keibves5KJXz 2qGgN1pGJttQEpg0
rwvPaI2E1Es9Cx/9/2ox0U7QhLgIeLg68cvAFPyEnSoT3eQXZNbV4ZvY1kIA 1uSbmmyJeVOI0GvM
bkDKgtlIl9qWmjUTMorD4nDpwbM/UQzfQzk9WUuFCqJ+TdNNJGB+pDyMiGy6 MiF3HRvcXHy8Y/Rj
pa+zu3ocUPzSjKLU1Th0RjbXl4MQ1PcxCOzXVTciX+bn+FEQCqELWdAG6wZc zSd5GhK8sAB6MQMS
ar//yIHQwJ6J79dX5WujHWtkt4Kqrp3g0QJ94x0JggzyOAJC/gbCzwAAAAAA AA==

--Apple-Mail-9-721976902--

Re: mod_perl "instability" on Ubuntu 10.04 server platform

am 29.10.2010 03:00:10 von Max Kanat-Alexander

On 10/28/2010 03:22 AM, Vanja Hrustic wrote:
> So, I just made 1 million requests to a Debian box, and had no
> problems and no failed requests.

Are both the Debian and the Ubuntu box using a worker MPM, or is one of
them using prefork and the other using worker?

-Max
--
http://www.everythingsolved.com/
Competent, Friendly Bugzilla and Perl Services. Everything Else, too.