mod_perl regex conundrum

am 15.07.2004 15:24:37 von Simon Miner

------_=_NextPart_001_01C46A6E.FE9D9FCA
Content-Type: text/plain

Hello,

My company has just upgraded our Apache/mod_perl and supporting software
versions, and we are now seeing a strange phenomenon. A piece of code which
has always worked speedily up to this point, now occasionally, but
predictably, takes 10 times longer to execute. The code fragment in
question is this.

-----

my $quoted_link = qr{

((href|action)\s*=\s*) # matching a link in an href or action attribute
($1)

(["']) # starting link delimiter (' or ") ($3)

(

(https?://\w+?\.chr(istian)?book\.com)? # optional domain name (old
or new) ($4)

/ # / separating domain from path

(?![^\3\#\?>]+?\.(exe|sit|pdf|ra?m|mp3|wax|css|js)) # skip non-HTML
file types

[^\3\#\?>]+? # anything that can be in a path (no
anchors, query strings, etc.

) # everything before the ticket

(

[\?\#] # start of a query string or anchor
reference

[^\3]+? # query string or anchor reference

)? # everything after the ticket ($8)

\3 # ending link delimiter

}six;

....

$$text_ref =~ s{$quoted_link}{$1$3$4/$ticket$8$3}gx;

---

The purpose of this code is to tag each URL on a web page with a session ID.
The $text_ref variable is a scalar reference to the page content, and the
$ticket variable contains the ID.

In an effort to debug why the slow-down was occurring, I replaced the
substitution in the snippet above with the following.

---

use Time::HiRes qw(time);

my $start = time;

my $elapsed;

my $count = 0;

print "

Text length = " . length( $$text_ref ) . "

";

while ( $$text_ref =~ /$quoted_link/g ) {

$count += 1;

$elapsed = time - $start;

print "

link_filter2: $count, $& $elapsed

";

}

---

This code prints out the cumulative time the regex is taking to match as it
encounters links on a web page. The problem continued to show itself with
this change, but I still don't know how to pinpoint why it's happening.

Here are some other noteworthy items pertaining to this problem.

* This problem only occurs under Apache/mod_perl. I saved the HTML of a
normal page and a slow web page to files, and copied the code above into a
small command line script. When I executed the script against the files,
the slow page took about the right

amount of time to process relative to the normal page (as opposed to 10
times as long). (BTW: the slow page is actually smaller than the normal
page, so it should be processed more quickly, which it did in the command
line script.)

* Our infrastructure upgrade made involved the following software updates.

-- Before upgrade: Apache 1.3.29, mod_perl 1.27, Perl 5.6.1, Solaris 5.8

-- After upgrade: Apache 1.3.31, mod_perl 1.29, Perl 5.8.3, Solaris 5.9

So, I am wondering...

* Can anyone suggest reasons why this code might be executing so slowly?

* Can anyone suggest potential improvements to the regex so it will execute
faster?

* Does anyone know of changes between the software versions mentioned above
that could lead to this behavior?

Thanks in advance for your help.

- Simon

-----------------------------------------------

Simon Miner

Applications Engineer

Christianbook.com

E: sminer@christianbook.com

T: (978) 573-2233

F: (978) 573-8233

-----------------------------------------------

------_=_NextPart_001_01C46A6E.FE9D9FCA
Content-Type: text/html
Content-Transfer-Encoding: quoted-printable

charset=3Dus-ascii">

style=3D'font-size:10.0pt;
font-family:Arial'>Hello,

style=3D'font-size:10.0pt;
font-family:Arial'>

style=3D'font-size:10.0pt;
font-family:Arial'>My company has just upgraded our Apache/mod_perl and
supporting software versions, and we are now seeing a strange =
phenomenon.
A piece of code which has always worked speedily up to this point, now
occasionally, but predictably, takes 10 times longer to execute. =
The code
fragment in question is this.

style=3D'font-size:10.0pt;
font-family:Arial'>

style=3D'font-size:10.0pt;
font-family:Arial'>-----

style=3D'font-size:10.0pt;
font-family:Courier'>my $quoted_link =3D qr{

style=3D'font-size:10.0pt;
font-family:Courier'> ((href|action)\s*=3D\s*) =
# matching
a link in an href or action attribute ($1)

style=3D'font-size:10.0pt;
font-family:Courier'>
(["'])          &=
nbsp;
# starting link delimiter (' or ") ($3)

style=3D'font-size:10.0pt;
font-family:Courier'> (

style=3D'font-size:10.0pt;
font-family:Courier'> =
(https?://\w+?\.chr(istian)?book\.com)?
# optional domain name (old or new) ($4)

style=3D'font-size:10.0pt;
font-family:Courier'>
/            =
;            =
;
# / separating domain from path

style=3D'font-size:10.0pt;
font-family:Courier'> =
(?![^\3\#\?>]+?\.(exe|sit|pdf|ra?m|mp3|wax|css|js))
# skip non-HTML file types

style=3D'font-size:10.0pt;
font-family:Courier'>
[^\3\#\?>]+?         &nb=
sp;
# anything that can be in a path (no anchors, query strings, =
etc.

style=3D'font-size:10.0pt;
font-family:Courier'>
)            =
;
# everything before the ticket

style=3D'font-size:10.0pt;
font-family:Courier'> (

style=3D'font-size:10.0pt;
font-family:Courier'>
[\?\#]           =
            =

# start of a query string or anchor reference

style=3D'font-size:10.0pt;
font-family:Courier'>
[^\3]+?           =
;
# query string or anchor reference

style=3D'font-size:10.0pt;
font-family:Courier'>
)?           &nbs=
p;
# everything after the ticket ($8)

style=3D'font-size:10.0pt;
font-family:Courier'>
\3
            =
#
ending link delimiter

style=3D'font-size:10.0pt;
font-family:Courier'>}six;

style=3D'font-size:10.0pt;
font-family:Courier'>...

style=3D'font-size:10.0pt;
font-family:Courier'>$$text_ref =3D~ =
s{$quoted_link}{$1$3$4/$ticket$8$3}gx;

style=3D'font-size:10.0pt;
font-family:Arial'>---

style=3D'font-size:10.0pt;
font-family:Arial'>The purpose of this code is to tag each URL on a web =
page
with a session ID. The $text_ref variable is a scalar reference =
to the
page content, and the $ticket variable contains the =
ID.

style=3D'font-size:10.0pt;
font-family:Arial'>

style=3D'font-size:10.0pt;
font-family:Arial'>In an effort to debug why the slow-down was =
occurring, I
replaced the substitution in the snippet above with the =
following.

style=3D'font-size:10.0pt;
font-family:Arial'>---

style=3D'font-size:10.0pt;
font-family:Courier'>use Time::HiRes qw(time);

style=3D'font-size:10.0pt;
font-family:Courier'>my $start =3D time;

style=3D'font-size:10.0pt;
font-family:Courier'>my $elapsed;

style=3D'font-size:10.0pt;
font-family:Courier'>my $count =3D 0;

style=3D'font-size:10.0pt;
font-family:Courier'>print "<p>Text length =3D " . =
length(
$$text_ref ) . "</p>";

style=3D'font-size:10.0pt;
font-family:Courier'>while ( $$text_ref =3D~ /$quoted_link/g ) =
{

style=3D'font-size:10.0pt;
font-family:Courier'> $count +=3D 1;

style=3D'font-size:10.0pt;
font-family:Courier'> $elapsed =3D time - =
$start;

style=3D'font-size:10.0pt;
font-family:Courier'> print "<p>link_filter2: $count, =
$&
$elapsed</p>";

style=3D'font-size:10.0pt;
font-family:Courier'>}

style=3D'font-size:10.0pt;
font-family:Arial'>---

style=3D'font-size:10.0pt;
font-family:Arial'>This code prints out the cumulative time the regex =
is taking
to match as it encounters links on a web page. The problem =
continued to
show itself with this change, but I still don't know how to pinpoint =
why
it's happening.

style=3D'font-size:10.0pt;
font-family:Arial'>

style=3D'font-size:10.0pt;
font-family:Arial'>Here are some other noteworthy items pertaining to =
this
problem.

style=3D'font-size:10.0pt;
font-family:Arial'>

style=3D'font-size:10.0pt;
font-family:Arial'>* This problem only occurs under =
Apache/mod_perl. I
saved the HTML of a normal page and a slow web page to files, and =
copied the
code above into a small command line script. When I executed the =
script
against the files, the slow page took about the right =

style=3D'font-size:10.0pt;
font-family:Arial'>amount of time to process relative to the normal =
page (as
opposed to 10 times as long). (BTW: the slow page is actually =
smaller
than the normal page, so it should be processed more quickly, which it =
did in
the command line script.)

style=3D'font-size:10.0pt;
font-family:Arial'>

style=3D'font-size:10.0pt;
font-family:Arial'>* Our infrastructure upgrade made involved the =
following
software updates.

style=3D'font-size:10.0pt;
font-family:Arial'> -- Before upgrade: Apache 1.3.29, =
mod_perl
1.27, Perl 5.6.1, Solaris 5.8

style=3D'font-size:10.0pt;
font-family:Arial'> -- After upgrade: Apache 1.3.31, =
mod_perl 1.29,
Perl 5.8.3, Solaris 5.9

style=3D'font-size:10.0pt;
font-family:Arial'>

style=3D'font-size:10.0pt;
font-family:Arial'>So, I am wondering...

style=3D'font-size:10.0pt;
font-family:Arial'>

style=3D'font-size:10.0pt;
font-family:Arial'>* Can anyone suggest reasons why this code might be
executing so slowly?

style=3D'font-size:10.0pt;
font-family:Arial'>* Can anyone suggest potential improvements to the =
regex so
it will execute faster?

style=3D'font-size:10.0pt;
font-family:Arial'>* Does anyone know of changes between the software =
versions
mentioned above that could lead to this behavior?

style=3D'font-size:10.0pt;
font-family:Arial'>

style=3D'font-size:10.0pt;
font-family:Arial'>Thanks in advance for your help.

style=3D'font-size:10.0pt;
font-family:Arial'>

style=3D'font-size:10.0pt;
font-family:Arial'>- Simon

style=3D'font-size:10.0pt;
font-family:Arial'>

style=3D'font-size:7.5pt;
font-family:Verdana'>--------------------------------------- -------- an>

style=3D'font-size:7.5pt;
font-family:Verdana'>Simon Miner

style=3D'font-size:7.5pt;
font-family:Verdana'>Applications Engineer

style=3D'font-size:7.5pt;
font-family:Verdana'>Christianbook.com

style=3D'font-size:
7.5pt;font-family:Verdana'>E face=3DVerdana>: =
href=3D"mailto:sminer@christianbook.com"> face=3DVerdana> style=3D'font-size:7.5pt;font-family:Verdana'>sminer@christi anbook.com span>

style=3D'font-size:
7.5pt;font-family:Verdana'>T face=3DVerdana>: =
(978) 573-2233

style=3D'font-size:
7.5pt;font-family:Verdana'>F face=3DVerdana>: =
(978) 573-8233

style=3D'font-size:7.5pt;
font-family:Verdana'>--------------------------------------- -------- an>

style=3D'font-size:
12.0pt'>

------_=_NextPart_001_01C46A6E.FE9D9FCA--

Re: mod_perl regex conundrum

am 15.07.2004 18:55:19 von Stas Bekman

Simon Miner wrote:

> My company has just upgraded our Apache/mod_perl and supporting software
> versions, and we are now seeing a strange phenomenon. A piece of code which
> has always worked speedily up to this point, now occasionally, but
> predictably, takes 10 times longer to execute. The code fragment in
> question is this.

> my $quoted_link = qr{ ... }

The first thing I'd do is check whether some code doesn't try to use
$`, $&, and $', which are known to cause this kind of slowdown. See:
http://search.cpan.org/dist/Devel-SawAmpersand/lib/Devel/Saw Ampersand.pm
It may happen under mod_perl since you usually end up loading quite a
few modules into the same interpreters, including those that you aren't
using for this particular code in question.

--
____________________________________________________________ ______
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com

--
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html

RE: mod_perl regex conundrum

am 15.07.2004 20:31:38 von Simon Miner

Thanks for the suggestion!

Between sawampersand and grep'ing my code, I did find an instance of $&. I
removed it, but I am still seeing the problem.

I have also noticed that the size of our mod_perl processes has doubled
since the upgrade (from ~50M to ~100M). Did Perl, Apache, and mod_perl
really get that much bigger between the versions I mentioned in my last
email?

Any other suggestions on diagnosing or improving the regex would be greatly
appreciated.

Thanks again.

- Simon

-----Original Message-----
From: Stas Bekman [mailto:stas@stason.org]
Sent: Thursday, July 15, 2004 12:55 PM
To: Simon Miner
Cc: mod_perl Mailing List (modperl@perl.apache.org)
Subject: Re: mod_perl regex conundrum

Simon Miner wrote:

> My company has just upgraded our Apache/mod_perl and supporting software
> versions, and we are now seeing a strange phenomenon. A piece of code
which
> has always worked speedily up to this point, now occasionally, but
> predictably, takes 10 times longer to execute. The code fragment in
> question is this.

> my $quoted_link = qr{ ... }

The first thing I'd do is check whether some code doesn't try to use
$`, $&, and $', which are known to cause this kind of slowdown. See:
http://search.cpan.org/dist/Devel-SawAmpersand/lib/Devel/Saw Ampersand.pm
It may happen under mod_perl since you usually end up loading quite a
few modules into the same interpreters, including those that you aren't
using for this particular code in question.

--
____________________________________________________________ ______
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com

--
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html

Re: mod_perl regex conundrum

am 15.07.2004 20:53:49 von Stas Bekman

Simon Miner wrote:
> Thanks for the suggestion!
>
> Between sawampersand and grep'ing my code, I did find an instance of $&. I
> removed it, but I am still seeing the problem.

Did you actually use Devel::SawAmpersand to test it? There are other
modules that pull those in, e.g. if you do 'use English'.

> I have also noticed that the size of our mod_perl processes has doubled
> since the upgrade (from ~50M to ~100M). Did Perl, Apache, and mod_perl
> really get that much bigger between the versions I mentioned in my last
> email?

Perl is getting bigger all the time but definitely not by this amount.
Use Apache::Status coupled with all the goodies it invokes (B::Size etc)
to figure out who eats your memory. If you have your perl built with
ithreads (to check run: perl -V:useithreads), recompile it to not enable
those (unless you plan to use them). You will find quite a few other
performance/memory usage related tips in the "Practical mod_perl" book [1].

Also in your original report, the example of using Time::HiRes is highly
unreliable. You need to count CPU clocks, not wallclocks. Use
Benchmark.pm instead.

Also have you tried using some special purpose CPAN module to do the
parsing for you? e.g. I remember Randal's WebTechniques articles [2]
have plenty of examples of using modules like HTML::Filter, HTML::Tree, etc.

[1] http://modperlbook.org/
[2] http://www.stonehenge.com/merlyn/WebTechniques/

--
____________________________________________________________ ______
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com

--
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html

[OT] Re: mod_perl regex conundrum

am 16.07.2004 00:27:20 von Larry Leszczynski

> > Between sawampersand and grep'ing my code, I did find an instance of $&.
> > I removed it, but I am still seeing the problem.
>
> Did you actually use Devel::SawAmpersand to test it? There are other
> modules that pull those in, e.g. if you do 'use English'.

On a related note, it's possible to 'use English' *without* pulling in the
problematic regex match variables, like so:

use English qw( -no_match_vars ); # Avoids regex performance penalty

Larry

--
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html

Re: [OT] Re: mod_perl regex conundrum

am 16.07.2004 00:41:15 von Stas Bekman

Larry Leszczynski wrote:
>>>Between sawampersand and grep'ing my code, I did find an instance of $&.
>>>I removed it, but I am still seeing the problem.
>>
>>Did you actually use Devel::SawAmpersand to test it? There are other
>>modules that pull those in, e.g. if you do 'use English'.
>
>
> On a related note, it's possible to 'use English' *without* pulling in the
> problematic regex match variables, like so:
>
> use English qw( -no_match_vars ); # Avoids regex performance penalty

Thanks Larry, the Devel::SawAmpersand manpage documents that:
http://search.cpan.org/dist/Devel-SawAmpersand/lib/Devel/Saw Ampersand.pm

--
____________________________________________________________ ______
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com

--
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html

RE: mod_perl regex conundrum

am 22.07.2004 16:40:28 von Simon Miner

Hi again,

Yes, I actually did a use Devel::SawAmpersand and it didn't see any evil
variables. (Currently, our code isn't using the English package.)

I ended up modifying our code so that it skips the regex I sent in my
previous message on most requests. This circumvents the biggest part of our
slow down, but it isn't completely solving the problem. We're still seeing
code sluggishness on certain pages of our web app. The strange thing is
that, in most cases, the same code runs as quickly as it did before our
upgrade. It's just on a few pages of the web app that it slows down.

We tried building Perl using Perl's malloc. Initial tests showed that this
alleviated some of the memory bloating, but when we rebuilt some of our
production web servers, the memory savings weren't realized. In fact, the
Perls with -Dusemymalloc were using more memory. This, coupled with a
warning in the Perl 5.8.3 INSTALL.Solaris file that said to never use
-Dusemymalloc with gcc on Solaris after Perl 5.7 made us uncomfortable with
this change, so we rolled it back.

Devel::DProf and dprofpp were showing that some calls to CGI.pm were taking
a good deal of time, so we're working on replacing them. This is not
producing much of a speedup, however.

We are also working to install the GTop module in our development
environment. However, as we use Solaris, this is proving tricky because
several prerequisite libraries are not present. (Thanks for helping my
coworker, Peter Wood, with this, Stas.) Has anyone else successfully
installed and used GTop on Solaris?

We found most of these approaches by looking in Practical mod_perl and the
mod_perl Developer's Cookbook. Are there any other suggestions that folks
on the mailing list can offer us as we continue to troubleshoot this issue?

Thanks again for your assistance.

- Simon

-----Original Message-----
From: Stas Bekman [mailto:stas@stason.org]
Sent: Thursday, July 15, 2004 2:54 PM
To: Simon Miner
Cc: mod_perl Mailing List (modperl@perl.apache.org)
Subject: Re: mod_perl regex conundrum

Simon Miner wrote:
> Thanks for the suggestion!
>
> Between sawampersand and grep'ing my code, I did find an instance of $&.
I
> removed it, but I am still seeing the problem.

Did you actually use Devel::SawAmpersand to test it? There are other
modules that pull those in, e.g. if you do 'use English'.

> I have also noticed that the size of our mod_perl processes has doubled
> since the upgrade (from ~50M to ~100M). Did Perl, Apache, and mod_perl
> really get that much bigger between the versions I mentioned in my last
> email?

Perl is getting bigger all the time but definitely not by this amount.
Use Apache::Status coupled with all the goodies it invokes (B::Size etc)
to figure out who eats your memory. If you have your perl built with
ithreads (to check run: perl -V:useithreads), recompile it to not enable
those (unless you plan to use them). You will find quite a few other
performance/memory usage related tips in the "Practical mod_perl" book [1].

Also in your original report, the example of using Time::HiRes is highly
unreliable. You need to count CPU clocks, not wallclocks. Use
Benchmark.pm instead.

Also have you tried using some special purpose CPAN module to do the
parsing for you? e.g. I remember Randal's WebTechniques articles [2]
have plenty of examples of using modules like HTML::Filter, HTML::Tree, etc.

[1] http://modperlbook.org/
[2] http://www.stonehenge.com/merlyn/WebTechniques/

--
____________________________________________________________ ______
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com

--
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html

RE: mod_perl regex conundrum

am 22.07.2004 19:17:49 von Perrin Harkins

On Thu, 2004-07-22 at 10:40, Simon Miner wrote:
> I ended up modifying our code so that it skips the regex I sent in my
> previous message on most requests. This circumvents the biggest part of our
> slow down, but it isn't completely solving the problem. We're still seeing
> code sluggishness on certain pages of our web app. The strange thing is
> that, in most cases, the same code runs as quickly as it did before our
> upgrade. It's just on a few pages of the web app that it slows down.
>
> We tried building Perl using Perl's malloc.

Why? Was it going into swap?

> Devel::DProf and dprofpp were showing that some calls to CGI.pm were taking
> a good deal of time, so we're working on replacing them. This is not
> producing much of a speedup, however.

Ironically, CGI.pm slows down CGI programs a lot (all of that code to
compile), but doesn't hurt mod_perl programs much.

> We are also working to install the GTop module in our development
> environment.

Why? Are you trying to get memory stats? You can probably just read
from /proc, like Apache::SizeLimit does.

> We found most of these approaches by looking in Practical mod_perl and the
> mod_perl Developer's Cookbook. Are there any other suggestions that folks
> on the mailing list can offer us as we continue to troubleshoot this issue?

First, find out if you are running out of memory and going into swap.
If you are, fix that with the techniques shown in the books. Next,
figure out if there is something different about a slow request. You
may need to install some extra logging so that you can tell which
requests were slow and reproduce them later. If you can find a specific
request that is always fast and another that is always slow, this should
lead you to the source of the problem. You may have to debug in -X mode
to solve this, since it could involve something about a request that
leaves the server in a bad state for the next request.

- Perrin

--
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html

Re: mod_perl regex conundrum

am 22.07.2004 19:28:33 von Stas Bekman

Simon Miner wrote:
[...]

> We found most of these approaches by looking in Practical mod_perl and the
> mod_perl Developer's Cookbook. Are there any other suggestions that folks
> on the mailing list can offer us as we continue to troubleshoot this issue?

If you can pinpoint the chunks of code that you find slow, I can try to
look at those to optimize them. e.g the regex that you've mentioned.
It's usually possible to rewrite the regex to make it faster. If you can
setup a self contained package with Benchmark.pm using Geoff's
Apache-Test skeleton (http://apache.org/~geoff/) -- that will save me a
lot of time trying to reproduce the environment. Though I'm working on
linux, but hopefully it shouldn't make much difference.

--
____________________________________________________________ ______
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com

--
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html

RE: mod_perl regex conundrum

am 02.08.2004 20:27:25 von Simon Miner

Thanks for your offer to examine our code. We were finally able to track
down the problem. Our code is using a third party API (which was also
upgraded), and that began emitting small pieces of Unicode, unbeknownst to
us. This Unicode was causing the regex slowdown. When we decoded the
Unicode, it began working as before.

Also, I talked to some folks at OSCON who said that some regex optimizations
were removed in Perl 5.8 because they were buggy. This could also have
contributed to the slowdown.

Thanks again.

-----Original Message-----
From: Stas Bekman [mailto:stas@stason.org]
Sent: Thursday, July 22, 2004 1:29 PM
To: Simon Miner
Cc: mod_perl Mailing List (modperl@perl.apache.org)
Subject: Re: mod_perl regex conundrum

Simon Miner wrote:
[...]

> We found most of these approaches by looking in Practical mod_perl and the
> mod_perl Developer's Cookbook. Are there any other suggestions that folks
> on the mailing list can offer us as we continue to troubleshoot this
issue?

If you can pinpoint the chunks of code that you find slow, I can try to
look at those to optimize them. e.g the regex that you've mentioned.
It's usually possible to rewrite the regex to make it faster. If you can
setup a self contained package with Benchmark.pm using Geoff's
Apache-Test skeleton (http://apache.org/~geoff/) -- that will save me a
lot of time trying to reproduce the environment. Though I'm working on
linux, but hopefully it shouldn't make much difference.

--
____________________________________________________________ ______
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com

--
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html

--
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html