FAQ 6.16 How do I efficiently match many regular expressions at once?

FAQ 6.16 How do I efficiently match many regular expressions at once?

am 01.01.2008 03:03:02 von PerlFAQ Server

This is an excerpt from the latest version perlfaq6.pod, which
comes with the standard Perl distribution. These postings aim to
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

------------------------------------------------------------ --------

6.16: How do I efficiently match many regular expressions at once?



( contributed by brian d foy )

Avoid asking Perl to compile a regular expression every time you want to
match it. In this example, perl must recompile the regular expression
for every iteration of the foreach() loop since it has no way to know
what $pattern will be.

@patterns = qw( foo bar baz );

LINE: while( )
{
foreach $pattern ( @patterns )
{
if( /\b$pattern\b/i )
{
print;
next LINE;
}
}
}

The qr// operator showed up in perl 5.005. It compiles a regular
expression, but doesn't apply it. When you use the pre-compiled version
of the regex, perl does less work. In this example, I inserted a map()
to turn each pattern into its pre-compiled form. The rest of the script
is the same, but faster.

@patterns = map { qr/\b$_\b/i } qw( foo bar baz );

LINE: while( <> )
{
foreach $pattern ( @patterns )
{
print if /$pattern/i;
next LINE;
}
}

In some cases, you may be able to make several patterns into a single
regular expression. Beware of situations that require backtracking
though.

$regex = join '|', qw( foo bar baz );

LINE: while( <> )
{
print if /\b(?:$regex)\b/i;
}

For more details on regular expression efficiency, see Mastering Regular
Expressions by Jeffrey Freidl. He explains how regular expressions
engine work and why some patterns are surprisingly inefficient. Once you
understand how perl applies regular expressions, you can tune them for
individual situations.



------------------------------------------------------------ --------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in
perlfaq.pod.

Re: FAQ 6.16 How do I efficiently match many regular expressions at

am 01.01.2008 09:57:01 von xueweizhong

> @patterns = map { qr/\b$_\b/i } qw( foo bar baz );
>
> LINE: while( <> )
> {
> foreach $pattern ( @patterns )
> {
> print if /$pattern/i;
> next LINE;
> }
> }
>

This codes's execution flow is too tricky to be understandood. Is
there any intention here?

From my understanding, the codes snippet should be changed as


@patterns = map { qr/\b$_\b/i } qw( foo bar baz );

LINE: while( <> )
{
foreach $pattern ( @patterns )
{
if (/$pattern/) {
print;
next LINE;
}
}
}
}


-Todd

Re: FAQ 6.16 How do I efficiently match many regular expressions at once?

am 01.01.2008 18:43:14 von brian d foy

In article
,
Todd wrote:

> > @patterns = map { qr/\b$_\b/i } qw( foo bar baz );
> >
> > LINE: while( <> )
> > {
> > foreach $pattern ( @patterns )
> > {
> > print if /$pattern/i;
> > next LINE;
> > }
> > }
> >
>
> This codes's execution flow is too tricky to be understandood. Is
> there any intention here?

It looks like that part of the code was supposed to be the same as the
previous code.

Before you start assigning malice, just ask. :)