Using (?{}) code blocks and $^R

Using (?{}) code blocks and $^R

am 20.09.2007 01:06:14 von Clint Olsen

Hi:

I've been writing some behemoth reguar expressions and having some good
luck relying on $^R to pass results from the RE. However, I have a
non-intuitive result coming from Perl:

#!/usr/bin/perl

use strict;
use warnings;
#use re 'debug';


my $multiline_comment = qr@/\*(?{ print "starting multi-line\n"; [ 0, 2 ] })
(?:(.)+?
(?{ [ $^R->[0], $^R->[1] + length $^N ] })
| (?: (\n+) (?{ print "found newline in multi\n"; [ length $^N, 1 ] }))
)*?
\*/ (?{ print "finished comment\n"; [1, 1]; })
@x;


my $foo = "/* foo
bar */";

while ($foo =~ m/$multiline_comment/g) {
print "@{$^R}\n";

}

When I run this example, I get:

# ./test
starting multi-line
found newline in multi
finished comment
0 2

I expected to see $^R contain the results of the final code block, not the
block above. I assume there's some weird scoping that I didn't anticipate,
but I'm not sure why. I figured the code would evaluate left->right.

Thanks,

-Clint
--
Clint Olsen . -- .
clint at NULlsen dot net .' ,-. `.
;_,' ( ;
"I am Dick Lexic of Borg. Prepare to be ass-laminated." `. ``;'
-- Styx Allum ` -- '

Re: Using (?{}) code blocks and $^R

am 21.09.2007 09:00:50 von nobull67

On Sep 20, 12:06 am, Clint Olsen wrote:

> I've been writing some behemoth reguar expressions and having some good
> luck relying on $^R to pass results from the RE. However, I have a
> non-intuitive result coming from Perl:

Wow, this is a "Gem"!

> #!/usr/bin/perl
>
> use strict;
> use warnings;
> #use re 'debug';
>
> my $multiline_comment = qr@/\*(?{ print "starting multi-line\n"; [ 0, 2 ] })
> (?:(.)+?
> (?{ [ $^R->[0], $^R->[1] + length $^N ] })
> | (?: (\n+) (?{ print "found newline in multi\n"; [ length $^N, 1 ] }))
> )*?
> \*/ (?{ print "finished comment\n"; [1, 1]; })
> @x;
>
> my $foo = "/* foo
> bar */";
>
> while ($foo =~ m/$multiline_comment/g) {
> print "@{$^R}\n";
>
> }
>
> When I run this example, I get:
>
> # ./test
> starting multi-line
> found newline in multi
> finished comment
> 0 2
>
> I expected to see $^R contain the results of the final code block, not the
> block above. I assume there's some weird scoping that I didn't anticipate,
> but I'm not sure why.

IMNSO $^R was a special variable too far. Code in (?{}) is already
able to set dynamically scoped (package variables) and that really
should be enough to propagate information from one to another. It's
certainly much clearer what's going on with explicit variables.

What's happening is that all but the first (?{}) does, in effect, an
implicit local($^R). What is, IMHO, confusing here is the fact that
the first (?{}) does not local($^R).

Consider:

#!/usr/bin/perl
use strict;
use warnings;

$_='X';


/ (?{1}) (?{2}) X (?{ print $^R }) /x and print "$^R\n"; #1
'21'
/ (?{1}) (?: (?{2}) X )? (?{ print $^R }) /x and print "$^R\n"; #2
'21'
/ (?{1}) (?: (?{2}) Y )? (?{ print $^R }) /x and print "$^R\n"; #3
'11'

our $r;
/ (?{$r=1}) (?{local $r=2}) X (?{ print $r }) /x and print "$r\n";
#1a '21'
/ (?{$r=1})((?{local $r=2}) X)?(?{ print $r }) /x and print "$r\n";
#2a '21'
/ (?{$r=1})((?{local $r=2}) Y)?(?{ print $r }) /x and print "$r\n";
#3a '11'
/ (?{$r=1})((?{ $r=2}) Y)?(?{ print $r }) /x and print "$r\n";
#3b '22'

__END__

All the above matches succeed. Examples 'a' show what's going on in
terms of an ordinary variable. (I only changed (?:) to () to avoid
line-wrap in the example - the value of $1 is not in discussion
here).

In examples 1 the local is kinda redundant as it was in your code but
in examples 2 and 3 it is needed so that $^R can be popped when the
character does not match. Examples 3 shows that when the pattern after
the second (?{}) fails the stack is popped back so that
local()izations from that block are undone.

Example 3b shows what would go wrong if there were not an implicit
local($^R). Even though the $r=2 happens in an branch of the pattern
match that subsequently was backtracked its effect is not undone.

All the above may, in fact, be a simplification of the truth but in
conclusion I think that perlre's description of $^R should say that
the state of $^R should be considered indeterminate after completion
of the match.

Re: Using (?{}) code blocks and $^R

am 24.09.2007 23:18:34 von Clint Olsen

On 2007-09-21, Brian McCauley wrote:
> IMNSO $^R was a special variable too far. Code in (?{}) is already
> able to set dynamically scoped (package variables) and that really
> should be enough to propagate information from one to another. It's
> certainly much clearer what's going on with explicit variables.

I had tried that at first, but doing this forces me to use lexically scoped
'globals' in my code or I have to move my behemoth patterns into the
subroutine scope, and I wasn't entirely sure whether Perl would in fact
build up and tear down these REs every time the subroutine was called. I
never got a clarification on that one, so I marched onward since the
results of $^R and $^N were sane after my other patterns. It also just
/seemed/ weird to have to set variables explicitly. The localization
examples in the manpage weren't all that clear to me. It looked to me as
if even if the pattern matched that you'd pop the stack out and lose those
intermediate values anyway, so that's why I tried to use $^R. In essence
it appeared that info could be passed top down but results couldn't be
handed back (bottom up)?

Thanks,

-Clint

Re: Using (?{}) code blocks and $^R

am 18.10.2007 02:32:10 von dkcombs

Man, am I out of date (or maybe just blind to the doc!).

Really briefly, just a quick overview, what are these two
things, and where do they seem to be useful?

Also, in what version did they or each first appear?


THANKS!

David

Re: Using (?{}) code blocks and $^R

am 18.10.2007 17:38:14 von glex_no-spam

David Combs wrote:
> Man, am I out of date (or maybe just blind to the doc!).
>
> Really briefly, just a quick overview, what are these two
> things, and where do they seem to be useful?

perldoc perlre

>
> Also, in what version did they or each first appear?