more elegant way to say ($1, $2, $3, $4, ...)?

am 09.08.2007 20:04:38 von Larry

I'm using a /g regex in a while loop to capture parenthesized matches
to meaningful variable names like this:

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = ($1, $2, $3);
...
}

The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

BTW, don't suggest:

while (my ($foo, $bar, $baz) = / (...) ... (...) ... (...)/g) {
...
}

That will cause the regex to evaluate in a list context, which changes
the behavior of /g to parse all of $_ at once, only returning the
first match and throwing away the rest.

Re: more elegant way to say ($1, $2, $3, $4, ...)?

am 09.08.2007 20:29:13 von nobull67

On Aug 9, 7:04 pm, Larry wrote:
> I'm using a /g regex in a while loop to capture parenthesized matches
> to meaningful variable names like this:
>
> while (/ (...) ... (...) ... (...)/g) {
> my ($foo, $bar, $baz) = ($1, $2, $3);
> ...
>
> }
>
> The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

Not that I know of in Perl5 - and believe me I've looked.

You could write a function that returns them

sub matches { no strict 'refs'; map $$_ , 1 .. $#- }

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = matches;
}

But this is hardly more elegant.

Re: more elegant way to say ($1, $2, $3, $4, ...)?

am 09.08.2007 20:52:55 von Larry

On Aug 9, 2:29 pm, Brian McCauley wrote:
> On Aug 9, 7:04 pm, Larry wrote:
>
> > I'm using a /g regex in a while loop to capture parenthesized matches
> > to meaningful variable names like this:
>
> > while (/ (...) ... (...) ... (...)/g) {
> > my ($foo, $bar, $baz) = ($1, $2, $3);
> > ...
>
> > }
>
> > The ($1, $2, $3) part seems inelegant ... is there a more elegant way?
>
> Not that I know of in Perl5 - and believe me I've looked.
>
> You could write a function that returns them
>
> sub matches { no strict 'refs'; map $$_ , 1 .. $#- }
>
> while (/ (...) ... (...) ... (...)/g) {
> my ($foo, $bar, $baz) = matches;
>
> }
>
> But this is hardly more elegant.

Not elegant?! It's awesome! Thanks!

BTW, I just learned 2 new things:

-- map can take an expr as the first param, not just a block (had to
look that up to see what was going on exactly!)

-- that there is a variable called @- and what it does

Thanks!

Re: more elegant way to say ($1, $2, $3, $4, ...)?

am 09.08.2007 21:08:14 von Uri Guttman

>>>>> "L" == Larry writes:

L> I'm using a /g regex in a while loop to capture parenthesized matches
L> to meaningful variable names like this:

L> while (/ (...) ... (...) ... (...)/g) {
L> my ($foo, $bar, $baz) = ($1, $2, $3);
L> ...
L> }

L> The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

you can look at using @+, @- to get the strings via substr. a map call
on 0 .. $#+ will do it but it is ugly too

perldoc perlvar says this:
$1 is the same as "substr($var, $-[1], $+[1] - $-[1])"

so this should work (untested):

my ($foo, $bar, $baz) =
map substr($var, $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;

and that map stuff could be put into a sub to clean it up. just pass in
$var and the @+ and @- globals should still be set. something like this:

sub matches {
map substr($_[0], $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;
}

my ($foo, $bar, $baz) = matches( $var ) ;

but i would just stick with the assignment of $1, $2 ... as it is the
cleanest.

uri

--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org

Re: more elegant way to say ($1, $2, $3, $4, ...)?

am 09.08.2007 21:10:17 von Uri Guttman

>>>>> "L" == Larry writes:

L> On Aug 9, 2:29 pm, Brian McCauley wrote:
>> On Aug 9, 7:04 pm, Larry wrote:

>> You could write a function that returns them
>>
>> sub matches { no strict 'refs'; map $$_ , 1 .. $#- }
>>
>> while (/ (...) ... (...) ... (...)/g) {
>> my ($foo, $bar, $baz) = matches;
>>
>> }
>>
>> But this is hardly more elegant.

L> Not elegant?! It's awesome! Thanks!

it is not elegant as it uses symrefs which is evil. see my other post
for a solution without symrefs.

L> -- that there is a variable called @- and what it does

and see @+ and how perlvar says to use them. my other post shows a full
example without symrefs.

uri

--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org

Re: more elegant way to say ($1, $2, $3, $4, ...)?

am 09.08.2007 22:11:25 von Mirco Wahab

Re: more elegant way to say ($1, $2, $3, $4, ...)?

am 10.08.2007 08:39:02 von nobull67

On Aug 9, 8:10 pm, Uri Guttman wrote:
> >>>>> "L" == Larry writes:
>
> L> On Aug 9, 2:29 pm, Brian McCauley wrote:
> >> On Aug 9, 7:04 pm, Larry wrote:
>
> >> You could write a function that returns them
> >>
> >> sub matches { no strict 'refs'; map $$_ , 1 .. $#- }
> >>
> >> while (/ (...) ... (...) ... (...)/g) {
> >> my ($foo, $bar, $baz) = matches;
> >>
> >> }
> >>
> >> But this is hardly more elegant.
>
> L> Not elegant?! It's awesome! Thanks!
>
> it is not elegant as it uses symrefs which is evil.

Using multiple named variables to implement what is logically a
composite data structure (array or hash) is evil.

The only way access such an evil structure is to use symref or eval().
(Of the two symrefs are the lesser evil).

In this case $1... are already, in effect, such a structure.

The 'evil' here is not in my code but in the underlying design
decision in early versions of Perl.

An alternative approach using substr($_...) would avoid symrefs but
the evil is still there. The fact that we choose to avert our eyes
does not reduce the evil.

See also:

http://groups.google.co.uk/group/comp.lang.perl.misc/browse_ thread/thread/1ebb17826a236940/1a323f2e1968a83f

> see my other post
> for a solution without symrefs.

Your post does not appear to have propagated, could you re-post it
please.

Re: more elegant way to say ($1, $2, $3, $4, ...)?

am 10.08.2007 10:29:33 von Uri Guttman

>>>>> "BM" == Brian McCauley writes:

BM> On Aug 9, 8:10 pm, Uri Guttman wrote:

>>
>> it is not elegant as it uses symrefs which is evil.

BM> Using multiple named variables to implement what is logically a
BM> composite data structure (array or hash) is evil.

i would rather put the blame on the text being parsed! :)
the OP never showed any real text to parse. i have done scalar m//g
loops too but rarely with more than a few grabs so i don't mind the $1
style. if there are too many i would break up the text first into
sections and then parse out the grabs and assign them to a list of
scalars or a hash slice (which is the best way).

BM> The 'evil' here is not in my code but in the underlying design
BM> decision in early versions of Perl.

perl6 solves this problem as usual by allowing m//g loops but only
grabbing what is in the regex and allowing assignment to hash elements
among many other things.

BM> An alternative approach using substr($_...) would avoid symrefs but
BM> the evil is still there. The fact that we choose to avert our eyes
BM> does not reduce the evil.

but it looks so much neater with substr. :)

>> see my other post
>> for a solution without symrefs.

BM> Your post does not appear to have propagated, could you re-post it
BM> please.

not sure why as i saw it. let it rest as it was just a slight mod of
what is in perlvar about using substr and @- and @+.

uri

--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org

Re: more elegant way to say ($1, $2, $3, $4, ...)?

am 02.10.2007 10:13:54 von demerphq

On Aug 9, 9:08 pm, Uri Guttman wrote:
> >>>>> "L" == Larry writes:
>
> L> I'm using a /g regex in a while loop to capture parenthesized matches
> L> to meaningful variable names like this:
>
> L> while (/ (...) ... (...) ... (...)/g) {
> L> my ($foo, $bar, $baz) = ($1, $2, $3);
> L> ...
> L> }
>
> L> The ($1, $2, $3) part seems inelegant ... is there a more elegant way?
>
> you can look at using @+, @- to get the strings via substr. a map call
> on 0 .. $#+ will do it but it is ugly too
>
> perldoc perlvar says this:
> $1 is the same as "substr($var, $-[1], $+[1] - $-[1])"
>
> so this should work (untested):
>
> my ($foo, $bar, $baz) =
> map substr($var, $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;
>
> and that map stuff could be put into a sub to clean it up. just pass in
> $var and the @+ and @- globals should still be set. something like this:
>
> sub matches {
> map substr($_[0], $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;
> }
>
> my ($foo, $bar, $baz) = matches( $var ) ;
>
> but i would just stick with the assignment of $1, $2 ... as it is the
> cleanest.

The problem with this approach is that it requires you to know what
string @- and @+ are operating on, which is actually somewhere between
difficult and impossible in the case of s///.

One solution that avoids this problem is the following, somewhat
crufty code:

sub matches { eval 'sub { \@_ }->(' . join(", ", map "\$$_", 1 .. $#
+ ) . ')' }

Now you can say

my $array=matches();

and have it do the right thing always*, even if we didn't make a copy
of the original string before we used s///.

*Of course the array returned for matches() is only "good" for the
results of a given match.

What we (perl5porters) really should do is provide a special magic
variable that returns the entire string that $1 and friends reference,
so then using @- and @+ would be safe. Unfortunately its too late for
that to make it into 5.10, although its possible for 5.10.1 i guess.

Yves

Re: more elegant way to say ($1, $2, $3, $4, ...)?

am 02.10.2007 10:14:40 von demerphq

On Aug 9, 10:11 pm, Mirco Wahab wrote:
> Larry wrote:
> > I'm using a /g regex in a while loop to capture parenthesized matches
> > to meaningful variable names like this:
>
> > while (/ (...) ... (...) ... (...)/g) {
> > my ($foo, $bar, $baz) = ($1, $2, $3);
> > ...
> > }
> > The ($1, $2, $3) part seems inelegant ... is there a more elegant way?
>
> The $n is an idiomatic expression which is
> not that bad in my opinion.
>
> You could fake 'named captures' like this:

Of use 5.10 when it comes out and make use its real named
captures. :-)

Yves

Re: more elegant way to say ($1, $2, $3, $4, ...)?

am 02.10.2007 17:19:08 von William James

On Aug 9, 1:04 pm, Larry wrote:
> I'm using a /g regex in a while loop to capture parenthesized matches
> to meaningful variable names like this:
>
> while (/ (...) ... (...) ... (...)/g) {
> my ($foo, $bar, $baz) = ($1, $2, $3);
> ...
>
> }
>
> The ($1, $2, $3) part seems inelegant ... is there a more elegant way?
>
> BTW, don't suggest:
>
> while (my ($foo, $bar, $baz) = / (...) ... (...) ... (...)/g) {
> ...
>
> }
>
> That will cause the regex to evaluate in a list context, which changes
> the behavior of /g to parse all of $_ at once, only returning the
> first match and throwing away the rest.

Ruby

scan( / (...) ... (...) ... (...)/ ){ |foo, bar, baz|
...
}

Re: more elegant way to say ($1, $2, $3, $4, ...)?

am 10.10.2007 08:37:11 von szr

Larry wrote:
> I'm using a /g regex in a while loop to capture parenthesized matches
> to meaningful variable names like this:
>
> while (/ (...) ... (...) ... (...)/g) {
> my ($foo, $bar, $baz) = ($1, $2, $3);
> ...
> }
>
> The ($1, $2, $3) part seems inelegant ... is there a more elegant way?
>
> BTW, don't suggest:
>
> while (my ($foo, $bar, $baz) = / (...) ... (...) ... (...)/g) {
> ...
> }
>
> That will cause the regex to evaluate in a list context, which changes
> the behavior of /g to parse all of $_ at once, only returning the
> first match and throwing away the rest.

Why not just do something like the following?

my $s = 'A1Z B2Y C3X D4W E5V';

### Inelegant - have to know amount of captures/loop-iteration
while ($s =~ /(\w)(\d)(\w)/g) {
my ($foo, $bar, $baz) = ($1, $2, $3);
print "'$foo' '$bar' '$baz'\n";
}

print "\n";

### More elegant - all matches for each iteration goes into an array
while (my @matches = $s =~ /\G.*?(\w)(\d)(\w)/) {
pos($s) = $+[0];
print "'", join("' '", @matches), "'\n";
}

___OUTPUT___
'A' '1' 'Z'
'B' '2' 'Y'
'C' '3' 'X'
'D' '4' 'W'
'E' '5' 'V'

'A' '1' 'Z'
'B' '2' 'Y'
'C' '3' 'X'
'D' '4' 'W'
'E' '5' 'V'

All you have to do is add \G.*? to the beginning of the regex, and
remove g from the end of the regex (modifier list.) Other than that,
you just need to have pos($s) = $+[0]; at the beginning of your loop
(or at least before the end of the loop, thouhg it seems safest to keep
it at the beginning, especially if you do any tests on pos($s)

:-)

--
szr