Variable interpolation and m in regular expression matching

Variable interpolation and m in regular expression matching

am 22.01.2008 14:42:30 von Josef Moellers

We were just discussing this and weren't able to resolve:

Imagine I have a variable $var and I'd like to use the m modifier on a=20
regular expression to match a line ending with "word" and the next=20
beginning with "var":

if (/word$var/m) ...

how does perl treat this: as "word" anywhere on a line followed by=20
whatever is in $var or as I described above?

Josef
--=20
These are my personal views and not those of Fujitsu Siemens Computers!
Josef Möllers (Pinguinpfleger bei FSC)
If failure had no penalty success would not be a prize (T. Pratchett)
Company Details: http://www.fujitsu-siemens.com/imprint.html

Re: Variable interpolation and m in regular expression matching

am 22.01.2008 14:59:59 von Gunnar Hjalmarsson

Josef Moellers wrote:
> Imagine I have a variable $var and I'd like to use the m modifier on a
> regular expression to match a line ending with "word" and the next
> beginning with "var":
>
> if (/word$var/m) ...
>
> how does perl treat this: as "word" anywhere on a line followed by
> whatever is in $var or as I described above?

What happened when you tried it?

Possibly is this what you are looking for:

if ( /word\nvar/ )

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Re: Variable interpolation and m in regular expression matching

am 22.01.2008 15:24:34 von Abigail

_
Josef Moellers (josef.moellers@fujitsu-siemens.com) wrote on VCCLVII
September MCMXCIII in :
== We were just discussing this and weren't able to resolve:
==
== Imagine I have a variable $var and I'd like to use the m modifier on a
== regular expression to match a line ending with "word" and the next
== beginning with "var":
==
== if (/word$var/m) ...
==
== how does perl treat this: as "word" anywhere on a line followed by
== whatever is in $var or as I described above?


No, and not only because $var is interpolated. '$' will match a line *END*,
but not the end of line character.

For that, you need:

if (/word\nvar/)

no /m modifier needed. But you might want to use an /x modifier to
make it better readable:

if (/word \n var/x)

And if you may have line endings from another system, you could use:

if (/word \R var/x)


Abigail
--
perl5.004 -wMMath::BigInt -e'$^V=Math::BigInt->new(qq]$^F$^W783$[$%9889$^F47]
..qq]$|88768$^W596577669$%$^W5$^F3364$[$^W$^F$|838747$[88897 39$%$|$^F673$%$^W]
..qq]98$^F76777$=56]);$^U=substr($]=>$|=>5)*(q.25..($^W=@^V) )=>do{print+chr$^V
%$^U;$^V/=$^U}while$^V!=$^W'

Re: Variable interpolation and m in regular expression matching

am 22.01.2008 15:47:43 von jurgenex

Josef Moellers wrote:
>We were just discussing this and weren't able to resolve:
>
>Imagine I have a variable $var and I'd like to use the m modifier on a
>regular expression to match a line ending with "word" and the next
>beginning with "var":
>
>if (/word$var/m) ...
>
>how does perl treat this: as "word" anywhere on a line followed by
>whatever is in $var or as I described above?

I think there is quite some confusion about end-of-line, line end, the
dollar sign, and multi-line matches.

1: the dollar sign in a RE is only special if it is the last character in
the RE. In that case and only in that case it anchors(!) the RE to the end
of the string. It does not(!) match the newline character.
2: if you want to match a newline character then you will have to say so by
including the newline character in the RE: /word\n$var/
2: the m modifier allows an RE to expand across multiple lines within a
single string. However the given RE does not contain a \n or any wild card
that could match a \n. Therefore the m modifier is of no use in this RE.

So the RE matches the text 'word' followed by the content of $var
interpreted as a RE.

jue

Re: Variable interpolation and m in regular expression matching

am 22.01.2008 16:05:17 von Josef Moellers

Jürgen Exner wrote:
> Josef Moellers wrote:
>=20
>>We were just discussing this and weren't able to resolve:
>>
>>Imagine I have a variable $var and I'd like to use the m modifier on a =

>>regular expression to match a line ending with "word" and the next=20
>>beginning with "var":
>>
>>if (/word$var/m) ...
>>
>>how does perl treat this: as "word" anywhere on a line followed by=20
>>whatever is in $var or as I described above?
>=20
>=20
> I think there is quite some confusion about end-of-line, line end, the
> dollar sign, and multi-line matches.
>=20
> 1: the dollar sign in a RE is only special if it is the last character =
in
> the RE. In that case and only in that case it anchors(!) the RE to the =
end
> of the string. It does not(!) match the newline character.
> 2: if you want to match a newline character then you will have to say s=
o by
> including the newline character in the RE: /word\n$var/
> 2: the m modifier allows an RE to expand across multiple lines within a=

> single string. However the given RE does not contain a \n or any wild c=
ard
> that could match a \n. Therefore the m modifier is of no use in this RE=


Why then does "perldoc perlre" tell me this:

sion inside are listed below. Modifiers that alter the
way a regular expression is used by Perl are detailed in
"Regexp Quote-Like Operators" in perlop and "Gory details
of parsing quoted constructs" in perlop.
..
m Treat string as multiple lines. That is, change "^"
and "$" from matching the start or end of the string
to matching the start or end of any line anywhere
within the string.
..
does in any double-quoted string.) The "\A" and "\Z" are
just like "^" and "$", except that they won't match multi=AD
ple times when the "/m" modifier is used, while "^" and
"$" will match at every internal line boundary. To match
the actual end of the string and not ignore an optional
trailing newline, use "\z".

You may be right in that a $ is only special at the end of the RE,=20
though, so I won't be able to match across line boundaries.

> So the RE matches the text 'word' followed by the content of $var
> interpreted as a RE.

Thanks,

Josef


--=20
These are my personal views and not those of Fujitsu Siemens Computers!
Josef Möllers (Pinguinpfleger bei FSC)
If failure had no penalty success would not be a prize (T. Pratchett)
Company Details: http://www.fujitsu-siemens.com/imprint.html

Re: Variable interpolation and m in regular expression matching

am 22.01.2008 16:18:23 von Abigail

_
Josef Moellers (josef.moellers@fujitsu-siemens.com) wrote on VCCLVII
September MCMXCIII in :
**
** You may be right in that a $ is only special at the end of the RE,
** though, so I won't be able to match across line boundaries.


But it's not:

$ perl -wE 'say "foo\nbar" =~ /o $ . b/smx'
1
$



Abigail
--
perl -wle 'print "Prime" if (0 x shift) !~ m 0^\0?$|^(\0\0+?)\1+$0'

Re: Variable interpolation and m in regular expression matching

am 22.01.2008 16:38:10 von Josef Moellers

Abigail wrote:
> _
> Josef Moellers (josef.moellers@fujitsu-siemens.com) wrote on VCCLVII
> September MCMXCIII in :=


What a strange date ... would you care to explain? No, wait ... MCMXCIII =

is 1993 and ... what does the bar on top of the V mean? Ah, I see: If my =

date calculator is right, today would be the 5257th of September 1993,=20
so the V with the bar is actually MMMMM! Nice.
We have a colleague who always uses some date format from the French=20
Revolution (no, not the Spanish Inquisition, no-one expects the Spanish=20
Inquisition ;-): "D=E9cade I, Duodi de Pluvi=F4se de l'Ann=E9e 216 de la =

R=E9volution" (that was yesterday, btw).

> ** =20
> ** You may be right in that a $ is only special at the end of the RE, =

> ** though, so I won't be able to match across line boundaries.
>=20
>=20
> But it's not:
>=20
> $ perl -wE 'say "foo\nbar" =3D~ /o $ . b/smx'
> 1
> $

Iow: There could be a confusion, but there won't be because the two=20
cases will be distinct: I could never match the line-end followed=20
*immediately* by a word anyway.

Josef
--=20
These are my personal views and not those of Fujitsu Siemens Computers!
Josef Möllers (Pinguinpfleger bei FSC)
If failure had no penalty success would not be a prize (T. Pratchett)
Company Details: http://www.fujitsu-siemens.com/imprint.html

Re: Variable interpolation and m in regular expression matching

am 22.01.2008 16:42:23 von Peter Makholm

Josef Moellers writes:

> Abigail wrote:
>> _
>> Josef Moellers (josef.moellers@fujitsu-siemens.com) wrote on VCCLVII
>> September MCMXCIII in :
>
> What a strange date ... would you care to explain? No, wait
> ... MCMXCIII is 1993 and ... what does the bar on top of the V mean?
> Ah, I see: If my date calculator is right, today would be the 5257th
> of September 1993, so the V with the bar is actually MMMMM! Nice.

http://en.wikipedia.org/wiki/September_that_never_ended

//Makholm

Re: Variable interpolation and m in regular expression matching

am 22.01.2008 16:58:52 von Gunnar Hjalmarsson

Jürgen Exner wrote:
> 2: the m modifier allows an RE to expand across multiple lines within a
> single string. However the given RE does not contain a \n or any wild card
> that could match a \n. Therefore the m modifier is of no use in this RE.

Why would you need things that match newlines in order for the /m
modifier to make a difference?

C:\home>type test.pl
$_ = "alpha one\nbeta two\ngamma three\n";
@lastword = /(\w+)$/g;
@allnums = /(\w+)$/gm;
print "\@lastword: @lastword\n";
print "\@allnums: @allnums\n";

C:\home>test.pl
@lastword: three
@allnums: one two three

C:\home>

Neither do you need the /m modifier to make e.g. \s+ expand across
multiple lines, do you?

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Re: Variable interpolation and m in regular expression matching

am 22.01.2008 19:59:16 von jurgenex

Gunnar Hjalmarsson wrote:
>Jürgen Exner wrote:
>> 2: the m modifier allows an RE to expand across multiple lines within a
>> single string. However the given RE does not contain a \n or any wild card
>> that could match a \n. Therefore the m modifier is of no use in this RE.
>
>Why would you need things that match newlines in order for the /m
>modifier to make a difference?
>
>C:\home>type test.pl
>$_ = "alpha one\nbeta two\ngamma three\n";
>@lastword = /(\w+)$/g;
>@allnums = /(\w+)$/gm;
>print "\@lastword: @lastword\n";
>print "\@allnums: @allnums\n";
>
>C:\home>test.pl
>@lastword: three
>@allnums: one two three

Interesting example!
Thank you for sharing.

jue

Re: Variable interpolation and m in regular expression matching

am 22.01.2008 20:02:45 von jurgenex

Josef Moellers wrote:
>Jürgen Exner wrote:
[...]
>> 1: the dollar sign in a RE is only special if it is the last character in
>> the RE. In that case and only in that case it anchors(!) the RE to the end
>> of the string. It does not(!) match the newline character.
[...]
> m Treat string as multiple lines. That is, change "^"
> and "$" from matching the start or end of the string
> to matching the start or end of any line anywhere
> within the string.

Ooops! Somehow I must have gotten something badly mixed up.

>You may be right in that a $ is only special at the end of the RE,
>though, so I won't be able to match across line boundaries.
>
> > So the RE matches the text 'word' followed by the content of $var
> > interpreted as a RE.

You may want to ignore this :-((

jue

Re: Variable interpolation and m in regular expression matching

am 23.01.2008 09:13:26 von Josef Moellers

Jürgen Exner wrote:

>>>So the RE matches the text 'word' followed by the content of $var
>>>interpreted as a RE.
>=20
>=20
> You may want to ignore this :-((

Ignore what? There's white space above your reply ;-)

--=20
These are my personal views and not those of Fujitsu Siemens Computers!
Josef Möllers (Pinguinpfleger bei FSC)
If failure had no penalty success would not be a prize (T. Pratchett)
Company Details: http://www.fujitsu-siemens.com/imprint.html

Re: Variable interpolation and m in regular expression matching

am 23.01.2008 13:13:43 von Florian Kaufmann

> 1: the dollar sign in a RE is only special if it is the last character in
> the RE. In that case and only in that case it anchors(!) the RE to the end

Are you really sure about this rule? I think the $ anchor can also be
used right before the alternation operator | and before a closing
parenthesis ).

$ perl -e 'print "match:".(<> =~ /\d$|x/)."\n"' <<< 23f
match:
$ perl -e 'print "match:".(<> =~ /\d$|x/)."\n"' <<< 23
match:1

Flo