Regex confusion...

Regex confusion...

am 27.09.2007 22:42:22 von guthrie

sorry for the beginner question, but...

With this code
my $img = "0-12345-abc";
print " Match.1 ", (defined $img);
print " Match.2 ", ($img =~ /\S/);
print " Matched::", $1;
print " Match.3 ", ($img =~ /^(\d)-(\d+)-(\w)$/);
print " Matched::", $1, ", ", $2, ", ", $3;
I expected:
true, true, "012345-abc", true,
and then: 0, 12345, abc

Instead I get:
true true "" false
"" "" ""

Actually: img= (0-52557-wind)
Match.1 1 Match.2 1 Matched::
Match.3 Matched::, ,

Seems simple enough, what am I missing!
- why doesn't the full string first match against /\S/ return the
string
- why doesn't the second (extracting) match work.

The actual code I'm trying for is:
if(defined $img and $img =~ /\S/) {
if ($img =~ /^(\d)-(\d+)-(\w)$/)
{ my ($t, $zip, $type) = ($1, $2, $3); }
else { die "ERROR: invalid URL arguments :: ${img}\n";}
print "Debug: (", $t, ", ", $zip, ", ", $type, ")\n"; ##
Debug...

Thanks for any hints. Sorry for the confusion!
Greg

Re: Regex confusion...

am 27.09.2007 22:59:20 von guthrie

Oops, small correction;

> With this code
> my $img = "0-12345-abc";
> print " Match.1 ", (defined $img);
> print " Match.2 ", ($img =~ /\S/);
> print " Matched::", $&;
print " pre-Matched::", $`, "\n";
print " post-Matched::", $', "\n";
I get::
Match.1 1 Match.2 1 Matched::0
pre-Matched::
post-Matched::-52557-wind

I expected:
pre="", match="0-12345-abc", post=""

Re: Regex confusion...

am 27.09.2007 23:02:41 von Narthring

On Sep 27, 3:42 pm, guthrie wrote:
> sorry for the beginner question, but...
>
> With this code
> my $img = "0-12345-abc";
> print " Match.1 ", (defined $img);
> print " Match.2 ", ($img =~ /\S/);
> print " Matched::", $1;
> print " Match.3 ", ($img =~ /^(\d)-(\d+)-(\w)$/);
> print " Matched::", $1, ", ", $2, ", ", $3;
> I expected:
> true, true, "012345-abc", true,
> and then: 0, 12345, abc
>
> Instead I get:
> true true "" false
> "" "" ""
>
> Actually: img= (0-52557-wind)
> Match.1 1 Match.2 1 Matched::
> Match.3 Matched::, ,
>
> Seems simple enough, what am I missing!
> - why doesn't the full string first match against /\S/ return the
> string
> - why doesn't the second (extracting) match work.
>
> The actual code I'm trying for is:
> if(defined $img and $img =~ /\S/) {
> if ($img =~ /^(\d)-(\d+)-(\w)$/)
> { my ($t, $zip, $type) = ($1, $2, $3); }
> else { die "ERROR: invalid URL arguments :: ${img}\n";}
> print "Debug: (", $t, ", ", $zip, ", ", $type, ")\n"; ##
> Debug...
>
> Thanks for any hints. Sorry for the confusion!
> Greg

$img =~ /^(\d)-(\d+)-(\w+)$/

\w matches a single 'word' character, not an entire word.

Re: Regex confusion...

am 27.09.2007 23:05:24 von Dummy

guthrie wrote:
> sorry for the beginner question, but...
>
> With this code
> my $img = "0-12345-abc";
> print " Match.1 ", (defined $img);
> print " Match.2 ", ($img =~ /\S/);
> print " Matched::", $1;
> print " Match.3 ", ($img =~ /^(\d)-(\d+)-(\w)$/);
> print " Matched::", $1, ", ", $2, ", ", $3;
> I expected:
> true, true, "012345-abc", true,
> and then: 0, 12345, abc
>
> Instead I get:
> true true "" false
> "" "" ""
>
> Actually: img= (0-52557-wind)
> Match.1 1 Match.2 1 Matched::
> Match.3 Matched::, ,
>
> Seems simple enough, what am I missing!
> - why doesn't the full string first match against /\S/ return the
> string

The character class \S matches a single character so it can't match the full
string. The expression ($img =~ /\S/) will only return "true" or "false"
because you don't use the /g global option and/or you don't have any capturing
parentheses in the pattern.


> - why doesn't the second (extracting) match work.

Because the pattern /^(\d)-(\d+)-(\w)$/ doesn't match the string
'0-52557-wind'. -(\w)$ will only match one character between a hyphen and
the end of the line but your string has four characters (wind) between the
hyphen and the end of the line.



John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall

Re: Regex confusion...

am 27.09.2007 23:40:32 von guthrie

-- Many thanks, very silly of me.

I thought these were word & space matches, not just a single
character.
I did (mis-) read the documentation several times! :-)

Thanks again.
Greg

Re: Regex confusion...

am 28.09.2007 01:10:42 von Ben Morrow

Quoth guthrie :
> sorry for the beginner question, but...
>
> With this code
> my $img = "0-12345-abc";
> print " Match.1 ", (defined $img);
> print " Match.2 ", ($img =~ /\S/);
> print " Matched::", $1;

You should never use the $N variables without checking the match
succeeded. In any case, your pattern has no capturing parens, so $1 will
be empty.

Others have already noted that \S and \w only match single characters.

> The actual code I'm trying for is:
> if(defined $img and $img =~ /\S/) {
> if ($img =~ /^(\d)-(\d+)-(\w)$/)
> { my ($t, $zip, $type) = ($1, $2, $3); }

This can be simplified to

if (
my ($t, $zip, $type) =
$img =~ /^(\d)-(\d+)-(\w+)$/
) {

which avoids the need to use the $N variables altogether.

Ben

Re: Regex confusion...

am 28.09.2007 03:43:01 von sln

On Fri, 28 Sep 2007 00:10:42 +0100, Ben Morrow wrote:

>
>Quoth guthrie :
>> sorry for the beginner question, but...
>>
>> With this code
>> my $img = "0-12345-abc";
>> print " Match.1 ", (defined $img);
>> print " Match.2 ", ($img =~ /\S/);
>> print " Matched::", $1;
>
>You should never use the $N variables without checking the match
>succeeded. In any case, your pattern has no capturing parens, so $1 will
>be empty.
>
>Others have already noted that \S and \w only match single characters.
>
>> The actual code I'm trying for is:
>> if(defined $img and $img =~ /\S/) {
>> if ($img =~ /^(\d)-(\d+)-(\w)$/)
>> { my ($t, $zip, $type) = ($1, $2, $3); }
>
>This can be simplified to
>
> if (
> my ($t, $zip, $type) =
> $img =~ /^(\d)-(\d+)-(\w+)$/
> ) {
>
>which avoids the need to use the $N variables altogether.
>
>Ben

It might be quicker to check for sucess first then do the asignment

$_ = "......";
if ( /^(\d)-(\d+)-(\w+)$/ )
{
#use $1,2,3 or asign
($t, $zip, $type) = ($1, $2, $3);
}

Re: Regex confusion...

am 28.09.2007 07:04:40 von Charles DeRykus

On Sep 27, 6:43 pm, s...@netherlands.co wrote:
> On Fri, 28 Sep 2007 00:10:42 +0100, Ben Morrow wrote:
>
> ...
>
> >This can be simplified to
>
> > if (
> > my ($t, $zip, $type) =
> > $img =~ /^(\d)-(\d+)-(\w+)$/
> > ) {
>
> >which avoids the need to use the $N variables altogether.
>
> >Ben
>
> It might be quicker to check for sucess first then do the asignment
>
> $_ = "......";
> if ( /^(\d)-(\d+)-(\w+)$/ )
> {
> #use $1,2,3 or asign
> ($t, $zip, $type) = ($1, $2,
^^^^^^^^^^^^^^^^^^^^^^^^^^

Are you sure... wouldn't your solution require an extra copy from
$N.

--
Charles DeRykus

Re: Regex confusion...

am 28.09.2007 09:28:56 von Abigail

_
sln@netherlands.co (sln@netherlands.co) wrote on VCXLI September MCMXCIII
in :
-- On Fri, 28 Sep 2007 00:10:42 +0100, Ben Morrow wrote:
--
-- >
-- >This can be simplified to
-- >
-- > if (
-- > my ($t, $zip, $type) =
-- > $img =~ /^(\d)-(\d+)-(\w+)$/
-- > ) {
-- >
-- >which avoids the need to use the $N variables altogether.
-- >
--
-- It might be quicker to check for sucess first then do the asignment
--
-- $_ = "......";
-- if ( /^(\d)-(\d+)-(\w+)$/ )
-- {
-- #use $1,2,3 or asign
-- ($t, $zip, $type) = ($1, $2, $3);
-- }
--

Do you have some figures to back up this claim? I get a marginal
difference, with the direct assignment being faster if the pattern
matches, and the delayed assignment being faster if the pattern doesn't
match. Considering that the difference is less than 1 microseconds,
I wouldn't base my decision which code to use based on speed.


Abigail
--
perl -wle\$_=\<\