Regexp to search over several lines in one string

Regexp to search over several lines in one string

am 27.01.2008 13:18:53 von d99alu

Hi!

I have a string, and I want to remove everything behind the ">"
character. The string contains new line characters that I don't want
to remove.

my $string = "line1
line2>
line3";

Why don't I get a match and replacement with this?

$string =~ s/^([^>]*>)/$1/;

I would expect the string to contain:

"line1
line2>"

But it still contains "line3"!!!

Why is this?
Any suggestions for how to do this in an other 8working) manner?

Best Regards,
Andreas - Sweden

Re: Regexp to search over several lines in one string

am 27.01.2008 13:39:14 von rvtol+news

d99alu@efd.lth.se schreef:

> I have a string, and I want to remove everything behind the ">"
> character. The string contains new line characters that I don't want
> to remove.

s/(?:<=>).*//s;

See perldoc perlre, search for "look-behind".

--
Affijn, Ruud

"Gewoon is een tijger."

Re: Regexp to search over several lines in one string

am 27.01.2008 14:11:22 von Gunnar Hjalmarsson

d99alu@efd.lth.se wrote:
> I have a string, and I want to remove everything behind the ">"
> character. The string contains new line characters that I don't want
> to remove.
>
> my $string = "line1
> line2>
> line3";
>
> Why don't I get a match and replacement with this?
>
> $string =~ s/^([^>]*>)/$1/;

It does match, but since you capture everything, and insert the captured
string using $1, nothing gets changed.

> I would expect the string to contain:
>
> "line1
> line2>"
>
> But it still contains "line3"!!!
>
> Why is this?

Because your regex does not match the "line3" portion of the string.

> Any suggestions for how to do this in an other 8working) manner?

One way to remove everything after the '>' character would be:

$string =~ s/[^>]+$//;

However, that removes the newline between "line2>" and "line3" as well...

This removes everything after '>' but newlines:

$string =~ s{([^>]+)$}{
my $rm = $1;
$rm =~ s/.+//g;
$rm;
}e;

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Re: Regexp to search over several lines in one string

am 27.01.2008 16:06:03 von someone

Dr.Ruud wrote:
> d99alu@efd.lth.se schreef:
>
>> I have a string, and I want to remove everything behind the ">"
>> character. The string contains new line characters that I don't want
>> to remove.
>
> s/(?:<=>).*//s;

ITYM: s/(?<=>).*//s;


John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall

Re: Regexp to search over several lines in one string

am 27.01.2008 16:39:03 von Petr Vileta

d99alu@efd.lth.se wrote:
> Hi!
>
> I have a string, and I want to remove everything behind the ">"
> character. The string contains new line characters that I don't want
> to remove.
>
> my $string = "line1
> line2>
> line3";
>
> Why don't I get a match and replacement with this?
>
> $string =~ s/^([^>]*>)/$1/;
>
> I would expect the string to contain:
>
> "line1
> line2>"
>

$string =~ s/^([^>]*>).*$/$1/s;

line1
line2>

--
Petr Vileta, Czech republic
(My server rejects all messages from Yahoo and Hotmail. Send me your
mail from another non-spammer site please.)

Please reply to

Re: Regexp to search over several lines in one string

am 27.01.2008 16:59:40 von rvtol+news

John W. Krahn schreef:
> Dr.Ruud:
>> d99alu:

>>> I have a string, and I want to remove everything behind the ">"
>>> character. The string contains new line characters that I don't want
>>> to remove.
>>
>> s/(?:<=>).*//s;
>
> ITYM: s/(?<=>).*//s;

Yes. (aaargh, oops again)

--
Affijn, Ruud

"Gewoon is een tijger."

Re: Regexp to search over several lines in one string

am 27.01.2008 17:11:00 von rvtol+news

Dr.Ruud schreef:
> d99alu:

>> I have a string, and I want to remove everything behind the ">"
>> character. The string contains new line characters that I don't want
>> to remove.
>
> s/(?:<=>).*//s;
>
> See perldoc perlre, search for "look-behind".

I also forgot the newline. Maybe this does what you need:

s/(?<=>).*/\n/s;

(doesn't keep any of the original newlines; even adds one when none was
there)

--
Affijn, Ruud

"Gewoon is een tijger."

Re: Regexp to search over several lines in one string

am 27.01.2008 19:02:49 von Gunnar Hjalmarsson

Petr Vileta wrote:
>
> $string =~ s/^([^>]*>).*$/$1/s;

The '$' character is redundant after .*

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Re: Regexp to search over several lines in one string

am 27.01.2008 22:48:13 von Gunnar Hjalmarsson

Gunnar Hjalmarsson wrote:
> d99alu@efd.lth.se wrote:
>> I have a string, and I want to remove everything behind the ">"
>> character. The string contains new line characters that I don't want
>> to remove.
>>
>> my $string = "line1
>> line2>
>> line3";
>>
>> Why don't I get a match and replacement with this?
>>
>> $string =~ s/^([^>]*>)/$1/;
>
> It does match, but since you capture everything, and insert the captured
> string using $1, nothing gets changed.

I have a feeling that the code above actually is an attempt to do:

if ( $string =~ /^([^>]*>)/ ) {
$string = $1;
}

That replaces the content of _$string_ with what was captured in the
regex. However, it's accomplished via the m// operator, while you were
using the s/// operator.

I recommend that you read up on both those operators in "perldoc perlop".

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Re: Regexp to search over several lines in one string

am 28.01.2008 19:31:53 von rvtol+news

Gunnar Hjalmarsson schreef:
> Petr Vileta:

>> $string =~ s/^([^>]*>).*$/$1/s;
>
> The '$' character is redundant after .*

Yes, in this case (because of the s-modfier) it is.

$ echo "abcd" |perl -pe 's/(.*)$/"<".++$i."=$1:".length($1).">\n"/ge'
<1=abcd:4>
<2=:0>

<3=:0>


$ echo "abcd" |perl -pe 's/(.*)$/"<".++$i."=$1:".length($1).">\n"/sge'
<1=abcd
:5>
<2=:0>

--
Affijn, Ruud

"Gewoon is een tijger."