Bookmarks

Yahoo Gmail Google Facebook Delicious Twitter Reddit Stumpleupon Myspace Digg

Search queries

wwwxxxx.jpeg, xxxxdup, WWWXXX..APC site:board.issociate.de, WWWXXXAPC, WWWXXX .CMD, Wwwwxxx reemine, WWWXXX.VCBA, WWWXXX.VCBA, TheboL.wwwxxxxx, WWWXXXAPC

Links

XODOX
Impressum

#1: Multiline preg_match

Posted on 2008-04-16 23:04:14 by Paul Lautman

According to the manual, the default for preg_match is to treat the subject
string as consisting of a single "line" of characters (even if it actually
contains several newlines).

I want to match the "string" below to extract everyting from <strong> to
</div> (not inclusive).
However my attempt at preg_match('/<strong(.*)/',$fc,$match) returns only
<strong>RCR002</strong>


Any suggestions welcome.

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>Untitled Document</title>
</head>

<body>
<div align="center"><img
src="../../../images/stories/rhinestone_stock/crowns/RCR002. gif" width="200"
height="200" />


<strong>RCR002</strong>

2mm &amp; 3mm

~5.25&rdquo;W x 3.25&rdquo;H</div>
</body>
</html>

Report this message

#2: Re: Multiline preg_match

Posted on 2008-04-16 23:39:00 by Mike Camden

On Apr 16, 2:04 pm, "Paul Lautman" <paul.laut...@btinternet.com>
wrote:
> According to the manual, the default for preg_match is to treat the subject
> string as consisting of a single "line" of characters (even if it actually
> contains several newlines).
>
> I want to match the "string" below to extract everyting from <strong> to
> </div> (not inclusive).
> However my attempt at preg_match('/<strong(.*)/',$fc,$match) returns only
> <strong>RCR002</strong>

>
> Any suggestions welcome.
>
> <html xmlns="http://www.w3.org/1999/xhtml">
> <head>
> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
> <title>Untitled Document</title>
> </head>
>
> <body>
> <div align="center"><img
> src="../../../images/stories/rhinestone_stock/crowns/RCR002. gif" width="200"
> height="200" />
>

> <strong>RCR002</strong>

> 2mm &amp; 3mm

> ~5.25&rdquo;W x 3.25&rdquo;H</div>
> </body>
> </html>

Try,

$arr = explode("\n", $string); // Where $string is your block of text
foreach($arr as $key => $value) {
preg_match('pattern', $value, $matches[$key]);
}

Report this message

#3: Re: Multiline preg_match

Posted on 2008-04-16 23:55:07 by Paul Lautman

Mike Camden wrote:
> On Apr 16, 2:04 pm, "Paul Lautman" <paul.laut...@btinternet.com>
> wrote:
>> According to the manual, the default for preg_match is to treat the
>> subject string as consisting of a single "line" of characters (even
>> if it actually contains several newlines).
>>
>> I want to match the "string" below to extract everyting from
>> <strong> to </div> (not inclusive).
>> However my attempt at preg_match('/<strong(.*)/',$fc,$match) returns
>> only <strong>RCR002</strong>

>>
>> Any suggestions welcome.
>>
>> <html xmlns="http://www.w3.org/1999/xhtml">
>> <head>
>> <meta http-equiv="Content-Type" content="text/html;
>> charset=iso-8859-1" /> <title>Untitled Document</title>
>> </head>
>>
>> <body>
>> <div align="center"><img
>> src="../../../images/stories/rhinestone_stock/crowns/RCR002. gif"
>> width="200" height="200" />
>>

>> <strong>RCR002</strong>

>> 2mm &amp; 3mm

>> ~5.25&rdquo;W x 3.25&rdquo;H</div>
>> </body>
>> </html>
>
> Try,
>
> $arr = explode("\n", $string); // Where $string is your block of text
> foreach($arr as $key => $value) {
> preg_match('pattern', $value, $matches[$key]);
> }

I have managed to do it using str_replace to change all the newlines to
spaces.
However I'd really like to understand why preg_match does not behave as the
manual suggests.

Report this message

#4: Re: Multiline preg_match

Posted on 2008-04-17 03:03:31 by luiheidsgoeroe

On Wed, 16 Apr 2008 23:04:14 +0200, Paul Lautman =

<paul.lautman@btinternet.com> wrote:

> According to the manual, the default for preg_match is to treat the =

> subject
> string as consisting of a single "line" of characters (even if it =

> actually
> contains several newlines).
>
> I want to match the "string" below to extract everyting from <strong> =
to
> </div> (not inclusive).
> However my attempt at preg_match('/<strong(.*)/',$fc,$match) returns o=
nly
> <strong>RCR002</strong>


From the preg match portion of the manual:
http://nl2.php.net/manual/en/regexp.reference.php
. =3D match any character _except_newline_ (by default)

Solution:
http://nl2.php.net/manual/en/reference.pcre.pattern.modifier s.php
s (PCRE_DOTALL)
If this modifier is set, a dot metacharacter in the pattern
matches all characters, including newlines. Without it,
newlines are excluded.

In conclusion (giving you the bonus 'untill </div> non-inclusive'):
preg_match('%<strong(.*)(?=3D</div>)%s',$fc,$match);

Which leaves me to say that while I'm a fan of regexes, I've given up =

using them on HTML, because a parser does a far more reliable, clearer, =
=

and most important more robust job.
-- =

Rik Wasmus

Report this message

#5: Re: Multiline preg_match

Posted on 2008-04-17 11:42:16 by Captain Paralytic

On 17 Apr, 02:03, "Rik Wasmus" <luiheidsgoe...@hotmail.com> wrote:
> On Wed, 16 Apr 2008 23:04:14 +0200, Paul Lautman
>
> <paul.laut...@btinternet.com> wrote:
> > According to the manual, the default for preg_match is to treat the
> > subject
> > string as consisting of a single "line" of characters (even if it
> > actually
> > contains several newlines).
>
> > I want to match the "string" below to extract everyting from <strong> to
> > </div> (not inclusive).
> > However my attempt at preg_match('/<strong(.*)/',$fc,$match) returns only
> > <strong>RCR002</strong>

>
> From the preg match portion of the manual:http://nl2.php.net/manual/en/regexp.reference.php
> . = match any character _except_newline_ (by default)
>
> Solution:http://nl2.php.net/manual/en/reference.pcre.pattern .modifiers.php
> s (PCRE_DOTALL)
> If this modifier is set, a dot metacharacter in the pattern
> matches all characters, including newlines. Without it,
> newlines are excluded.
>
> In conclusion (giving you the bonus 'untill </div> non-inclusive'):
> preg_match('%<strong(.*)(?=</div>)%s',$fc,$match);
>
> Which leaves me to say that while I'm a fan of regexes, I've given up
> using them on HTML, because a parser does a far more reliable, clearer,
> and most important more robust job.
> --
> Rik Wasmus

Thanks Rik.

In this case I don't want to parse the HTML. I want to extract a
particular chunk from many files.

Report this message

#6: Re: Multiline preg_match

Posted on 2008-04-18 10:03:41 by Alexey Kulentsov

Paul Lautman wrote:

>>> However my attempt at preg_match('/<strong(.*)/',$fc,$match) returns
>>> only <strong>RCR002</strong>

>>>
.....
>>> <strong>RCR002</strong>

>>> 2mm &amp; 3mm

>>> ~5.25&rdquo;W x 3.25&rdquo;H</div>

> I have managed to do it using str_replace to change all the newlines to
> spaces.
> However I'd really like to understand why preg_match does not behave as the
> manual suggests.
Please read manual about 's' modifier:
http://www.php.net/manual/en/reference.pcre.pattern.modifier s.php

Try '/<strong[^>]*>(.*)</div>/s' to extract this part

Report this message

#7: Re: Multiline preg_match

Posted on 2008-04-18 14:19:08 by Captain Paralytic

On 18 Apr, 08:03, Alexey Kulentsov <a...@inbox.ru> wrote:
> Paul Lautman wrote:
> >>> However my attempt at preg_match('/<strong(.*)/',$fc,$match) returns
> >>> only <strong>RCR002</strong>

>
> ....
> >>> <strong>RCR002</strong>

> >>> 2mm &amp; 3mm

> >>> ~5.25&rdquo;W x 3.25&rdquo;H</div>
> > I have managed to do it using str_replace to change all the newlines to
> > spaces.
> > However I'd really like to understand why preg_match does not behave as the
> > manual suggests.
>
> Please read manual about 's' modifier:http://www.php.net/manual/en/reference.pcre.pattern .modifiers.php
>
> Try '/<strong[^>]*>(.*)</div>/s' to extract this part

Thanks, Rik pointed out that one. It was the paragraph right below the
one that I read which suggested the opposite!

Report this message