Regular Expression to extract all .JPG and .PNG URL"s
Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 17:12:40 von kosanovic
Hello,
I'm bad at regular expressions. Would somebody help me:
I need to extract all URL to .jpg and .png pictures from a string
containing an HTML file (DOM wouldn't work well in what I need).
I've tried:
preg_match_all("/.jpg$|.png$/", $htmlfile, $Matches);
foreach ($Matches as $match)
{
echo $match."
";
}
without much success. Anybody can correct this to make it work?
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 17:27:59 von Steve
wrote in message
news:1192201960.982311.106940@i13g2000prf.googlegroups.com.. .
> Hello,
> I'm bad at regular expressions. Would somebody help me:
> I need to extract all URL to .jpg and .png pictures from a string
> containing an HTML file (DOM wouldn't work well in what I need).
>
> I've tried:
>
> preg_match_all("/.jpg$|.png$/", $htmlfile, $Matches);
> foreach ($Matches as $match)
> {
> echo $match."
";
> }
\.(jpe?g|png)\b+$?
that should cure what ailes you. this is free handed, but after eyeballing
regex for years, i'm pretty sure it will work 'outta the box'.
cheers
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 17:28:03 von Captain Paralytic
On 12 Oct, 16:12, kosano...@gmail.com wrote:
> Hello,
> I'm bad at regular expressions. Would somebody help me:
> I need to extract all URL to .jpg and .png pictures from a string
> containing an HTML file (DOM wouldn't work well in what I need).
>
> I've tried:
>
> preg_match_all("/.jpg$|.png$/", $htmlfile, $Matches);
> foreach ($Matches as $match)
> {
> echo $match."
";
> }
>
> without much success. Anybody can correct this to make it work?
Try:
preg_match_all("/.*(jpg|png)$/", $htmlfile, $Matches);
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 17:51:51 von Steve
"Steve" wrote in message
news:ZnMPi.30$3d2.24@newsfe06.lga...
>
> wrote in message
> news:1192201960.982311.106940@i13g2000prf.googlegroups.com.. .
>> Hello,
>> I'm bad at regular expressions. Would somebody help me:
>> I need to extract all URL to .jpg and .png pictures from a string
>> containing an HTML file (DOM wouldn't work well in what I need).
>>
>> I've tried:
>>
>> preg_match_all("/.jpg$|.png$/", $htmlfile, $Matches);
>> foreach ($Matches as $match)
>> {
>> echo $match."
";
>> }
>
> \.(jpe?g|png)\b+$?
sorry...if you're screen scaping, then the image may be in a src tag
attribute and will no doubt be quoted or tic'd. the above will only handle
unquoted/tic'd where the src value (this image link) is not ended with the
closing tag. all that said, see if you can tell how the changes below meet
the constraints/problems not covered by the former...
=(["'])?(([^\.]*\.)*(jpe?|pn)g)\1[^>]*?>
again, i haven't tested it...but the raw image (including the path, i.e.
http://www.example.com/images/image.png) should be the second match captured
by preg_match_all. to make sure of that, do this:
preg_match_all($pattern, $search, $matches);
echo '' . print_r($matches, true) . '
';
hth,
me
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 18:00:57 von Steve
"Captain Paralytic" wrote in message
news:1192202883.609210.175920@k35g2000prh.googlegroups.com.. .
> On 12 Oct, 16:12, kosano...@gmail.com wrote:
>> Hello,
>> I'm bad at regular expressions. Would somebody help me:
>> I need to extract all URL to .jpg and .png pictures from a string
>> containing an HTML file (DOM wouldn't work well in what I need).
>>
>> I've tried:
>>
>> preg_match_all("/.jpg$|.png$/", $htmlfile, $Matches);
>> foreach ($Matches as $match)
>> {
>> echo $match."
";
>> }
>>
>> without much success. Anybody can correct this to make it work?
>
> Try:
> preg_match_all("/.*(jpg|png)$/", $htmlfile, $Matches);
nah...that would mean:
ghijpgklmno
or
mnopngqrst
or any text having those letters within it would be captured. however,
neither his nor yours work unless the string ends at the end of the line
(eof or \r or \n). both don't escape the dot...so that means 'any character
followed by jpg or png'. more marginal success would be had just doing,
"/\.jpg|\.png/", but even that doesn't cut it...since either jpg or png can
be followed by anything and still be captured. not to mention we've left out
jpeg's.
make sense?
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 18:02:13 von kosanovic
Everybody thank you very much for your help. However I coulndn't make
work any of the examples. The output I get all the time is "Array" and
that's it.
Steve, image paths are not always in the src value but they are always
quoted.
So I thought to make it find .jpg or .png and grab the string on the
left form the " sign. How would that be in reg exp?
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 18:06:20 von Lars Eighner
In our last episode,
<1192204933.239597.226170@v29g2000prd.googlegroups.com>,
the lovely and talented kosanovic@gmail.com
broadcast on comp.lang.php:
> Everybody thank you very much for your help. However I coulndn't make
> work any of the examples. The output I get all the time is "Array" and
> that's it.
Did you even read the manual to find out what preg_match_all actually
does?
> Steve, image paths are not always in the src value but they are always
> quoted.
> So I thought to make it find .jpg or .png and grab the string on the
> left form the " sign. How would that be in reg exp?
--
Lars Eighner
Countdown: 465 days to go.
What do you do when you're debranded?
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 18:07:16 von Captain Paralytic
On 12 Oct, 17:02, kosano...@gmail.com wrote:
> Everybody thank you very much for your help. However I coulndn't make
> work any of the examples. The output I get all the time is "Array" and
> that's it.
>
> Steve, image paths are not always in the src value but they are always
> quoted.
> So I thought to make it find .jpg or .png and grab the string on the
> left form the " sign. How would that be in reg exp?
Try print_r($Matches);
instead of the foreach
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 18:41:11 von Steve
wrote in message
news:1192204933.239597.226170@v29g2000prd.googlegroups.com.. .
> Everybody thank you very much for your help. However I coulndn't make
> work any of the examples. The output I get all the time is "Array" and
> that's it.
>
> Steve, image paths are not always in the src value but they are always
> quoted.
> So I thought to make it find .jpg or .png and grab the string on the
> left form the " sign. How would that be in reg exp?
yes...just take the equal sign out of the pattern i gave AND the trailing >.
now, your pattern looks like this:
(["'])?(([^\.]*\.)*(jpe?|pn)g)\1
that will catch a jpg, jpeg, or png where it appears inbetween a set of tics
(') or quotes (").
REMEMBER !!! to debug this and find out where your match will apear within
$matches, do this:
echo '' . print_r($matches, true) . '
';
please post your results here, so we can either continue to help or, see
that it worked for you.
thx,
me
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 18:44:32 von kosanovic
$htmlc=" \"http://example.com/file1.jpg\" kjkjskfj \"http://
blabla.com/image2.png\" dsgdg";
preg_match_all("/.*(jpg|png)$/", $htmlc, $matches);
echo '
' . print_r($matches, true) . '
';
?>
outputs:
Array
(
[0] => Array
(
)
[1] => Array
(
)
)
I need to extract:
http://example.com/file1.jpg
http://blabla.com/image2.png
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 19:00:53 von Steve
wrote in message
news:1192207472.865090.209120@q3g2000prf.googlegroups.com...
>
>
> $htmlc=" \"http://example.com/file1.jpg\" kjkjskfj \"http://
> blabla.com/image2.png\" dsgdg";
> preg_match_all("/.*(jpg|png)$/", $htmlc, $matches);
> echo '' . print_r($matches, true) . '
';
>
> ?>
>
> outputs:
> Array
> (
> [0] => Array
> (
> )
>
> [1] => Array
> (
> )
>
> )
look up preg_match_all in the docs or at php.net. regarless of matching or
not having any results that do match, it will always return an array as a
result. typically, array[0] will have your exact matches (as another array).
array[1 - n] contains sub-matches...or partial matches. so,
foreach ($matches[0] as $match)
{
echo '' . $match . '
';
}
will most likely be what you need. make sense?
this is all aside from the fact that the pattern used above doesn't resemble
anything you've described as your goal...and i'm not even preg. ;^)
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 19:26:37 von kosanovic
Steve, the arrays are empty. I don't care if it's arrays or not, I
need data.
On Oct 12, 7:00 pm, "Steve" wrote:
> wrote in message
>
> news:1192207472.865090.209120@q3g2000prf.googlegroups.com...
>
>
>
> >
>
> > $htmlc=" \"http://example.com/file1.jpg\" kjkjskfj \"http://
> > blabla.com/image2.png\" dsgdg";
> > preg_match_all("/.*(jpg|png)$/", $htmlc, $matches);
> > echo '' . print_r($matches, true) . '
';
>
> > ?>
>
> > outputs:
> > Array
> > (
> > [0] => Array
> > (
> > )
>
> > [1] => Array
> > (
> > )
>
> > )
>
> look up preg_match_all in the docs or at php.net. regarless of matching or
> not having any results that do match, it will always return an array as a
> result. typically, array[0] will have your exact matches (as another array).
> array[1 - n] contains sub-matches...or partial matches. so,
>
> foreach ($matches[0] as $match)
> {
> echo '' . $match . '
';
>
> }
>
> will most likely be what you need. make sense?
>
> this is all aside from the fact that the pattern used above doesn't resemble
> anything you've described as your goal...and i'm not even preg. ;^)
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 19:51:57 von Steve
wrote in message
news:1192209997.314517.154870@e34g2000pro.googlegroups.com.. .
> Steve, the arrays are empty. I don't care if it's arrays or not, I
> need data.
pardon my frustration, but, no shit! however, if you don't understand the
output of preg_match_all, how then, will you get the data?
let me turn you into less than a thinker since you apparently need
spoon-feeding. copy and paste the following...then quit wasting everyone's
time.
$html = '"http://www.example.com/file1.jpg" kjkjskfj ';
$html .= '"http://www.example.com/fil1.jpeg" kjkjskfj';
$html .= '"http://www.example.com/fil1.png" kjkjskfj';
$pattern = '/(["\'])?(([^\.]*\.)*?(jpe?|pn)g)\1/';
preg_match_all($pattern, $html, $matches);
$images = $matches[2]; // well holy mother of christ! right where i
guessed!!!
foreach ($images as $image)
{
echo '' . $image . '
';
}
now, either go read the manual and do it all yourself (preferable), be
polite when consuming someone else's time...especially when they're tying to
help you, or foad!
either way, you man run along now!
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 20:02:25 von kosanovic
Steve thank you very much for your time.
On Oct 12, 7:51 pm, "Steve" wrote:
> wrote in message
>
> news:1192209997.314517.154870@e34g2000pro.googlegroups.com.. .
>
> > Steve, the arrays are empty. I don't care if it's arrays or not, I
> > need data.
>
> pardon my frustration, but, no shit! however, if you don't understand the
> output of preg_match_all, how then, will you get the data?
>
> let me turn you into less than a thinker since you apparently need
> spoon-feeding. copy and paste the following...then quit wasting everyone's
> time.
>
> $html = '"http://www.example.com/file1.jpg" kjkjskfj ';
> $html .= '"http://www.example.com/fil1.jpeg" kjkjskfj';
> $html .= '"http://www.example.com/fil1.png" kjkjskfj';
> $pattern = '/(["\'])?(([^\.]*\.)*?(jpe?|pn)g)\1/';
> preg_match_all($pattern, $html, $matches);
> $images = $matches[2]; // well holy mother of christ! right where i
> guessed!!!
> foreach ($images as $image)
> {
> echo '' . $image . '
';
>
> }
>
> now, either go read the manual and do it all yourself (preferable), be
> polite when consuming someone else's time...especially when they're tying to
> help you, or foad!
>
> either way, you man run along now!
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 21:03:52 von kosanovic
Steve
for this: (["'])?(([^\.]*\.)*(jpe?|pn)g)\1
I get: Unknown modifier
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 22:14:23 von kosanovic
Steve
for this: (["'])?(([^\.]*\.)*(jpe?|pn)g)\1
I get: Unknown modifier
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 22:15:18 von Steve
wrote in message
news:1192208441.918752.186650@v29g2000prd.googlegroups.com.. .
> Steve
> for this: (["'])?(([^\.]*\.)*(jpe?|pn)g)\1
> I get: Unknown modifier
because that is the core of the pattern. you have to encase it:
'//'
make sense? just implement the copy/paste version i sent in the other post.
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 22:42:16 von Michael Fesser
..oO(Steve)
> wrote in message
>news:1192209997.314517.154870@e34g2000pro.googlegroups.com. ..
>> Steve, the arrays are empty. I don't care if it's arrays or not, I
>> need data.
>
>pardon my frustration, but, no shit! however, if you don't understand the
>output of preg_match_all, how then, will you get the data?
Calm down. In his example he used a pattern with a '$' at the end, which
can't work in this case. That's why he got an empty result array.
Micha
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 22:44:58 von Steve
wrote in message
news:1192208671.979837.289450@i13g2000prf.googlegroups.com.. .
> Steve
> for this: (["'])?(([^\.]*\.)*(jpe?|pn)g)\1
> I get: Unknown modifier
what is up with your news reader!
you've posted this twice now. i gave you working code three posts ago, to
which you said thank you for your time. i cannot fathom that you're still
trying to work out the half-coded stuff of earlier threads when i gave you
the working code already! that's why i have to close my eyes and ignore the
fact it isn't latency in usenet message arrival (based on the time stamp),
and blame your new reader.
you have a very poor new sreader. lol (noticing it's google).
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 12.10.2007 23:02:15 von Steve
"Michael Fesser" wrote in message
news:opmvg3t6j4s0dha38tg864n9nfqg94pv3g@4ax.com...
> .oO(Steve)
>
>> wrote in message
>>news:1192209997.314517.154870@e34g2000pro.googlegroups.com ...
>>> Steve, the arrays are empty. I don't care if it's arrays or not, I
>>> need data.
>>
>>pardon my frustration, but, no shit! however, if you don't understand the
>>output of preg_match_all, how then, will you get the data?
>
> Calm down. In his example he used a pattern with a '$' at the end, which
> can't work in this case. That's why he got an empty result array.
i know that, and in that same thread, i told him that. the problem is that
he posts outside of threads and it becomes very difficult to see which
examples, of the 3 sets, he is referring to. this is yet another reason to
avoid using google groups as an interface into usenet.
btw, even in that post, i was calm. nothing that goes on in usenet is worth
getting riled about. ;^)
Re: Regular Expression to extract all .JPG and .PNG URL"s
am 13.10.2007 13:43:55 von Paul Lautman
Steve wrote:
>> preg_match_all("/.*(jpg|png)$/", $htmlfile, $Matches);
>
> nah...that would mean:
>
> ghijpgklmno
>
> or
>
> mnopngqrst
>
> or any text having those letters within it would be captured. however,
> neither his nor yours work unless the string ends at the end of the
> line (eof or \r or \n). both don't escape the dot...so that means
> 'any character followed by jpg or png'. more marginal success would
> be had just doing, "/\.jpg|\.png/", but even that doesn't cut
> it...since either jpg or png can be followed by anything and still be
> captured. not to mention we've left out jpeg's.
>
> make sense?
Well it seems to be a bit of a contradiction. My one was intended to only
find png or jpg at the end of the line as that was what the OP had intimated
was required.
Now you can hardly say that it would match ghijpgklmno or mnopngqrst at the
same time as saying that it'll only match with png or jpg at the end of the
line!