Needed Regular Expression
am 28.02.2006 12:39:31 von vspriya05
Hi,
I am new to perl. Can anybody help how to search a particular
string.
For Example:
I have a string like:
$input='
href="http://microsoft.com/account/cart/cart.jsp">Shopping Cart
|
href="https://microsoft.com/account/auth/account.jsp">My Account
|
href="https://microsoft.com/ws/eBayISAPI.dll?RegisterEnterIn fo&siteid=20&co_partnerid=2&usage=1106&ru=de fault">Register
|
|
href="https://microsoft.com/account/auth/wishlist/wishlist.j sp?refresh">Wish
List'
>From these string i need to extract all href links like,
http://microsoft.com/account/cart/cart.jsp
https://microsoft.com/account/auth/account.jsp
https://microsoft.com/ws/eBayISAPI.dll?RegisterEnterInfo& ;siteid=20&co_partnerid=2&usage=1106&ru=default
http://microsoft.com/help/sell.cfm
The input string may differ...that is it may contain 1 or more than 1
href.
I have tried
$input =~ m/.*(HREF=\".*?\")/i;
print $1 . "\n";
but it display only the 1st href..that is
href="http://microsoft.com/account/cart/cart.jsp"
Can anybody help me how to extract the hrefs?
Re: Needed Regular Expression
am 28.02.2006 12:53:33 von Michael Greb
In article <1141126771.230743.183300@v46g2000cwv.googlegroups.com>,
vspriya05@gmail.com wrote:
> Hi,
>
> I am new to perl. Can anybody help how to search a particular
> string.
> For Example:
> I have a string like:
>
> The input string may differ...that is it may contain 1 or more than 1
> href.
>
> I have tried
> $input =~ m/.*(HREF=\".*?\")/i;
> print $1 . "\n";
> but it display only the 1st href..that is
> href="http://microsoft.com/account/cart/cart.jsp"
>
> Can anybody help me how to extract the hrefs?
You can get multiple matches like this:
while ($input =~ m/HREF="(.*?)"/ig)
print $1, "\n";
}
This will print just the link itself, you could move the capturing
parenthesis back to the outside to get output like you have now. You
will probably want to use a more flexible regex though, for example,
what if someone uses single quotes? Also, the double quotes in your
regex don't need to be escaped.
--
Michael
michael@thegrebs.com
Re: Needed Regular Expression
am 28.02.2006 19:03:07 von Tintin
wrote in message
news:1141126771.230743.183300@v46g2000cwv.googlegroups.com.. .
> Hi,
>
> I am new to perl. Can anybody help how to search a particular
> string.
> For Example:
> I have a string like:
>
> $input='
> href="http://microsoft.com/account/cart/cart.jsp">Shopping Cart
> |
> href="https://microsoft.com/account/auth/account.jsp">My Account
> |
>
href="https://microsoft.com/ws/eBayISAPI.dll?RegisterEnterIn fo&siteid=20
&co_partnerid=2&usage=1106&ru=default">Register
> |
> |
>
href="https://microsoft.com/account/auth/wishlist/wishlist.j sp?refresh">Wish
> List'
>
> >From these string i need to extract all href links like,
> http://microsoft.com/account/cart/cart.jsp
> https://microsoft.com/account/auth/account.jsp
>
https://microsoft.com/ws/eBayISAPI.dll?RegisterEnterInfo& ;siteid=20&co_partnerid=2&usage=1106&ru=default
> http://microsoft.com/help/sell.cfm
>
> The input string may differ...that is it may contain 1 or more than 1
> href.
>
> I have tried
> $input =~ m/.*(HREF=\".*?\")/i;
> print $1 . "\n";
> but it display only the 1st href..that is
> href="http://microsoft.com/account/cart/cart.jsp"
>
> Can anybody help me how to extract the hrefs?
Unless you can 100% guarantee the format of your HTML, you should use an
HTML parser.
For your case, see:
http://search.cpan.org/~bdfoy/HTML-SimpleLinkExtor-1.12/Simp leLinkExtor.pm