Bookmarks

Yahoo Gmail Google Facebook Delicious Twitter Reddit Stumpleupon Myspace Digg

Search queries

sqldatasource dal, wwwxxxenden, convert raid5 to raid 10 mdadm, apache force chunked, nrao wwwxxx, xxxxxdup, procmail change subject header, wwwXxx not20, Wwwxxx.doks sas, linux raid resync after reboot

Links

XODOX
Impressum

#1: WWW::Mechanize doesn"t always follow_link(text

Posted on 2008-04-20 19:45:13 by mikaelb

I'm using WWW::Mechanize 1.34 and have a problem.
This doesn't work:
$agent->follow_link(text => 'Edit Librarians', n => 1);
It doesn't work in the sense that the link isn't followed and the $agent
is still on the same page. Is there a bug in my code or is there a known
bug in WWW::Mechanize. I've tried to change   to space but that
didn't work.

This works:
$agent->follow_link(url_regex => qr/librarians/, n => 1);

The corresponding XHTML code is:
<a href="mkbAdmin?func=librarians&amp;lang=en">Edit&nbsp;Librarians</a>

I want it to work since I use HTTP::Recorder to generate the code
automatically as I surf using a proxy and it generates code of the type
that doesn't work.

This works:
$agent->follow_link(text => 'Logout', n => 1);

By the way HTTP::Recorder actually generates:
$agent->follow_link(text => 'Edit&nbsp;Librarians', n => '1');

Report this message

#2: Re: WWW::Mechanize doesn"t always follow_link(text

Posted on 2008-04-21 19:34:49 by John Bokma

"M.O.B. i L." <mikaelb@df.lth.se> wrote:

> I'm using WWW::Mechanize 1.34 and have a problem.
> This doesn't work:
> $agent->follow_link(text => 'Edit&nbsp;Librarians', n => 1);
> It doesn't work in the sense that the link isn't followed and the $agent
> is still on the same page. Is there a bug in my code or is there a known
> bug in WWW::Mechanize. I've tried to change &nbsp; to space but that
> didn't work.
>
> This works:
> $agent->follow_link(url_regex => qr/librarians/, n => 1);
>
> The corresponding XHTML code is:
> <a href="mkbAdmin?func=librarians&amp;lang=en">Edit&nbsp;Librarians</a>
>
> I want it to work since I use HTTP::Recorder to generate the code
> automatically as I surf using a proxy and it generates code of the type
> that doesn't work.
>
> This works:
> $agent->follow_link(text => 'Logout', n => 1);
>
> By the way HTTP::Recorder actually generates:
> $agent->follow_link(text => 'Edit&nbsp;Librarians', n => '1');

HTML::TreeBuilder, or a module it's using, returns &nbsp; as a single
character, it might be that you have to
use the code instead.

Comment on http://johnbokma.com/perl/search-term-suggestion-tool.html
says: (&nbsp;, stored as char 225)

So you might want to try: "Edit\xe1Librarians".

Wild guess.

--
John

Arachnids near Coyolillo
http://johnbokma.com/perl/

Report this message

#3: Re: WWW::Mechanize doesn"t always follow_link(text

Posted on 2008-04-23 14:09:04 by mikaelb

John Bokma wrote:
> "M.O.B. i L." <mikaelb@df.lth.se> wrote:
>
>> I'm using WWW::Mechanize 1.34 and have a problem.
>> This doesn't work:
>> $agent->follow_link(text => 'Edit&nbsp;Librarians', n => 1);
>> It doesn't work in the sense that the link isn't followed and the $agent
>> is still on the same page. Is there a bug in my code or is there a known
>> bug in WWW::Mechanize. I've tried to change &nbsp; to space but that
>> didn't work.
>>
>> This works:
>> $agent->follow_link(url_regex => qr/librarians/, n => 1);
>>
>> The corresponding XHTML code is:
>> <a href="mkbAdmin?func=librarians&amp;lang=en">Edit&nbsp;Librarians</a>
>>
>> I want it to work since I use HTTP::Recorder to generate the code
>> automatically as I surf using a proxy and it generates code of the type
>> that doesn't work.
>>
>> This works:
>> $agent->follow_link(text => 'Logout', n => 1);
>>
>> By the way HTTP::Recorder actually generates:
>> $agent->follow_link(text => 'Edit&nbsp;Librarians', n => '1');
>
> HTML::TreeBuilder, or a module it's using, returns &nbsp; as a single
> character, it might be that you have to
> use the code instead.
>
> Comment on http://johnbokma.com/perl/search-term-suggestion-tool.html
> says: (&nbsp;, stored as char 225)
>
> So you might want to try: "Edit\xe1Librarians".
>
> Wild guess.
>
Thanks! But it should be \xa0. First I tried matching with regular
expressions and that worked using . (dot) for the unknown character. I
then found this page about &nbsp;
<http://www.w3.org/International/questions/qa-escapes> where it says:
"An example of an ambiguous character is 00A0: NO-BREAK SPACE. This type
of space prevents line breaking, but it looks just like any other space
when used as a character. Using &nbsp; (or &#xA0;) makes it quite clear
where such spaces appear in the text.".

So this works:
$agent->follow_link(text => "Edit\xa0Librarians", n => 1);

Report this message

#4: Re: WWW::Mechanize doesn"t always follow_link(text

Posted on 2008-04-23 18:07:27 by John Bokma

"M.O.B. i L." <mikaelb@df.lth.se> wrote:

> John Bokma wrote:

[..]

>> HTML::TreeBuilder, or a module it's using, returns &nbsp; as a single
>> character, it might be that you have to
>> use the code instead.
>>
>> Comment on http://johnbokma.com/perl/search-term-suggestion-tool.html
>> says: (&nbsp;, stored as char 225)
>>
>> So you might want to try: "Edit\xe1Librarians".
>>
>> Wild guess.
>>
> Thanks! But it should be \xa0.

Yeah, but HTML::TreeBuilder returns it as 225 :-D.

[..]

> So this works:
> $agent->follow_link(text => "Edit\xa0Librarians", n => 1);

Glad my post was able to help you in the right way.

--
John

http://johnbokma.com/perl/

Report this message

#5: Re: WWW::Mechanize doesn"t always follow_link(text

Posted on 2008-04-24 19:17:55 by mikaelb

M.O.B. i L. wrote:
> Thanks! But it should be \xa0. First I tried matching with regular
> expressions and that worked using . (dot) for the unknown character. I
> then found this page about &nbsp;
> <http://www.w3.org/International/questions/qa-escapes> where it says:
> "An example of an ambiguous character is 00A0: NO-BREAK SPACE. This type
> of space prevents line breaking, but it looks just like any other space
> when used as a character. Using &nbsp; (or &#xA0;) makes it quite clear
> where such spaces appear in the text.".
>
> So this works:
> $agent->follow_link(text => "Edit\xa0Librarians", n => 1);

I add that I have developed these command lines to convert back and forth:
sed -i '/&nbsp;/s/&nbsp;/\\xa0/g;/\\xa0/s/'\''/"/g' MKBTest.pl
sed -i '/\\xa0/s/\\xa0/\&nbsp;/g;/&nbsp;/s/"/'\''/g' MKBTest.pl

Report this message