WWW::Mechanize doesn"t always follow_link(text

WWW::Mechanize doesn"t always follow_link(text

am 20.04.2008 19:45:13 von mikaelb

I'm using WWW::Mechanize 1.34 and have a problem.
This doesn't work:
$agent->follow_link(text => 'Edit Librarians', n => 1);
It doesn't work in the sense that the link isn't followed and the $agent
is still on the same page. Is there a bug in my code or is there a known
bug in WWW::Mechanize. I've tried to change   to space but that
didn't work.

This works:
$agent->follow_link(url_regex => qr/librarians/, n => 1);

The corresponding XHTML code is:


I want it to work since I use HTTP::Recorder to generate the code
automatically as I surf using a proxy and it generates code of the type
that doesn't work.

This works:
$agent->follow_link(text => 'Logout', n => 1);

By the way HTTP::Recorder actually generates:
$agent->follow_link(text => 'Edit Librarians', n => '1');

Re: WWW::Mechanize doesn"t always follow_link(text

am 21.04.2008 19:34:49 von John Bokma

"M.O.B. i L." wrote:

> I'm using WWW::Mechanize 1.34 and have a problem.
> This doesn't work:
> $agent->follow_link(text => 'Edit Librarians', n => 1);
> It doesn't work in the sense that the link isn't followed and the $agent
> is still on the same page. Is there a bug in my code or is there a known
> bug in WWW::Mechanize. I've tried to change   to space but that
> didn't work.
>
> This works:
> $agent->follow_link(url_regex => qr/librarians/, n => 1);
>
> The corresponding XHTML code is:
>
>
> I want it to work since I use HTTP::Recorder to generate the code
> automatically as I surf using a proxy and it generates code of the type
> that doesn't work.
>
> This works:
> $agent->follow_link(text => 'Logout', n => 1);
>
> By the way HTTP::Recorder actually generates:
> $agent->follow_link(text => 'Edit Librarians', n => '1');

HTML::TreeBuilder, or a module it's using, returns   as a single
character, it might be that you have to
use the code instead.

Comment on http://johnbokma.com/perl/search-term-suggestion-tool.html
says: ( , stored as char 225)

So you might want to try: "Edit\xe1Librarians".

Wild guess.

--
John

Arachnids near Coyolillo
http://johnbokma.com/perl/

Re: WWW::Mechanize doesn"t always follow_link(text

am 23.04.2008 14:09:04 von mikaelb

John Bokma wrote:
> "M.O.B. i L." wrote:
>
>> I'm using WWW::Mechanize 1.34 and have a problem.
>> This doesn't work:
>> $agent->follow_link(text => 'Edit Librarians', n => 1);
>> It doesn't work in the sense that the link isn't followed and the $agent
>> is still on the same page. Is there a bug in my code or is there a known
>> bug in WWW::Mechanize. I've tried to change   to space but that
>> didn't work.
>>
>> This works:
>> $agent->follow_link(url_regex => qr/librarians/, n => 1);
>>
>> The corresponding XHTML code is:
>>
>>
>> I want it to work since I use HTTP::Recorder to generate the code
>> automatically as I surf using a proxy and it generates code of the type
>> that doesn't work.
>>
>> This works:
>> $agent->follow_link(text => 'Logout', n => 1);
>>
>> By the way HTTP::Recorder actually generates:
>> $agent->follow_link(text => 'Edit Librarians', n => '1');
>
> HTML::TreeBuilder, or a module it's using, returns   as a single
> character, it might be that you have to
> use the code instead.
>
> Comment on http://johnbokma.com/perl/search-term-suggestion-tool.html
> says: ( , stored as char 225)
>
> So you might want to try: "Edit\xe1Librarians".
>
> Wild guess.
>
Thanks! But it should be \xa0. First I tried matching with regular
expressions and that worked using . (dot) for the unknown character. I
then found this page about  
where it says:
"An example of an ambiguous character is 00A0: NO-BREAK SPACE. This type
of space prevents line breaking, but it looks just like any other space
when used as a character. Using   (or  ) makes it quite clear
where such spaces appear in the text.".

So this works:
$agent->follow_link(text => "Edit\xa0Librarians", n => 1);

Re: WWW::Mechanize doesn"t always follow_link(text

am 23.04.2008 18:07:27 von John Bokma

"M.O.B. i L." wrote:

> John Bokma wrote:

[..]

>> HTML::TreeBuilder, or a module it's using, returns   as a single
>> character, it might be that you have to
>> use the code instead.
>>
>> Comment on http://johnbokma.com/perl/search-term-suggestion-tool.html
>> says: ( , stored as char 225)
>>
>> So you might want to try: "Edit\xe1Librarians".
>>
>> Wild guess.
>>
> Thanks! But it should be \xa0.

Yeah, but HTML::TreeBuilder returns it as 225 :-D.

[..]

> So this works:
> $agent->follow_link(text => "Edit\xa0Librarians", n => 1);

Glad my post was able to help you in the right way.

--
John

http://johnbokma.com/perl/

Re: WWW::Mechanize doesn"t always follow_link(text

am 24.04.2008 19:17:55 von mikaelb

M.O.B. i L. wrote:
> Thanks! But it should be \xa0. First I tried matching with regular
> expressions and that worked using . (dot) for the unknown character. I
> then found this page about  
> where it says:
> "An example of an ambiguous character is 00A0: NO-BREAK SPACE. This type
> of space prevents line breaking, but it looks just like any other space
> when used as a character. Using   (or  ) makes it quite clear
> where such spaces appear in the text.".
>
> So this works:
> $agent->follow_link(text => "Edit\xa0Librarians", n => 1);

I add that I have developed these command lines to convert back and forth:
sed -i '/ /s/ /\\xa0/g;/\\xa0/s/'\''/"/g' MKBTest.pl
sed -i '/\\xa0/s/\\xa0/\ /g;/ /s/"/'\''/g' MKBTest.pl