HTTP::Response decoded_content is undefined

HTTP::Response decoded_content is undefined

am 28.03.2007 14:04:37 von timlegge

Hi

I had a script that was able to parse the decoded_content for the
forms in a html page. However, a recent update to the page broke the
script ( @forms = HTML::Form->parse($response->decoded_content,
$response->base);) was unable to find the forms in the web page.
After much research I found that the decoded_content was empty but the
call to parse seemed happy with HTML::Form->parse($response->content,
$response->base); instead.

It looks like the issue may have been caused by the addition of a meta
tag to the html page:



so far, I have been unable to prove that as the page is generated via
compiled javascript and is painful to change.

Any idea whether this meta tag would cause an issue with
decoded_content and whether there might be a work around...

Tim

Re: HTTP::Response decoded_content is undefined

am 30.03.2007 02:47:31 von jzhang

On Mar 28, 8:04 pm, "timlegge" wrote:
> Hi
>
> I had a script that was able to parse the decoded_content for the
> forms in a html page. However, a recent update to the page broke the
> script ( @forms = HTML::Form->parse($response->decoded_content,
> $response->base);) was unable to find the forms in the web page.
> After much research I found that the decoded_content was empty but the
> call to parse seemed happy with HTML::Form->parse($response->content,
> $response->base); instead.
>
> It looks like the issue may have been caused by the addition of a meta
> tag to the html page:
>
>
>
> so far, I have been unable to prove that as the page is generated via
> compiled javascript and is painful to change.
>
> Any idea whether this meta tag would cause an issue with
> decoded_content and whether there might be a work around...
>
> Tim

I've also met such kind of error when processing Chinese web pages.
It seems decoded_content() failed to recognize the charset of your web
page.
You can try $response->decoded_content('default_charset'=>'utf8');
Or you can hack the decoded_content function in HTTP::Message module,
to make the charset detection part more sophisticated.

Zhang Jun