"Internal Server Error" when GETing with WWW::Mechanize?
am 18.01.2005 12:35:26 von James
Hi All,
I'm having trouble using WWW::Mechanize to parse ParcelForce's website
(for tracking parcels). I've broken my script down to the simplest case
possible (below) however still can't get over the first step (ie
retrieving the page) - the idea is that I will then fill out and submit
the retrieved form.
Code:
#!/usr/bin/perl -w
use strict;
use WWW::Mechanize;
my $agent = WWW::Mechanize->new( autocheck => 1 );
$agent->get("http://www.parcelforce.com:80/portal/pw/track") ;
The error I get is...
"Error GETing http://www.parcelforce.com:80/portal/pw/track: Internal
Server Error at track.pl line 5"
I tried simplifying the URL down to http://www.parcelforce.com/ and
still have the same problem.
Using WWW:Mechanize on other URLs (e.g. perl.com) works fine and the
page is retrievable using wget so I don't think it's my network at
fault.
Can anyone suggest any other ideas?
TIA
Re: "Internal Server Error" when GETing with WWW::Mechanize?
am 18.01.2005 13:42:16 von gisle
James Turnbull writes:
> The error I get is...
> "Error GETing http://www.parcelforce.com:80/portal/pw/track: Internal
> Server Error at track.pl line 5"
The server is confused by something in the request that LWP sends.
This is a trace I get with 'lwp-request http://www.parcelforce.com:80/portal/pw/track':
GET /portal/pw/track HTTP/1.1
TE: deflate,gzip;q=0.3
Connection: TE, close
Host: www.parcelforce.com:80
User-Agent: lwp-request/2.06
HTTP/1.1 500 Internal Server Error
Content-language: en-US
Content-length: 0
Content-type: text/html; charset=ISO-8859-1
Date: Tue, 18 Jan 2005 12:37:25 GMT
Server: Netscape-Enterprise/6.0
Set-Cookie: FGNCLIID=42b0olsqf5khpzen020dycwtbh27;expires=Thu, 18 Jan 2007 12:37:26 GMT;path=/
Connection: Close
--Gisle
Re: "Internal Server Error" when GETing with WWW::Mechanize?
am 18.01.2005 14:04:40 von gisle
Gisle Aas writes:
> James Turnbull writes:
>
> > The error I get is...
> > "Error GETing http://www.parcelforce.com:80/portal/pw/track: Internal
> > Server Error at track.pl line 5"
>
> The server is confused by something in the request that LWP sends.
This is a buggy server that crashes unless the request sent have an
"Accept" header. It does not appear to matter what you put in it, as
demonstrated by running:
$ lwp-request -H Accept:foo http://www.parcelforce.com:80/portal/pw/track
In your app you can work around this problem by telling LWP to always
send an Accept header using code like:
$argent->default_header(Accept => "text/*");
(The default_header method was introduced in LWP-5.800).
Regards,
Gisle
Re: "Internal Server Error" when GETing with WWW::Mechanize?
am 18.01.2005 16:17:18 von James
On 2005-01-18 13:04:40 +0000, gisle@ActiveState.com (Gisle Aas) said:
> In your app you can work around this problem by telling LWP to always
> send an Accept header using code like:
> $argent->default_header(Accept => "text/*");
Many thanks for your help Gisle - this change did allow me to grab the page.
Sadly however it's all in vain as I am unable to persuade the
parcelforce website to accept the tracking number.
I recorded all the submitted form data using HTTP::Recorder yet when
replaying it using WWW:Mechanize I get an "Error in tracking number"
page back. Various attempts at fiddling with cookies and javascript
sessions have resulted in nothing so I'm going to have to give in for
just now.
I've posted the code below in case someone with more Mechanize
experience can track down the problem but I'm pretty much stumped.
We'll just have to copy and paste the tracking numbers for the time
being :-/
James Turnbull
#!/usr/bin/perl -w
use strict;
use WWW::Mechanize;
my $agent = WWW::Mechanize->new( autocheck => 1 );
$agent->add_header( Accept => "text/*" );
$agent->add_header( Referrer => "http://www.parcelforce.com/portal/pw" );
$agent->get("http://www.parcelforce.com/portal/pw/track?catI d=7500082");
$agent->form(3);
$agent->field("_D:/rmg/track/TrackFormHandler.value.incoming InternationalParcel"
=> " ");
$agent->field("_D:/rmg/track/TrackFormHandler.value.invalidI nputUrl" => " ");
$agent->field("/rmg/track/TrackFormHandler.value.year" => "-1");
$agent->field("/rmg/track/TrackFormHandler.value.incomingInt ernationalParcel"
=> "");
$agent->field("/rmg/track/TrackFormHandler.value.month" => "-1");
$agent->field("_D:/rmg/track/TrackFormHandler.value.year" => " ");
$agent->field("trackConsigniaPage" => "track");
$agent->field("_D:/rmg/track/TrackFormHandler.value.searchWa itUrl" => " ");
$agent->field("/rmg/track/TrackFormHandler.value.day" => "-1");
$agent->field("/rmg/track/TrackFormHandler.track" => "");
$agent->field("/rmg/track/TrackFormHandler.value.searchCompl eteUrl" =>
"/portal/pw/track?catId=7500082&pageId=trt_resultspage");
$agent->field("/rmg/track/TrackFormHandler.value.trackingNum ber" =>
"ZH353428810GB");
$agent->field("_dyncharset" => "ISO-8859-1");
$agent->field("_D:/rmg/track/TrackFormHandler.value.searchCo mpleteUrl" => " ");
$agent->field("/rmg/track/TrackFormHandler.value.invalidInpu tUrl" =>
"/portal/pw/track?catId=7500082&pageId=trt_resultspage");
$agent->field("_D:/rmg/track/TrackFormHandler.value.day" => " ");
$agent->field("_D:/rmg/track/TrackFormHandler.track" => " ");
$agent->field("_D:/rmg/track/TrackFormHandler.value.month" => " ");
$agent->field("_DARGS" =>
"/portal/rmgroup/apps/templates/html/trackAndTraceForm.jsp") ;
$agent->field("/rmg/track/TrackFormHandler.value.searchWaitU rl" =>
"/portal/pw/track?catId=7500082&timeout=true&pageId=trt_time outpage");
$agent->field("_D:/rmg/track/TrackFormHandler.value.tracking Number" => " ");
$agent->submit_form(form_name => "TrackFormHandler");
print $agent->content();