using cookie in google scholar from perl

using cookie in google scholar from perl

am 09.11.2007 17:16:17 von francois.rappaz

I would like to use bibTex data in a perl script from google scholar.
These data are controlled in google scholar with the preference panel
wich use cookies to remember the user's settings. bibTex are not
displayed with the default settings.

Using the perl script I receive an empty page with the following url
http://scholar.google.com/scholar.bib?num=100&hl=en&lr=&q=in fo:9Y4Rq3zllPUJ:scholar.google.com/&output=citation&oe=ASCII &oi=citation
despite I try to set the same cookie as the cookie sets in my browser
when "display bibTex links" is on.
The same url give what I want if I send it in my browser with the
correct preferences.

My code goes like this:
my $url1="http://scholar.google.com/scholar.bib?
num=100&hl=en&lr=&q=info:9Y4Rq3zllPUJ:scholar.google.com/
&output=citation&oe=ASCII&oi=citation";
my $ua = LWP::UserAgent->new;
$ua->agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.1.9)
Gecko/20071025 Firefox/2.0.0.9');
my $c = HTTP::Cookies::Netscape->new(file=>"cookies.txt",
autosave=>"1");
my $h = HTTP::Headers->new(
Accept => "text/xml,application/xml,application/xhtml+xml,text/
html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5",
Host => "www.unifr.ch",
);
my $req =HTTP::Request->new(GET=>$url1,$h);
$c->add_cookie_header($req);
$ua->cookie_jar($c);
my $res =$ua->request($req);
die ($res->status_line) unless ($res->is_success);
print $res->as_string;

The output is
HTTP/1.1 200 OK
Cache-Control: private
Connection: Close
Date: Fri, 09 Nov 2007 16:04:42 GMT
Server: GWS/2.1
Content-Length: 0
Content-Type: text/plain; charset=ISO-8859-1
Client-Date: Fri, 09 Nov 2007 16:04:44 GMT
Client-Peer: 66.102.1.99:80
Client-Response-Num: 1
Set-Cookie:
PREF=ID=8d4ee7c002f1dc72:TM=1194624282:LM=1194624282:S=vBeAT mmCJPdAHCQy;
expires=Sun, 17-Jan-2038 19:14:07 GMT; path=/; domain=.google.com
Set-Cookie: GSP=ID=8d4ee7c002f1dc72; expires=Sun, 17-Jan-2038 19:14:07
GMT; path=/; domain=.scholar.unifr.ch

The relevant line in my cookies file use by the script is
..scholar.google.com TRUE / FALSE 2147368448 GSP
ID=c77566a14ed11a16:IN=7e6cc990821af63+b8acc395c41ea61f:CF=4

The cookie data from my browser are
name: GSP
Value: ID=c77566a14ed11a16:IN=7e6cc990821af63+b8acc395c41ea61f:CF=4
Domain: .scholar.google.com
Path: /
....

Well the cookie from the script output seems to be different form the
cookie I want to send. This is maybe the reason of my empty page but I
can't understand why that cookie is changed ...

Thanks for any help !

Francois

Re: using cookie in google scholar from perl

am 14.11.2007 15:11:08 von francois.rappaz

On Nov 9, 5:16 pm, francois.rap...@unifr.ch (Francois) wrote:
> I would like to use bibTex data in a perl script from google scholar.
> These data are controlled in google scholar with the preference panel
> wich use cookies to remember the user's settings. bibTex are not
> displayed with the default settings.
>
> Using the perl script I receive an empty page with the following urlhttp://scholar.google.com/scholar.bib?num=100&hl=en&lr=&q =info:9Y4Rq3...
> despite I try to set the same cookie as the cookie sets in my browser
> when "display bibTex links" is on.
> The same url give what I want if I send it in my browser with the
> correct preferences.
>
> My code goes like this:
> my $url1="http://scholar.google.com/scholar.bib?
> num=100&hl=en&lr=&q=info:9Y4Rq3zllPUJ:scholar.google.com/
> &output=citation&oe=ASCII&oi=citation";
> my $ua = LWP::UserAgent->new;
> $ua->agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.1.9)
> Gecko/20071025 Firefox/2.0.0.9');
> my $c = HTTP::Cookies::Netscape->new(file=>"cookies.txt",
> autosave=>"1");
> my $h = HTTP::Headers->new(
> Accept => "text/xml,application/xml,application/xhtml+xml,text/
> html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5",
> Host => "www.unifr.ch",
> );
> my $req =HTTP::Request->new(GET=>$url1,$h);
> $c->add_cookie_header($req);
> $ua->cookie_jar($c);
> my $res =$ua->request($req);
> die ($res->status_line) unless ($res->is_success);
> print $res->as_string;
>
> The output is
> HTTP/1.1 200 OK
> Cache-Control: private
> Connection: Close
> Date: Fri, 09 Nov 2007 16:04:42 GMT
> Server: GWS/2.1
> Content-Length: 0
> Content-Type: text/plain; charset=ISO-8859-1
> Client-Date: Fri, 09 Nov 2007 16:04:44 GMT
> Client-Peer: 66.102.1.99:80
> Client-Response-Num: 1
> Set-Cookie:
> PREF=ID=8d4ee7c002f1dc72:TM=1194624282:LM=1194624282:S=vBeAT mmCJPdAHCQy;
> expires=Sun, 17-Jan-2038 19:14:07 GMT; path=/; domain=.google.com
> Set-Cookie: GSP=ID=8d4ee7c002f1dc72; expires=Sun, 17-Jan-2038 19:14:07
> GMT; path=/; domain=.scholar.unifr.ch
>
> The relevant line in my cookies file use by the script is
> .scholar.google.com TRUE / FALSE 2147368448 GSP
> ID=c77566a14ed11a16:IN=7e6cc990821af63+b8acc395c41ea61f:CF=4
>
> The cookie data from my browser are
> name: GSP
> Value: ID=c77566a14ed11a16:IN=7e6cc990821af63+b8acc395c41ea61f:CF=4
> Domain: .scholar.google.com
> Path: /
> ...
>
> Well the cookie from the script output seems to be different form the
> cookie I want to send. This is maybe the reason of my empty page but I
> can't understand why that cookie is changed ...
>
> Thanks for any help !
>
> Francois

Well, one needs luck some time: if I replace the domain
scholar.google.com by the domain my pc is in (.unifr.ch) the script
works ... I have read some more on cookies but can't understand how
the server at google scholar can find the right cookie when the domain
is changed...
Francois