GoogleBot

GoogleBot

am 08.04.2009 07:58:46 von jmcaricand

Hi,

I use a PerlTransHandler on my server to get a file :

sub handler {
....

if ( $uri =~ /^\/sitemap.xml.gz$/ ) {
my $real_url = $r->unparsed_uri;

$real_url = '/static' . $real_url;

$r->proxyreq(1);
$r->uri($real_url);
$r->filename(sprintf "proxy:http://xxx.xxx.xxx.xxx%s",$real_url);
$r->handler('proxy-server');

return Apache2::Const::OK;
}

....

return Apache2::Const::DECLINED;
}

When I use Firefox to get sitemap.xml.gz, all work fine :

xxx.xxx.xxx.xxx - - [07/Apr/2009:14:59:35 +0200] "GET /sitemap.xml.gz
HTTP/1.1" 200 2924 "-" "Mozilla/5.0 (X11; U; Linux i686; fr; rv:1.9.0.6)
Gecko/2009020409 Iceweasel/3.0.6 (Debian-3.0.6-1)"

But, when GoogleBot download the file, I see these logs :

66.249.65.228 - - [07/Apr/2009:15:01:41 +0200] "HEAD /sitemap.xml.gz
HTTP/1.1" 302 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)"
66.249.65.228 - - [07/Apr/2009:15:01:44 +0200] "GET /sitemap.xml.gz
HTTP/1.1" 302 451 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)"

Why 2 requests and why my server return status 302 and not 200 ?

Thanks.

Re: GoogleBot

am 08.04.2009 14:52:59 von mpeters

jmcaricand@greta-besancon.com wrote:

> 66.249.65.228 - - [07/Apr/2009:15:01:41 +0200] "HEAD /sitemap.xml.gz
> HTTP/1.1" 302 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
> +http://www.google.com/bot.html)"
> 66.249.65.228 - - [07/Apr/2009:15:01:44 +0200] "GET /sitemap.xml.gz
> HTTP/1.1" 302 451 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
> +http://www.google.com/bot.html)"
>
> Why 2 requests

The first is a HEAD request
(http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Re quest_methods)

> and why my server return status 302 and not 200 ?

Seems you have some sort of Auth handler in front that does a redirect (which is
what a 302 is). If you want to find out why you should try hitting that resource
with your browser pretending to be the Googlebot. If you're using Firefox you
should look at the User Agent Switcher plugin.

--
Michael Peters
Plus Three, LP