Redirect

Redirect

am 05.08.2007 21:38:31 von blvstk

I'm getting hammered with... "GET /guestbook/gbook.php?a=sign HTTP/1.1"
I got rid of the guestbook months ago. The directory does not exist.
I have lines in .htaccess....
"Redirect /guestbook http://www.[somewhere else].com" and
"Redirect /guestbook/ http://www.[somewhere else].com
These work as long as there are no parameters after "guestbook".
But the bots are adding the "gbook.php?a=sign"
Here's one that appears to have come from a googlebot....
"crawl-66-249-65-100.googlebot.com - - [01/Aug/2007:17:03:05 -0400] "GET /guestbook/gbook.php?a=delete&num=2?
Why would Google try to mess with my (nonexistent) guestbook?
I have a multitude of "deny from" lines but they don't even make a dent.

What's the best way to handle this situation?
Or do I just have to live with it?
I'm running Apache/1.3.12 under Windows XP

Re: Redirect

am 05.08.2007 22:19:17 von patpro

In article ,
"Lil' Abner" wrote:

> I'm getting hammered with... "GET /guestbook/gbook.php?a=sign HTTP/1.1"
> I got rid of the guestbook months ago. The directory does not exist.
> I have lines in .htaccess....
> "Redirect /guestbook http://www.[somewhere else].com" and
> "Redirect /guestbook/ http://www.[somewhere else].com
> These work as long as there are no parameters after "guestbook".
> But the bots are adding the "gbook.php?a=sign"
> Here's one that appears to have come from a googlebot....
> "crawl-66-249-65-100.googlebot.com - - [01/Aug/2007:17:03:05 -0400] "GET
> /guestbook/gbook.php?a=delete&num=2?
> Why would Google try to mess with my (nonexistent) guestbook?
> I have a multitude of "deny from" lines but they don't even make a dent.
>
> What's the best way to handle this situation?

teach the bot : return proper error codes, don't send them else where
unless it's related (the guestbook moved ?), and read the doc about
redirect to handle the changing URL.

patpro

--
http://www.patpro.net/

Re: Redirect

am 05.08.2007 22:55:23 von blvstk

patpro ~ patrick proniewski wrote in
news:patpro-4C8C4F.22191705082007@news-1.proxad.net:

> In article ,
> "Lil' Abner" wrote:
>
>> I'm getting hammered with... "GET /guestbook/gbook.php?a=sign
>> HTTP/1.1" I got rid of the guestbook months ago. The directory does
>> not exist. I have lines in .htaccess....
>> "Redirect /guestbook http://www.[somewhere else].com" and
>> "Redirect /guestbook/ http://www.[somewhere else].com
>> These work as long as there are no parameters after "guestbook".
>> But the bots are adding the "gbook.php?a=sign"
>> Here's one that appears to have come from a googlebot....
>> "crawl-66-249-65-100.googlebot.com - - [01/Aug/2007:17:03:05 -0400]
>> "GET /guestbook/gbook.php?a=delete&num=2?
>> Why would Google try to mess with my (nonexistent) guestbook?
>> I have a multitude of "deny from" lines but they don't even make a
>> dent.
>>
>> What's the best way to handle this situation?
>
> teach the bot : return proper error codes, don't send them else where
> unless it's related (the guestbook moved ?), and read the doc about
> redirect to handle the changing URL.

OK. I studied the documentation and came up with
"Redirect gone /guestbook"

And it seems to be working! Even with parameters.

"Gone
The requested resource /guestbook/gbook.php is no longer available on
this server and there is no forwarding address. Please remove all
references to this resource."

I guess that's about the best I can do. Do the bots even see what is
returned, or will they just keep plugging away forever?

Thanks for your suggestion. We'll see what happens now.

Lil' Abner

Re: Redirect

am 05.08.2007 23:34:50 von patpro

In article ,
"Lil' Abner" wrote:

> > teach the bot : return proper error codes, don't send them else where
> > unless it's related (the guestbook moved ?), and read the doc about
> > redirect to handle the changing URL.
>
> OK. I studied the documentation and came up with
> "Redirect gone /guestbook"

you got it. I usually rely on numbered error codes (404, 410, ...) but
"gone" works the same.

> I guess that's about the best I can do. Do the bots even see what is
> returned, or will they just keep plugging away forever?

they don't care about the text, but they should care about the http
error code. In that case, they will receive a 410 error code, instead of
your previous redirection. If they are well programmed, they should
begin to stop coming (it can take a while).

But I'm afraid it won't change anything is some external sites are
referring to these URL.

patpro

--
http://www.patpro.net/