Best practices for returning 404/file-not-found pages inside andoutside of mod_perl
Best practices for returning 404/file-not-found pages inside andoutside of mod_perl
am 15.04.2008 21:19:09 von Mark Stosberg
It seems that using CGI, it is too late return a true 404 once the
script is processing the request. It's possible to still send output
that returns "page not found" text, but the HTTP status code will be 200.
More recently, I learned that with mod_perl, I learned that I can get
the system to return a true 404, so I updated my CGI::Application logic
to do that when possible:
if (exists $ENV{MOD_PERL}) {
$self->header_add( -status => 404 );
return '';
}
else {
return $self->error(title => 'Page not found')
}
However, I don't think I'm doing the ideal think in mod_perl, because it
behaves strangely in some cases. Two specific cases:
If I use GET on the command line, instead of 404, I'll get back this:
"500 EOF when chunk header expected"
Unless I fallback to HTTP 1.0:
PERL_LWP_USE_HTTP_10=1 GET ...
But for some reason, setting this environment variable was not working
for with Test::WWW::Mechanize.
More troubling is the behavior I see in the browser: The first time I
access the script that would through this 404 in mod_perl, it works.
Then for attempts 2 through 6 return internal server errors complaining
about "can't locate modules". Starting on load 7, the pages are returned
reliably with the 404 error. WTF?
( This is with Apache 1.3x and mod_perl 1.x )
The approach of CGI::Application::Dispatch is to also return the "404"
code, but also returns the body content along with it. In my case, I'm
hoping to trigger the internal ErrorDocument 404 page instead of
re-inventingt that wheel.
What am I missing?
Thanks!
Mark
Re: Best practices for returning 404/file-not-found pages inside and outside of mod_perl
am 15.04.2008 22:28:32 von David Nicol
On Tue, Apr 15, 2008 at 2:19 PM, Mark Stosberg wrote:
>
> return a true 404
Since MP already replaces the C function, it shouldn't be too tricky to
abstract 404 and other error codes with by letting exit take arguments -- then
you could do what you want with C< 404)>> for instance.
I don't know how hard that feature would be to add; adding it to mod-perl might
drive similar features added to other CGI systems, for instance the apache
project could add a mapping of specific non-zero exit codes from CGI programs
to things other than "500 internal server error."
Of course, there's always redirecting to a location that really doesn't exist,
but that isn't "true."
Dave the idea guy
Re: Best practices for returning 404/file-not-found pages inside and outside of mod_perl
am 16.04.2008 04:53:59 von Perrin Harkins
On Tue, Apr 15, 2008 at 3:19 PM, Mark Stosberg wrote:
> It seems that using CGI, it is too late return a true 404 once the script
> is processing the request.
I thought mod_cgi would handle this, actually. It parses your header
output. Apache::Registry has trouble emulating that, as discussed on
this list in the past.
> However, I don't think I'm doing the ideal think in mod_perl, because it
> behaves strangely in some cases. Two specific cases:
>
> If I use GET on the command line, instead of 404, I'll get back this:
> "500 EOF when chunk header expected"
You're not using Registry here, right? Is it possible that something
is using your status header as a return code from a mod_perl handler?
Those don't always match.
The best source for examples of how to do this correctly is probably
the mod_perl Developer's Cookbook. I don't have mine handy, but
that's where I'd like first if you have it.
> More troubling is the behavior I see in the browser: The first time I
> access the script that would through this 404 in mod_perl, it works.
> Then for attempts 2 through 6 return internal server errors complaining
> about "can't locate modules". Starting on load 7, the pages are returned
> reliably with the 404 error. WTF?
I'm not familiar with that one. What's the full text of the error message?
> In my case, I'm
> hoping to trigger the internal ErrorDocument 404 page instead of
> re-inventingt that wheel.
I'm not sure you can do that. I know you can set the ErrorDocument
for a specific block ($r->custom_response), but I don't think you can
just hand off to ErrorDocument because it's tied into the default
handler. I don't remember this well, so checking one of the books or
the list archive is your best bet.
- Perrin
Re: Best practices for returning 404/file-not-found pages insideand outside of mod_perl
am 22.04.2008 20:27:16 von Mark Stosberg
I'm come to understand my 404 handling case better. Here's what I know:
A. If I just set "status => 404" with CGI.pm / Apache::Registry and
return nothing, it works the first time, and then after that I
get a lot of these errors:
"[Tue Apr 22 13:47:07 2008] [error] Can't locate SAP/QuickSearch.pm
in @INC" And indeed, the path that should be set via PerlSetEnv is
missing.
B. If I sent "status => 404" *and* send content, the result is that
two pages are displayed: One is the ErrorDocument for 404, and the
other is the content I sent.
C. If I don't set the status code but just send "file not found"
content, that looks right to users, but the "200" code is returned,
which is inaccurate for any automated tools using the site.
###
At this point, I'm ready go with "C" as being "good enough", however I'm
interested to know why the environment variable would be missing when
the 404 page is called, especially when it works on the first load.
Thanks!
Mark
--
. . . . . . . . . . . . . . . . . . . . . . . . . . .
Mark Stosberg Principal Developer
mark@summersault.com Summersault, LLC
765-939-9301 ext 202 database driven websites
. . . . . http://www.summersault.com/ . . . . . . . .
--
. . . . . . . . . . . . . . . . . . . . . . . . . . .
Mark Stosberg Principal Developer
mark@summersault.com Summersault, LLC
765-939-9301 ext 202 database driven websites
. . . . . http://www.summersault.com/ . . . . . . . .
Re: Best practices for returning 404/file-not-found pages inside and outside of mod_perl
am 22.04.2008 20:42:44 von Perrin Harkins
On Tue, Apr 22, 2008 at 2:27 PM, Mark Stosberg wrote:
> A. If I just set "status => 404" with CGI.pm / Apache::Registry and
> return nothing, it works the first time, and then after that I
> get a lot of these errors:
>
> "[Tue Apr 22 13:47:07 2008] [error] Can't locate SAP/QuickSearch.pm
> in @INC" And indeed, the path that should be set via PerlSetEnv is
> missing.
Hmm, PerlSetEnv depends on being inside a specific
Location/File/Directory block. Are you sure that apache has resolved
to the block you think it's in when you have this problem?
> B. If I sent "status => 404" *and* send content, the result is that
> two pages are displayed: One is the ErrorDocument for 404, and the
> other is the content I sent.
Yeah, it can't recall content you've already sent. I think mod_cgi
avoids this issue by buffering everything.
- Perrin
Re: Best practices for returning 404/file-not-found pages insideand outside of mod_perl
am 24.04.2008 22:16:44 von Mark Stosberg
> On Tue, Apr 22, 2008 at 2:27 PM, Mark Stosberg
> wrote:
> > A. If I just set "status => 404" with CGI.pm / Apache::Registry
> > and return nothing, it works the first time, and then after that I
> > get a lot of these errors:
> >
> > "[Tue Apr 22 13:47:07 2008] [error] Can't locate
> > SAP/QuickSearch.pm in @INC" And indeed, the path that should be set
> > via PerlSetEnv is missing.
>
> Hmm, PerlSetEnv depends on being inside a specific
> Location/File/Directory block. Are you sure that apache has resolved
> to the block you think it's in when you have this problem?
No, I'm not. Are there debugging techniques that help me confirm
this?
Thanks again for your help, Perrin.
Mark
--
. . . . . . . . . . . . . . . . . . . . . . . . . . .
Mark Stosberg Principal Developer
mark@summersault.com Summersault, LLC
765-939-9301 ext 202 database driven websites
. . . . . http://www.summersault.com/ . . . . . . . .
Re: Best practices for returning 404/file-not-found pages inside and outside of mod_perl
am 25.04.2008 06:53:54 von Perrin Harkins
On Thu, Apr 24, 2008 at 4:16 PM, Mark Stosberg wrote:
> No, I'm not. Are there debugging techniques that help me confirm
> this?
Sorry Mark, I misread your mail. I thought you were using PerlSetVar.
What are you doing exactly? PerlSetEnv PERL5LIB?
- Perrin
Re: Best practices for returning 404/file-not-found pages insideand outside of mod_perl
am 25.04.2008 15:14:37 von Mark Stosberg
> Sorry Mark, I misread your mail. I thought you were using PerlSetVar.
> What are you doing exactly? PerlSetEnv PERL5LIB?
Exactly.
Mark
--
. . . . . . . . . . . . . . . . . . . . . . . . . . .
Mark Stosberg Principal Developer
mark@summersault.com Summersault, LLC
765-939-9301 ext 202 database driven websites
. . . . . http://www.summersault.com/ . . . . . . . .
Re: Best practices for returning 404/file-not-found pages insideand outside of mod_perl
am 25.04.2008 16:38:36 von aw
Mark Stosberg wrote:
>> Sorry Mark, I misread your mail. I thought you were using PerlSetVar.
>> What are you doing exactly? PerlSetEnv PERL5LIB?
>
> Exactly.
>
Hi guys,
sorry to butt in, particularly since much higher-grade specialists have
been in this thread before, but ..
I seem to recall an earlier thread talking about the same kind of thing.
Isn't it too late, once the Apache server has started and initialised
Perl interpreters and so on, to set the environment var. PERL5LIB via
PerlSetEnv ?
Would that not explain the curious behaviour seen ?
André
Re: Best practices for returning 404/file-not-found pages insideand outside of mod_perl
am 29.04.2008 23:23:42 von Mark Stosberg
On Fri, 25 Apr 2008 16:38:36 +0200
André Warnier wrote:
> Mark Stosberg wrote:
> >> Sorry Mark, I misread your mail. I thought you were using
> >> PerlSetVar. What are you doing exactly? PerlSetEnv PERL5LIB?
> >=20
> > Exactly.
> >=20
> Hi guys,
> sorry to butt in, particularly since much higher-grade specialists
> have been in this thread before, but ..
> I seem to recall an earlier thread talking about the same kind of
> thing. Isn't it too late, once the Apache server has started and
> initialised Perl interpreters and so on, to set the environment var.
> PERL5LIB via PerlSetEnv ?
> Would that not explain the curious behaviour seen ?
Possibly. As I reviewed other environments that are working for us,
I see that we use "SetEnv" there. I just introduced "PerlSetEnv" recently,
as part trying to debug mod_perl problems.=20
I did find and change this section in my startup script which seems like=20
it could be related:
- chdir dirname(__FILE__);
- use lib '../config', '../perllib';
+ my $dir =3D dirname(__FILE__);
+ use lib $dir.'/../config'; $
+ use lib $dir.'/../perllib';
####
So, I was adding relative paths to @INC before, and now I'm adding absolute=
ones.=20
Mark
--=20
. . . . . . . . . . . . . . . . . . . . . . . . . . .=20
Mark Stosberg Principal Developer =20
mark@summersault.com Summersault, LLC =20
765-939-9301 ext 202 database driven websites
. . . . . http://www.summersault.com/ . . . . . . . .
Re: Best practices for returning 404/file-not-found pages insideand outside of mod_perl
am 01.05.2008 15:58:16 von Mark Stosberg
> + my $dir = dirname(__FILE__);
> + use lib $dir.'/../config';
> + use lib $dir.'/../perllib';
Actually, for some reason that syntax didn't work either, but this did work on my modperl-startup.pl:
use lib dirname(__FILE__).'/../config';
use lib dirname(__FILE__).'/../perllib';
Mark
Re: Best practices for returning 404/file-not-found pages insideand outside of mod_perl
am 01.05.2008 16:06:28 von aw
Mark Stosberg wrote:
>> + my $dir = dirname(__FILE__);
>> + use lib $dir.'/../config';
>> + use lib $dir.'/../perllib';
>
> Actually, for some reason that syntax didn't work either, but this did work on my modperl-startup.pl:
>
> use lib dirname(__FILE__).'/../config';
> use lib dirname(__FILE__).'/../perllib';
>
> Mark
>
>
this is a question to the perl gurus here :
In the first part above (what does not work), is it not because the "use
lib" instructions are actually "executed" at the perl *compile* time, at
which time the $dir variable does not have any value yet ?
Re: Best practices for returning 404/file-not-found pages inside and outside of mod_perl
am 03.05.2008 06:34:20 von Graham TerMarsch
On Thursday 01 May 2008 7:06 am, Andr=E9 Warnier wrote:
> Mark Stosberg wrote:
> >> + my $dir =3D dirname(__FILE__);
> >> + use lib $dir.'/../config';
> >> + use lib $dir.'/../perllib';
> >
> > Actually, for some reason that syntax didn't work either, but this did
> > work on my modperl-startup.pl:
> >
> > use lib dirname(__FILE__).'/../config';
> > use lib dirname(__FILE__).'/../perllib';
> >
> > Mark
>
> this is a question to the perl gurus here :
>
> In the first part above (what does not work), is it not because the "use
> lib" instructions are actually "executed" at the perl *compile* time, at
> which time the $dir variable does not have any value yet ?
Yes.
Crazier yet, if you do it as:
my $dir;
BEGIN { $dir=3Ddirname(__FILE__) };
use lib $dir.'/../config';
use lib $dir.'/../perllib';
and put the assignment in a BEGIN block, it still doesn't work (at least, I=
=20
haven't been able to get it to work).
I cheated, and when I needed this idiom I created a module that exported th=
e=20
value, and which assigned it in its "import()" method. Gave me something=20
like:
use FindMe qw($ME);
use lib $ME.'/../config';
use lib $ME.'/../perllib';
and that worked fine. The "import()" routine gets called and sets the valu=
e=20
before its used in the following lines.
Now... why the BEGIN block doesn't do it, I've no idea and didn't poke too=
=20
much farther into it to figure out why.
=2D-=20
Graham TerMarsch
Howling Frog Internet Development, Inc.