requests and sub-requests

am 12.10.2008 12:54:32 von aw

Hi.

I am writing a new PerlAuthHandler module.
It is working fine in most cases, but..

In an attempt at being clever, I put the following code in the handler :

unless ($r->is_initial_req) {
if (defined $r->prev) {
# we are in a subrequest. Just copy user from main request.
$r->user( $r->prev->user );
}
# Also disable authorization phase
$r->set_handlers(PerlAuthzHandler => undef);
return OK;
}

The idea being that if we are in a sub-request, there is no point in
authenticating/authorizing it again, since the main request should
already do that, right ? Optimisation..

Now the above works very nicely, except in the case where, before this
handler gets called, there is an intervention by mod_rewrite.
It seems as if mod_rewrite makes the above fail, even when the rewrite
condition does not apply and the URL is considered as a "pass-through".

I suspect that it is because mod_rewrite, no matter what, invoques the
original (or modified) URL as a sub-request of the original request.
This would cause the above to fail, because in such a case, the above
conditional code would be invoked, but there is no $r->prev->user to be
copied.

So,
1) is my suspicion above correct ?
2) is there a way to modify the above code to still allow some
optimisation in that case ?

Thanks.

Re: requests and sub-requests

am 12.10.2008 14:56:45 von torsten.foertsch

On Sun 12 Oct 2008, Andr=E9 Warnier wrote:
> In an attempt at being clever, I put the following code in the
> handler :
>
> =A0 =A0 =A0unless ($r->is_initial_req) {
> =A0 =A0 =A0 =A0 =A0if (defined $r->prev) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0# we are in a subrequest. =A0Just copy user fr=
om main
> request. $r->user( $r->prev->user );
> =A0 =A0 =A0 =A0 =A0}
> =A0 =A0 =A0 =A0 =A0# Also disable authorization phase
> =A0 =A0 =A0 =A0 =A0$r->set_handlers(PerlAuthzHandler =3D> undef);
> =A0 =A0 =A0 =A0 =A0return OK;
> =A0 =A0 =A0}

You have to distinguish between subrequests and internal redirects. The=20
former result from $r->lookup_uri, $r->lookup_file or similar (there=20
are a few more such functions in the C API) and internal redirects that=20
result from $r->internal_redirect (internal_fast_redirect() is not as=20
the name suggests an internal redirect but simply overrides the current=20
request). Subrequests are used for example by mod_rewrite, mod_include,=20
mod_negotiation to look for some characteristics of a document and=20
perhaps pull it in (run() it). Internal redirects are used in mod_cgi=20
when the CGI output indicates a status 200 (HTTP_OK) but also contains=20
a Location header. But the main usage of internal redirects is the=20
ErrorDocument.

Now, is_initial_req() checks if the current $r is the result of a=20
subrequest or the result of a internal redirect and returns false if=20
so. prev() returns the parent request if the current $r is the result=20
of an internal redirect and main() returns the main request if the=20
current $r is a subrequest. So, your code checks only for internal=20
redirects (ErrorDocument).

Now, have a look at httpd-2.x.y/server/request.c around line 170. You'll=20
see this piece of code:

/* Skip authn/authz if the parent or prior request passed the
* authn/authz,
* and that configuration didn't change (this requires
* optimized _walk()
* functions in map_to_storage that use the same merge results given
* identical input.) If the config changes, we must re-auth.
*/
if (r->main && (r->main->per_dir_config == r->per_dir_config)) {
r->user =3D r->main->user;
r->ap_auth_type =3D r->main->ap_auth_type;
}
else if (r->prev && (r->prev->per_dir_config == r->per_dir_config))=
=20
{
r->user =3D r->prev->user;
r->ap_auth_type =3D r->prev->ap_auth_type;
}
else {
switch (ap_satisfies(r)) {
case SATISFY_ALL:
case SATISFY_NOSPEC:
if ((access_status =3D ap_run_access_checker(r)) !=3D 0) {
return decl_die(access_status, "check access", r);
...

You see, you are not the first who had had the idea of reusing an=20
established identity. If your subreq or internal redirect hits the same=20
Location or Directory container the AAA phases are completely skipped.
Maybe this is enough optimization if you shift a few directives around=20
in your httpd.conf.

If not, the code above shows you how to do it. But you must ask yourself=20
if it really is valid to reuse the identity. I believe, you can safely=20
inherit the identity from $r->main or $r->prev but you must not skip=20
the other 2 A's. If you can't it would mean you have one realm of=20
identities for the main request and another for the subreq. That, I'd=20
say, is a configuration error.

> The idea being that if we are in a sub-request, there is no point in
> authenticating/authorizing it again, since the main request should
> already do that, right ? =A0Optimisation..
>
> Now the above works very nicely, except in the case where, before
> this handler gets called, there is an intervention by mod_rewrite. It
> seems as if mod_rewrite makes the above fail, even when the rewrite
> condition does not apply and the URL is considered as a
> "pass-through".
>
> I suspect that it is because mod_rewrite, no matter what, invoques
> the original (or modified) URL as a sub-request of the original
> request. This would cause the above to fail, because in such a case,
> the above conditional code would be invoked, but there is no
> $r->prev->user to be copied.

mod_rewrite doesn't make subrequests if not asked to. I know only of 2=20
ways to have mod_rewrite perform a subreq: %{LA-U:variable}=20
and %{LA-F:variable} in a RewriteCond.

Torsten

=2D-
Need professional mod_perl support?
Just hire me: torsten.foertsch@gmx.net

Re: requests and sub-requests

am 12.10.2008 19:09:58 von aw

Torsten,

Many thanks for the excellent information, I will ponder that.

More below, but one more question here :
Where does $r->internal_redirect "live" (in which package) ?
I am having trouble finding it.

Torsten Foertsch wrote:
> On Sun 12 Oct 2008, André Warnier wrote:
>> In an attempt at being clever, I put the following code in the
>> handler :
>>
>> unless ($r->is_initial_req) {
>> if (defined $r->prev) {
>> # we are in a subrequest. Just copy user from main
>> request. $r->user( $r->prev->user );
>> }
>> # Also disable authorization phase
>> $r->set_handlers(PerlAuthzHandler => undef);
>> return OK;
>> }
>
> You have to distinguish between subrequests and internal redirects. The
> former result from $r->lookup_uri, $r->lookup_file or similar (there
> are a few more such functions in the C API) and internal redirects that
> result from $r->internal_redirect (internal_fast_redirect() is not as
> the name suggests an internal redirect but simply overrides the current
> request). Subrequests are used for example by mod_rewrite, mod_include,
> mod_negotiation to look for some characteristics of a document and
> perhaps pull it in (run() it). Internal redirects are used in mod_cgi
> when the CGI output indicates a status 200 (HTTP_OK) but also contains
> a Location header. But the main usage of internal redirects is the
> ErrorDocument.
>
> Now, is_initial_req() checks if the current $r is the result of a
> subrequest or the result of a internal redirect and returns false if
> so. prev() returns the parent request if the current $r is the result
> of an internal redirect and main() returns the main request if the
> current $r is a subrequest. So, your code checks only for internal
> redirects (ErrorDocument).
>
> Now, have a look at httpd-2.x.y/server/request.c around line 170. You'll
> see this piece of code:
>
> /* Skip authn/authz if the parent or prior request passed the
> * authn/authz,
> * and that configuration didn't change (this requires
> * optimized _walk()
> * functions in map_to_storage that use the same merge results given
> * identical input.) If the config changes, we must re-auth.
> */
> if (r->main && (r->main->per_dir_config == r->per_dir_config)) {
> r->user = r->main->user;
> r->ap_auth_type = r->main->ap_auth_type;
> }
> else if (r->prev && (r->prev->per_dir_config == r->per_dir_config))
> {
> r->user = r->prev->user;
> r->ap_auth_type = r->prev->ap_auth_type;
> }
> else {
> switch (ap_satisfies(r)) {
> case SATISFY_ALL:
> case SATISFY_NOSPEC:
> if ((access_status = ap_run_access_checker(r)) != 0) {
> return decl_die(access_status, "check access", r);
> ...
>
Ok, I get it.

I have a little question related to the above, but not very urgent : why
the check on the configuration change ? what can change between a
request and a sub-request (or internal redirect) ?

> You see, you are not the first who had had the idea of reusing an
> established identity.
I did not think I would be.

If your subreq or internal redirect hits the same
> Location or Directory container the AAA phases are completely skipped.
> Maybe this is enough optimization if you shift a few directives around
> in your httpd.conf.
I don't think so, because this is a really specific authentication
method, for a special case.
And I don't think that Apache will skip the mod_perl AAA phases, will it ?

>
> If not, the code above shows you how to do it. But you must ask yourself
> if it really is valid to reuse the identity. I believe, you can safely
> inherit the identity from $r->main or $r->prev but you must not skip
> the other 2 A's. If you can't it would mean you have one realm of
> identities for the main request and another for the subreq. That, I'd
> say, is a configuration error.
>
As a first stage of the AAA, for some Locations, there is a filtering on
the remote IP of the caller. Some IP's get an "automatic" user-id,
which can vary according to the IP. In some cases, this is authoritative
(no access unless you have the right IP), in some cases not (you get a
second chance). Some Locations don't have the IP filter, they always
get the second chance below. This IP filter is implemented as a
PerlAccessHandler. This is the main reason for trying to optimise,
because it is expensive : the IP of the caller must be compared to
several ranges of IP, not necessarily matching regular subnets.

The second step is a PerlAuthenHandler, which can re-direct to a login
page.
Then there is a PerlAuthenzHandler to check if this user is allowed to
access that resource.
It also combines with SSO, with some URL rewriting, and with trying to
control access to a Tomcat application behind the Apache.

The back-end for the authentication is a special DB system, whose access
for that is rather heavy, but required.
On the positive side, this is for a limited range of well-known
applications, for a limited public and for a reasonable number of
expected transactions/s.
So I am trying to wring out the optimisations I can, without going too far.
I started this module wanting to keep it "clean and lean and mean", but
as I discover more and more twists, it is getting to look like the
classical spaghetti bowl..

I am also, but on a separate thread, looking at tying this AAA stuff to
the $r->connection (with notes()).

I'm also having fun doing this, it's interesting.

>> The idea being that if we are in a sub-request, there is no point in
>> authenticating/authorizing it again, since the main request should
>> already do that, right ? Optimisation..
>>
>> Now the above works very nicely, except in the case where, before
>> this handler gets called, there is an intervention by mod_rewrite. It
>> seems as if mod_rewrite makes the above fail, even when the rewrite
>> condition does not apply and the URL is considered as a
>> "pass-through".
>>
>> I suspect that it is because mod_rewrite, no matter what, invoques
>> the original (or modified) URL as a sub-request of the original
>> request. This would cause the above to fail, because in such a case,
>> the above conditional code would be invoked, but there is no
>> $r->prev->user to be copied.
>
> mod_rewrite doesn't make subrequests if not asked to. I know only of 2
> ways to have mod_rewrite perform a subreq: %{LA-U:variable}
> and %{LA-F:variable} in a RewriteCond.
>

This :
http://perl.apache.org/docs/2.0/api/Apache2/RequestUtil.html #C_is_initial_req_
may be missing ".. or an internal redirect" in a couple of places.

Re: requests and sub-requests

am 12.10.2008 20:39:50 von Adam Prime

André Warnier wrote:
> Torsten,
>
> Many thanks for the excellent information, I will ponder that.
>
> More below, but one more question here :
> Where does $r->internal_redirect "live" (in which package) ?
> I am having trouble finding it.

http://perl.apache.org/docs/2.0/api/Apache2/SubRequest.html# C_internal_redirect_

> As a first stage of the AAA, for some Locations, there is a filtering on
> the remote IP of the caller. Some IP's get an "automatic" user-id,
> which can vary according to the IP. In some cases, this is authoritative
> (no access unless you have the right IP), in some cases not (you get a
> second chance). Some Locations don't have the IP filter, they always
> get the second chance below. This IP filter is implemented as a
> PerlAccessHandler. This is the main reason for trying to optimise,
> because it is expensive : the IP of the caller must be compared to
> several ranges of IP, not necessarily matching regular subnets.

It seems odd to me to set $r->user in an AccessHandler. It's probably
not a problem, but it seems (at least to me) that that would make more
sense as a part of the Authen code. You can then control all your
'second chance' stuff with normal state checking within your Authen
Handler instead of doing funky stuff with set_handlers (which seems to
be what you're doing)

Adam

Re: requests and sub-requests

am 12.10.2008 21:24:43 von aw

Adam Prime wrote:

>
> http://perl.apache.org/docs/2.0/api/Apache2/SubRequest.html# C_internal_redirect_
Thanks.
Although considering Torsten previous answer and explanation, that's
kind of an odd place, no ? ;-)

[...]

>
> It seems odd to me to set $r->user in an AccessHandler. It's probably
> not a problem, but it seems (at least to me) that that would make more
> sense as a part of the Authen code. You can then control all your
> 'second chance' stuff with normal state checking within your Authen
> Handler instead of doing funky stuff with set_handlers (which seems to
> be what you're doing)
>

Well, yeah, I guess it can be debated, on the philosophical level.

Access Control is supposed to be allowing or denying access, based on
criteria other than the user identity.
This handler does 2 things :
It checks the remote IP of the caller, and can deny access if it is not
in its list. That's clearly part of Access.
Now in addition, if the IP is in it's list, it can have a "standard"
user-id associated with it. So, as long as we're there, we might as
well pick it up and set it in $r->user.

The Authentication runs right after it, and that's another handler.
This one is supposed to check that we know who the user is.
It does that, by checking first $r->user.
If it finds it there, it means that something put it there before, and
it this case that can only be the Access handler. If the Access module
did not put it there, then this handler would have to re-read the IP
list, and look again for the caller IP, which does not seem to make sense.
Now if there is no $r->user, it means the user needs to login.
For the login, we want to send back a nice-looking form. And after the
user fills it in, we would like - if the user-id is ok - to proceed
back to the original page that the user tried to access. For that we
need to fill-in some parameters in the login form, so that when the user
posts it, we know where to go (back) to.

That's where the set_handlers kicks in.

Instead of sending an external redirect to the browser toward a static
login form (which I could then not fill in), or to a server location
where they would get the login form (which would nean another
section in the configuration, and another browser round-trip), I thought
it was more elegant and mod_perl-ish and not so funky, to use
set_handlers() to install a ResponseHandler for the current call, and
then return OK from the current Authen handler.

The next phase is the response (I'm skipping the AuthenzHandler for
clarity).
Now we have two cases :
- either the access/authentication/authorization was ok, and we have the
default Apache handler sending the normal response
- or the AAA was not ok, and we have the custom previously-installed
perl ResponseHandler, who will send the login form, while filling it in
on the way based on information stored in $r->pnotes previously.
Without a browser round-trip.

I personally find this rather elegant, and not too bizarre.
It also works very nicely, in most cases.

The problem is that, if the location desired by the user originally
already has a non-default handler, then things don't happen exactly as I
wanted : despite installing modperl as the handler and my
ResponseHandler under that, the original Apache-level handler is still
being called.

But that's another thread on this same list.

Re: requests and sub-requests

am 12.10.2008 21:51:28 von torsten.foertsch

On Sun 12 Oct 2008, Andr=E9 Warnier wrote:
> I have a little question related to the above, but not very urgent :
> why the check on the configuration change ? what can change between a
> request and a sub-request (or internal redirect) ?

Suppose this:

Require group foo

Require group bar

If during the processing of /mainreq a subreq is issued to /subreq the=20
other "Require group" must be respected. Of course there also can be=20
another AuthUserFile or so, so that the identity could not be=20
established and the subreq results in a 401. I'd consider that as a=20
configuration error.

The configuration change mentioned above means the different Location=20
containers. They are represented as $r->per_dir_config.

> I don't think so, because this is a really specific authentication
> method, for a special case.
> And I don't think that Apache will skip the mod_perl AAA phases, will
> it ?

Yes, mod_perl handlers are called inside these ap_run_... functions.=20
ap_run_access_checker() is the first of the 3 A's. A PerlAccessHandler=20
is called from this function.

Torsten

=2D-
Need professional mod_perl support?
Just hire me: torsten.foertsch@gmx.net

Re: requests and sub-requests

am 12.10.2008 22:54:46 von aw

Torsten Foertsch wrote:
[...]
>> And I don't think that Apache will skip the mod_perl AAA phases, will
>> it ?
>
> Yes, mod_perl handlers are called inside these ap_run_... functions.
> ap_run_access_checker() is the first of the 3 A's. A PerlAccessHandler
> is called from this function.
>
Thanks again for another valuable bit of information.
That removes the incentive for many of my tentative optimisations.
I seems that I can just let Apache deal with this then.
I have to think some more about it, but at the moment I don't see a
downside of that.
I do not use lookup_uri in this module, at least not explicitly.

Would you by any chance like to have a look too at the "sethandlers
question" thread a bit higher up ?