Cache Module that Caches EVERYTHING

Cache Module that Caches EVERYTHING

am 12.11.2003 23:55:46 von Eli Marmor

Hi,

I'm curious to know if there is any module that does the following:

Caches EVERYTHING, including dynamic pages and GET/POST requests with
parameters (i.e. if http://domain.com/cgi.exe?key=valA returns fooA and
http://domain.com/cgi.exe?key=valB returns fooB, then the next call to
http://domain.com/cgi.exe?key=valA will return fooA without even
accessing the backend server and http://domain.com/cgi.exe?key=valB
will return fooB without accessing the backend server).

In other words, I'm looking for a special version of mod_cache that
handles situations of off-line browsing.

Is there anything?

Nick? Graham? Anybody?

Thanks,
--
Eli Marmor
marmor@netmask.it
CTO, Founder
Netmask (El-Mar) Internet Technologies Ltd.
__________________________________________________________
Tel.: +972-9-766-1020 8 Yad-Harutzim St.
Fax.: +972-9-766-1314 P.O.B. 7004
Mobile: +972-50-23-7338 Kfar-Saba 44641, Israel

Re: Cache Module that Caches EVERYTHING

am 13.11.2003 00:20:55 von Ian Holsman

I *believe* that the cache-disk/memcache in apache 2
could do this.
you would need to override the key generation (via the optional hook)
to make the queryargs part of the name.

On 13/11/2003, at 9:55 AM, Eli Marmor wrote:

> Hi,
>
> I'm curious to know if there is any module that does the following:
>
> Caches EVERYTHING, including dynamic pages and GET/POST requests with
> parameters (i.e. if http://domain.com/cgi.exe?key=valA returns fooA and
> http://domain.com/cgi.exe?key=valB returns fooB, then the next call to
> http://domain.com/cgi.exe?key=valA will return fooA without even
> accessing the backend server and http://domain.com/cgi.exe?key=valB
> will return fooB without accessing the backend server).
>
> In other words, I'm looking for a special version of mod_cache that
> handles situations of off-line browsing.
>
> Is there anything?
>
> Nick? Graham? Anybody?
>
> Thanks,
> --
> Eli Marmor
> marmor@netmask.it
> CTO, Founder
> Netmask (El-Mar) Internet Technologies Ltd.
> __________________________________________________________
> Tel.: +972-9-766-1020 8 Yad-Harutzim St.
> Fax.: +972-9-766-1314 P.O.B. 7004
> Mobile: +972-50-23-7338 Kfar-Saba 44641, Israel
>
--
Ian Holsman
Director
Network Management Systems
CNET Networks
PH: (61) 3-9857-3742 (Australia)/ 415-344-2608 (USA)

Re: Cache Module that Caches EVERYTHING

am 13.11.2003 00:46:03 von Eli Marmor

Ian Holsman wrote:

> I *believe* that the cache-disk/memcache in apache 2
> could do this.
> you would need to override the key generation (via the optional hook)
> to make the queryargs part of the name.

Wow.

My plan was that if there is not such a module, I would patch mod_cache
and/or its sub parts (mem/disk/etc.) to do exactly what you wrote.

But I haven't thought about this hook. Overriding it looks simpler and
more elegant.

Your answer is so simple but so brilliant...

And it's the third SUCCESSIVE time that a question of me is answered by
you (maybe it's finally the time to take a look at your wish list ;-)


In any case, if anybody else knows a module which does EXACTLY what I
need, he is still welcome to tell about it.

Thanks,
--
Eli Marmor
marmor@netmask.it
CTO, Founder
Netmask (El-Mar) Internet Technologies Ltd.
__________________________________________________________
Tel.: +972-9-766-1020 8 Yad-Harutzim St.
Fax.: +972-9-766-1314 P.O.B. 7004
Mobile: +972-50-23-7338 Kfar-Saba 44641, Israel

Re: Cache Module that Caches EVERYTHING

am 13.11.2003 01:03:08 von Ian Holsman

this is the hook in question.
http://lxr.webperf.org/source.cgi/modules/experimental/mod_c ache.h#338

there may be some other things you need to do to the cache to make it
actually want to cache the content.
this just makes it differentiate based on query args.

On 13/11/2003, at 10:46 AM, Eli Marmor wrote:

> Ian Holsman wrote:
>
>> I *believe* that the cache-disk/memcache in apache 2
>> could do this.
>> you would need to override the key generation (via the optional hook)
>> to make the queryargs part of the name.
>
> Wow.
>
> My plan was that if there is not such a module, I would patch mod_cache
> and/or its sub parts (mem/disk/etc.) to do exactly what you wrote.
>
> But I haven't thought about this hook. Overriding it looks simpler and
> more elegant.
>
> Your answer is so simple but so brilliant...
>
> And it's the third SUCCESSIVE time that a question of me is answered by
> you (maybe it's finally the time to take a look at your wish list ;-)
>
>
> In any case, if anybody else knows a module which does EXACTLY what I
> need, he is still welcome to tell about it.
>
> Thanks,
> --
> Eli Marmor
> marmor@netmask.it
> CTO, Founder
> Netmask (El-Mar) Internet Technologies Ltd.
> __________________________________________________________
> Tel.: +972-9-766-1020 8 Yad-Harutzim St.
> Fax.: +972-9-766-1314 P.O.B. 7004
> Mobile: +972-50-23-7338 Kfar-Saba 44641, Israel
>
--
Ian Holsman
Director
Network Management Systems
CNET Networks
PH: (61) 3-9857-3742 (Australia)/ 415-344-2608 (USA)

Re: Cache Module that Caches EVERYTHING

am 13.11.2003 01:08:49 von Eli Marmor

Ian Holsman wrote:
>
> this is the hook in question.
> http://lxr.webperf.org/source.cgi/modules/experimental/mod_c ache.h#338
>
> there may be some other things you need to do to the cache to make it
> actually want to cache the content.
> this just makes it differentiate based on query args.

Ha...
I just found the cache_generate_key hook a minute before reading your
message... (next time I should wait for your messages before digging
into Apache's source? ;-)

It seems that there is already a partial support for arguments in the
default function.

The main fix is going to be, as you noted, "convincing" mod_cache to
cache EVERYTHING, no matter if it's static, dynamic, uncacheable, etc.

Maybe I'll submit something (if it will be useful) to the develops
list.

Thanks again,
--
Eli Marmor
marmor@netmask.it
CTO, Founder
Netmask (El-Mar) Internet Technologies Ltd.
__________________________________________________________
Tel.: +972-9-766-1020 8 Yad-Harutzim St.
Fax.: +972-9-766-1314 P.O.B. 7004
Mobile: +972-50-23-7338 Kfar-Saba 44641, Israel

Re: [apache-modules] Re: Cache Module that Caches EVERYTHING

am 13.11.2003 04:29:25 von Nick Kew

On Thu, 13 Nov 2003, Eli Marmor wrote:

> Ian Holsman wrote:

Ian seems to have written off-list; neither my inbox nor the web
archive at marc.theaimsgroup.com has it. I am doing some work
that involves smart cacheing, but I haven't figured out how I
could use mod_cache for it, so this discussion is something I'd
like to see!

(Actually I can see a basic approach to using mod_cache for my app:
an apr_http_client implementation that could hook in things like
mod_cache is on the wishlist, but I've not tried to tackle it).

--
Nick Kew

Re: Cache Module that Caches EVERYTHING

am 13.11.2003 15:01:07 von Mike Collins

Check out oscache at www.opensymphony.com

It has a servlet filter that does what you are requestng.


----- Original Message -----=20
From: "Eli Marmor"
To: ;
Sent: Wednesday, November 12, 2003 5:55 PM
Subject: Cache Module that Caches EVERYTHING


Hi,

I'm curious to know if there is any module that does the following:

Caches EVERYTHING, including dynamic pages and GET/POST requests with
parameters (i.e. if http://domain.com/cgi.exe?key=3DvalA returns fooA =
and
http://domain.com/cgi.exe?key=3DvalB returns fooB, then the next call to
http://domain.com/cgi.exe?key=3DvalA will return fooA without even
accessing the backend server and http://domain.com/cgi.exe?key=3DvalB
will return fooB without accessing the backend server).

In other words, I'm looking for a special version of mod_cache that
handles situations of off-line browsing.

Is there anything?

Nick? Graham? Anybody?

Thanks,
--=20
Eli Marmor
marmor@netmask.it
CTO, Founder
Netmask (El-Mar) Internet Technologies Ltd.
__________________________________________________________
Tel.: +972-9-766-1020 8 Yad-Harutzim St.
Fax.: +972-9-766-1314 P.O.B. 7004
Mobile: +972-50-23-7338 Kfar-Saba 44641, Israel

Re: [apache-modules] Re: Cache Module that Caches EVERYTHING

am 14.11.2003 10:03:55 von Eli Marmor

Nick Kew wrote:
>
> On Thu, 13 Nov 2003, Eli Marmor wrote:
>
> > Ian Holsman wrote:
>
> Ian seems to have written off-list; neither my inbox nor the web
> archive at marc.theaimsgroup.com has it.

Thanks for this note.

Ian wrote TO the lists.
I just replied, without re-adding the lists to the "To:" field.
The point is that he is subscribed only to one of the two lists that
were included in the "To:" field, so the readers of that list
(including the archives) received his message immediately, while the
other list received after a delay (or haven't received at all).

Everybody is welcome to read messages that he missed in the archive
of the other list. In my messages, I'll continue to quote everything.

> I am doing some work
> that involves smart cacheing, but I haven't figured out how I
> could use mod_cache for it, so this discussion is something I'd
> like to see!
>
> (Actually I can see a basic approach to using mod_cache for my app:
> an apr_http_client implementation that could hook in things like
> mod_cache is on the wishlist, but I've not tried to tackle it).

It's simple:

First, set mod_cache directives to the most liberal value:

CacheIgnoreCacheControl On
CacheIgnoreNoLastMod On
CacheExpiryCheck Off
CacheMaxStreamingBuffer 1000000

(the last one might become needless in latest versions of Apache).

Next, patch the function "cache_in_filter()" so it will cache more
things. It mainly involves commenting out "if" conditions like checks
for "no-store", "private", etc. There is also a check for the existence
of "args" when there is no "Expiration"; It should be commented out too
(by the way: most of the things that I mention had to be done by
current directives, at least according to the DOCs. So either the code
should be fixed, or the DOCs should be "fixed"...).

After everything is working, new directives should be added, to control
when that checks will be done and when not (at least for the "private"
check; The rest should be controlled by existing directives, as I
mentioned above) (I used the words "comment out" above, only as a quick
and dirty way to cause mod_cache to work in different way, but of
course after it's working it should be configured using directives, so
the original behavior will remain the default one, and the behavior of
offline/mirror will be activated by changing the default values of the
directives).

Then you reach the most demanding task: POST requests. Contrary to GET
requests, which are cached correctly (after making the mentioned above
patches), the arguments of POST are contained it the body, which is not
accessible during the beginning of the caching. The only way to resolve
it, is to write an input filter (which will parse the arguments or just
save them for future concatanation to the key after the "?").

Alternatively, it is possible to use apreq-2 for this purpose (but it
will require you to include apreq-2 in your future builds of Apache. I
would love to see it included as a standard APR module...).

Support for POST requests will not be complete without changing the
check for M_GET method in cache_url_handler(), which should check for
M_POST too (again, this should be controlled by directives and not done
automatically for everybody).

The last thing is to force the next accesses to the cached resource to
use it, although the stuff believes that it is not cacheable. And the
same hack for POST, should be done here too (I'm not sure if it is
needed to distinguish between GET and POST; we may cache both in the
same syntax/format, and ignore rare cases when the same request with
the same parameters but a different method is used. And even in such
rare cases, I'm not sure it may hurt anything if we refer to both
requests as the same one).


Now comes the biggest question:

Isn't it better to find an existing module that already does it?
I know there are such modules for squid, why aren't for Apache2?
Or are there and I am not aware of them?

--
Eli Marmor
marmor@netmask.it
CTO, Founder
Netmask (El-Mar) Internet Technologies Ltd.
__________________________________________________________
Tel.: +972-9-766-1020 8 Yad-Harutzim St.
Fax.: +972-9-766-1314 P.O.B. 7004
Mobile: +972-50-23-7338 Kfar-Saba 44641, Israel