Dollar Self Storage (aka mod_perl children sharing/updating data on a schedule)

Dollar Self Storage (aka mod_perl children sharing/updating data on a schedule)

am 03.01.2011 22:48:53 von Jeff Anderson

Greetings,

I am looking to set up a mod_perl handler which keep track of the
count of requests coming in. Each child process will store this data
in local memory and after 5-10 minutes have passed, each child process
will merge its data into a central database, the goal being that each
child will not have to hit a database for every request.

I have a handler that contains a data in $self for each child and when
a REQUEST comes through, a check is made to see if the interval has
passed and if so, the child will merge its data with the database.

The problem is --- how do i additionally have each child merge its
data on a schedule -- that is, without relying only on an incoming
request to "hit" that specific child process? I have tried 2 attempted
solutions with no luck. (Keep in mind that as long as requests are
coming in, the children will eventually merge their data within a good
degree of accuracy, but only if requests are coming in.)

Attempt #1 --- configure a signal handler and send a signal to each
child process - this didn't seem to work but i am about to try some
more tests. I have read in the docs, however, that sending direct
signals to mod_perl children is not recommended.

Attempt #2 --- register a Clean Up hook. This doesn't seem to work for
me because, as i understand so far, assigning a reference to a sub via
PerlCleanupHandler is not the same as calling the object's method.
Hence ... i do not have access to $self nor the local memory. So, the
sub is called via the Clean Up phase, but the sub is meant to be
called as a method (and i can't use $self has a hash ref unless called
as a method).

Other considerations:

- Perhaps each child process will need to use it's own SQLite or similar cache?
- Perhaps there is another hook that i do not know about that better
suits such needs?
- Perhaps my mistake is obvious -- configuring Clean up hook incorrectly, etc.

Any information will be greatly appreciated. I hope everyone had a
Happy New Year. On a side note -- there is a storage facility in LA
called "Dollar Self Storage" ... :)


--
jeffa

Re: Dollar Self Storage (aka mod_perl children sharing/updating dataon a schedule)

am 07.01.2011 15:57:57 von Cosimo Streppone

On Mon, 03 Jan 2011 22:48:53 +0100, Jeff Anderson
wrote:

> I am looking to set up a mod_perl handler which keep track of the
> count of requests coming in. Each child process will store this data
> in local memory and after 5-10 minutes have passed, each child process
> will merge its data into a central database, the goal being that each
> child will not have to hit a database for every request.

Hi Jeff,

we usually do that with a local memcached server and
counters (Cache::Memcached::inc() function).

I'm looking into using Cache::FastMmap as an alternative.

I'm not sure about the volume of requests you have,
but using memcached, we got as far as 500 req/s
without any problem or slowdown at all.

--
Cosimo

Re: Dollar Self Storage (aka mod_perl children sharing/updating data on a schedule)

am 13.01.2011 09:17:42 von torsten.foertsch

On Monday, January 03, 2011 22:48:53 Jeff Anderson wrote:
> I am looking to set up a mod_perl handler which keep track of the
> count of requests coming in. Each child process will store this data
> in local memory and after 5-10 minutes have passed, each child process
> will merge its data into a central database, the goal being that each
> child will not have to hit a database for every request.
>=20
> I have a handler that contains a data in $self for each child and when
> a REQUEST comes through, a check is made to see if the interval has
> passed and if so, the child will merge its data with the database.
>=20
> The problem is --- how do i additionally have each child merge its
> data on a schedule -- that is, without relying only on an incoming
> request to "hit" that specific child process? I have tried 2 attempted
> solutions with no luck. (Keep in mind that as long as requests are
> coming in, the children will eventually merge their data within a good
> degree of accuracy, but only if requests are coming in.)
>=20
> Attempt #1 --- configure a signal handler and send a signal to each
> child process - this didn't seem to work but i am about to try some
> more tests. I have read in the docs, however, that sending direct
> signals to mod_perl children is not recommended.
>=20
> Attempt #2 --- register a Clean Up hook. This doesn't seem to work for
> me because, as i understand so far, assigning a reference to a sub via
> PerlCleanupHandler is not the same as calling the object's method.
> Hence ... i do not have access to $self nor the local memory. So, the
> sub is called via the Clean Up phase, but the sub is meant to be
> called as a method (and i can't use $self has a hash ref unless called
> as a method).
>=20
> Other considerations:
>=20
> - Perhaps each child process will need to use it's own SQLite or similar
> cache? - Perhaps there is another hook that i do not know about that
> better suits such needs?
> - Perhaps my mistake is obvious -- configuring Clean up hook incorrectly,
> etc.
>=20
> Any information will be greatly appreciated. I hope everyone had a
> Happy New Year. On a side note -- there is a storage facility in LA
> called "Dollar Self Storage" ... :)

I can think of 2 solutions.

1) perhaps the Apache scoreboard already has all the information you need.=
=20
Then you don't need any special hook. Just configure a scroreboard file on=
=20
disk. I do that normally on a tmpfs fileysystem (Linux). So it does really=
=20
exist only in RAM. Then write an external daemon that uses=20
Apache2::ScorebordFile to read the information on a regular basis.

2) quite similar. But if the required information is not available in the=20
scoreboard you can establish your own using either File::Map or=20
IPC::ScoreBoard.

In both cases you have to deal with the fact that Apache starts up addition=
al=20
children on demand and also terminates them when the load goes down. The=20
apache scoreboard does contain the necessary information to do that.

Torsten Förtsch

=2D-=20
Need professional modperl support? Hire me! (http://foertsch.name)

Like fantasy? http://kabatinte.net

Re: Dollar Self Storage (aka mod_perl children sharing/updating dataon a schedule)

am 13.01.2011 14:59:36 von Ryan Gies

On 01/03/2011 04:48 PM, Jeff Anderson wrote:
> the goal being that each child will not have to hit a database for every request.
>
If the reason behind not hitting the database on each request is because
you don't want to impact your page-response times, know that the Cleanup
phase happens after the response has been sent to the client.

The below demonstration assigns a cleanup handler which sleeps for 3
seconds. You will notice that your pages still load before the snooze
handler is run.

# ---- Source from: .../httpd.conf ----

PerlModule Apache2::Testing
PerlCleanupHandler Apache2::Testing->snooze_handler

# ---- Source from: .../Apache2/Testing.pm ----

package Apache2::Testing;
use strict;
use ModPerl::Util;
our $Seconds = 3;
sub snooze_handler {
my $phase = ModPerl::Util::current_callback();
warn sprintf("[%d] %s: is sleeping for %d seconds.\n", $$, $phase, $Seconds);
sleep $Seconds;
warn sprintf("[%d] %s: is now awake.\n", $$, $phase);
}
1;

Re: Dollar Self Storage (aka mod_perl children sharing/updating dataon a schedule)

am 13.01.2011 16:12:52 von Perrin Harkins

Hi Jeff,

> I am looking to set up a mod_perl handler which keep track of the
> count of requests coming in. Each child process will store this data
> in local memory and after 5-10 minutes have passed, each child process
> will merge its data into a central database, the goal being that each
> child will not have to hit a database for every request.

I agree with the people saying that memcached/Cache::FastMmap or an
in-memory file is probably fast enough to hit on every request. In
general though, storing and dumping things to a db now and then is not
a bad way to go for non-critical data.

> The problem is --- how do i additionally have each child merge its
> data on a schedule -- that is, without relying only on an incoming
> request to "hit" that specific child process?

You can't. The nature of apache is that it responds to network
events. Cleanup handlers are also only going to fire after a request.
You could rig up a cron to hit your server regularly and if the data
was shared between the children then whatever child picked that up
could write it to the db, but that seems a lot harder than
alternatives already suggested.

> Attempt #2 --- register a Clean Up hook. This doesn't seem to work for
> me because, as i understand so far, assigning a reference to a sub via
> PerlCleanupHandler is not the same as calling the object's method.

You could just store this data in a $My::Global::Counter and read it
from anywhere. Each child has its own variable storage, so this is
safe.

Second, you should be able to make a cleanup handler call your sub as a method

> - Perhaps each child process will need to use it's own SQLite or similar cache?

SQLite may well be slower than your real database, so I wouldn't do
that without testing.

BTW, how are you configuring a handler to create a $self that lasts
across multiple requests?

- Perrin

Re: Dollar Self Storage (aka mod_perl children sharing/updating dataon a schedule)

am 26.01.2011 00:17:06 von Jeff Anderson

--00163631049142d2ab049ab3ea98
Content-Type: text/plain; charset=ISO-8859-1

Greetings,

First, a big thank you to everyone for these great suggestions and
corrections. The idea to send a HUP signal to the parent process works fine
under the pre-forked server model, but does not work at until under the
threaded worker model. So signals are right out.

I probably forgot to mention that we are also dealing with multiple servers,
so a central data store will be required.

We finally decided to go with a convention -- send as many "internal"
specialized requests as there are child proceses to each server seems to
working well enough. We are not interested in precious, just pretty darned
good accuracy.

Please feel free to keep this discussion open, ask questions or make further
suggestions.

Thank you all once again!
jeffa



On Thu, Jan 13, 2011 at 7:12 AM, Perrin Harkins wrote:

> Hi Jeff,
>
> > I am looking to set up a mod_perl handler which keep track of the
> > count of requests coming in. Each child process will store this data
> > in local memory and after 5-10 minutes have passed, each child process
> > will merge its data into a central database, the goal being that each
> > child will not have to hit a database for every request.
>
> I agree with the people saying that memcached/Cache::FastMmap or an
> in-memory file is probably fast enough to hit on every request. In
> general though, storing and dumping things to a db now and then is not
> a bad way to go for non-critical data.
>
> > The problem is --- how do i additionally have each child merge its
> > data on a schedule -- that is, without relying only on an incoming
> > request to "hit" that specific child process?
>
> You can't. The nature of apache is that it responds to network
> events. Cleanup handlers are also only going to fire after a request.
> You could rig up a cron to hit your server regularly and if the data
> was shared between the children then whatever child picked that up
> could write it to the db, but that seems a lot harder than
> alternatives already suggested.
>
> > Attempt #2 --- register a Clean Up hook. This doesn't seem to work for
> > me because, as i understand so far, assigning a reference to a sub via
> > PerlCleanupHandler is not the same as calling the object's method.
>
> You could just store this data in a $My::Global::Counter and read it
> from anywhere. Each child has its own variable storage, so this is
> safe.
>
> Second, you should be able to make a cleanup handler call your sub as a
> method
>
> > - Perhaps each child process will need to use it's own SQLite or similar
> cache?
>
> SQLite may well be slower than your real database, so I wouldn't do
> that without testing.
>
> BTW, how are you configuring a handler to create a $self that lasts
> across multiple requests?
>
> - Perrin
>



--
jeffa

--00163631049142d2ab049ab3ea98
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Greetings,


First, a big thank you to everyone for these =
great suggestions and corrections. The idea to send a HUP signal to the par=
ent process works fine under the pre-forked server model, but does not work=
at until under the threaded worker model. So signals are right out.


I probably forgot to mention that we are also dealing w=
ith multiple servers, so a central data store will be required.
<=
br>
We finally decided to go with a convention -- send as many &q=
uot;internal" specialized requests as there are child proceses to each=
server seems to working well enough. We are not interested in precious, ju=
st pretty darned good accuracy.


Please feel free to keep this discussion open, ask ques=
tions or make further suggestions.

Thank you all o=
nce again!
jeffa



mail_quote">
On Thu, Jan 13, 2011 at 7:12 AM, Perrin Harkins < ef=3D"mailto:perrin@elem.com">perrin@elem.com> wrote:
ckquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #c=
cc solid;padding-left:1ex;">
Hi Jeff,



> I am looking to set up a mod_perl handler which keep track of the

> count of requests coming in. Each child process will store this data r>
> in local memory and after 5-10 minutes have passed, each child process=


> will merge its data into a central database, the goal being that each<=
br>
> child will not have to hit a database for every request.



I agree with the people saying that memcached/Cache::FastMmap or an r>
in-memory file is probably fast enough to hit on every request. =A0In

general though, storing and dumping things to a db now and then is not

a bad way to go for non-critical data.



> The problem is --- how do i additionally have each child merge its

> data on a schedule -- that is, without relying only on an incoming

> request to "hit" that specific child process?



You can't. =A0The nature of apache is that it responds to network=


events. =A0Cleanup handlers are also only going to fire after a request. >
=A0You could rig up a cron to hit your server regularly and if the data

was shared between the children then whatever child picked that up

could write it to the db, but that seems a lot harder than

alternatives already suggested.



> Attempt #2 --- register a Clean Up hook. This doesn't seem to work=
for

> me because, as i understand so far, assigning a reference to a sub via=


> PerlCleanupHandler is not the same as calling the object's method.=




You could just store this data in a $My::Global::Counter and read it<=
br>
from anywhere. =A0Each child has its own variable storage, so this is

safe.



Second, you should be able to make a cleanup handler call your sub as a met=
hod



> - Perhaps each child process will need to use it's own SQLite or s=
imilar cache?



SQLite may well be slower than your real database, so I wouldn't =
do

that without testing.



BTW, how are you configuring a handler to create a $self that lasts

across multiple requests?



- Perrin




--
jeffa



--00163631049142d2ab049ab3ea98--