non-stop generational modperl config update strategies?

non-stop generational modperl config update strategies?

am 21.04.2010 02:18:33 von Jeff McCarrell

> This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

--B_3354628713_19438396
Content-type: text/plain;
charset="ISO-8859-1"
Content-transfer-encoding: quoted-printable

Hello modperl-folk:

In a nutshell: I would like to ask the community for pointers on how to
evolve our successfully
deployed application which gets restarted once / hour to reload
configuration state
to a model where it is continuously running.

Background:
We have a long-successfully-deployed mp2 application that has to scale well=
..
Our clusters are currently fronted by hardware load balancers.
To effect a configuration change, which is currently done hourly,
a process removes a cluster slave host from the hardware load balancers
rotation,
pushes the new config state down to the slave,
and restarts apache2.
After apache2 restarts, the mgmt process puts that host back into the load
balancer rotation.

We are running the prefork MPM with 32-512 httpd children on fairly beefy
linux boxes with 4 =AD 8 cores each
using the standard unix copy-on-write model:
load the config state into perl data structures in the parent once via a
PerlPostConfigHandler
then fork as many children as the prefork MPM decides it needs to handle th=
e
load.
The configuration pushes are scheduled via wall clock time;
currently it is deterministic when a configuration change will be fully
propagated.
All configuration changes are pushed to all slaves.
There are currently a few hundred slaves covering most of North America.
This model has worked and is working fairly well for us to date.

However, the clock-work nature of our scheme ultimately limits our
scalability.
And in thinking about what it would take to overcome our current
limitations,
I would like to be able to reload the configuration state from new data
without an apache restart.
Put another way, I would like to be able to load next generation
configuration state into new httpd children,
then kill off the previous generation as they complete their current
requests,
and start using the new instances, all while servicing requests,
albeit at perhaps a reduced rate while the configuration state is being
swapped.
=20
I don=B9t know how to do this in the prefork MPM;
and what I am proposing more-or-less breaks the load-parent / fork model.

The size of the configuration data are not too large: lets say 10Meg or so,
but large enough that I prefer models that share this state among all httpd
children.

At this point, I'd be willing to consider other MPMs if they help me get
there.
I am willing to trade memory and CPU to achieve a non-stop apache instance.

So modperl experts: any pointers on prior art here?
I'd love to hear about strategies that have been shown to work in real life=
..


TIA,

-- jeff


--B_3354628713_19438396
Content-type: text/html;
charset="ISO-8859-1"
Content-transfer-encoding: quoted-printable



non-stop generational modperl config update strategies?


'>Hello modperl-folk:



In a nutshell: I would like to ask the community for pointers on how to evo=
lve our successfully

deployed application which gets restarted once / hour to reload configurati=
on state

to a model where it is continuously running.



Background:

We have a long-successfully-deployed mp2 application that has to scale well=
..

Our clusters are currently fronted by hardware load balancers.

To effect a configuration change, which is currently done hourly,

a process removes a cluster slave host from the hardware load balancers rot=
ation,

pushes the new config state down to the slave,

and restarts apache2.

After apache2 restarts, the mgmt process puts that host back into the load =
balancer rotation.



We are running the prefork MPM with 32-512 httpd children on fairly beefy l=
inux boxes with 4 – 8 cores each

using the standard unix copy-on-write model:

load the config state into perl data structures in the parent once via a Pe=
rlPostConfigHandler

then fork as many children as the prefork MPM decides it needs to handle th=
e load.

The configuration pushes are scheduled via wall clock time;

currently it is deterministic when a configuration change will be fully pro=
pagated.

All configuration changes are pushed to all slaves.

There are currently a few hundred slaves covering most of North America. >
This model has worked and is working fairly well for us to date.



However, the clock-work nature of our scheme ultimately limits our scalabil=
ity.

And in thinking about what it would take to overcome our current limitation=
s,

I would like to be able to reload the configuration state from new data wit=
hout an apache restart.

Put another way, I would like to be able to load next generation configurat=
ion state into new httpd children,

then kill off the previous generation as they complete their current reques=
ts,

and start using the new instances, all while servicing requests,

albeit at perhaps a reduced rate while the configuration state is being swa=
pped.

 

I don’t know how to do this in the prefork MPM;

and what I am proposing more-or-less breaks the load-parent / fork model. R>


The size of the configuration data are not too large: lets say 10Meg or so,=


but large enough that I prefer models that share this state among all httpd=
children.



At this point, I'd be willing to consider other MPMs if they help me get th=
ere.

I am willing to trade memory and CPU to achieve a non-stop apache instance.=




So modperl experts: any pointers on prior art here?

I'd love to hear about strategies that have been shown to work in real life=
..





TIA,



-- jeff






--B_3354628713_19438396--

Re: non-stop generational modperl config update strategies?

am 21.04.2010 04:09:14 von Perrin Harkins

Hi Jeff,

On Tue, Apr 20, 2010 at 8:18 PM, Jeff McCarrell wrote:
> I would like to be able to reload the configuration state from new data
> without an apache restart.

Given that you say it's only 10MBs, my best advice to you is to stop
worrying about sharing, even it means buying a few thousand dollars
more RAM. It's the simplest thing by far and might save you a lot of
debugging time. Then you can just have child processes check the
timestamp on your config file or however you push it out and reload
the data as needed in a cleanup handler.

If you're determined to share it, I don't think you'll find anything
significantly better than what you have. There's no way to share
configuration via IPC without eventually copying it into local perl
memory. If you try to use the threaded MPM you may succeed in sharing
this structure but you will have totally lost copy-on-write and will
probably end up using more RAM in the end. It's not that hard to try
though, so maybe you'd like to give it a shot.

Alternatively, you may decide this data is not all needed all the time
and you can store it in something very fast like BerkeleyDB and only
read the pieces you need when you need them from the apache children.

- Perrin

Re: non-stop generational modperl config update strategies?

am 21.04.2010 10:27:29 von torsten.foertsch

On Wednesday 21 April 2010 02:18:33 Jeff McCarrell wrote:
> I am willing to trade memory and CPU to achieve a non-stop apache instanc=
e.
>=20
> So modperl experts: any pointers on prior art here?
> I'd love to hear about strategies that have been shown to work in real
> life.
>=20
I use Apache2::Translation (AT) + MMapDB. Of course it depends upon your ap=
p.=20
AT can configure anything that can be configured at runtime. MMapDB stores=
=20
data in shared mem. As long as you don't need to reload perl modules, creat=
e=20
new IP-based VHosts, open additional logfiles or similar you probably don't=
=20
need to restart.

The mmapdb databases (single files) can be prepared off-line and then be=20
copied to the destination. Then the old database (still mapped by its users)
is invalidated by setting a flag. At the time of the next access the new=20
version will be mmapped. MMapDB is completely lock-free. So, deadlocks as w=
ith=20
BerkeleyDB cannot occur. It even saves the UTF8 bit of strings.

Torsten Förtsch

=2D-=20
Need professional modperl support? Hire me! (http://foertsch.name)

Like fantasy? http://kabatinte.net

Re: non-stop generational modperl config update strategies?

am 21.04.2010 16:59:01 von Perrin Harkins

2010/4/21 Torsten Förtsch :
> The mmapdb databases (single files) can be prepared off-line and then be
> copied to the destination. Then the old database (still mapped by its use=
rs)
> is invalidated by setting a flag. At the time of the next access the new
> version will be mmapped. MMapDB is completely lock-free. So, deadlocks as=
with
> BerkeleyDB cannot occur.

Actually, this is the same approach I would use with BerkekelyDB:
build the database offline, watch for a newer file in a directory,
switch to the new one when it arrives. This is a read-only
application, so there's no need for locks at all.

In both cases you have the same drawback: it's impossible to read
anything from the shared data without copying the data you read into
perl variables. A shared database only saves memory if you don't need
all of the data to handle a request.

- Perrin

Re: non-stop generational modperl config update strategies?

am 21.04.2010 17:10:00 von torsten.foertsch

On Wednesday 21 April 2010 16:59:01 Perrin Harkins wrote:
> In both cases you have the same drawback: it's impossible to read
> anything from the shared data without copying the data you read into
> perl variables. A shared database only saves memory if you don't need
> all of the data to handle a request.
>=20
no, MMapDB creates read-only variables that reference the mmapped block. It=
=20
manipulates SvPVX directly:

sv=3DnewSV(0);
SvUPGRADE(sv, SVt_PV);
SvPOK_only(sv);
SvPV_set(sv, pointer);
SvLEN_set(sv, 0); # this makes sure perl won't try to free() the sp=
ace
SvCUR_set(sv, length);
SvREADONLY_on(sv);

You can then pass around references to that variable and nothing will be=20
copied.

Torsten Förtsch

=2D-=20
Need professional modperl support? Hire me! (http://foertsch.name)

Like fantasy? http://kabatinte.net

Re: non-stop generational modperl config update strategies?

am 21.04.2010 17:19:22 von Perrin Harkins

2010/4/21 Torsten Förtsch :
> no, MMapDB creates read-only variables that reference the mmapped block. =
It
> manipulates SvPVX directly

Very interesting! I'll have to try it out as a storage backend for CHI.

- Perrin

Re: non-stop generational modperl config update strategies?

am 21.04.2010 17:36:17 von Cosimo Streppone

On Wed, 21 Apr 2010 17:10:00 +0200, Torsten Förtsch
wrote:

> On Wednesday 21 April 2010 16:59:01 Perrin Harkins wrote:
>> In both cases you have the same drawback: it's impossible to read
>> anything from the shared data without copying the data you read into
>> perl variables.
>> [...]
>>
> no, MMapDB creates read-only variables that reference the mmapped block.
> It manipulates SvPVX directly:
>
> SvPV_set(sv, pointer);
> SvLEN_set(sv, 0); # this makes sure perl won't try to free() the
> space
> [...]
> You can then pass around references to that variable and nothing will be
> copied.

Cool!

So, if I understand correctly: using something like Cache::FastMmap
creates copy of your strings/values/... in your process memory.
See the "fc_read" function. Is this correct?

http://cpansearch.perl.org/src/ROBM/Cache-FastMmap-1.35/Cach e-FastMmap-CImpl/CImpl.xs

I guess pretty much anything else works that way, not just Cache::FastMmap.

--
Cosimo

Re: non-stop generational modperl config update strategies?

am 21.04.2010 17:55:13 von torsten.foertsch

On Wednesday 21 April 2010 17:19:22 Perrin Harkins wrote:
> 2010/4/21 Torsten Förtsch :
> > no, MMapDB creates read-only variables that reference the mmapped block.
> > It manipulates SvPVX directly
>=20
> Very interesting! I'll have to try it out as a storage backend for CHI.

One really cool thing I have just recently thought about would be an interf=
ace=20
to get the file offsets of a value. Then one could create file buckets with=
=20
$r->sendfile($filename, $offset, $len);. Even better would be a perl interf=
ace=20
to ap_send_fd to circumvent the filename=3D>filedescriptor race condition.

It itches me to implement that but ENOTIME.

Torsten Förtsch

=2D-=20
Need professional modperl support? Hire me! (http://foertsch.name)

Like fantasy? http://kabatinte.net

Re: non-stop generational modperl config update strategies?

am 26.04.2010 23:57:25 von Jeff McCarrell

Thanks to Perrin and Torsten for their input.

I will be investigating Torsten's MMapDB, and will report results in a few
weeks.

-- jeff