Refreshing stored data at administrator"s signal
am 13.01.2008 20:30:55 von Colin WetherbeeGreetings.
I have an application that accesses some relatively static database
tables to create drop-down
Greetings.
I have an application that accesses some relatively static database
tables to create drop-down
Colin Wetherbee wrote:
> At the moment (and not in a production environment), every time the
> drop-down list is generated for a web page, the script queries the
> database to retrieve the entire list of aircraft. I would prefer to
> retrieve the list of aircraft when each Perl interpreter starts and
> then not retrieve it again until the administrator sends a signal.
> For this particular table, the signal would only occur when new
> aircraft hit the market, like the Boeing 787 will (hopefully) in
> December of this year.
>
> The most UNIX-ish way to do this, I guess, would be to send SIGHUP to
> each running perl process, causing it to reload its configuration,
> update its stored lists, and so forth. I'd rather do this in a more
> Perl-ish or Apache-ish way, though, and I'd also rather be specific
> about which list it should update.
Wouldn't a simpler approach be to just restart Apache when you want to
update the lists? You could even have the 'add to list' function send
SIGUSR1 to the parent Apache, causing a graceful restart.
Having said that, if running 20 DB queries returning a few hundred
records is causing you a speed problem, are you sure the DB is running
efficiently? Is this a very high traffic site? Is there a requirement
for ultra-fast page generation? I've got pages that make dozens and
dozens of DB queries returning hundreds of records and do lots of
post-processing, and I can generate pages in under a second much of the
time.
cheers
John
John ORourke wrote:
> Colin Wetherbee wrote:
>> At the moment (and not in a production environment), every time the
>> drop-down list is generated for a web page, the script queries the
>> database to retrieve the entire list of aircraft. I would prefer to
>> retrieve the list of aircraft when each Perl interpreter starts and
>> then not retrieve it again until the administrator sends a signal.
>> For this particular table, the signal would only occur when new
>> aircraft hit the market, like the Boeing 787 will (hopefully) in
>> December of this year.
>>
>> The most UNIX-ish way to do this, I guess, would be to send SIGHUP to
>> each running perl process, causing it to reload its configuration,
>> update its stored lists, and so forth. I'd rather do this in a more
>> Perl-ish or Apache-ish way, though, and I'd also rather be specific
>> about which list it should update.
>
> Wouldn't a simpler approach be to just restart Apache when you want to
> update the lists? You could even have the 'add to list' function send
> SIGUSR1 to the parent Apache, causing a graceful restart.
I'm trying to avoid restarting Apache altogether, although I admit it
would be a pretty simple solution.
> Having said that, if running 20 DB queries returning a few hundred
> records is causing you a speed problem, are you sure the DB is running
> efficiently? Is this a very high traffic site? Is there a requirement
> for ultra-fast page generation? I've got pages that make dozens and
> dozens of DB queries returning hundreds of records and do lots of
> post-processing, and I can generate pages in under a second much of the
> time.
The point is more like "well, this isn't really super-dynamic data, so
running a query every time I need it seems like a waste of processor
time and disk activity."
It's not causing any slow-down right now, though when the site goes
live, it certainly could.
Colin
Colin Wetherbee wrote:
> John ORourke wrote:
>> Colin Wetherbee wrote:
>> Wouldn't a simpler approach be to just restart Apache when you want
>> to update the lists? You could even have the 'add to list' function
>> send SIGUSR1 to the parent Apache, causing a graceful restart.
>
> I'm trying to avoid restarting Apache altogether, although I admit it
> would be a pretty simple solution.
I'd seriously consider it - it's simple and clean and only takes a few
seconds, and it happens every night when you rotate your logs anyway.
If you really really don't want to restart Apache, you could get your
'add data' function to create a file called 'need_restart' somewhere on
the disk, and after processing each request your mod_perl handler could
check for the file and call $r->child_terminate if it finds it. You'd
have to have some method of stopping it from constantly restarting....
could get complicated.
The cynic in me suspects you'll spend too many hours on this
not-really-a-problem, when there may be other parts of the system that
would benefit from more attention!
cheers
John
John ORourke wrote:
> Colin Wetherbee wrote:
>> John ORourke wrote:
>>> Colin Wetherbee wrote:
>>> Wouldn't a simpler approach be to just restart Apache when you want
>>> to update the lists? You could even have the 'add to list' function
>>> send SIGUSR1 to the parent Apache, causing a graceful restart.
>>
>> I'm trying to avoid restarting Apache altogether, although I admit it
>> would be a pretty simple solution.
>
> I'd seriously consider it - it's simple and clean and only takes a few
> seconds, and it happens every night when you rotate your logs anyway.
> If you really really don't want to restart Apache, you could get your
> 'add data' function to create a file called 'need_restart' somewhere on
> the disk, and after processing each request your mod_erl handler could
> check for the file and call $r->child_terminate if it finds it. You'd
> have to have some method of stopping it from constantly restarting....
> could get complicated.
I thought about the file thing... if the file exists, check its last
modified timestamp; if that timestamp is greater than the stored
timestamp, then update the data from the database. It seems like
unnecessary disk access, though. Then again, this whole problem is
riddled with unnecessary disk access. :)
> The cynic in me suspects you'll spend too many hours on this
> not-really-a-problem, when there may be other parts of the system that
> would benefit from more attention!
Well, you're probably right about that. ;)
Perhaps I'll set up a restart-based system and then worry about it later
if it becomes an "actual" problem.
Thanks for your input.
Colin
On Jan 13, 2008 4:19 PM, Colin Wetherbee
> I thought about the file thing... if the file exists, check its last
> modified timestamp; if that timestamp is greater than the stored
> timestamp, then update the data from the database. It seems like
> unnecessary disk access, though. Then again, this whole problem is
> riddled with unnecessary disk access. :)
Using a "touch file" is the classic solution to this problem. You
check the mod time on a file (it's okay for it always be there -- it's
just the mod time we care about) and compare that to the last update
time that you keep in a global. It's dirt simple, avoids messy
problems with signals, and it should end up in your operating system's
disk cache so it really won't do any physical disk reads.
- Perrin
Scott Gifford wrote:
> Colin Wetherbee
>
> [...]
>
>> At the moment (and not in a production environment), every time the
>> drop-down list is generated for a web page, the script queries the
>> database to retrieve the entire list of aircraft. I would prefer to
>> retrieve the list of aircraft when each Perl interpreter starts and
>> then not retrieve it again until the administrator sends a signal.
>> For this particular table, the signal would only occur when new
>> aircraft hit the market, like the Boeing 787 will (hopefully) in
>> December of this year.
>
> Essentially what you want is an in-memory cache of a possibly slow
> database query. There are several modules on CPAN that do this;
> search for "cache".
I'm not sure what you're suggesting. The first few pages of "cache" on
CPAN have some modules for caching data in memory and on disk and so
forth, but I don't see how they relate to my problem.
Which is that of notifying all of my application's perl processes when
an update has been performed on a table in a database, without having
them access the database to determine this on their own.
Thanks.
Colin
> I'm not sure what you're suggesting. The first few pages of "cache" on
> CPAN have some modules for caching data in memory and on disk and so
> forth, but I don't see how they relate to my problem.
>
> Which is that of notifying all of my application's perl processes when
> an update has been performed on a table in a database, without having
> them access the database to determine this on their own.
There are two ways of achieving your task:
- active: forcing all the apache processes to update their list of
aircraft
- passive: having each apache child check on whether the copy of the
list they already have is still up to date
By far the simplest way of achieving the first option is by having the
parent load and cache the list (which means that memory is shared by all
the child processes) and restarting your apache processes when the list
changes.
For the passive route, each apache child has to perform some kind of
check to see whether their version is up to date. This requires some
kind of check somewhere, eg:
- checking the last modified time of a file
- loading the list from a cache
- loading the list from the database
Your intention is to reduce the number of database hits. That's fine,
but it needs to be weighed against the cost of inflexibility, or the
cost of checking and rebuilding the cache.
For data that almost never changes, I would go the active route.
For data that changes more regularly, but has a certain time-to-live, I
would go the caching route. For data that changes by the second, get it
directly from the DB.
So searching for 'cache' on CPAN, indeed gives you a number of very
useful modules that ease your path to reducing the number of DB hits
that you have.
My personal favourite is Cache::Memcached, but that's only relevant if
you have more than one web server. If not, the file based caches are
the fastest (or you could try looking at SQLite or Cache::BerkleyDB or
even a memory table in MySQL, but on a different DB server)
regards
Clint
>
> Thanks.
>
> Colin
>
Clinton Gormley wrote:
>> I'm not sure what you're suggesting. The first few pages of "cache" on
>> CPAN have some modules for caching data in memory and on disk and so
>> forth, but I don't see how they relate to my problem.
>>
>> Which is that of notifying all of my application's perl processes when
>> an update has been performed on a table in a database, without having
>> them access the database to determine this on their own.
>
> My personal favourite is Cache::Memcached, but that's only relevant if
> you have more than one web server. If not, the file based caches are
> the fastest (or you could try looking at SQLite or Cache::BerkleyDB or
> even a memory table in MySQL, but on a different DB server)
Memcached sounds like a good idea. I could cache the update timestamps
and compare them on each run.
I guess I wasn't thinking about "cache" the right way around.
Thanks!
Colin
Hi. The touch file will definately work and I've used that myself but
in this case its inelegance bothers me. It's also another touch point
for administration. What I would probably do is put the state
information in the database itself. The script would keep track of the
age of its data and every 5 minutes or so it would check the state
information in the course of its normal operation. So when a user hit
causes the script to execute the last thing it does is see if it's state
data is more than 5 minutes old and if so refresh it. If the state
information has changed it would reload everything indicated right
there. U want to do this at the very tail end of the script so the
refresh doesn't delay the page draw for the user. This way u've avoided
expanding ur administrative footprint.
Colin Wetherbee wrote:
>
> Greetings.
>
> I have an application that accesses some relatively static database
> tables to create drop-down