looks like one database from the outside, but it"s two inside

am 19.07.2007 07:23:52 von shixilun

Overview: I'm trying to create the situation in the subject line:
from the outside, the database handle thinks it's just one database,
but inside it's really two databases containing tables with the same
schema but not necessarily the same data.

Specifics: Where I work we have a web site with content determined
jointly by user interaction and by data in various tables in a very
large database. I would like to test what would happen if I add other
data to the database. Typically what I have to do is copy the
relevant parts of the very large database to my box, add in the new
data, and test locally. (The very large database is far too large to
copy between machines.)

There are a number of things that make this setup difficult:

1) Determining what is "relevant" is not always an easy task and is
often fraught with errors.

2) The very large database can be modified by many, many other people
at will. So, each time I perform a test, I have to go back to the
very large database and copy the relevant parts to my box.

3) Handles to the database are created through an intermediary module.
So, the call XXX::DBH->new('YYY') (where XXX::DBH is a subclass of
DBI::db) returns a $dbh to the correct database (names have been
changed to protect the innocent). This module relies on a
configuration file to turn 'YYY' into the correct parameters for
DBI->connect(); that means I'm always fiddling with the configuration
file, depending if I want to use the original very large database or
my local one.

4) I'd like to fiddle with the original code as little as possible,
including the intermediary module.

My thoughts have turned to something like DBD::Proxy, because then I
could just change the environment variable DBI_PROXY when needed to
switch between servers. But DBI::ProxyServer doesn't seem very easily
adaptable to this situation; it seems restricted to the rather
specific task of acting as a proxy for another driver.

I have also thought about writing a driver from scratch, using
SQL::Statement to help out, but that breaks the first rule of driver
writing ("Don't!"), and since this is for my job, the second rule
("Don't -- get someone else to do it for you!") isn't really an
option.

I'm not afraid of putting in the time to do this right, and I'm not
trying to get something for free. But I'd like some opinions about
what might be the correct approaches to this problem, especially if
there's something obvious that I've overlooked. (So OK, I'm trying to
get some advice for free :)

Thanks.

Re: looks like one database from the outside, but it"s two inside

am 19.07.2007 12:13:37 von Tim.Bunce

On Wed, Jul 18, 2007 at 10:23:52PM -0700, Theron Stanford wrote:
> Overview: I'm trying to create the situation in the subject line:
> from the outside, the database handle thinks it's just one database,
> but inside it's really two databases containing tables with the same
> schema but not necessarily the same data.

It's not really clear what you're trying to do. From what you say below
it seems that you just want to "just change the env var to switch
between servers". That's relatively simple but doesn't seem to match
what you said above.

> My thoughts have turned to something like DBD::Proxy, because then I
> could just change the environment variable DBI_PROXY when needed to
> switch between servers.

Perhaps you could just use the DBI_AUTOPROXY env var to force a
different DSN without using DBD::Proxy.

> I have also thought about writing a driver from scratch, using
> SQL::Statement to help out, but that breaks the first rule of driver
> writing ("Don't!"), and since this is for my job, the second rule
> ("Don't -- get someone else to do it for you!") isn't really an
> option.
>
> I'm not afraid of putting in the time to do this right, and I'm not
> trying to get something for free. But I'd like some opinions about
> what might be the correct approaches to this problem, especially if
> there's something obvious that I've overlooked. (So OK, I'm trying to
> get some advice for free :)

DBD::Multiplex may be worth exploring. (I have a modified and more
flexible version I worked on a couple of years ago that never quite got
polished up and released. It should probably be renamed as it's
slightly incompatible with the old one. Volunteers welcome.)

Tim.

Re: looks like one database from the outside, but it"s two inside

am 19.07.2007 16:32:34 von mnhan

Hi,

What about a round robin dns for the database server? Connect to the
server via a dns name that can be round robin to 2 ips. One per database
server. The outside host wouldn't know the difference and would simply
connect to the dns name and external ip that it was given and your inside
host handles which internal ip it belonging to a server it was actually
talking to. Just a thought.

Michael

On Wed, 18 Jul 2007,
Theron Stanford wrote:

> Date: Wed, 18 Jul 2007 22:23:52 -0700
> From: Theron Stanford
> To: dbi-users@perl.org
> Subject: looks like one database from the outside, but it's two inside
>
> Overview: I'm trying to create the situation in the subject line:
> from the outside, the database handle thinks it's just one database,
> but inside it's really two databases containing tables with the same
> schema but not necessarily the same data.
>
> Specifics: Where I work we have a web site with content determined
> jointly by user interaction and by data in various tables in a very
> large database. I would like to test what would happen if I add other
> data to the database. Typically what I have to do is copy the
> relevant parts of the very large database to my box, add in the new
> data, and test locally. (The very large database is far too large to
> copy between machines.)
>
> There are a number of things that make this setup difficult:
>
> 1) Determining what is "relevant" is not always an easy task and is
> often fraught with errors.
>
> 2) The very large database can be modified by many, many other people
> at will. So, each time I perform a test, I have to go back to the
> very large database and copy the relevant parts to my box.
>
> 3) Handles to the database are created through an intermediary module.
> So, the call XXX::DBH->new('YYY') (where XXX::DBH is a subclass of
> DBI::db) returns a $dbh to the correct database (names have been
> changed to protect the innocent). This module relies on a
> configuration file to turn 'YYY' into the correct parameters for
> DBI->connect(); that means I'm always fiddling with the configuration
> file, depending if I want to use the original very large database or
> my local one.
>
> 4) I'd like to fiddle with the original code as little as possible,
> including the intermediary module.
>
> My thoughts have turned to something like DBD::Proxy, because then I
> could just change the environment variable DBI_PROXY when needed to
> switch between servers. But DBI::ProxyServer doesn't seem very easily
> adaptable to this situation; it seems restricted to the rather
> specific task of acting as a proxy for another driver.
>
> I have also thought about writing a driver from scratch, using
> SQL::Statement to help out, but that breaks the first rule of driver
> writing ("Don't!"), and since this is for my job, the second rule
> ("Don't -- get someone else to do it for you!") isn't really an
> option.
>
> I'm not afraid of putting in the time to do this right, and I'm not
> trying to get something for free. But I'd like some opinions about
> what might be the correct approaches to this problem, especially if
> there's something obvious that I've overlooked. (So OK, I'm trying to
> get some advice for free :)
>
> Thanks.
>
>

--
---//---
"He who learns must suffer. And even in our sleep,
pain that cannot forget, falls drop by drop upon the heart;
and, in our own despair, against our will,
comes wisdom to us by the awful grace of God."
--- Aeschylus

Re: looks like one database from the outside, but it"s two inside

am 19.07.2007 19:01:56 von shixilun

Thanks to all, and apologies to all.

I should have waited another day to flesh this out before sending, but
I was trying to take advantage of living on the west coast of the USA
to send this out last night and come to work this morning to find my
inbox teeming with lots of interesting ideas. Dumb, dumb, dumb (and
selfish).

On 7/19/07, Tim Bunce wrote:
>
> It's not really clear what you're trying to do.

Suppose I have a database with a table with two columns, 'key' and
'value'. (This is a simplification, of course.) The contents of this
database determine how my software interacts. The database is
read-only, as far as the software is concerned; this should make the
implementation simpler.

Now, suppose I'd like to test out the result of adding or updating
some new key-value pairs in this database. What I'd like to do is
have a local database table to put these temporary key-value pairs in,
and have my software obtain a $dbh that makes it transparent to the
software that it's really looking at two tables, mixing their data
together as one.

By the way, when I say "my software", I mean software written by the
company that isn't really amenable to changing, which is why I thought
of the DBI_AUTOPROXY approach (see below). But I could hack a local
copy of it if necessary.

> From what you say below
> it seems that you just want to "just change the env var to switch
> between servers". That's relatively simple but doesn't seem to match
> what you said above.

What I intended was that I could use the environment variable to
switch between "normal", using just the one database, or "test",
adding in the data from the new database as well. Since I usually
have more than one "test" database, having a switch would really help.

> > My thoughts have turned to something like DBD::Proxy, because then I
> > could just change the environment variable DBI_PROXY when needed to
> > switch between servers.
>
> Perhaps you could just use the DBI_AUTOPROXY env var to force a
> different DSN without using DBD::Proxy.

Although I got the name of the environment variable wrong, I'm glad to
see you caught what I had in mind and seem to think it's a
possibility.

The trouble is determining what this different DSN would be. I don't
know of any DBDs that act the way I have envisioned above.

> DBD::Multiplex may be worth exploring. (I have a modified and more
> flexible version I worked on a couple of years ago that never quite got
> polished up and released. It should probably be renamed as it's
> slightly incompatible with the old one. Volunteers welcome.)

I did take a look at DBD::Multiplex, but it struck me that it was
intended for situations in which there are multiple servers with
duplicate contents: each database has the same data in it, so you can
read any server you want, and a write operation propagates that change
to each of the servers, keeping them in sync. The situation I
envision is of multiple servers whose contents are not the same.

But I'll take another look at the module if you think it warrants it.

Theron

Re: looks like one database from the outside, but it"s two inside

am 19.07.2007 19:09:45 von ngdvakigyotuia

I would consider setting up the test database to consist of a database
link to the "real" database, along with views for all the tables.
Then on those tables where I wanted to interject some data, I would
create test-data tables and alter the views to merge the data.

Then switching from "test" to "real" is a matter of changing where you
connect to. Alternately, you could change the views to only
incorporate test-data if a certain flag is set in yet another test
table.

On 7/19/07, Theron Stanford wrote:
> Thanks to all, and apologies to all.
>
> I should have waited another day to flesh this out before sending, but
> I was trying to take advantage of living on the west coast of the USA
> to send this out last night and come to work this morning to find my
> inbox teeming with lots of interesting ideas. Dumb, dumb, dumb (and
> selfish).
>
> On 7/19/07, Tim Bunce wrote:
> >
> > It's not really clear what you're trying to do.
>
> Suppose I have a database with a table with two columns, 'key' and
> 'value'. (This is a simplification, of course.) The contents of this
> database determine how my software interacts. The database is
> read-only, as far as the software is concerned; this should make the
> implementation simpler.
>
> Now, suppose I'd like to test out the result of adding or updating
> some new key-value pairs in this database. What I'd like to do is
> have a local database table to put these temporary key-value pairs in,
> and have my software obtain a $dbh that makes it transparent to the
> software that it's really looking at two tables, mixing their data
> together as one.
>
> By the way, when I say "my software", I mean software written by the
> company that isn't really amenable to changing, which is why I thought
> of the DBI_AUTOPROXY approach (see below). But I could hack a local
> copy of it if necessary.
>
> > From what you say below
> > it seems that you just want to "just change the env var to switch
> > between servers". That's relatively simple but doesn't seem to match
> > what you said above.
>
> What I intended was that I could use the environment variable to
> switch between "normal", using just the one database, or "test",
> adding in the data from the new database as well. Since I usually
> have more than one "test" database, having a switch would really help.
>
> > > My thoughts have turned to something like DBD::Proxy, because then I
> > > could just change the environment variable DBI_PROXY when needed to
> > > switch between servers.
> >
> > Perhaps you could just use the DBI_AUTOPROXY env var to force a
> > different DSN without using DBD::Proxy.
>
> Although I got the name of the environment variable wrong, I'm glad to
> see you caught what I had in mind and seem to think it's a
> possibility.
>
> The trouble is determining what this different DSN would be. I don't
> know of any DBDs that act the way I have envisioned above.
>
> > DBD::Multiplex may be worth exploring. (I have a modified and more
> > flexible version I worked on a couple of years ago that never quite got
> > polished up and released. It should probably be renamed as it's
> > slightly incompatible with the old one. Volunteers welcome.)
>
> I did take a look at DBD::Multiplex, but it struck me that it was
> intended for situations in which there are multiple servers with
> duplicate contents: each database has the same data in it, so you can
> read any server you want, and a write operation propagates that change
> to each of the servers, keeping them in sync. The situation I
> envision is of multiple servers whose contents are not the same.
>
> But I'll take another look at the module if you think it warrants it.
>
> Theron
>

Re: looks like one database from the outside, but it"s two inside

am 19.07.2007 19:22:00 von shixilun

This sounds like a good idea. I guess I just don't know enough about
databases yet to know how to do this linking. Mostly I just know
about how to get DBI to interact with MySQL.

I should mention that I'd also like to be able to make some rules in
the "real" database temporarily disappear while testing, but I fear
this will make things even more complicated.

On 7/19/07, Sven Miller wrote:
> I would consider setting up the test database to consist of a database
> link to the "real" database, along with views for all the tables.
> Then on those tables where I wanted to interject some data, I would
> create test-data tables and alter the views to merge the data.
>
> Then switching from "test" to "real" is a matter of changing where you
> connect to. Alternately, you could change the views to only
> incorporate test-data if a certain flag is set in yet another test
> table.
>

Re: looks like one database from the outside, but it"s two inside

am 19.07.2007 19:27:22 von ngdvakigyotuia

And unfortunately I don't know enough about MySQL to be more helpful.
Database links may be an Oracle-only thing.

On 7/19/07, Theron Stanford wrote:
> This sounds like a good idea. I guess I just don't know enough about
> databases yet to know how to do this linking. Mostly I just know
> about how to get DBI to interact with MySQL.
>
> I should mention that I'd also like to be able to make some rules in
> the "real" database temporarily disappear while testing, but I fear
> this will make things even more complicated.
>
> On 7/19/07, Sven Miller wrote:
> > I would consider setting up the test database to consist of a database
> > link to the "real" database, along with views for all the tables.
> > Then on those tables where I wanted to interject some data, I would
> > create test-data tables and alter the views to merge the data.
> >
> > Then switching from "test" to "real" is a matter of changing where you
> > connect to. Alternately, you could change the views to only
> > incorporate test-data if a certain flag is set in yet another test
> > table.
> >
>

Re: looks like one database from the outside, but it"s two inside

am 20.07.2007 11:20:22 von clive

Sven Miller wrote:
> And unfortunately I don't know enough about MySQL to be more helpful.
> Database links may be an Oracle-only thing.
>
> On 7/19/07, Theron Stanford wrote:
>> This sounds like a good idea. I guess I just don't know enough about
>> databases yet to know how to do this linking. Mostly I just know
>> about how to get DBI to interact with MySQL.
>

http://dev.mysql.com/doc/refman/5.0/en/create-view.html

create a view thus (from a quick scan on the doc)

create view my_agregate_table
as
select * from maintable
union
select * from tempdatatable

either or both of the selects can have where clauses that restrict the
data 'viewed'

--
This message has been scanned for viruses and
dangerous content by OpenProtect(http://www.openprotect.com), and is
believed to be clean.