table type and server load

am 11.04.2006 18:06:51 von Joelle Tegwen

I'm currently working on an application that looks like it's going to
have (we're still finalizing the database design) 12 lookup tables
(tables of mostly static data) and 7 high entry data tables (very
denormalized). However, relative to other applications, the whole thing
is pretty static. It's a reporting application and reporting happens no
more than once a year (maybe once every 2 years).

Over the next 5 years we intend to add on to this application 3 other
reporting groups that will (for arguments sake) add 5 lookup tables and
7 data tables each. (I really don't know I'm just kind of guessing
here). Many of the original 12 lookup tables will be reused in the other
reports.

To some extent the reports would be accessed separately, but eventually
they want to be able to see the data from the different reports combined.

So I have a few questions.
1) The data is very relationtional, but would you use MyISAM anyway
because it's such a read heavy application?

2) How does the use of federated tables affect table access efficiency?
i.e. If I put the "data entry" tables each in their own schemas and the
lookup tables in another and used federated tables to connect them does
that improve the lookup speed? Or does it really not matter since it's
mostly lookup anyway?

3) Are there other ways to make this more efficient?

Thanks
Joelle

--
MySQL Windows Mailing List
For list archives: http://lists.mysql.com/win32
To unsubscribe: http://lists.mysql.com/win32?unsub=gcdmw-win32@m.gmane.org

Re: table type and server load

am 11.04.2006 19:44:08 von Shawn Green

Responses intermingled ---

--- Joelle Tegwen wrote:

> I'm currently working on an application that looks like it's going to
>
> have (we're still finalizing the database design) 12 lookup tables
> (tables of mostly static data) and 7 high entry data tables (very
> denormalized). However, relative to other applications, the whole
> thing
> is pretty static.

If they are relatively static, how often are they upated? Hourly?
Daily? Weekly?

> It's a reporting application and reporting happens
> no
> more than once a year (maybe once every 2 years).
>
> Over the next 5 years we intend to add on to this application 3 other
>
> reporting groups that will (for arguments sake) add 5 lookup tables
> and
> 7 data tables each. (I really don't know I'm just kind of guessing
> here). Many of the original 12 lookup tables will be reused in the
> other
> reports.
>
> To some extent the reports would be accessed separately, but
> eventually
> they want to be able to see the data from the different reports
> combined.
>

I thought you were storing data and building the reports from the data.
Are you just storing the results of an existing report from some other
system?

> So I have a few questions.
> 1) The data is very relationtional, but would you use MyISAM anyway
> because it's such a read heavy application?
>

How relational a set of data is shouldn't affect your choice of which
engine (unless you NEED foreign key constraints. I that case you must
use InnoDB). MyISAM is the fastest engine for mostly-read type usage.

> 2) How does the use of federated tables affect table access
> efficiency?

Federating a table requires an extra relay of your query from one
server to another, spooling the results on the server to which you
originally posted the query, then transmittal of those results to you.
What you may be able to gain in performance (one report per federating
server?) you have to balance against the networking overhead involved
in distributing the query. Depending on the setup and the query traffic
you will have, FEDERAT-ing a table can either help or hurt your
performance.

> i.e. If I put the "data entry" tables each in their own schemas and
> the
> lookup tables in another and used federated tables to connect them
> does
> that improve the lookup speed? Or does it really not matter since
> it's
> mostly lookup anyway?
>

Schemas are generally all on the same server. You do not need to
federate tables in order to perform cross-database queries. You only
need FEDERATED tables if you want to do cross-server queries.

> 3) Are there other ways to make this more efficient?

Creating indexes of the appropriate types on the appropriate columns
will always help. If you are worried about space (as this seems to be
more of a reporting archive application) you may consider using the
ARCHIVE engine.

> Thanks
> Joelle
>

No problem,

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

--
MySQL Windows Mailing List
For list archives: http://lists.mysql.com/win32
To unsubscribe: http://lists.mysql.com/win32?unsub=gcdmw-win32@m.gmane.org

Re: table type and server load

am 11.04.2006 21:25:37 von Joelle Tegwen

more intermingling...

Shawn Green wrote:
>> I'm currently working on an application that looks like it's going to
>>
>> have (we're still finalizing the database design) 12 lookup tables
>> (tables of mostly static data) and 7 high entry data tables (very
>> denormalized). However, relative to other applications, the whole
>> thing
>> is pretty static.
>>
>
> If they are relatively static, how often are they upated? Hourly?
> Daily? Weekly?
>
The "data entry" tables will get new information every year (or maybe 2
years). Each time a new set of data is entered it's a new "reporting
cycle" The data is all entered, verified, (updated etc) and then when
it's finalized the new information all rolls out to be "public" and
reportable. In theory, once it's finalized that's it, but I'm told that
is not always reality. It looks like the people we are reporting data on
have about a month to dispute information post "finalization" The lookup
tables might have information added to them each reporting cycle, but
it's on the scale of 0-10 additions vs 10,000 in the data entry tables.
>> It's a reporting application and reporting happens
>> no
>> more than once a year (maybe once every 2 years).
>>
>> Over the next 5 years we intend to add on to this application 3 other
>>
>> reporting groups that will (for arguments sake) add 5 lookup tables
>> and
>> 7 data tables each. (I really don't know I'm just kind of guessing
>> here). Many of the original 12 lookup tables will be reused in the
>> other
>> reports.
>>
>> To some extent the reports would be accessed separately, but
>> eventually
>> they want to be able to see the data from the different reports
>> combined.
>>
>>
>
> I thought you were storing data and building the reports from the data.
> Are you just storing the results of an existing report from some other
> system?
>
>
We are storing data and building reports from them. Right now there's 4
reports that are on similar topics but the data is presented in a very
silo-ized kind of way. Eventually we hope to be able to integrate the
data. I can go into more detail with examples if it's still not clear.
>> So I have a few questions.
>> 1) The data is very relationtional, but would you use MyISAM anyway
>> because it's such a read heavy application?
>>
>>
>
> How relational a set of data is shouldn't affect your choice of which
> engine (unless you NEED foreign key constraints. I that case you must
> use InnoDB). MyISAM is the fastest engine for mostly-read type usage.
>
How do I know if I NEED foreign key constraints?
>
>> 2) How does the use of federated tables affect table access
>> efficiency?
>>
>
> Federating a table requires an extra relay of your query from one
> server to another, spooling the results on the server to which you
> originally posted the query, then transmittal of those results to you.
> What you may be able to gain in performance (one report per federating
> server?) you have to balance against the networking overhead involved
> in distributing the query. Depending on the setup and the query traffic
> you will have, FEDERAT-ing a table can either help or hurt your
> performance.
>
>
>> i.e. If I put the "data entry" tables each in their own schemas and
>> the
>> lookup tables in another and used federated tables to connect them
>> does
>> that improve the lookup speed? Or does it really not matter since
>> it's
>> mostly lookup anyway?
>>
>>
>
> Schemas are generally all on the same server. You do not need to
> federate tables in order to perform cross-database queries. You only
> need FEDERATED tables if you want to do cross-server queries.
>
Sorry, I'm still stuck in Access mode I guess :) I get it now.

So separating out the lookup tables from the data tables is just an
architectural choice. How do I decide whether or not to separate them or
make one big schema for the whole thing (or something else entirely)?
Maybe this is beyond the scope of this list?
>
>> 3) Are there other ways to make this more efficient?
>>
>
> Creating indexes of the appropriate types on the appropriate columns
> will always help. If you are worried about space (as this seems to be
> more of a reporting archive application) you may consider using the
> ARCHIVE engine.
>
>
Maybe after the data is "finally final" we could use ARCHIVE.I was
definitely considering it for the data from previous reporting cycles.

Thanks much!
Joelle
>

--
MySQL Windows Mailing List
For list archives: http://lists.mysql.com/win32
To unsubscribe: http://lists.mysql.com/win32?unsub=gcdmw-win32@m.gmane.org

Re: table type and server load

am 11.04.2006 21:57:42 von Shawn Green

.... some snipping and more mingling ...

--- Joelle Tegwen wrote:

> more intermingling...
>
> Shawn Green wrote:

> >
> > How relational a set of data is shouldn't affect your choice of
> which
> > engine (unless you NEED foreign key constraints. I that case you
> must
> > use InnoDB). MyISAM is the fastest engine for mostly-read type
> usage.
> >
> How do I know if I NEED foreign key constraints?

You need them only if you want the database to verify that the values
in certain columns already exist in other tables (eg: verify that the
lookup_id of one record is actually a valid value because it really
does exist in the lookup table). The way you describe your situation,
you can probably do that validation in your code for as few records
that you may need to add or change.

> > Schemas are generally all on the same server. You do not need to
> > federate tables in order to perform cross-database queries. You
> only
> > need FEDERATED tables if you want to do cross-server queries.
> >
> Sorry, I'm still stuck in Access mode I guess :) I get it now.
>
> So separating out the lookup tables from the data tables is just an
> architectural choice. How do I decide whether or not to separate them
> or
> make one big schema for the whole thing (or something else entirely)?
>
> Maybe this is beyond the scope of this list?

Maybe the details are beyond the scope but the theory isn't. Some good
reasons to put data common to several sets of other data in it's own
container (database):

a) security - you can grant read-only permissions to most of the data
access accounts to your lookup tables. That way you don't possibly get
five dozen people trying to muck with the most important lists.
Contrary to what several managers might think: Access Control is a good
thing.

b) organization - If it feels "natural" to organize the data in a
hub-and-spoke pattern, do it. It's a form of normalization to eliminate
duplicate tables of information in separate databases. Combining them
into a centralized database is logical under many circumstances.

c) memory optimization - you could preload that database into memory
(or operate that database off of a RAM disk, if you have the space).
Because so many other queries need to get data from this one central
location, you can get some serious throughput boosts by improving the
storage of these central facts.

There are probably a dozen other reasons to split or not to split your
data but those are the ones I could think of quickly.

>
> Thanks much!
> Joelle
> >
>
No problem

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

--
MySQL Windows Mailing List
For list archives: http://lists.mysql.com/win32
To unsubscribe: http://lists.mysql.com/win32?unsub=gcdmw-win32@m.gmane.org

RE: table type and server load

am 11.04.2006 22:33:08 von Leif Johnston

I would agree that this is close to a static project and the db requirements
at this point are way overkill...

-----Original Message-----
From: Shawn Green [mailto:sgreen_mysql@yahoo.com]
Sent: Tuesday, April 11, 2006 3:58 PM
To: Joelle Tegwen; MySQL - Win32
Cc: sgreen_mysql@yahoo.com
Subject: Re: table type and server load

.... some snipping and more mingling ...

--- Joelle Tegwen wrote:

> more intermingling...
>
> Shawn Green wrote:

> >
> > How relational a set of data is shouldn't affect your choice of
> which
> > engine (unless you NEED foreign key constraints. I that case you
> must
> > use InnoDB). MyISAM is the fastest engine for mostly-read type
> usage.
> >
> How do I know if I NEED foreign key constraints?

You need them only if you want the database to verify that the values
in certain columns already exist in other tables (eg: verify that the
lookup_id of one record is actually a valid value because it really
does exist in the lookup table). The way you describe your situation,
you can probably do that validation in your code for as few records
that you may need to add or change.

> > Schemas are generally all on the same server. You do not need to
> > federate tables in order to perform cross-database queries. You
> only
> > need FEDERATED tables if you want to do cross-server queries.
> >
> Sorry, I'm still stuck in Access mode I guess :) I get it now.
>
> So separating out the lookup tables from the data tables is just an
> architectural choice. How do I decide whether or not to separate them
> or
> make one big schema for the whole thing (or something else entirely)?
>
> Maybe this is beyond the scope of this list?

Maybe the details are beyond the scope but the theory isn't. Some good
reasons to put data common to several sets of other data in it's own
container (database):

a) security - you can grant read-only permissions to most of the data
access accounts to your lookup tables. That way you don't possibly get
five dozen people trying to muck with the most important lists.
Contrary to what several managers might think: Access Control is a good
thing.

b) organization - If it feels "natural" to organize the data in a
hub-and-spoke pattern, do it. It's a form of normalization to eliminate
duplicate tables of information in separate databases. Combining them
into a centralized database is logical under many circumstances.

c) memory optimization - you could preload that database into memory
(or operate that database off of a RAM disk, if you have the space).
Because so many other queries need to get data from this one central
location, you can get some serious throughput boosts by improving the
storage of these central facts.

There are probably a dozen other reasons to split or not to split your
data but those are the ones I could think of quickly.

>
> Thanks much!
> Joelle
> >
>
No problem

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

--
MySQL Windows Mailing List
For list archives: http://lists.mysql.com/win32
To unsubscribe:
http://lists.mysql.com/win32?unsub=leif@technologycatalyst.c om

--
MySQL Windows Mailing List
For list archives: http://lists.mysql.com/win32
To unsubscribe: http://lists.mysql.com/win32?unsub=gcdmw-win32@m.gmane.org