Performance impact of large number of columns
am 04.08.2009 23:05:10 von Jeremy Jongsma
I have a table in a stock analysis database that has 284 columns. All
columns are int, double, or datetime, except the primary key
(varchar(12)). By my calculation this makes each row 1779 bytes.
CREATE TABLE technicals (
symbol varchar(12) primary key,
last double,
open double,
high double,
... etc ...
ma_5day double,
ma_20day double,
... etc ...
);
I've been looking into splitting it into multiple tables, i.e.:
-- Basic info
CREATE TABLE technicals_basic (
symbol varchar(12) primary key,
last double,
open double,
high double
);
-- Moving averages
CREATE TABLE technicals_ma (
symbol varchar(12),
ma_5day double,
ma_20day double,
FOREIGN KEY symbol REFERENCES technicals_basic(symbol)
);
Will splitting this into multiple tables provide any performance
benefit? Queries will never ask for all 284 columns, so under the new
system, even if split into 20 tables, there would be a maximum of 2-3
joins per query. The question is whether the row size reduction
provides enough benefit to introduce table joins.
I need to be able to search and sort by ANY of the 284 fields at times,
so my concern is that the current large row size will affect the speed
of sequential scans of the table (60,000 rows). Is that worry
justified?
-j
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org
Re: Performance impact of large number of columns
am 05.08.2009 23:26:08 von Jeremy Jongsma
The problem is that I need to search/sort by ANY of the 284 fields at
times - 284 indexes is a bit silly, so there will be a lot of sequential
scans (table has 60,000 rows). Given that criteria, will fewer columns
in more tables provide a performance benefit?
-j
On Tue, 2009-08-04 at 16:03 -0700, Howard Hart wrote:
> If your key is good, your limitation is really in searching through the
> physical index file, not the data files which contain the actual
> records. Result is you're only limited by the disk seek time to read
> through the index files to get the physical (offset) location of the
> record in the data file. 10 rows or 1000 rows in that record shouldn't
> make much difference.
>
> Searching through multiple index files after splitting using complex
> joins may actually drop your performance an order of magnitude. As
> usual, YMMV depending on your database engine, schema, indexing
> efficiency, etc....
>
> Howard Hart
> Ooma
>
> Jeremy Jongsma wrote:
> > I have a table in a stock analysis database that has 284 columns. All
> > columns are int, double, or datetime, except the primary key
> > (varchar(12)). By my calculation this makes each row 1779 bytes.
> >
> > CREATE TABLE technicals (
> > symbol varchar(12) primary key,
> > last double,
> > open double,
> > high double,
> > ... etc ...
> > ma_5day double,
> > ma_20day double,
> > ... etc ...
> > );
> >
> > I've been looking into splitting it into multiple tables, i.e.:
> >
> > -- Basic info
> > CREATE TABLE technicals_basic (
> > symbol varchar(12) primary key,
> > last double,
> > open double,
> > high double
> > );
> >
> > -- Moving averages
> > CREATE TABLE technicals_ma (
> > symbol varchar(12),
> > ma_5day double,
> > ma_20day double,
> > FOREIGN KEY symbol REFERENCES technicals_basic(symbol)
> > );
> >
> > Will splitting this into multiple tables provide any performance
> > benefit? Queries will never ask for all 284 columns, so under the new
> > system, even if split into 20 tables, there would be a maximum of 2-3
> > joins per query. The question is whether the row size reduction
> > provides enough benefit to introduce table joins.
> >
> > I need to be able to search and sort by ANY of the 284 fields at times,
> > so my concern is that the current large row size will affect the speed
> > of sequential scans of the table (60,000 rows). Is that worry
> > justified?
> >
> > -j
> >
> >
> >
>
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org