Possible tricks to ALTER on huge tables?

Possible tricks to ALTER on huge tables?

am 06.08.2010 02:07:51 von Daevid Vincent

Driving to work today, I had an epiphany thought, but wanted to see if
anyone could prove my theory or not.

We currently have some tables that are approaching 1 BILLION rows (real
Billion, with nine zeros, not that silly six zero version). Trying to do an
"ALTER" on them to add a column can sometimes take hours.

I'm wondering if we had the foresight to create the tables, and then tack
on extra "dormant" columns of various common types, such as:

future_uint int(10) unsigned null,
future_int int(10) signed null,
future_var varchar(255) null,
Etc.

So basically they'd be unused, then when we wanted a "new" column of that
type, we would just "rename" the dormant one.

I'm not sure if mySQL is smart enough to realize that if the schema
definition for a column is identical, then it's just a simple rename, or if
it treats any change the same and will still take hours to complete (if so,
perhaps there's an optimization for you mysql developer team)

Another option I considered, was is it possible to just go in with a hex
editor and rename the field in the .frm file? Is there some kind of .frm
editor available anywhere?

root@mypse:/var/lib/mysql/agis_core# hexdump -C country.frm
00001000 01 01 00 00 0a 00 00 00 02 00 01 00 00 00 01 80
|................|
00001010 02 00 00 12 00 02 00 ff 50 52 49 4d 41 52 59 ff
|........PRIMARY.|
00001020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|................|
*
000013f0 00 00 00 00 00 00 00 00 00 00 00 00 06 00 49 6e
|..............In|
00001400 6e 6f 44 42 00 00 00 00 00 00 00 00 00 00 00 00
|noDB............|
00001410 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|................|
*
00002150 04 00 0b 69 64 5f 63 6f 75 6e 74 72 79 00 05 00
|...id_country...|
00002160 05 69 73 6f 32 00 06 00 05 69 73 6f 33 00 07 00
|.iso2....iso3...|
00002170 0d 63 6f 75 6e 74 72 79 5f 6e 61 6d 65 00 04 0b
|.country_name...|
00002180 05 05 00 02 00 00 12 00 0f 00 00 02 c0 00 00 05
|................|
00002190 05 06 06 00 04 00 00 00 80 00 00 00 fe c0 00 00
|................|
000021a0 06 05 09 09 00 0a 00 00 00 80 00 00 00 fe c0 00
|................|
000021b0 00 07 0d 42 fd 02 13 00 00 00 00 00 00 00 0f c0
|...B............|
000021c0 00 00 ff 69 64 5f 63 6f 75 6e 74 72 79 ff 69 73
|...future_var...|
000021d0 6f 32 ff 69 73 6f 33 ff 63 6f 75 6e 74 72 79 5f
|...future_int...|


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Possible tricks to ALTER on huge tables?

am 06.08.2010 02:10:23 von Buford Tannen

Daevid Vincent wrote:
> I'm wondering if we had the foresight to create the tables, and then tack
> on extra "dormant" columns of various common types, such as:

Nothing beats empirical evidence. Why don't you try it and find
out (and report back)!

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Possible tricks to ALTER on huge tables?

am 06.08.2010 03:04:41 von Rob Wultsch

On Thu, Aug 5, 2010 at 5:07 PM, Daevid Vincent wrote:
> Driving to work today, I had an epiphany thought, but wanted to see if
> anyone could prove my theory or not.
>
> We currently have some tables that are approaching 1 BILLION rows (real
> Billion, with nine zeros, not that silly six zero version). Trying to do =
an
> "ALTER" on them to add a column can sometimes take hours.
>
> I'm wondering if we had the foresight to create the tables, and then tack
> on extra "dormant" columns of various common types, such as:
>
> future_uint int(10) unsigned null,
> future_int int(10) signed null,
> future_var varchar(255) null,
> Etc.
>
> So basically they'd be unused, then when we wanted a "new" column of that
> type, we would just "rename" the dormant one.
>
> I'm not sure if mySQL is smart enough to realize that if the schema
> definition for a column is identical, then it's just a simple rename, or =
if
> it treats any change the same and will still take hours to complete (if s=
o,
> perhaps there's an optimization for you mysql developer team)
>
> Another option I considered, was is it possible to just go in with a hex
> editor and rename the field in the .frm file? Is there some kind of .frm
> editor available anywhere?
>
> root@mypse:/var/lib/mysql/agis_core# hexdump -C country.frm
> 00001000 =A001 01 00 00 0a 00 00 00 =A002 00 01 00 00 00 01 80
> |................|
> 00001010 =A002 00 00 12 00 02 00 ff =A050 52 49 4d 41 52 59 ff
> |........PRIMARY.|
> 00001020 =A000 00 00 00 00 00 00 00 =A000 00 00 00 00 00 00 00
> |................|
> *
> 000013f0 =A000 00 00 00 00 00 00 00 =A000 00 00 00 06 00 49 6e
> |..............In|
> 00001400 =A06e 6f 44 42 00 00 00 00 =A000 00 00 00 00 00 00 00
> |noDB............|
> 00001410 =A000 00 00 00 00 00 00 00 =A000 00 00 00 00 00 00 00
> |................|
> *
> 00002150 =A004 00 0b 69 64 5f 63 6f =A075 6e 74 72 79 00 05 00
> |...id_country...|
> 00002160 =A005 69 73 6f 32 00 06 00 =A005 69 73 6f 33 00 07 00
> |.iso2....iso3...|
> 00002170 =A00d 63 6f 75 6e 74 72 79 =A05f 6e 61 6d 65 00 04 0b
> |.country_name...|
> 00002180 =A005 05 00 02 00 00 12 00 =A00f 00 00 02 c0 00 00 05
> |................|
> 00002190 =A005 06 06 00 04 00 00 00 =A080 00 00 00 fe c0 00 00
> |................|
> 000021a0 =A006 05 09 09 00 0a 00 00 =A000 80 00 00 00 fe c0 00
> |................|
> 000021b0 =A000 07 0d 42 fd 02 13 00 =A000 00 00 00 00 00 0f c0
> |...B............|
> 000021c0 =A000 00 ff 69 64 5f 63 6f =A075 6e 74 72 79 ff 69 73
> |...future_var...|
> 000021d0 =A06f 32 ff 69 73 6f 33 ff =A063 6f 75 6e 74 72 79 5f
> |...future_int...|

Having significant amount of overhead for unused columns will without
doubt harm performance significantly for certain operations.

Altering .frm files should is always be tried on a non-prod box before
even considering using it on prod. There are some well known use cases
(adding enums values, enlarging varchar columns) where altering a .frm
is useful, but it should always be considered very dangerous.

You could of course consider using PostgreSQL which would only need a
very brief exclusive lock for adding a default null column...

--=20
Rob Wultsch
wultsch@gmail.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg

Re: Possible tricks to ALTER on huge tables?

am 06.08.2010 09:47:01 von Jigal van Hemert

Daevid Vincent wrote:
> We currently have some tables that are approaching 1 BILLION rows (real
> Billion, with nine zeros, not that silly six zero version). Trying to do an
> "ALTER" on them to add a column can sometimes take hours.

A few years ago I have tested possible table structures for an
application which had to store at least a million profiles of persons.
Because we expected that properties would be added (and/or removed) from
the database quite often I also tested a structure where the properties
of a single profile were stored in tables based on the data type.

So we had tables with integers, strings, dates, etc. and used a record
for each property; columns were like: id, property name, value, and a
few other relevant things to handle and display the data.

Most select queries were about as fast as they would be with a single
table. Database size was approximately the same because not all profiles
used all properties, so we only needed to store the properties a certain
profile would use.
The only limitation at that time was 31 joins, but I don't think we've
ever hit that limit.

Adding properties was easy, just adding them to the configuration of the
application was enough.

It really depends on the situation of your application which table
structure is the most suitable. Test the performance of all kinds of
operations you need to do with realistic data and various amounts of
data to see how it scales.

--
Jigal van Hemert

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Possible tricks to ALTER on huge tables?

am 06.08.2010 17:55:44 von Andy Wallace

I had to do this trick with a few million rows in the table, and what
I did was to create a new table with the required structure, then did
"insert into select from", starting with the newest data first, cause
that made sense for my application. Then, renamed the old table and the
new.

YMMV
andu

Jigal van Hemert wrote:
> Daevid Vincent wrote:
>> We currently have some tables that are approaching 1 BILLION rows (real
>> Billion, with nine zeros, not that silly six zero version). Trying to
>> do an
>> "ALTER" on them to add a column can sometimes take hours.
>

--
Andy Wallace
iHOUSEweb, Inc.
awallace@ihouseweb.com
(866) 645-7700 ext 219
--
There are two ways to build software:
Make it so simple that there are obviously no bugs,
or make it so complex that there are no obvious bugs.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

RE: Possible tricks to ALTER on huge tables?

am 06.08.2010 19:56:19 von Daevid Vincent

> -----Original Message-----
> From: Rob Wultsch [mailto:wultsch@gmail.com]
> Sent: Thursday, August 05, 2010 6:05 PM
> To: Daevid Vincent
> Cc: MySQL List
> Subject: Re: Possible tricks to ALTER on huge tables?
>
> Having significant amount of overhead for unused columns will without
> doubt harm performance significantly for certain operations.

I don't see this as "significant overhead". A few "null" columns that are
dormant are not going to impact performance in any measurable way I don't
think.

> Altering .frm files should is always be tried on a non-prod box before
> even considering using it on prod.

Thanks Capt'n Obvious. ;-p

Isn't that the case for ANY and ALL DB operations?

> There are some well known use cases
> (adding enums values, enlarging varchar columns) where altering a .frm
> is useful, but it should always be considered very dangerous.

Of course.

> You could of course consider using PostgreSQL which would only need a
> very brief exclusive lock for adding a default null column...

....an interesting thought.


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org