Importing large databases faster

Importing large databases faster

am 16.12.2009 21:56:00 von Madison Kelly

Hi all,

I've got a fairly large set of databases I'm backing up each Friday. The
dump takes about 12.5h to finish, generating a ~172 GB file. When I try
to load it though, *after* manually dumping the old databases, it takes
1.5~2 days to load the same databases. I am guessing this is, at least
in part, due to indexing.

My question is; Given an empty target DB and a dump file generated via:

ssh root@server "mysqldump --all-databases -psecret" > /path/to/backup.sql

How can I go about efficiently loading it into a new database?
Specifically, can I disable triggers, indexes and what not until after
load finishes? I can only imagine that a single "ok, go create your
indexes now" at the end would be faster. Perhaps some way to hold off
commits from happening as often? The target server has 32Gb of RAM, so I
suspect I should be able to hold things in memory and commit to disk
relatively rarely.

I am currently loading via this command:

mysql -psecret < /path/to/backup.sql

The source and destination MySQL versions are:

Source:
mysql Ver 14.13 Distrib 5.1.19-beta, for unknown-linux-gnu (x86_64)
using readline 5.0

Dest:
mysql Ver 14.12 Distrib 5.0.77, for redhat-linux-gnu (x86_64) using
readline 5.1

The reason for the discrepancy is that the old server was setup from
source on CentOS 4.x by a previous tech and the destination server is
the stock version from CentOS 5.x. The source server will be phased out
soon, so no real attempt at maintaining matching versions was done.

Thanks!

Madi

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

RE: Importing large databases faster

am 16.12.2009 22:50:56 von Gavin Towey

VGhlcmUgYXJlIHNjcmlwdHMgb3V0IHRoZXJlIHN1Y2ggYXQgdGhlIE1hYXRr aXQgbWstcGFyYWxs
ZWwtZHVtcC9yZXN0b3JlIHRoYXQgY2FuIHNwZWVkIHVwIHRoaXMgcHJvY2Vz cyBieSBydW5uaW5n
IGluIHBhcmFsbGVsLg0KDQpIb3dldmVyIGlmIHlvdSdyZSBkb2luZyB0aGlz IGV2ZXJ5IHdlZWsg
b24gdGhhdCBsYXJnZSBvZiBhIGRhdGFzZXQsIEknZCBqdXN0IHVzZSBmaWxl c3lzdGVtIHNuYXBz
aG90cy4gIFlvdSdyZSBiYWNrdXAvcmVzdG9yZSB3b3VsZCB0aGVuIG9ubHkg dGFrZSBhcyBsb25n
IGFzIGl0IHRha2VzIGZvciB5b3UgdG8gc2NwIHRoZSBkYXRhYmFzZSBmcm9t IG9uZSBtYWNoaW5l
IHRvIGFub3RoZXIuDQoNClJlZ2FyZHMsDQpHYXZpbiBUb3dleQ0KDQoNCi0t LS0tT3JpZ2luYWwg
TWVzc2FnZS0tLS0tDQpGcm9tOiBNYWRpc29uIEtlbGx5IFttYWlsdG86bGlu dXhAYWx0ZWV2ZS5j
b21dDQpTZW50OiBXZWRuZXNkYXksIERlY2VtYmVyIDE2LCAyMDA5IDEyOjU2 IFBNDQpUbzogbXlz
cWxAbGlzdHMubXlzcWwuY29tDQpTdWJqZWN0OiBJbXBvcnRpbmcgbGFyZ2Ug ZGF0YWJhc2VzIGZh
c3Rlcg0KDQpIaSBhbGwsDQoNCkkndmUgZ290IGEgZmFpcmx5IGxhcmdlIHNl dCBvZiBkYXRhYmFz
ZXMgSSdtIGJhY2tpbmcgdXAgZWFjaCBGcmlkYXkuIFRoZQ0KZHVtcCB0YWtl cyBhYm91dCAxMi41
aCB0byBmaW5pc2gsIGdlbmVyYXRpbmcgYSB+MTcyIEdCIGZpbGUuIFdoZW4g SSB0cnkNCnRvIGxv
YWQgaXQgdGhvdWdoLCAqYWZ0ZXIqIG1hbnVhbGx5IGR1bXBpbmcgdGhlIG9s ZCBkYXRhYmFzZXMs
IGl0IHRha2VzDQoxLjV+MiBkYXlzIHRvIGxvYWQgdGhlIHNhbWUgZGF0YWJh c2VzLiBJIGFtIGd1
ZXNzaW5nIHRoaXMgaXMsIGF0IGxlYXN0DQppbiBwYXJ0LCBkdWUgdG8gaW5k ZXhpbmcuDQoNCk15
IHF1ZXN0aW9uIGlzOyBHaXZlbiBhbiBlbXB0eSB0YXJnZXQgREIgYW5kIGEg ZHVtcCBmaWxlIGdl
bmVyYXRlZCB2aWE6DQoNCnNzaCByb290QHNlcnZlciAibXlzcWxkdW1wIC0t YWxsLWRhdGFiYXNl
cyAtcHNlY3JldCIgPiAvcGF0aC90by9iYWNrdXAuc3FsDQoNCkhvdyBjYW4g SSBnbyBhYm91dCBl
ZmZpY2llbnRseSBsb2FkaW5nIGl0IGludG8gYSBuZXcgZGF0YWJhc2U/DQpT cGVjaWZpY2FsbHks
IGNhbiBJIGRpc2FibGUgdHJpZ2dlcnMsIGluZGV4ZXMgYW5kIHdoYXQgbm90 IHVudGlsIGFmdGVy
DQpsb2FkIGZpbmlzaGVzPyBJIGNhbiBvbmx5IGltYWdpbmUgdGhhdCBhIHNp bmdsZSAib2ssIGdv
IGNyZWF0ZSB5b3VyDQppbmRleGVzIG5vdyIgYXQgdGhlIGVuZCB3b3VsZCBi ZSBmYXN0ZXIuIFBl
cmhhcHMgc29tZSB3YXkgdG8gaG9sZCBvZmYNCmNvbW1pdHMgZnJvbSBoYXBw ZW5pbmcgYXMgb2Z0
ZW4/IFRoZSB0YXJnZXQgc2VydmVyIGhhcyAzMkdiIG9mIFJBTSwgc28gSQ0K c3VzcGVjdCBJIHNo
b3VsZCBiZSBhYmxlIHRvIGhvbGQgdGhpbmdzIGluIG1lbW9yeSBhbmQgY29t bWl0IHRvIGRpc2sN
CnJlbGF0aXZlbHkgcmFyZWx5Lg0KDQpJIGFtIGN1cnJlbnRseSBsb2FkaW5n IHZpYSB0aGlzIGNv
bW1hbmQ6DQoNCm15c3FsIC1wc2VjcmV0IDwgL3BhdGgvdG8vYmFja3VwLnNx bA0KDQpUaGUgc291
cmNlIGFuZCBkZXN0aW5hdGlvbiBNeVNRTCB2ZXJzaW9ucyBhcmU6DQoNClNv dXJjZToNCm15c3Fs
ICBWZXIgMTQuMTMgRGlzdHJpYiA1LjEuMTktYmV0YSwgZm9yIHVua25vd24t bGludXgtZ251ICh4
ODZfNjQpDQp1c2luZyByZWFkbGluZSA1LjANCg0KRGVzdDoNCm15c3FsICBW ZXIgMTQuMTIgRGlz
dHJpYiA1LjAuNzcsIGZvciByZWRoYXQtbGludXgtZ251ICh4ODZfNjQpIHVz aW5nDQpyZWFkbGlu
ZSA1LjENCg0KVGhlIHJlYXNvbiBmb3IgdGhlIGRpc2NyZXBhbmN5IGlzIHRo YXQgdGhlIG9sZCBz
ZXJ2ZXIgd2FzIHNldHVwIGZyb20NCnNvdXJjZSBvbiBDZW50T1MgNC54IGJ5 IGEgcHJldmlvdXMg
dGVjaCBhbmQgdGhlIGRlc3RpbmF0aW9uIHNlcnZlciBpcw0KdGhlIHN0b2Nr IHZlcnNpb24gZnJv
bSBDZW50T1MgNS54LiBUaGUgc291cmNlIHNlcnZlciB3aWxsIGJlIHBoYXNl ZCBvdXQNCnNvb24s
IHNvIG5vIHJlYWwgYXR0ZW1wdCBhdCBtYWludGFpbmluZyBtYXRjaGluZyB2 ZXJzaW9ucyB3YXMg
ZG9uZS4NCg0KVGhhbmtzIQ0KDQpNYWRpDQoNCi0tDQpNeVNRTCBHZW5lcmFs IE1haWxpbmcgTGlz
dA0KRm9yIGxpc3QgYXJjaGl2ZXM6IGh0dHA6Ly9saXN0cy5teXNxbC5jb20v bXlzcWwNClRvIHVu
c3Vic2NyaWJlOiAgICBodHRwOi8vbGlzdHMubXlzcWwuY29tL215c3FsP3Vu c3ViPWd0b3dleUBm
Zm4uY29tDQoNCg0KVGhpcyBtZXNzYWdlIGNvbnRhaW5zIGNvbmZpZGVudGlh bCBpbmZvcm1hdGlv
biBhbmQgaXMgaW50ZW5kZWQgb25seSBmb3IgdGhlIGluZGl2aWR1YWwgbmFt ZWQuICBJZiB5b3Ug
YXJlIG5vdCB0aGUgbmFtZWQgYWRkcmVzc2VlLCB5b3UgYXJlIG5vdGlmaWVk IHRoYXQgcmV2aWV3
aW5nLCBkaXNzZW1pbmF0aW5nLCBkaXNjbG9zaW5nLCBjb3B5aW5nIG9yIGRp c3RyaWJ1dGluZyB0
aGlzIGUtbWFpbCBpcyBzdHJpY3RseSBwcm9oaWJpdGVkLiAgUGxlYXNlIG5v dGlmeSB0aGUgc2Vu
ZGVyIGltbWVkaWF0ZWx5IGJ5IGUtbWFpbCBpZiB5b3UgaGF2ZSByZWNlaXZl ZCB0aGlzIGUtbWFp
bCBieSBtaXN0YWtlIGFuZCBkZWxldGUgdGhpcyBlLW1haWwgZnJvbSB5b3Vy IHN5c3RlbS4gRS1t
YWlsIHRyYW5zbWlzc2lvbiBjYW5ub3QgYmUgZ3VhcmFudGVlZCB0byBiZSBz ZWN1cmUgb3IgZXJy
b3ItZnJlZSBhcyBpbmZvcm1hdGlvbiBjb3VsZCBiZSBpbnRlcmNlcHRlZCwg Y29ycnVwdGVkLCBs
b3N0LCBkZXN0cm95ZWQsIGFycml2ZSBsYXRlIG9yIGluY29tcGxldGUsIG9y IGNvbnRhaW4gdmly
dXNlcy4gVGhlIHNlbmRlciB0aGVyZWZvcmUgZG9lcyBub3QgYWNjZXB0IGxp YWJpbGl0eSBmb3Ig
YW55IGxvc3Mgb3IgZGFtYWdlIGNhdXNlZCBieSB2aXJ1c2VzIG9yIGVycm9y cyBvciBvbWlzc2lv
bnMgaW4gdGhlIGNvbnRlbnRzIG9mIHRoaXMgbWVzc2FnZSwgd2hpY2ggYXJp c2UgYXMgYSByZXN1
bHQgb2YgZS1tYWlsIHRyYW5zbWlzc2lvbi4gW0ZyaWVuZEZpbmRlciBOZXR3 b3JrcywgSW5jLiwg
MjIwIEh1bWJvbHQgY291cnQsIFN1bm55dmFsZSwgQ0EgOTQwODksIFVTQSwg RnJpZW5kRmluZGVy
LmNvbQ0K

Re: Importing large databases faster

am 17.12.2009 01:34:53 von Madison Kelly

Gavin Towey wrote:
> There are scripts out there such at the Maatkit mk-parallel-dump/restore that can speed up this process by running in parallel.
>
> However if you're doing this every week on that large of a dataset, I'd just use filesystem snapshots. You're backup/restore would then only take as long as it takes for you to scp the database from one machine to another.
>
> Regards,
> Gavin Towey

Thanks! Will the Maatkit script work on a simple --all-databases dump?

As for the copy, it's a temporary thing. This is just being done weekly
while we test out the new server. Once it's live, the new server will
indeed be backed up via LVM snapshots. :)

Madi

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

RE: Importing large databases faster

am 17.12.2009 02:01:18 von Gavin Towey

SSBkb24ndCB0aGluayBzbywgSSdtIHByZXR0eSBzdXJlIHlvdSBoYXZlIHRv IHVzZSBtay1wYXJh
bGxlbC1kdW1wIHRvIGdldCB0aGUgZGF0YSBpbiBhIGZvcm1hdCBpdCB3YW50 cy4gIFRoZSBkb2Nz
IGFyZSBvbmxpbmUgdGhvdWdoLg0KDQpSZWdhcmRzLA0KR2F2aW4gVG93ZXkN Cg0KLS0tLS1Pcmln
aW5hbCBNZXNzYWdlLS0tLS0NCkZyb206IE1hZGlzb24gS2VsbHkgW21haWx0 bzpsaW51eEBhbHRl
ZXZlLmNvbV0NClNlbnQ6IFdlZG5lc2RheSwgRGVjZW1iZXIgMTYsIDIwMDkg NDozNSBQTQ0KVG86
IEdhdmluIFRvd2V5DQpDYzogbXlzcWxAbGlzdHMubXlzcWwuY29tDQpTdWJq ZWN0OiBSZTogSW1w
b3J0aW5nIGxhcmdlIGRhdGFiYXNlcyBmYXN0ZXINCg0KR2F2aW4gVG93ZXkg d3JvdGU6DQo+IFRo
ZXJlIGFyZSBzY3JpcHRzIG91dCB0aGVyZSBzdWNoIGF0IHRoZSBNYWF0a2l0 IG1rLXBhcmFsbGVs
LWR1bXAvcmVzdG9yZSB0aGF0IGNhbiBzcGVlZCB1cCB0aGlzIHByb2Nlc3Mg YnkgcnVubmluZyBp
biBwYXJhbGxlbC4NCj4NCj4gSG93ZXZlciBpZiB5b3UncmUgZG9pbmcgdGhp cyBldmVyeSB3ZWVr
IG9uIHRoYXQgbGFyZ2Ugb2YgYSBkYXRhc2V0LCBJJ2QganVzdCB1c2UgZmls ZXN5c3RlbSBzbmFw
c2hvdHMuICBZb3UncmUgYmFja3VwL3Jlc3RvcmUgd291bGQgdGhlbiBvbmx5 IHRha2UgYXMgbG9u
ZyBhcyBpdCB0YWtlcyBmb3IgeW91IHRvIHNjcCB0aGUgZGF0YWJhc2UgZnJv bSBvbmUgbWFjaGlu
ZSB0byBhbm90aGVyLg0KPg0KPiBSZWdhcmRzLA0KPiBHYXZpbiBUb3dleQ0K DQpUaGFua3MhIFdp
bGwgdGhlIE1hYXRraXQgc2NyaXB0IHdvcmsgb24gYSBzaW1wbGUgLS1hbGwt ZGF0YWJhc2VzIGR1
bXA/DQoNCkFzIGZvciB0aGUgY29weSwgaXQncyBhIHRlbXBvcmFyeSB0aGlu Zy4gVGhpcyBpcyBq
dXN0IGJlaW5nIGRvbmUgd2Vla2x5DQp3aGlsZSB3ZSB0ZXN0IG91dCB0aGUg bmV3IHNlcnZlci4g
T25jZSBpdCdzIGxpdmUsIHRoZSBuZXcgc2VydmVyIHdpbGwNCmluZGVlZCBi ZSBiYWNrZWQgdXAg
dmlhIExWTSBzbmFwc2hvdHMuIDopDQoNCk1hZGkNCg0KLS0NCk15U1FMIEdl bmVyYWwgTWFpbGlu
ZyBMaXN0DQpGb3IgbGlzdCBhcmNoaXZlczogaHR0cDovL2xpc3RzLm15c3Fs LmNvbS9teXNxbA0K
VG8gdW5zdWJzY3JpYmU6ICAgIGh0dHA6Ly9saXN0cy5teXNxbC5jb20vbXlz cWw/dW5zdWI9Z3Rv
d2V5QGZmbi5jb20NCg0KDQpUaGlzIG1lc3NhZ2UgY29udGFpbnMgY29uZmlk ZW50aWFsIGluZm9y
bWF0aW9uIGFuZCBpcyBpbnRlbmRlZCBvbmx5IGZvciB0aGUgaW5kaXZpZHVh bCBuYW1lZC4gIElm
IHlvdSBhcmUgbm90IHRoZSBuYW1lZCBhZGRyZXNzZWUsIHlvdSBhcmUgbm90 aWZpZWQgdGhhdCBy
ZXZpZXdpbmcsIGRpc3NlbWluYXRpbmcsIGRpc2Nsb3NpbmcsIGNvcHlpbmcg b3IgZGlzdHJpYnV0
aW5nIHRoaXMgZS1tYWlsIGlzIHN0cmljdGx5IHByb2hpYml0ZWQuICBQbGVh c2Ugbm90aWZ5IHRo
ZSBzZW5kZXIgaW1tZWRpYXRlbHkgYnkgZS1tYWlsIGlmIHlvdSBoYXZlIHJl Y2VpdmVkIHRoaXMg
ZS1tYWlsIGJ5IG1pc3Rha2UgYW5kIGRlbGV0ZSB0aGlzIGUtbWFpbCBmcm9t IHlvdXIgc3lzdGVt
LiBFLW1haWwgdHJhbnNtaXNzaW9uIGNhbm5vdCBiZSBndWFyYW50ZWVkIHRv IGJlIHNlY3VyZSBv
ciBlcnJvci1mcmVlIGFzIGluZm9ybWF0aW9uIGNvdWxkIGJlIGludGVyY2Vw dGVkLCBjb3JydXB0
ZWQsIGxvc3QsIGRlc3Ryb3llZCwgYXJyaXZlIGxhdGUgb3IgaW5jb21wbGV0 ZSwgb3IgY29udGFp
biB2aXJ1c2VzLiBUaGUgc2VuZGVyIHRoZXJlZm9yZSBkb2VzIG5vdCBhY2Nl cHQgbGlhYmlsaXR5
IGZvciBhbnkgbG9zcyBvciBkYW1hZ2UgY2F1c2VkIGJ5IHZpcnVzZXMgb3Ig ZXJyb3JzIG9yIG9t
aXNzaW9ucyBpbiB0aGUgY29udGVudHMgb2YgdGhpcyBtZXNzYWdlLCB3aGlj aCBhcmlzZSBhcyBh
IHJlc3VsdCBvZiBlLW1haWwgdHJhbnNtaXNzaW9uLiBbRnJpZW5kRmluZGVy IE5ldHdvcmtzLCBJ
bmMuLCAyMjAgSHVtYm9sdCBjb3VydCwgU3Vubnl2YWxlLCBDQSA5NDA4OSwg VVNBLCBGcmllbmRG
aW5kZXIuY29tDQo=

Re: Importing large databases faster

am 17.12.2009 07:09:22 von Shawn Green

Madison Kelly wrote:
> Hi all,
>
> I've got a fairly large set of databases I'm backing up each Friday. The
> dump takes about 12.5h to finish, generating a ~172 GB file. When I try
> to load it though, *after* manually dumping the old databases, it takes
> 1.5~2 days to load the same databases. I am guessing this is, at least
> in part, due to indexing.
>
> My question is; Given an empty target DB and a dump file generated via:
>
> ssh root@server "mysqldump --all-databases -psecret" > /path/to/backup.sql
>
> How can I go about efficiently loading it into a new database?
> Specifically, can I disable triggers, indexes and what not until after
> load finishes? I can only imagine that a single "ok, go create your
> indexes now" at the end would be faster. Perhaps some way to hold off
> commits from happening as often? The target server has 32Gb of RAM, so I
> suspect I should be able to hold things in memory and commit to disk
> relatively rarely.
>
> I am currently loading via this command:
>
> mysql -psecret < /path/to/backup.sql
>

For that kind of dump, that kind of restore is what you get. Your
current dump is generating GB of INSERT statements that need to be
parsed then processed.

To get a faster restore, use a different sort of dump. I suggest you
compare your current process to one that uses the --tab option of
mysqldump (to save the data) then uses LOAD DATA INFILE ... to
repopulate your server. This is the fastest known method to populate a
table other than a direct file copy.
--
Shawn Green, MySQL Senior Support Engineer
Sun Microsystems, Inc.
Office: Blountville, TN



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Importing large databases faster

am 17.12.2009 10:59:07 von Jay Ess

Madison Kelly wrote:
> Hi all,
>
> I've got a fairly large set of databases I'm backing up each Friday.
> The dump takes about 12.5h to finish, generating a ~172 GB file. When
> I try to load it though, *after* manually dumping the old databases,
> it takes 1.5~2 days to load the same databases. I am guessing this is,
> at least in part, due to indexing.
>
> My question is; Given an empty target DB and a dump file generated via:
>
> ssh root@server "mysqldump --all-databases -psecret" >
> /path/to/backup.sql
I use the "-e -v -f -q -Q -K" parameters for the mysqldump on large
tables/databases. It does what you are asking for. Disables the key
generation until all of the data is inserted. It also uses multi insert
statements and not individual insert statement for every row which
speeds up things considerable.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Importing large databases faster

am 17.12.2009 16:46:49 von mos

At 03:59 AM 12/17/2009, you wrote:
>Madison Kelly wrote:
>>Hi all,
>>
>>I've got a fairly large set of databases I'm backing up each Friday. The
>>dump takes about 12.5h to finish, generating a ~172 GB file. When I try
>>to load it though, *after* manually dumping the old databases, it takes
>>1.5~2 days to load the same databases. I am guessing this is, at least in
>>part, due to indexing.
>>
>>My question is; Given an empty target DB and a dump file generated via:
>>
>>ssh root@server "mysqldump --all-databases -psecret" > /path/to/backup.sql
>I use the "-e -v -f -q -Q -K" parameters for the mysqldump on large
>tables/databases. It does what you are asking for. Disables the key
>generation until all of the data is inserted. It also uses multi insert
>statements and not individual insert statement for every row which speeds
>up things considerable.


"Load Data ..." is still going to be much faster.

Mike


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Importing large databases faster

am 18.12.2009 11:27:08 von Brent Clark

On 17/12/2009 17:46, mos wrote:
> "Load Data ..." is still going to be much faster.
>
> Mike
>

Hiya

If you using on Linux and using LVM, look at mylvmbackup.

HTH

Brent Clark

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org