Speeding up series of complex INSERTS

Speeding up series of complex INSERTS

am 13.05.2005 17:27:26 von KEVIN ZEMBOWER

This is my first experience working with a large database. I'm trying to =
insert over 320,000 record of a bibliographic database of medical =
articles. If anyone would like to test my system, the program is available =
at http://www.jhuccp.org/temp/KZ/pop2sql.pl, the SQL statements to set up =
the database are at http://www.jhuccp.org/temp/KZ/popline.sql and a file =
consisting of a header record and 1000 data records is at http://www.jhuccp=
...org/temp/KZ/popline.dmp.gz and must be decompressed before use. The =
program is run by specifying the name of the data file to be processed on =
the command line, such as "./pop2sql.pl popline.dmp". The only unusual =
module used by the program is Text::xSV.

When testing this program with 1000 data records, I was processing about =
570 records per minute. This would have processed all 320,000 records in =
about 9 hours, which was acceptable to me. However, I started it running =
yesterday at about 15:40 and it's still running now, almost 20 hours =
later. Right now, this is the condition of the database:
Database popline running on localhost
Table Records Type Size=20
abstract 220,038 MyISAM 517.9 MB =20
abstractword 0 MyISAM 1.0 KB =20
abstractword_article 0 MyISAM 1.0 KB =20
article 243,817 MyISAM 66.5 MB =20
author 164,213 MyISAM 7.2 MB =20
author_article 418,227 MyISAM 3.6 MB =20
clientgroup 0 MyISAM 1.0 KB =20
community 0 MyISAM 1.0 KB =20
corpname 2,750 MyISAM 376.7 KB =20
corpname_article 20,477 MyISAM 181.0 KB =20
country 0 MyISAM 1.0 KB =20
findingstype 0 MyISAM 1.0 KB =20
findingstype_article 0 MyISAM 1.0 KB =20
journal 45,032 MyISAM 5.2 MB =20
journal_article 149,919 MyISAM 1.3 MB =20
keyword 2,109 MyISAM 73.8 KB =20
keyword_article 2,393,673 MyISAM 20.5 MB =20
languguage 3 MyISAM 4.1 KB =20
popreporttopic 0 MyISAM 1.0 KB =20
popreporttopic_article 0 MyISAM 1.0 KB =20
20 table(s) 3,660,258 -- 622.8 MB =20

As you can see, only 243,817 articles (records) have been inserted. It's =
currently inserting records at the rate of 104 records per minute. At this =
rate, it'll take another 13 hours to complete this run.

I read the article in the reference manual at 7.2.14 on the Speed of =
INSERT Statements. I only use INSERT and SELECT statements. I don't know =
which of these techniques I can apply to my program. I don't think I can =
insert multiple rows at a time; I need to process the input file one row =
at a time. Would INSERT DELAYED help here? Would it be worthwhile to =
rewrite my program completely to generate a series of input files that I =
could use LOAD DATA INFILE with? I'm guessing that the most probable =
speed-up might occur if I disable the indexes, load the data, then =
recreate the indexes with myisamchk (I can't use the DISABLE/ENABLE KEYS =
of MySQL 4.0 as I'm running 3.23.49). Would it be worthwhile to rewrite my =
program to use more than one thread at a time?

Any suggestions on what I can try? Any advice on diagnostics I can run to =
determine the bottleneck?

Thanks for all your help and suggestions on my problem.

-Kevin Zembower

-----
E. Kevin Zembower
Internet Systems Group manager
Johns Hopkins University
Bloomberg School of Public Health
Center for Communications Programs
111 Market Place, Suite 310
Baltimore, MD 21202
410-659-6139

--
MySQL Perl Mailing List
For list archives: http://lists.mysql.com/perl
To unsubscribe: http://lists.mysql.com/perl?unsub=3Dgcdmp-msql-mysql-modules @m.gmane.org

Re: Speeding up series of complex INSERTS

am 13.05.2005 17:47:26 von Rudy Lippan

On Fri, 13 May 2005, KEVIN ZEMBOWER wrote:

> Would INSERT DELAYED help here? Would it be worthwhile to rewrite my program
> completely to generate a series of input files that I could use LOAD DATA
> INFILE with? I'm guessing that the most probable speed-up might occur if I
> disable the indexes, load the data, then recreate the indexes with myisamchk

This is going probably going to give you the best performance. And
don't forget run optimize table beteen loads.

Rudy


--
MySQL Perl Mailing List
For list archives: http://lists.mysql.com/perl
To unsubscribe: http://lists.mysql.com/perl?unsub=gcdmp-msql-mysql-modules@m .gmane.org

Re: Speeding up series of complex INSERTS

am 30.05.2005 12:04:16 von Felix Geerinckx

On 13/05/2005, KEVIN ZEMBOWER wrote:

> Any suggestions on what I can try? Any advice on diagnostics I can
> run to determine the bottleneck?
>
> Thanks for all your help and suggestions on my problem.

Just had a very quick look. It seems that you keyword table is not
indexed on name.
Perhaps that's slowing things down when more and more records are
entered?

--
felix

--
MySQL Perl Mailing List
For list archives: http://lists.mysql.com/perl
To unsubscribe: http://lists.mysql.com/perl?unsub=gcdmp-msql-mysql-modules@m .gmane.org