Question about simple databases in Perl

Question about simple databases in Perl

am 16.10.2007 16:30:59 von January Weiner

Hello,

I have the following problem:

I have a very simple relation keyword -- number. I need to access it
very quickly for huge data sets.

For now, I was using NDBM, tying a hash with an indexed database. However,
this approach has drawbacks: large sizes of the index files,
platform-dependent index files, hard to store and retrieve additional
information etc. However, for someone like me it was a very easy and
straightforward approach.

I have now tested the sqlite which gives me the power of SQL. However, it
turned out to be roughly 100-500 times slower than NDBM, and the speed here
is of utter importance. First, a code snippet showing how I populated the
database:

my $dbh = DBI->connect('dbi:SQLite:dbname=blah.foo.sql','','');
$sth = $dbh->prepare( 'PRAGMA synchronous = OFF' ) ;
$sth->execute( ) ;
$sth = $dbh->prepare( 'create table t ( name TEXT, cont TEXT )' ) ;
$sth->execute( ) ;

$sth = $dbh->prepare( "insert into t ( name, cont ) values ( ?, ? )" ) ;

while( keys %records ) {
$sth->execute( $_, $records{$_} ) ;
}


And here is a code snippet showing how get the records:

my $dbh = DBI->connect('dbi:SQLite:dbname=blah.foo.sql','','');
my $sth = $dbh->prepare( "select * from t where name=?" ) ;

my $ntests = 10000 ; # number of tests
my $nrec = 700000 ; # number of records ;
# @keys hold all the keys, don't worry where I got it from

while( $ntests > 0 ) {

$i = int(rand($nrec)) ;
$key = $keys[ $i ] ;

# $t0 = [ gettimeofday ] ;
$sth->execute( $key ) ;
$all = $sth->fetchall_arrayref( ) ;
# $dt = tv_interval( $t0, [ gettimeofday ] ) ;

$ntests-- ;

}

Now, my questions are:

1) I don't know much about SQL or sqlite, maybe NDBM is expected to be so
much faster?
2) if not -- what do I do wrong? Should I ask in an SQL newsgroup for help
on optimizing my queries / database structure?

3) what other means could I use to speed up the access to these simple
records?

Best regards,

January


--

Re: Question about simple databases in Perl

am 16.10.2007 16:44:25 von Paul Lalli

On Oct 16, 10:30 am, January Weiner wrote:

> while( keys %records ) {
> $sth->execute( $_, $records{$_} ) ;
> }

There's no way that works. It simply repeatedly tests to see if there
are any elements in the hash, and if so, executes that SQL. It does
not assign any elements to $_, and seeing as nothing in the block
changes %records, it's an infinite loop.

Paul Lalli

Re: Question about simple databases in Perl

am 16.10.2007 19:44:54 von xhoster

January Weiner wrote:
> Hello,
>
> I have the following problem:
>
> I have a very simple relation keyword -- number. I need to access it
> very quickly for huge data sets.
>
> For now, I was using NDBM, tying a hash with an indexed database.
> However, this approach has drawbacks: large sizes of the index files,
> platform-dependent index files, hard to store and retrieve additional
> information etc. However, for someone like me it was a very easy and
> straightforward approach.
>
> I have now tested the sqlite which gives me the power of SQL. However,
> it turned out to be roughly 100-500 times slower than NDBM, and the speed
> here is of utter importance. First, a code snippet showing how I
> populated the database:
>
> my $dbh = DBI->connect('dbi:SQLite:dbname=blah.foo.sql','','');
> $sth = $dbh->prepare( 'PRAGMA synchronous = OFF' ) ;
> $sth->execute( ) ;
> $sth = $dbh->prepare( 'create table t ( name TEXT, cont TEXT )' ) ;
.....
> my $sth = $dbh->prepare( "select * from t where name=?" ) ;



> Now, my questions are:
>
> 1) I don't know much about SQL or sqlite, maybe NDBM is expected to be so
> much faster?
> 2) if not -- what do I do wrong? Should I ask in an SQL newsgroup for
> help on optimizing my queries / database structure?

Other than the infinite loop mentions by someone else, you didn't build an
index on the "name" column or table "t". If you don't build an index,
then the program will have to scan the whole dataset to find what you
are looking for.

$dbh->do('create index asdfasdf on t(name)');

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Re: Question about simple databases in Perl

am 17.10.2007 07:09:23 von January Weiner

Paul Lalli wrote:
> There's no way that works. It simply repeatedly tests to see if there
> are any elements in the hash, and if so, executes that SQL. It does
> not assign any elements to $_, and seeing as nothing in the block
> changes %records, it's an infinite loop.

Whoups, my error while simplifying the code snippet. It should have been

for( keys %... ) {


}

j.

--