J
January Weiner
Hello,
I have the following problem:
I have a very simple relation keyword -- number. I need to access it
very quickly for huge data sets.
For now, I was using NDBM, tying a hash with an indexed database. However,
this approach has drawbacks: large sizes of the index files,
platform-dependent index files, hard to store and retrieve additional
information etc. However, for someone like me it was a very easy and
straightforward approach.
I have now tested the sqlite which gives me the power of SQL. However, it
turned out to be roughly 100-500 times slower than NDBM, and the speed here
is of utter importance. First, a code snippet showing how I populated the
database:
my $dbh = DBI->connect('dbi:SQLite:dbname=blah.foo.sql','','');
$sth = $dbh->prepare( 'PRAGMA synchronous = OFF' ) ;
$sth->execute( ) ;
$sth = $dbh->prepare( 'create table t ( name TEXT, cont TEXT )' ) ;
$sth->execute( ) ;
$sth = $dbh->prepare( "insert into t ( name, cont ) values ( ?, ? )" ) ;
while( keys %records ) {
$sth->execute( $_, $records{$_} ) ;
}
And here is a code snippet showing how get the records:
my $dbh = DBI->connect('dbi:SQLite:dbname=blah.foo.sql','','');
my $sth = $dbh->prepare( "select * from t where name=?" ) ;
my $ntests = 10000 ; # number of tests
my $nrec = 700000 ; # number of records ;
# @keys hold all the keys, don't worry where I got it from
while( $ntests > 0 ) {
$i = int(rand($nrec)) ;
$key = $keys[ $i ] ;
# $t0 = [ gettimeofday ] ;
$sth->execute( $key ) ;
$all = $sth->fetchall_arrayref( ) ;
# $dt = tv_interval( $t0, [ gettimeofday ] ) ;
$ntests-- ;
}
Now, my questions are:
1) I don't know much about SQL or sqlite, maybe NDBM is expected to be so
much faster?
2) if not -- what do I do wrong? Should I ask in an SQL newsgroup for help
on optimizing my queries / database structure?
3) what other means could I use to speed up the access to these simple
records?
Best regards,
January
--
I have the following problem:
I have a very simple relation keyword -- number. I need to access it
very quickly for huge data sets.
For now, I was using NDBM, tying a hash with an indexed database. However,
this approach has drawbacks: large sizes of the index files,
platform-dependent index files, hard to store and retrieve additional
information etc. However, for someone like me it was a very easy and
straightforward approach.
I have now tested the sqlite which gives me the power of SQL. However, it
turned out to be roughly 100-500 times slower than NDBM, and the speed here
is of utter importance. First, a code snippet showing how I populated the
database:
my $dbh = DBI->connect('dbi:SQLite:dbname=blah.foo.sql','','');
$sth = $dbh->prepare( 'PRAGMA synchronous = OFF' ) ;
$sth->execute( ) ;
$sth = $dbh->prepare( 'create table t ( name TEXT, cont TEXT )' ) ;
$sth->execute( ) ;
$sth = $dbh->prepare( "insert into t ( name, cont ) values ( ?, ? )" ) ;
while( keys %records ) {
$sth->execute( $_, $records{$_} ) ;
}
And here is a code snippet showing how get the records:
my $dbh = DBI->connect('dbi:SQLite:dbname=blah.foo.sql','','');
my $sth = $dbh->prepare( "select * from t where name=?" ) ;
my $ntests = 10000 ; # number of tests
my $nrec = 700000 ; # number of records ;
# @keys hold all the keys, don't worry where I got it from
while( $ntests > 0 ) {
$i = int(rand($nrec)) ;
$key = $keys[ $i ] ;
# $t0 = [ gettimeofday ] ;
$sth->execute( $key ) ;
$all = $sth->fetchall_arrayref( ) ;
# $dt = tv_interval( $t0, [ gettimeofday ] ) ;
$ntests-- ;
}
Now, my questions are:
1) I don't know much about SQL or sqlite, maybe NDBM is expected to be so
much faster?
2) if not -- what do I do wrong? Should I ask in an SQL newsgroup for help
on optimizing my queries / database structure?
3) what other means could I use to speed up the access to these simple
records?
Best regards,
January
--