using 'DB_File' versus just plain tie() ?

D

dan baker

can anyone shed some light on the pros and cons of tie() ing to a hash
using 'DB_FILE' rather than just a plain tie? What I have been playing
with works great and seems very fast, but creates a binary file that I
can't move between a UNIX server and a PC test environment.... I am
wondering how I can get the tie(0 to create a plain ASCII file that
could be moved between platforms.

Also wondering about other tradeoffs such as speed, and at what point
in terms of number of records it would even be noticable. i have a
fairly simple value string, and can use delimiters to pack and unpack
multiple fields, so thats not a problem I don't mind dealing with.

my current code uses something lie this:

use DB_File;
tie %tempHash , 'DB_File' , $dbfile ;
$tempHash{$key} = $ValueString ;

and I am wondering about what I'd have to do to get the tie()ed file
to be written in plain ASCII text ?

thanx,

d
 
P

Perusion Hostmaster

can anyone shed some light on the pros and cons of tie() ing to a hash
using 'DB_FILE' rather than just a plain tie? What I have been playing
with works great and seems very fast, but creates a binary file that I
can't move between a UNIX server and a PC test environment.... I am
wondering how I can get the tie(0 to create a plain ASCII file that
could be moved between platforms.

AFAIK, DB_File is portable between machines and even big-endian
and little-endian architectures (unlike GDBM or NDBM).

You may be running into version bifurcation -- many DB_File
modules still use 1.x, and a lot have moved to 2.x/3.x.
Also wondering about other tradeoffs such as speed, and at what point
in terms of number of records it would even be noticable. i have a
fairly simple value string, and can use delimiters to pack and unpack
multiple fields, so thats not a problem I don't mind dealing with.

my current code uses something lie this:

use DB_File;
tie %tempHash , 'DB_File' , $dbfile ;
$tempHash{$key} = $ValueString ;

and I am wondering about what I'd have to do to get the tie()ed file
to be written in plain ASCII text ?

Some of this depends on your keys and data. If you are are going to
have binary data, or data that can exceed a few hundred bytes, you
really want to use DB_File.

You also might try something like Tie::TextDir, which creates a directory
with files named for the keys. Performance will not be near as good,
of course, but it might suit. A sample script follows.


#!/usr/bin/perl -w

use Tie::TextDir;

my $DIR = '/tmp/testhash';

unless (-d $DIR) {
mkdir $DIR, 0777
or die "mkdir $DIR: $!\n";
}

my %hash;

tie %hash, 'Tie::TextDir', $DIR, 'rw'; # Open in read/write mode

print "Current keys in hash:\n\n";

for(keys %hash) {
print "\t$_\n";
}

print "\n";

## Create new file based on date

my $now = time();
my $date = scalar(localtime);

$hash{$now} = "File created by me on $date\n";

untie %hash;

chdir $DIR
or die "chdir $DIR: $!\n";

my $dir = `ls -l`;
print "Directory $DIR:\n$dir\n";
 
P

Paul Marquess

dan baker said:
Perusion Hostmaster <[email protected]> wrote in message
--------------------

huh, I wonder how I can tell if this is the issue? All I know is that
if I FTP the tied()ed file from the unix host to my pc, the app can't
read it when run on my localhost...

There are two main reasons why people have difficulty moving DB_File
database files between platforms.

Firstly, if ftp was used, and the transfer was done in ASCII mode, the
database file will be completely trashed. The transfer must be done in ftp
binary mode.

The second, and most likely, is that the version of Berkeley DB used to
build DB_File was different on the each machine. The most recent version of
Berkeley DB is 4.1.25 and there have been quite a few changes made to the
database file format between version 1 and version 4. If the versions of
Berkeley DB are compatible, it is possible to copy database files between
platforms without any problem.

You can determine which versions of Berkeley DB were used to build DB_File
on your systems by running this one-liner on each platform.

perl -e "use DB_File; print qq{Berkeley DB ver $DB_File::db_ver\n}"

If the versions are identical, the database file can be used on either
platform.

Assuming the versions are different, the next thing you want to do is run
the dbinfo script that comes with DB_File against the database file you want
to copy. It will tell you what versions of Berkeley DB are capable of
building it, and therefore what versions can read it.

If you draw a blank with that, and you still want to use DB_File, the only
options have are: to rebuild DB_File on one of the platforms using the
version of Berkeley DB from the other; or, you can use the db_dump/db_load
utilities that come with Berkeley DB to dump/rebuild the database.
I was hoping there would be a
simple (although maybe slower) alternative that would save the info in
a p-lain ASCII file rather than abinary b-tree or however they do it
internally.

DB_File has a Recno mode that uses text file as backing store. The Perl
interface is via a tied array, rather than a hash though. Have a look at the
DB_File documentation for more details.

cheers
Paul
 
M

Michele Dondi

huh, I wonder how I can tell if this is the issue? All I know is that
if I FTP the tied()ed file from the unix host to my pc, the app can't
read it when run on my localhost... I was hoping there would be a

Despite what you've been told, in my *experience* DB_File databases
are definitely not portable across platforms. Not that I'm a big
expert either, so I asked here, and I got the following answer:

SH>The DB library uses platform specifics in the database. Things like
SH>the endianness of the machine, and the size of ints. That means a
SH>database created on one machine isn't necessarily readable on
SH>another.
[...]
SH>One reason *DBM isn't made cross platform is that the performance
SH>trade off (no longer being able to directly copy data from the file
SH>to memory) isn't considered worth the benefit. After all you can
SH>just read the data and write it out in a transport format (text for
SH>example) and create a *DBM database in on the target machine. Well
SH>I think anyway, not being the designer/author I'm just guessing.

To which another valuable contributor to this ng added:

BL>That's right. You can blame C, which is used to compile the
BL>database engine for that. C tends to use pretty abstract names for
BL>its data types, like int(), which physically are stored in the for
BL>the machine most efficient way, but with no garantees at all about
BL>portability. So all bets on endianness, or number of bits per
BL>integer, are off.
BL>
BL>It is often possible to compile a database engine to use a portable
BL>file format, but with a bad impact on efficiency. Speed could
BL>easily halve.

[Both quotations have been slightly edited for clarity]


Michele
 
D

dan baker

Paul Marquess said:
You can determine which versions of Berkeley DB were used to build DB_File
on your systems by running this one-liner on each platform.

perl -e "use DB_File; print qq{Berkeley DB ver $DB_File::db_ver\n}"
----------------

good info... they are indeed different.
so, I am faced with either making an intermeadiate "dumptoascii", and
then another to reload... or consider using a non-db_file solution.
shucks.

thanks for help,

d
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,816
Latest member
SapanaCarpetStudio

Latest Threads

Top