D
denis.papathanasiou
But what about the records that aren't properly terminated? Won't
that throw off your count?
If the read fails, the exception handling returns null for the entire
array of n bytes.
Likewise, if the read succeeds, the array is valid, i.e. all lines
within the data block are of the same size (and the routine that picks
out lines from the array does further checking).
So it brings up a trade-off in sizing the array for reads: too large
and miss parts of the file uncorrupted (but traverse the file quickly)
versus too small and take forever to traverse the file (but minimize
losing uncorrupted data).
It's not perfect, and I'll keep thinking up possible improvements over
time (fortunately, it's not a problem which happens very often).
You might try something along the following lines:
#! /usr/bin/perl
use strict;
use warnings;
use Fcntl qw/ SEEK_SET /;
my $RECORDSZ = 20;
my $IN_FILE = $0;
open IN, "<:raw", $IN_FILE or die "$0: open: $!";
my $nrec = 0;
while (sysseek IN, $nrec * $RECORDSZ, SEEK_SET) {
my $nread = sysread IN, my($buf), $RECORDSZ;
if (defined $nread) {
if ($nread == 0) {
exit 0; # eof
}
else {
$buf =~ s{([^[:graph:] ])} {
"<" . sprintf("%02X", ord $1) . ">"
}ge;
print "$nrec: $buf\n";
}
}
else {
warn "$0: $IN_FILE:$nrec: sysread: $!";
}
++$nrec;
}
die "$0: sysseek: $!";
Thanks for suggesting it; I'll definitely give it a try tomorrow.
My first (quick) impression is that the while loop should not be tied
to the file handler b/c of how perl (seems) to close or invalidate the
file handler at the sign of i/o trouble.
So it might be better to read the byte size of the file with stat(),
and use that value to iterate (read) n bytes at a time (that's what I
do in CL).
The other potential problem is catching exceptions when reading the
the corrupted section; I think "eval{ }; warn;" is supposed to do that
in perl, but I've not had success in getting it to work like an
exception handler in CL.
So regardless of how I wind up iterating through the file, if I can't
handle the bad read and maintain control, it won't work.
But that's just a guess based on a quick read; I'll experiment with it
and find out what really happens.