capturing [A-Z]+ in binary file - how? - EMASTER_Z (0/1)

S

Steve D

$file_name = 'EMASTER_Z';

if (-B $file_name) {
print "$file_name - is a binary file\n";
}
if (-T $file_name) {
print "$file_name - is a text file\n";
}
open (MASTER_IN, "<$file_name") ||
die "unable to open $file_name file $!";
binmode(STDOUT);
binmode(MASTER_IN);
print "\n----------------- 0 ----------------------\n";
### lets me see the file as it is
print STDOUT <MASTER_IN>;

print "\n----------------- 1 ----------------------\n";
### place data into variable or array
### goal is to capture all text data but no matter what I do nothing
works
### have tried unpack but I do not fully understand it
@master_array = <MASTER_IN>;
chomp @master_array;

$var = pop(@master_array);
print STDOUT $var;
print "--------- done -----------\n";

## what am I doing wrong, or do not understand"
# thanks in advance
 
G

Gunnar Hjalmarsson

Steve said:
open (MASTER_IN, "<$file_name") ||
die "unable to open $file_name file $!";

### place data into variable or array
### goal is to capture all text data but no matter what I do nothing
works
### have tried unpack but I do not fully understand it

One way:

print map "$_\n", /([[:alpha:]]+)/g while <MASTER_IN>;
 
U

Uri Guttman

SD> open (MASTER_IN, "<$file_name") ||
SD> die "unable to open $file_name file $!";
SD> binmode(STDOUT);
SD> binmode(MASTER_IN);

SD> print STDOUT <MASTER_IN>;

that will read in all of the file and print it.

SD> @master_array = <MASTER_IN>;

what do you think is happening with that line? you have already read in
the file. handles can't figure out where you want to read from. you have
to open the file again or seek to the beginning to read it all in.

better yet, why do you want to read it all in twice? just read it into
an array and then print it or mung it. you have it backwards, printing
directly from the handle and then trying to read it in again.

and assuming a file is binary and then reading it in as lines makes
little sense. lines are normally found in text files. binary files can
have newlines in them but no guaratees of where and how many. i think
you need to rethink your whole solution.

uri
 
J

Joe Smith

Steve said:
print STDOUT <MASTER_IN>;

@master_array = <MASTER_IN>;

## what am I doing wrong, or do not understand"

The thing that you do not understand is that <> in list context
will read in the _entire_ file all at once, and print() supplies
list context.

print STDOUT <MASTER_IN>; # Read and write entire file
@master_array = <MASTER_IN>; # Reads nothing, file at EOF already

Binary files may or may not have any linefeeds, therefore you should
read in fixed-length records.

my $record_size = 8192;
open my $in,'<',$file_name or die "Cannot read $file_name: $!\n";
$/ = \$record_size;
while (<$in>) { 'do something with $_'; }

-Joe
 
S

Steve D

The thing that you do not understand is that <> in list context
will read in the _entire_ file all at once, and print() supplies
list context.

print STDOUT <MASTER_IN>; # Read and write entire file
@master_array = <MASTER_IN>; # Reads nothing, file at EOF already

Binary files may or may not have any linefeeds, therefore you should
read in fixed-length records.

my $record_size = 8192;
open my $in,'<',$file_name or die "Cannot read $file_name: $!\n";
$/ = \$record_size;
while (<$in>) { 'do something with $_'; }

-Joe


Thank you, this helps.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top