filehandle to a member of a zip archive

S

scottmf

I am trying to write a modified file open function that can take an
ascii text file, a .gz file or a .zip file and create a filehandle for
reading or writing using the appropriate module. I have a routine
working for ascii text or .gz files but I cannot figure out how to get
it working for zip files. The goal is to replace the standard open
command in several of my scripts with my modified open, but to not have
to change any other code and be able to read from or write to any file
type simply based on the filename I use.

Here is what I have:

my $test_fh_in;
my $test_fh_out;
mod_open($test_fh_in, "test_in.zip");
mod_open($test_fh_out, ">test_out.zip");
while(<$test_fh_in>){
print "$_";
print $test_fh_out "$_";
}

sub mod_open{
## Libraries needed to to zipped file reading/writing
use File::Basename;
use IO::File ;
use IO::Zlib ;
use Archive::Zip;
my $FH;
my $file_name = $_[1];
my $read; ## boolean indicating whether to open the file for
read or write 1 = read, 0 = write
## Need to look into appending as well.....
if($file_name =~ /^\>/){ ## Check to see if the file is to be
opened for reading or writing
$read = 0;
$file_name =~ s/^\>//; ## drop the read/write identifier from the
string so it is a valid filename
}
else{
$read = 1;
}
## define a list of zip suffixes
my @suffixlist = (
".gz",
".gzip",
".zip"
);
# check for filename extension - if .gz use the the Zlib, else do
not...
my ($file_base, $file_path, $file_type) =
fileparse($file_name,@suffixlist);

if($read){ ## if the file is supposed to be opened for reading
then do so
if ( $file_type eq "" ) {
print "The input file $file_name is standard ASCII -
uncompressed\n" ;
$FH = IO::File->new($file_name, "r") ;
}
elsif($file_type =~ /gz(ip)?/) {
print "The input file $file_name is compressed using gzip\n" ;
$FH = IO::Zlib->new($file_name, "rb") ;
}
elsif($file_type =~ /zip/){
print "The input file $file_name is compressed using winzip\n" ;
my $zp = Archive::Zip->new($file_name); ## open the zip file
my $numMembers = $zp->numberOfMembers(); ## find out what files
are in the zip file
if($numMembers>1){
die "This routine only supports zip archives with one file\n";
}
## need to get a filehandle for the compressed file
}
}
elsif(!$read){ ## if the file is supposed to be opened for writing
the do so
if ( $file_type eq "" ) {
print "The output file $file_name is standard ASCII -
uncompressed\n" ;
$FH = IO::File->new($file_name, "w") ;
}
elsif($file_type =~ /gz(ip)?/) {
print "The output file $file_name is compressed\n" ;
$FH = IO::Zlib->new($file_name, "wb") ;
}
elsif($file_type =~ /zip/){
print "The output file $in_file is compressed using winzip\n" ;
my $zp = Archive::Zip->new($file_name); ## create the zip file
## need to create a filehandle to a compressed file
}
}
## make sure the file got opened or created
if (!defined $FH) {
die "Cannot open file: $file_name $!\n";
}
## now pass the file handle back to the user for reading or writing
$_[0] = $FH;
}
 
P

Paul Marquess

scottmf said:
I am trying to write a modified file open function that can take an
ascii text file, a .gz file or a .zip file and create a filehandle for
reading or writing using the appropriate module. I have a routine
working for ascii text or .gz files but I cannot figure out how to get
it working for zip files. The goal is to replace the standard open
command in several of my scripts with my modified open, but to not have
to change any other code and be able to read from or write to any file
type simply based on the filename I use.

For the reading interface in particular, you might wat to have a look at
IO::Uncompress::AnyUncompress - this will auto-detect a number of
compression formats, including gzip and zip. It also has a feature to work
in a passthrough mode if the data isn't compressed. Assuming you have the
zlib module (IO::Compress::Zlib) installed, this is all you need to open for
reading all of the file formats you are intrested in

$FH = IO::Uncompress::AnyUncompress->new($file_name, Transparent =>1)

If you also have IO::Compress::Bzip2 and/or IO::Compress::Lzop installed,
you can add bzip2 and lzop compressed files to the list of formats that
AnyUncompress can handle.

If you don't want to go down the auto-detection path, this will create a
filehandle that will read the first element from a zip file.

$FH = IO::Uncompress::Unzip->new($file_name)

For writing to zip files, I can't comment on Archive::Zip because I don't
know it that well, but I can comment on IO::Compress::Zip, because I wrote
it. This will create a filehandle to allow writing to a zip file

$FH = IO::Compress::Zip->new($file_name, Name => "whatever")

Paul
 
S

scottmf

Thanks, I'll look into that when I get a chance. Also, is there any
way I can make my subroutine take in a barwood operator for the
filehandle. I would like to be able to easily implement this in old
code I have by just replacing my open(FH, "filename") with mod_open(FH,
"filename") instead of having to use $FH and changing all the
references throughout the script from FH to $FH

Thanks for the help,
~Scott
 
T

Tad McClellan

scottmf said:
Thanks, I'll look into that when I get a chance.


Look into what?

Please quote some context in followups like everybody else does.

Also, is there any
way I can make my subroutine take in a barwood operator for the
filehandle.


No.

There is no such thing as a "bareword operator", so I guess
you meant "bareword filehandle" instead.

I would like to be able to easily implement this in old
code I have by just replacing my open(FH, "filename") with mod_open(FH,
"filename") instead of having to use $FH

s/open\(/mod_open(/;

is a lot easier than

s/open\(/mod_open(\$/;

??

and changing all the
references throughout the script from FH to $FH


That could be cumbersome.

(Seems like:
s/<FH>/<\$FH>/g;
s/(printf? )FH/$1\$FH/g;
would come pretty darn close though.

You can pass a type glob, or a reference to a type glob as
described in the answer to your Frequently Asked Question:

perldoc -q filehandle

How can I make a filehandle local to a subroutine? How do I pass file-
handles between subroutines? How do I make an array of filehandles?
 
D

DJ Stunks

scottmf said:
Thanks, I'll look into that when I get a chance. Also, is there any
way I can make my subroutine take in a barwood operator for the
filehandle. I would like to be able to easily implement this in old
code I have by just replacing my open(FH, "filename") with mod_open(FH,
"filename") instead of having to use $FH and changing all the
references throughout the script from FH to $FH

Thanks for the help,
~Scott

perl -pi~ -e " s'FH'$FH' " source.pl

-jp
 
P

Paul Marquess

DJ Stunks said:
perl -pi~ -e " s'FH'$FH' " source.pl

You could try something like this

mod_open(FH, "filename");

sub mod_open{
...
$FH = IO::File->new($file_name, "r") ;
...
## now pass the file handle back to the user for reading or writing
my $href = \*{ $_[0] };
$$href = $FH;
}
 
S

scottmf

s/open\(/mod_open(/;
is a lot easier than

s/open\(/mod_open(\$/;

(Seems like:
s/<FH>/<\$FH>/g;
s/(printf? )FH/$1\$FH/g;
would come pretty darn close though.

While I realize I could go through all of the scripts I need to update
and change every filehandle reference, it seems like there *should* be
a way to have my modified open function simply take a bareword
filehandle the same as open() does (and still "use strict;") After
reading perdoc -q filehandle I have had some success passing the
type_glob by reference, but I cannot get this to work with IO::File or
IO::Zlib; see the code below for an example. I can do exactly what I
want with the open() function, but I cannot get it to work with the
IO::File->new or IO::Zlib->new functions.

Thanks,
~Scott

#!/usr/local/bin/perl
#
use strict;
use warnings;
use IO::File;
use IO::Zlib;

new_open(\*FH_IN, "test.txt") || die "Can't open the file\n"; ## Pass
type glob by reference

while(<FH_IN>){ ## read the file without any modification to the
filehandle
print "$_";
}
close(FH_IN);

sub new_open{
my $FH = $_[0];
my $file_name = $_[1];
#$FH = IO::File->new($file_name, "r"); ## This also does not work
#$FH = IO::Zlib->new($file_name, "rb") ; ## this does not work if I
have a .gz file
open($FH, "$file_name") || die "Can't open $!\n"; ## This method
works
}
 
S

scottmf

Okay, I think I finally got things working the way I want (See code
below). Thanks for all the help. If anyone has any ideas on how to
improve this function it would still be greatly appreciated.

Thanks,
~Scott


#!/usr/local/bin/perl
#
use strict;
use warnings;

new_open(\*FH_IN, "test.txt.gz") || die "Can't open the file\n"; ##
Pass typeglob by reference

while(<FH_IN>){ ## No change to bareword filehandle throughout code
print "$_";
}
close(FH_IN);

sub new_open{
my $FH = $_[0];
my $file_name = $_[1];
if($file_name =~ /\.gz$/){ ## check to see if it is a gzip file or
not
tie *$FH, 'IO::Zlib', "$file_name", "rb";
}
else{
tie *$FH, 'IO::File', "$file_name", "r";
}
}
 
B

Ben Morrow

Quoth "scottmf said:
While I realize I could go through all of the scripts I need to update
and change every filehandle reference, it seems like there *should* be
a way to have my modified open function simply take a bareword
filehandle the same as open() does (and still "use strict;")

Use prototypes. This is what they are for.

~% perl -le'print prototype "CORE::eek:pen"'
*;$@

The fact the second arg is optional is probably not something you want
to emulate, so

use strict;
use Symbol;

sub my_open (*$;@) {

my $FH = defined $_[0] ?
Symbol::qualify_to_ref $_[0], caller :
($_[0] = Symbol::gensym);
shift;

my $op = shift;

return open $FH, $op, @_;
}

will emulate open.

However, a much better idea if you're writing your own function is to
write it to return an open FH, rather than opening onto one passed in.
It makes the implementation much simpler, and, as you can see, makes
replacing the function with another later much easier.

Ben
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,739
Latest member
Clint8040

Latest Threads

Top