.gz and .bz2 files on the Perl command line

T

Ted Zlatanov

Say I have this program call

go.pl A B C.gz D.bz2

I'd like to just use <> in go.pl and have the compressed files
automatically extracted. Is there a way to do that without writing my
own logic, as a module I can bring in? This is a read-only task. I
couldn't find anything on CPAN.

Thanks
Ted
 
J

Jens Thoms Toerring

Ted Zlatanov said:
Say I have this program call
go.pl A B C.gz D.bz2
I'd like to just use <> in go.pl and have the compressed files
automatically extracted. Is there a way to do that without writing my
own logic, as a module I can bring in? This is a read-only task. I
couldn't find anything on CPAN.

There's the PerlIO:gzip module, to be used like this:

use PerlIO::gzip;
open FOO, "<:gzip", "file.gz" or die $!;
print while <FOO>;

And there's PerlIO::via::Bzip2, to be used like this

use PerlIO::via::Bzip2;
open my $fh, "<:via(Bzip2)", "file.bz2" or die $!;
print while <$f>;

Not that I would have used them, just found them on CPAN by
searching for 'gzip' and 'bzip'.

Regards, Jens
 
B

Ben Morrow

Quoth Ted Zlatanov said:
Say I have this program call

go.pl A B C.gz D.bz2

I'd like to just use <> in go.pl and have the compressed files
automatically extracted. Is there a way to do that without writing my
own logic, as a module I can bring in? This is a read-only task. I
couldn't find anything on CPAN.

Heh :). See the recent thread on p5p... The logic is trivial, at least
if your files have sane names:

BEGIN {
for (@ARGV) {
/\.gz$/ and $_ = "gzip -dc $_ |";
/\.bz2$/ and $_ = "bzip2 -dc $_ |";
}
}

since <> uses 2-arg open.

Ben
 
P

Paul Marquess

Ted said:
Say I have this program call

go.pl A B C.gz D.bz2

I'd like to just use <> in go.pl and have the compressed files
automatically extracted. Is there a way to do that without writing my
own logic, as a module I can bring in? This is a read-only task. I
couldn't find anything on CPAN.

Thanks
Ted

If you have IO::Uncompress::Zlib & IO::Uncompress::Bunzip2, then the code
below will allow you to read .gz , .bz2, .zip etc plus non-compressed files
as well.


use IO::Uncompress::AnyUncompress;

for my $file (@ARGV)
{
my $fh = new IO::Uncompress::AnyUncompress $file, Transparent => 1
or die "Cannot open $file: $AnyUncompressError\n";

while (<$fh>)
{
# whatever
}
}

The trick is to include the 'Transparent' option when using
IO::Uncompress::AnyUncompress. That will get it to read uncompressed files
if it doesn't detect any compression.

Paul
 
X

xhoster

Ben Morrow said:
Heh :). See the recent thread on p5p... The logic is trivial, at least
if your files have sane names:

BEGIN {
for (@ARGV) {
/\.gz$/ and $_ = "gzip -dc $_ |";
/\.bz2$/ and $_ = "bzip2 -dc $_ |";
}
}

since <> uses 2-arg open.


cat AutoZip.pm
use strict;
use warnings;
for (@ARGV) {
/\.gz$/ and $_ = "gzip -dc $_ |";
/\.bz2$/ and $_ = "bzip2 -dc $_ |";
}
1;
__END__

perl -MAutoZip -lne 'print' thing.gz

Yep, I get uncompressed lines out. It is a thing of beauty. Thanks
Ben.


Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
T

Ted Zlatanov

PM> If you have IO::Uncompress::Zlib & IO::Uncompress::Bunzip2, then the code
PM> below will allow you to read .gz , .bz2, .zip etc plus non-compressed files
PM> as well.

PM> use IO::Uncompress::AnyUncompress;

PM> for my $file (@ARGV)
PM> {
PM> my $fh = new IO::Uncompress::AnyUncompress $file, Transparent => 1
PM> or die "Cannot open $file: $AnyUncompressError\n";

PM> while (<$fh>)
PM> {
PM> # whatever
PM> }
PM> }

I was trying to avoid doing that work, I just wanted to use <> without
creating variables. This was my second choice :)

PM> The trick is to include the 'Transparent' option when using
PM> IO::Uncompress::AnyUncompress. That will get it to read uncompressed files
PM> if it doesn't detect any compression.

Cool, I will remember that.


BM> The logic is trivial, at least if your files have sane names:

BM> BEGIN {
BM> for (@ARGV) {
BM> /\.gz$/ and $_ = "gzip -dc $_ |";
BM> /\.bz2$/ and $_ = "bzip2 -dc $_ |";
BM> }
BM> }

BM> since <> uses 2-arg open.

Nice, I think that's perfect. It would be nice if I could use a module
to do this rewrite more magically using more "proper" (not depending on
gzip and bzip2 being available), but this is very nice.

On 5 Aug 2008 21:35:50 GMT (e-mail address removed) (Jens Thoms Toerring) wrote:

JTT> There's the PerlIO:gzip module, to be used like this:

JTT> use PerlIO::gzip;
JTT> open FOO, "<:gzip", "file.gz" or die $!;
JTT> print while <FOO>;

JTT> And there's PerlIO::via::Bzip2, to be used like this

JTT> use PerlIO::via::Bzip2;
JTT> open my $fh, "<:via(Bzip2)", "file.bz2" or die $!;
JTT> print while <$f>;

JTT> Not that I would have used them, just found them on CPAN by
JTT> searching for 'gzip' and 'bzip'.

As with Paul Marquess' suggestion, this is a good solution but I wanted
something more magical ;)

Thanks
Ted
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,209
Messages
2,571,089
Members
47,689
Latest member
kilaocrhtbfnr

Latest Threads

Top