Getting files out of a .tar.gz archive

C

celldee

Hi,

I have a .tar.gz file containing some xml files. I need to locate
particular files in the archive and process them. I'm looking for a
pure Ruby way to do this without resorting to external system
commands.

I found Archive::Tar::Minitar that allows me to process my files once
I have expanded the .tar.gz file to a .tar file. So I can do this :-

open(@tarfile, "rb") do |f|
Archive::Tar::Minitar::Reader.open(f).each do |entry|
fpl = StringIO.new( entry.read) if entry.name =~ /
#{@date}#{channel}_pl/
fpi = StringIO.new( entry.read) if entry.name =~ /
#{@date}#{channel}_pi/
end
end

However, in order to get a tar file, I have to call gunzip to expand
my .tar.gz file. Does anybody know of a way for me to replace the
gunzip call with a Ruby library call of some sort? Or does anyone have
suggestions for an alternative way to do this?

Cheers,

Chris
http://smuby.org
 
T

Todd Benson

Hi,

I have a .tar.gz file containing some xml files. I need to locate
particular files in the archive and process them. I'm looking for a
pure Ruby way to do this without resorting to external system
commands.

I found Archive::Tar::Minitar that allows me to process my files once
I have expanded the .tar.gz file to a .tar file. So I can do this :-

open(@tarfile, "rb") do |f|
Archive::Tar::Minitar::Reader.open(f).each do |entry|
fpl = StringIO.new( entry.read) if entry.name =~ /
#{@date}#{channel}_pl/
fpi = StringIO.new( entry.read) if entry.name =~ /
#{@date}#{channel}_pi/
end
end

However, in order to get a tar file, I have to call gunzip to expand
my .tar.gz file. Does anybody know of a way for me to replace the
gunzip call with a Ruby library call of some sort? Or does anyone have
suggestions for an alternative way to do this?

Cheers,

Chris

There's Zlib::Gzip* classes. I've never used them, though.

Todd
 
C

celldee

Hi Todd,

I've tried to use Zlib::GzipReader, but that just gives me a
continuous stream of text or a series of strings that do not resemble
the actual file structure, unless I've missed something.

Thanks,

Chris
http://smuby.org
 
D

Daniel Brumbaugh Keeney

Hi,

I have a .tar.gz file containing some xml files. I need to locate
particular files in the archive and process them. I'm looking for a
pure Ruby way to do this without resorting to external system
commands.

I found Archive::Tar::Minitar that allows me to process my files once
I have expanded the .tar.gz file to a .tar file. So I can do this :-

open(@tarfile, "rb") do |f|
Archive::Tar::Minitar::Reader.open(f).each do |entry|
fpl = StringIO.new( entry.read) if entry.name =~ /
#{@date}#{channel}_pl/
fpi = StringIO.new( entry.read) if entry.name =~ /
#{@date}#{channel}_pi/
end
end

However, in order to get a tar file, I have to call gunzip to expand
my .tar.gz file. Does anybody know of a way for me to replace the
gunzip call with a Ruby library call of some sort? Or does anyone have
suggestions for an alternative way to do this?

Cheers,

Chris
http://smuby.org

Use the docs.

From Minitar's readme:
tgz = Zlib::GzipReader.new(File.open('test.tgz', 'rb'))
# Warning: tgz and the file will be closed.
Minitar.unpack(tgz, 'x')

For GZip and the rest of the standard library,
http://www.ruby-doc.org/stdlib/

Daniel Brumbaugh Keeney
 
C

celldee

Hi Daniel,

I saw that, but I don't want to expand the .tar.gz any more than I
have to. The code that I put up earlier is getting what I want out of
the .tar file using Minitar::Reader which I quite like, I'm just
looking to eliminate the gunzip step. Minitar.unpack expands
the .tar.gz and writes the files to disk, which means that I'll have
to mess around in the filesystem more than I need to.

Thanks for your reply,

Chris
http://smuby.org
 
J

Joachim Glauche

celldee said:
I saw that, but I don't want to expand the .tar.gz any more than I
have to. The code that I put up earlier is getting what I want out of
the .tar file using Minitar::Reader which I quite like, I'm just
looking to eliminate the gunzip step.

If packed with Gzip, you always have to unpack it in order to read the
content.

Tar is the container and Gzip does the compression. So the files in a
tar.gz file are first put into a container then packed. So in order you
read anything inside the tar you have to unpack the file to have access
the tarball to read your file inside.
 
D

Daniel Brumbaugh Keeney

Hi Daniel,

I saw that, but I don't want to expand the .tar.gz any more than I
have to. The code that I put up earlier is getting what I want out of
the .tar file using Minitar::Reader which I quite like, I'm just
looking to eliminate the gunzip step. Minitar.unpack expands
the .tar.gz and writes the files to disk, which means that I'll have
to mess around in the filesystem more than I need to.

Thanks for your reply,

Chris


My apologies for failing to understand the issue. GZip and Minitar
both provide incremental readers, although I have not used them. I
believe the correct combination for what you're asking is this:

tgz = Zlib::GzipReader.new(File.open('test.tgz', 'rb'))
# Warning: tgz and the file will be closed.
reader = Minitar::Reader.new(tgz)
reader.each_entry do |file|
#do something with each file, and break if you like
end
reader.close # does this do anything?
tgz.close

Daniel Brumbaugh Keeney
 
D

Daniel Brumbaugh Keeney

tgz = Zlib::GzipReader.new(File.open('test.tgz', 'rb'))
# Warning: tgz and the file will be closed.

I seem to have failed to remove the former comment. It looks like
Minitar::Reader at no point closes tgz or the file, and therefore tgz
needs to be closed manually afterward, as I my code correctly
demonstrated, though the comment did not.

Daniel Brumbaugh Keeney
 
P

Paul Mckibbin

Daniel said:
I seem to have failed to remove the former comment. It looks like
Minitar::Reader at no point closes tgz or the file, and therefore tgz
needs to be closed manually afterward, as I my code correctly
demonstrated, though the comment did not.

Daniel Brumbaugh Keeney

You can use Zlib::GzipReader.open with a block. (Since I set Chris the
original problem, and this isn't the hard part, I'll show the code that
is used in the app, with a couple of minor mods to make it look like
Chris's example.)

Zlib::GzipReader.open(@tarfile) { |tgz|
Archive::Tar::Minitar::Reader.open(tgz).each do |entry|

# Chris's code
# fpl = StringIO.new( entry.read) if entry.name =~
/#{@date}#{channel}_pl/
# fpi = StringIO.new( entry.read) if entry.name =~
/#{@date}#{channel}_pi/

# Or test the verification by using the XML document directly
# fpl=REXML::Document.new(StringIO.new(entry.read)) if entry.name
=~/#{date}#{channel}_pl/
# fpi=REXML::Document.new(StringIO.new(entry.read)) if entry.name
=~/#{date}#{channel}_pi/

end
}

Mac
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,740
Latest member
AdolphBig6

Latest Threads

Top