Info regarding Zlib::GzipReader

J

J-H Johansen

Hi,

I'm trying to parse through a gzip'ed proxy access log with
Zlib::GzipReader and I'm having some difficulties.

f = File.open(file, "r")
gz = Zlib::GzipReader.new(f)
gz.readlines.each do |block|
puts block
end

What this piece of code will do is to read the first 6 lines of the
proxy log before it reaches (what it believes to be) the end of the
file. These few lines happens to be the info header which contains:

#Software: ......
#Version: ......
#Start-date: ......
#Date: ......
#Fields: ....................
#Remark: ........

The access log contains a wee bit more than that though (980796 lines).
By just using File.open(file) it seems I can read the whole file.

I'm speculating here but I think that maybe the gzip file may have
been buffered. I.e. first 6 lines has been gzip'ed and then the rest
of the file has been gzip'ed and appended to it afterwards.

One way of fixing the problem is to gunzip the file and then gzip the
output into a new file. Problem solved (sort of).

Do any of you know of any other way to do this without actually
modifying the access logs ?

I'm thinking of something along the lines of breaking up the file into
smaller file handles which in turn can be used by GzipReader, but I
don't know how this is done.

Anyone know how this can be done or if there is any better ways of doing it ?


Thanks
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,736
Latest member
zacharyharris

Latest Threads

Top