Zip Inflator

T

Tony O'Bryan

I'm trying to read a Zip file, and am having problems. First, here is my
process:

1) Create a ZipFile object based on a filename.
2) Get a ZipEntry object based on a filename in the archive.
3) Create an Inflator object with nowrap true. Removing true generates an
"unknown compression type" error.
4) Get an InputStream object from the ZipFile object, based on the ZipEntry.
5) Read the compressed data via the InputStream object.
6) Set the Inflator object's input to the compressed data.
7) Call the Inflator's inflate method to get decompressed data.

Steps 1 through 6 work fine, but my program fails at step 7. I always get
"java.util.zip.DataFormatException: invalid block type" at the call to
inflate. Here's the source code to go along with the problem:

========================================================

ZipFile zipFile = new ZipFile(args[0]);
ZipEntry entry = zipFile.getEntry(args[1]);
Inflater inflater = new Inflater(true);
InputStream input = zipFile.getInputStream(entry);
long lCompressedSize = entry.getCompressedSize();
byte [] baCompressed = new byte[(int)lCompressedSize];
byte [] baUncompressed = new byte[(int)entry.getSize()];
int nBytesRead;

nBytesRead = input.read(baCompressed);
inflater.setInput(baCompressed,0,nBytesRead);
inflater.inflate(baUncompressed);

=========================================================

I have confirmed results at each stage, up to and including reading the
compressed data into baCompressed.

Is there some gotcha in Java's Zip handling that I'm not aware of?
 
C

Chris Uppal

Tony said:
1) Create a ZipFile object based on a filename.
2) Get a ZipEntry object based on a filename in the archive.
3) Create an Inflator object with nowrap true. Removing true generates an
"unknown compression type" error.

Why create an Inflator object ? ZipFile/ZipEnty decompress the data (where
necessary) for you.
4) Get an InputStream object from the ZipFile object, based on the
ZipEntry.

Specifically, the contents of this stream are already decompressed.

-- chris
 
T

Tony O'Bryan

Chris said:
Why create an Inflator object ? ZipFile/ZipEnty decompress the data
(where necessary) for you.
Specifically, the contents of this stream are already decompressed.

That's what I hoped at first, but I didn't see any way to get access to the
decompressed data from ZipEntry without (presumably) using the InputStream.

When I did use the InputStream, though, the read method returned the exact
number of bytes for the compressed data. The returned data also didn't
match the original data. I therefore assumed that it wasn't yet
decompressed.

How do I get access to the decompressed data from the ZipEntry?
 
C

Chris Uppal

Tony said:
When I did use the InputStream, though, the read method returned the exact
number of bytes for the compressed data. The returned data also didn't
match the original data. I therefore assumed that it wasn't yet
decompressed.

Hmm. That seems strange. The number of bytes returned by each read() means
nothing much in itself, of course, or do you mean that the /total/ number of
bytes read is the same as the compressed size ?

Anyway some code dragged out of example that used to work for me (not
re-tested):

ZipFile zipFile = new ZipFile("xxx");
ZipEntry entry = zipFile.getEntry("yyy");

byte[] buffer = new byte[1024 * 8];
InputStream stream = zipFile.getInputStream(entry);
while (stream.read(buffer) >= 0)
{
// do something with the buffer
}
stream.close();
zipFile.close();

each buffer's worth of data is uncompressed. Please remember that a call to
read() is /not/ required to fill the supplied buffer before returning, even if
there is enough data available to do so.

-- chris
 
R

Roedy Green

5) Read the compressed data via the InputStream object.

That should be sufficient. The InputStream automatically
decompresses. There is no need to fool around with Inflators.

File zf = new File(... );
ZipFile zip = new ZipFile( zf );
// for each element in the zip
// can't use for:each
// only works with Iterator not Enumeration.
for ( Enumeration e = zip.entries(); e.hasMoreElements(); )
{
ZipEntry entry = (ZipEntry)e.nextElement();
String elementName = entry.getName();
ZipInputStream = zip.getInputStream( entry );
...
} // end for each element in the zip
 
T

Tony O'Bryan

Chris said:
Hmm. That seems strange. The number of bytes returned by each read()
means nothing much in itself, of course, or do you mean that the /total/
number of bytes read is the same as the compressed size ?

I meant the total size. However, there was a bug in my code. I was reading
into a buffer equal in size to that of the compressed data, so naturally I
was receiving only that many bytes.

[my keyboard died right in the middle of typing this. I had to go out to
the store for a new one]

You are correct that ZipEntry has the decompressed data via its InputStream.
I wrote the ASCII value of each to stdout one byte at a time, and each byte
is correct. I am also getting the correct number of bytes.

My problem is that my FileWriter is writing a lot of extra bytes to the file
for some reason. It is also somehow altering the bytes that it writes,
which is why I initially thought the Zip objects weren't correct. It seems
at first glance to be masking out bit 5 of every byte for some reason.

My test JPEG is 2986 bytes uncompressed, but the FileWriter is outputting
5598 bytes. This is my decompression loop code:

while ( (nBytesRead = input.read(baUncompressed)) > -1)
{
nTotal += nBytesRead;
System.out.println("Bytes read: " + nBytesRead);
for (int i = 0;i < nBytesRead;i++)
{
System.out.println(i + "=[" + baUncompressed + "]");
writerUncompressed.write(baUncompressed);
}
}
writerUncompressed.close();
 
R

Roedy Green

[my keyboard died right in the middle of typing this.

You can often revive them by popping keys and cleaning out with a
Q-tip soaked in alcohol. Bits of grit or hair in the keyboard can
block a key from contacting properly.

Sometimes all you have to do in hit the alt,ctrl and shift keys to get
the software back in sync. And don't forget to check keypad mode is
off.
 
T

Tony O'Bryan

Roedy said:
that is because the bytes are perfect as they are. Just write them out
as raw bytes. See http://mindprod.com/applets/fileio.html
for how.

Writing them raw was what I was trying to do. Being away from Java for
almost two years has taken its toll on my memory of the class libraries.

Actually, writing them to disk was just supposed to be a technique to verify
the accuracy of the Zip read. My application doesn't actually need to
write anything it unzips. I just don't like it when something doesn't
work, even a throwaway code path, so I had to figure it out -grin-.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,737
Latest member
Georgeengab

Latest Threads

Top