Query about compression and decompression in Java using java.util.zip package

Y

yogesh

I am compressing/decompressing objects and sending them over http as
per the article mentioned in the URL below:-
http://java.sun.com/developer/technicalArticles/Programming/compression/

However I have a query regarding some code in which the following
sequence is depicted.

import java.io.*;
import java.util.zip.*;

public class SaveEmployee {
public static void main(String argv[]) throws
Exception {
// create some objects
Employee sarah = new Employee("S. Jordan", 28,
56000);
Employee sam = new Employee("S. McDonald", 29,
58000);
// serialize the objects sarah and sam
FileOutputStream fos = new
FileOutputStream("db");
GZIPOutputStream gz = new GZIPOutputStream(fos);
ObjectOutputStream oos = new
ObjectOutputStream(gz);
oos.writeObject(sarah);
oos.writeObject(sam);
oos.flush();
oos.close();
fos.close();
}
}

I wanted to know what is the logical explanation of the sequence of
the statements marked below

GZIPOutputStream gz = new GZIPOutputStream(fos);
ObjectOutputStream oos = new
ObjectOutputStream(gz);
oos.writeObject(sarah);


It seems to me that at the time the GZIPOutputStream is created, it is
empty and the then the empty stream is passed on to ObjectOutputStream
and then objects are written into the stream.

It seems to me that the logical sequence should be

ObjectOutputStream oos = new
ObjectOutputStream(fos);
oos.writeObject(sarah);
oos.writeObject(sam);
GZIPOutputStream gz = new GZIPOutputStream(oos);
gz.flush ();
gz.close();
oos.close();

i.e first the ObjectOutputStream is created,then objects are written
onto the stream and afterwards the stream is zipped using
GZIPOutputStream.


In my code if I try to reverse the order(the second case) I get the
error
java.io.EOFException
at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:200)
at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:190)
at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:130)
at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:58)
at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:68)

Why doesnt this work ?.


Any help would be greatly appreciated
 
F

Filip Larsen

yogesh wrote
[...]
I wanted to know what is the logical explanation of the sequence of
the statements marked below

GZIPOutputStream gz = new GZIPOutputStream(fos);
ObjectOutputStream oos = new
ObjectOutputStream(gz);
oos.writeObject(sarah);

With standard Java I/O one or more streams are connected into a kind of
pipeline of processing. You put data into one end which, after
processing (typically including buffering and modification), appear in
the other end in a file, on network socket, in byte array buffer, or
similar.

In your case, you want the object stream data to be compressed before it
is placed in a file, hence you must have the pipeline:
ObjectOutputStream -> GZIPOutputStream -> FileOutputStream. When you
write an object to the ObjectOutputStream it will emit a sequence of
bytes that are zipped by the GZIPOutputStream which, when enought bytes
have been received, will emit a sequence of zipped bytes to the
FileOutputStream. When you close or flush the ObjectOutputStream in the
end, the OutputStreams of your pipeline will flush any buffered data
they might contain.

It seems to me that at the time the GZIPOutputStream is created, it is
empty and the then the empty stream is passed on to ObjectOutputStream
and then objects are written into the stream.

Think of it as a pipeline where data you put in may appear right away in
the other end. The OutputStreams are meant to process data, not to store
it as such. Of course, some streams have to buffer a bit of data in
order to work or perform better, but in principle they do not store
data.


Regards,
 
Y

yogesh

Hi Filip.
Thanks for your reply.However it has not answered my question
completely.
What seems to work is

FileOutputStream->GZIPOutputStream ->ObjectOutputStream->
ObjectOutputStream.writeObject() (output end)
FileInputStream->GZIPInputStream ->ObjectInputStream->
ObjectInputStream.readObject() (input end)

i.e the stream is first zipped and then sent as objects.
Here the stream is first zipped at the output end before writing the
actual object.So how come the objects come out zipped if they are
written later into the stream.

What should work (but does not) is

FileOutputStream->->ObjectOutputStream->
ObjectOutputStream.writeObject()->GZIPOutputStream (output end)
(the objects are written first and then zipped and sent)


FileInputStream->GZIPInputStream ->ObjectInputStream->
ObjectInputStream.readObject() (input end)
(at input they are unzipped and then read)

Any light on this would be appreciated.

Thanks
yogesh.

-> ->

Filip Larsen said:
yogesh wrote
[...]
I wanted to know what is the logical explanation of the sequence of
the statements marked below

GZIPOutputStream gz = new GZIPOutputStream(fos);
ObjectOutputStream oos = new
ObjectOutputStream(gz);
oos.writeObject(sarah);

With standard Java I/O one or more streams are connected into a kind of
pipeline of processing. You put data into one end which, after
processing (typically including buffering and modification), appear in
the other end in a file, on network socket, in byte array buffer, or
similar.

In your case, you want the object stream data to be compressed before it
is placed in a file, hence you must have the pipeline:
ObjectOutputStream -> GZIPOutputStream -> FileOutputStream. When you
write an object to the ObjectOutputStream it will emit a sequence of
bytes that are zipped by the GZIPOutputStream which, when enought bytes
have been received, will emit a sequence of zipped bytes to the
FileOutputStream. When you close or flush the ObjectOutputStream in the
end, the OutputStreams of your pipeline will flush any buffered data
they might contain.

It seems to me that at the time the GZIPOutputStream is created, it is
empty and the then the empty stream is passed on to ObjectOutputStream
and then objects are written into the stream.

Think of it as a pipeline where data you put in may appear right away in
the other end. The OutputStreams are meant to process data, not to store
it as such. Of course, some streams have to buffer a bit of data in
order to work or perform better, but in principle they do not store
data.


Regards,
 
M

Michael Borgwardt

yogesh said:
Thanks for your reply.However it has not answered my question
completely.
What seems to work is

FileOutputStream->GZIPOutputStream ->ObjectOutputStream->
ObjectOutputStream.writeObject() (output end)
FileInputStream->GZIPInputStream ->ObjectInputStream->
ObjectInputStream.readObject() (input end)

i.e the stream is first zipped and then sent as objects.
No.

Here the stream is first zipped at the output end before writing the
actual object.So how come the objects come out zipped if they are
written later into the stream.

You seem to have problems understanding what method calls are.
A method call has parameters and a return value. In the "writeObject"
case only the parameters are relevant, but in the "readObject" case
only the return value is relevant. And they are processed in the
opposite order.

The call writeObject() does the following, in that order:

- The ObjectOutputStream encodes (serializes) the object passed
as parameter into a sequence of bytes and passes it to the
GZIPOutputStream.
- The GZIPOutputStream compresses the sequence of bytes to a
(probably) shorter sequence of bytes and passes it to the
FileOutputStream.
- The FileOutputStream writes the resulting bytes to a file.

This is simple because there are no return values (actually there
are at the lower levels, but it's not relevant for understanding
what happens).

readObject() on the other hand does this:

- The ObjectInputStream asks the GZIPInputStream to supply bytes
that can be decoded into an object.
- the GZIPInputStream asks the FileInputStream for bytes to
decompress.
- the FileInputStream reads the bytes from the file and returns
them to the GZIPInputStream.
- the GZIPInputStream decompresses the bytes and returns the resulting
longer byte sequence to the ObjectInputStream.
- the ObjectInputStream decodes (deserializes) the byte sequence and
returns the resulting object to the calling method.
 
Joined
Dec 30, 2009
Messages
3
Reaction score
0
Query On DataCompression using java.util.zip.

I have used the following code to write into a file using object stream.

public void writeData() {
try {
oos = new ObjectOutputStream(new GZIPOutputStream(new FileOutputStream(new File("D:/xyz.dat"), true)));
DataContainer oContainer= new DataContainer(i++, i);
oos.writeObject(oContainer);
oos.flush();
oos.close();
} catch (Exception e) {
e.printStackTrace();
}
}

public class DataContainer implements Serializable{
private int id;
private int data;

public DataContainer(int id,int data){
this.id=id;
this.data=data;
}

public void printData(){
System.out.println("ID->>"+this.id+"\tDATA->>"+this.data);
}
}

And I try to read with the following code
public void readaData() {
try {
ObjectInputStream ois = new ObjectInputStream(new GZIPInputStream(new FileInputStream("D:/xyz.dat")));
while (true) {
try {
DataContainer oDataContainer = (DataContainer) ois.readObject();
oDataContainer.printData();
} catch (Exception e1) {
e1.printStackTrace();
break;
}
}
ois.close();
Thread.sleep(1000);
} catch (Exception e) {
e.printStackTrace();
}
}


While on read i'm getting EOF File exception before the expected read operation.
I could observe that Once if I use ois.close(); in the write() method then if again I am appending to that file then after the first write() an EOF is placed in the file which blocks me from the requirement.


My requirement is that I want to append Object data to a file in compressed format and read it fully


Can anyone can help me in this regard.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top