easy to use compression for log files without latency??

X

xyzzy12

We need to compress log files while they are being written out. We put
in the GZIP output stream classes and at first look they work fine BUT
you have to wait a good amount of time before they start to write
anything out (the file size remains small). After the stream gets a
chance to churn over a bunch of data then it starts to write out. But
its behind. Its pretty annoying for troubleshooting and that is what
we have the logs for! Is compression stream that we can use that is
optimized for regular text that we can use that doesn't need to buffer
stuff up so much? Can we "preseed" the gzip class so it will start to
write out sooner?

any suggestions?
 
R

Roedy Green

Can we "preseed" the gzip class so it will start to
write out sooner?

You can try a flush() method.

You can use unbuffered i/o.

You can use a smaller buffer. The default is 8k.

I am puzzled what you are doing with a gzip stream in debug mode. I
would just take off the gzip to debug so you can see what is going
down the pipe.

See http://mindprod.com/applets/fileio.html
for how to do buffered/unbuffered/compressed/uncompressed i/o.
 
H

Hemal Pandya

We need to compress log files while they are being written out. We put
in the GZIP output stream classes and at first look they work fine BUT
you have to wait a good amount of time before they start to write
anything out (the file size remains small). After the stream gets a
chance to churn over a bunch of data then it starts to write out. But
its behind. Its pretty annoying for troubleshooting and that is what
we have the logs for! Is compression stream that we can use that is
optimized for regular text that we can use that doesn't need to buffer
stuff up so much? Can we "preseed" the gzip class so it will start to
write out sooner?

any suggestions?

Consider rotating logs and compressing previously rotated logs.
 
R

Roedy Green

Consider rotating logs and compressing previously rotated logs.

that way a flush after each log entry would guarantee you would not
loose data on a crash, though it would slow you down.
 
X

xyzzy12

Thank your for the help. I'll try a smaller buffer size and
experimenting with these levels and strategies.

BEST_SPEED
Compression level for fastest compression.

FILTERED
Compression strategy best used for data consisting mostly of
small values with a somewhat random distribution.

I may try
HUFFMAN_ONLY Compression strategy for Huffman coding only.


I noticed that there isn't a way to set those on a gzip output stream,
but thru the protected variable def, I can set those values. Wish me
luck! :)
 
X

xyzzy12

Found the problem! I need Z_SYNC_FLUSH and only a true implementation
of Zlib will work.
http://www.jcraft.com/jzlib/

Why JZlib?
Java Platform API provides packages 'java.util.zip.*' for accessing to
zlib, but that support is very limited if you need to use the essence
of zlib. For example, we needed to full access to zlib to add the
packet compression support to pure Java SSH system, but they are
useless for our requirements.

There are even 2 java bug reports about the lack of proper flushing
capability!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top