zipped socket

John · Aug 8, 2005

Is there anyway open a socket so that every send/listen/recv
goes thru a zipping/unzipping process automatically?

Thanks,
--j

jepler · Aug 8, 2005

As far as I know, there is not a prefabbed solution for this problem. One
issue that you must solve is the issue of buffering (when must some data you've
written to the compressor really go out to the other side) and the issue of
what to do when a read() or recv() reads gzipped bytes but these don't produce any
additional unzipped bytes---this is a problem because normally a read() that
returns '' indicates end-of-file.

If you only work with whole files at a time, then one easy thing to do is use
the 'zlib' encoding: 'abc'
... but because zlib isn't self-delimiting, this won't work if you want to
write() multiple times, or if you want to read() less than the full file

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFC90dSJd01MZaTXX0RAlxtAKCYInTY85Hkbw1HUxGRAcIVQgnAPACgkQ8D
Qm3hbkJFwW7BZ1J34zd/4eE=
=E6gg
-----END PGP SIGNATURE-----

Peter Hansen · Aug 8, 2005

John said:
Is there anyway open a socket so that every send/listen/recv
goes thru a zipping/unzipping process automatically?

You ought to be able to do this easily by wrapping a bz2 compressor
around the socket (maybe using socket.makefile() to return a file object
first) and probably using a generator as well:

http://effbot.org/librarybook/bz2.htm includes relevant examples (not
specifically with sockets though).

Googling for "python incremental compression" ought to turn up any other
alternatives.

-Peter

Bryan Olson · Aug 10, 2005

> As far as I know, there is not a prefabbed solution for this problem. One
> issue that you must solve is the issue of buffering (when must some data you've
> written to the compressor really go out to the other side) and the issue of
> what to do when a read() or recv() reads gzipped bytes but these don't produce any
> additional unzipped bytes---this is a problem because normally a read() that
> returns '' indicates end-of-file.
>
> If you only work with whole files at a time, then one easy thing to do is use
> the 'zlib' encoding:
> 'abc'
> ... but because zlib isn't self-delimiting, this won't work if you want to
> write() multiple times, or if you want to read() less than the full file

That's basically a solved problem; zlib does have a kind of
self-delimiting. The key is the 'flush' method of the
compression object:

some_send_function( compressor.flush(Z_SYNC_FLUSH) )

The Python module doc is unclear/wrong on this, but zlib.h
explains:

If the parameter flush is set to Z_SYNC_FLUSH, all pending
output is flushed to the output buffer and the output is
aligned on a byte boundary, so that the decompressor can get
all input data available so far.

There's also Z_FULL_FLUSH, which also re-sets the compression
dictionary. For a stream socket, we'd usually want to keep the
dictionary, since that's what gives us the compression. The
Python doc states:

Z_SYNC_FLUSH and Z_FULL_FLUSH allow compressing further
strings of data and are used to allow partial error recovery
on decompression

That's not correct. Z_FULL_FLUSH allows recovery after errors,
but Z_SYNC_FLUSH is just to allow pushing all the compressor's
input to the decompressor's output.

socket programming	8	May 4, 2013
Receive packet using socket	2	Oct 9, 2013
python socket query	4	Dec 23, 2013
socket client and server in one application?	6	May 5, 2014
IO::Socket client	26	Apr 28, 2014
Correct method to start threads and socket programming	1	Nov 30, 2015
Return status from methods in socket class	1	Sep 26, 2007
Problem: read block of bytes from socket.	4	Jul 9, 2013

zipped socket

John

jepler

Peter Hansen

Bryan Olson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads