creating ZIP files on the cheap

S

samwyse

I've got an app that's creating Open Office docs; if you don't know,
these are actually ZIP files with a different extension. In my case,
like many other people, I generating from boilerplate, so only one
component (content.xml) of my ZIP file will ever change. Instead of
creating the entire ZIP file each time, what is the cheapest way to
accomplish my goal? I'd kind-of like to just write the first part of
the file as a binary blob, then write my bit, then write most of the
table of contents as another blob, and finally write a TOC entry for
my bit. Has anyone ever done anything like this? Thanks.
 
U

uticdmarceau2007

samwyse said:
I've got an app that's creating Open Office docs; if you don't know,
these are actually ZIP files with a different extension. In my case,
like many other people, I generating from boilerplate, so only one
component (content.xml) of my ZIP file will ever change. Instead of
creating the entire ZIP file each time, what is the cheapest way to
accomplish my goal? I'd kind-of like to just write the first part of
the file as a binary blob, then write my bit, then write most of the
table of contents as another blob, and finally write a TOC entry for
my bit. Has anyone ever done anything like this? Thanks.
I think you intend on using simply the command-line infozip tool from
python system commands:
mycmd = "zip -r content.xml foo.zip"
mycmd = "zip -i your.blob foo.zip"
try:
mycmd = "zip -r content.xml foo.zip"
retcode = call(mycmd, shell=True)
mycmd = "zip -i your.blob foo.zip"
retcode = call(mycmd, shell=True)
if retcode < 0:
print >>sys.stderr, "Child was terminated by signal", -retcode
else:
print >>sys.stderr, "Child returned", retcode
except OSError, e:
print >>sys.stderr, "Execution failed:", e
 
L

Lie Ryan

I've got an app that's creating Open Office docs; if you don't know,
these are actually ZIP files with a different extension. In my case,
like many other people, I generating from boilerplate, so only one
component (content.xml) of my ZIP file will ever change. Instead of
creating the entire ZIP file each time, what is the cheapest way to
accomplish my goal? I'd kind-of like to just write the first part of
the file as a binary blob, then write my bit, then write most of the
table of contents as another blob, and finally write a TOC entry for
my bit. Has anyone ever done anything like this? Thanks.

You might want to look at solid and non-solid compression. Solid
compression writes all files in the binary as one huge block and must be
compressed/decompressed as a whole, while non-solid compression writes
the zipped file in chunks that can be decompressed individually.

I don't know if there's any way to compress/decompress/recompress as
non-solid compression from python though. Maybe others can point
something out, or maybe you can use an external zipper that understands
non-solid compression.
 
J

John Machin

I've got an app that's creating Open Office docs; if you don't know,
these are actually ZIP files with a different extension.  In my case,
like many other people, I generating from boilerplate, so only one
component (content.xml) of my ZIP file will ever change.  Instead of
creating the entire ZIP file each time, what is the cheapest way to
accomplish my goal?  I'd kind-of like to just write the first part of
the file as a binary blob, then write my bit, then write most of the
table of contents as another blob, and finally write a TOC entry for
my bit.  Has anyone ever done anything like this?  Thanks.

Option 1: set up a file that contains everything except the
content.xml. Then for each new file: copy the "empty" file, open the
copy with zipfile (mode 'a') and write your content.xml. This at least
is understandable and maintainable.

Option 2 (recommended): insert some timing apparatus into your script.
How much time is taken by the template stuff? Is it worth chancing
your arm on getting the "binary blob" stuff correct? Is it
maintainable? I.e. pretend that the next person to maintain your code
knows where you live and owns a chainsaw.
 
E

Emile van Sebille

On 12/23/2009 3:47 PM John Machin said...
Is it
maintainable? I.e. pretend that the next person to maintain your code
knows where you live and owns a chainsaw.

Oooh... that's much better than the finger guillotine and annual
holiday-party finger count I normally threaten with...

Emile
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,183
Messages
2,570,968
Members
47,517
Latest member
TashaLzw39

Latest Threads

Top