Unpacking byte strings from a file of unknown size

M

Mark

Hi;

I'm trying to use the struct.unpack to extract an int, int, char
struct info from a file. I'm more accustomed to the file.readlines
which works well in a 'for' construct (ending loop after reaching
EOF).

# This does OK at fetching one 10-byte string at a time:
# (4, 4, 2 ascii chars representing hex)
info1, info2, info3 = struct.unpack('<IIH', myfile.read(10))

# Now to do the entire file, putting into a loop just gives error:
# TypeError: 'int' object is not iterable
for info1, info2, info3 in struct.unpack('<IIH', myfile.read(10)):

In trying to shoehorn this into a 'for' loop I've been unsuccessful.
I also tried other variations that also didn't work but no point
wasting space. Using Python 2.5, WinXP

Thx,
Mark
 
S

Steven Clark

Hi;

I'm trying to use the struct.unpack to extract an int, int, char
struct info from a file. I'm more accustomed to the file.readlines
which works well in a 'for' construct (ending loop after reaching
EOF).

# This does OK at fetching one 10-byte string at a time:
# (4, 4, 2 ascii chars representing hex)
info1, info2, info3 = struct.unpack('<IIH', myfile.read(10))

# Now to do the entire file, putting into a loop just gives error:
# TypeError: 'int' object is not iterable
for info1, info2, info3 in struct.unpack('<IIH', myfile.read(10)):

In trying to shoehorn this into a 'for' loop I've been unsuccessful.
I also tried other variations that also didn't work but no point
wasting space. Using Python 2.5, WinXP

Thx,
Mark

I usually do something like:
s = myfile.read(10)
while len(s) == 10:
info1, info2, info3 = struct.unpack('<IIH', s)
s = myfile.read(10)
#might want to check that len(s) == 0 here
 
G

Gabriel Genellina

En Mon, 27 Oct 2008 19:03:37 -0200, Steven Clark
I usually do something like:
s = myfile.read(10)
while len(s) == 10:
info1, info2, info3 = struct.unpack('<IIH', s)
s = myfile.read(10)
#might want to check that len(s) == 0 here

Pretty clear. Another alternative, using a for statement as the OP
requested (and separating the "read" logic from the "process" part):

def chunked(f, size):
while True:
block = f.read(size)
if not block: break
yield block

for block in chunked(open(filename, 'rb'), 10):
info1, info2, info3 = struct.unpack('<IIH', block)
...

A third one:

from functools import partial

for block in iter(partial(open(filename,'rb').read, 10), ''):
...

(rather unreadable, I admit, if one isn't familiar with partial functions
and the 2-argument iter variant)
 
T

Terry Reedy

Mark said:
Hi;

I'm trying to use the struct.unpack to extract an int, int, char
struct info from a file. I'm more accustomed to the file.readlines
which works well in a 'for' construct (ending loop after reaching
EOF).

You do not need .readlines to iterate through a file by lines.
for line in f.readlines():pass
is awkward if you have 100million 100 byte lines, whereas
for line in f: pass
will read one line at a time and process before reading the next.
# This does OK at fetching one 10-byte string at a time:
# (4, 4, 2 ascii chars representing hex)
info1, info2, info3 = struct.unpack('<IIH', myfile.read(10))

# Now to do the entire file, putting into a loop just gives error:
# TypeError: 'int' object is not iterable
for info1, info2, info3 in struct.unpack('<IIH', myfile.read(10)):

In trying to shoehorn this into a 'for' loop I've been unsuccessful.

for loops require an iterator. Files only come with one. So either use
a while loop or define a reusable file-block generator (untested):

def blocks(openfile, n):
while True:
block = openfile.read(n)
if len(block) == n:
yield block
else:
raise StopIteration

Terry Jan Reedy
 
M

Mark

Thanks I tested your solution and that works.

One of the things that didn't work was
for chunk in myfile.read(10):
info1, info2, info3 = struct.unpack('<IIH', chunk)

It gets an error saying unpack requires a string of length 10, which I
thought chunk would be after the read(10). I'm still a little
confused as to why.

But thanks very much Steven, for a working solution.

Mark
 
M

Mark

this code python interprets as:

data = myfile.read(10)
for chunk in data:
<and here chunk is a single-character string>.

Aha - now that you put it that way it makes sense. And thanks to all
who replied - I'll try out the other suggestions too.

Mark
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,992
Messages
2,570,220
Members
46,805
Latest member
ClydeHeld1

Latest Threads

Top