How to receive a data file of unknown length using a python socket?

T

twgray

I am attempting to send a jpeg image file created on an embedded
device over a wifi socket to a Python client running on a Linux pc
(Ubuntu). All works well, except I don't know, on the pc client side,
what the file size is? The following is a snippet:

Code:
        f = open("frame.jpg",mode = 'wb')
       while True:
            data = self.s.recv(MAXPACKETLEN)
            if len(data) == 0:
                break
            recvd += len(data)
            f.write(data)
        f.close()
[end]

It appears to be locking up in  'data=self.s.recv(MAXPACKETLEN)' on
the final packet, which will always be less than MAXPACKETLEN.

I guess my question is, how do I detect end of data on the client side?
 
I

Irmen de Jong

twgray said:
I am attempting to send a jpeg image file created on an embedded
device over a wifi socket to a Python client running on a Linux pc
(Ubuntu). All works well, except I don't know, on the pc client side,
what the file size is?

You don't. Sockets are just endless streams of bytes. You will have to design some form
of 'wire protocol' that includes the length of the message that is to be read.
For instance a minimalistic protocol could be the following:
Send 4 bytes that contain the length (an int) then the data itself. The client reads 4
bytes, decodes it into the integer that tells it the length, and then reads the correct
amount of bytes from the socket.


--irmen
 
T

Tycho Andersen

You don't. Sockets are just endless streams of bytes. You will have to
design some form of 'wire protocol' that includes the length of the message
that is to be read.
For instance a minimalistic protocol could be the following:
Send 4 bytes that contain the length (an int) then the data itself. The
client reads 4 bytes, decodes it into the integer that tells it the length,
and then reads the correct amount of bytes from the socket.

Exactly, sending the length first is the only way to know ahead of
time. Alternatively, if you know what the end of the data looks like,
you can look for that 'flag' as well, and stop trying to recv() after
that.

Some things that may be useful, though, are socket.settimeout() and
socket.setblocking(). More information is availible in the docs:
http://docs.python.org/library/socket.html.

You need to be careful with this, though, since network latency may
cause problems. Using these methods will keep your program from
sitting in recv() forever, though.

\t
 
T

twgray

You don't. Sockets are just endless streams of bytes. You will have to design some form
of 'wire protocol' that includes the length of the message that is to be read.
For instance a minimalistic protocol could be the following:
Send 4 bytes that contain the length (an int) then the data itself. The client reads 4
bytes, decodes it into the integer that tells it the length, and then reads the correct
amount of bytes from the socket.

--irmen

Thanks for the reply. But, now I have a newbie Python question. If I
send a 4 byte address from the embedded device, how do I convert that,
in Python, to a 4 byte, or long, number?
 
M

MRAB

twgray said:
Thanks for the reply. But, now I have a newbie Python question. If I
send a 4 byte address from the embedded device, how do I convert that,
in Python, to a 4 byte, or long, number?

If you send the length as 4 bytes then you'll have to decide whether
it's big-endian or little-endian. An alternative is to send the length
as characters, terminated by, say, '\n' or chr(0).
 
N

Nobody

It appears to be locking up in 'data=self.s.recv(MAXPACKETLEN)' on
the final packet, which will always be less than MAXPACKETLEN.

I guess my question is, how do I detect end of data on the client side?

recv() should return zero when the sender closes its end of the connection.

Is the sender actually closing its end? If you are unsure, use a packet
sniffer such as tcpdump to look for a packet with the FIN flag.

If you need to keep the connection open for further transfers, you need to
incorporate some mechanism for identifying the end of the data into the
protocol. As others have suggested, prefixing the data by its length is
one option. Another is to use an end-of-data marker, but then you need a
mechanism to "escape" the marker if it occurs in the data. A length prefix
is probably simpler to implement, but has the disadvantage that you can't
start sending the data until you know how long it is going to be.
 
J

John Machin

You don't. Sockets are just endless streams of bytes. You will have to design some form
of 'wire protocol' that includes the length of the message that is to be read.

Apologies in advance for my ignorance -- the last time I dipped my toe
in that kind of water, protocols like zmodem and Kermit were all the
rage -- but I would have thought there would have been an off-the-
shelf library for peer-to-peer file transfer over a socket
interface ... not so?
 
M

MRAB

Nobody said:
recv() should return zero when the sender closes its end of the connection.

Is the sender actually closing its end? If you are unsure, use a packet
sniffer such as tcpdump to look for a packet with the FIN flag.

If you need to keep the connection open for further transfers, you need to
incorporate some mechanism for identifying the end of the data into the
protocol. As others have suggested, prefixing the data by its length is
one option. Another is to use an end-of-data marker, but then you need a
mechanism to "escape" the marker if it occurs in the data. A length prefix
is probably simpler to implement, but has the disadvantage that you can't
start sending the data until you know how long it is going to be.
You could send it in chunks, ending with a chunk length of zero.
 
A

Aahz

If you send the length as 4 bytes then you'll have to decide whether
it's big-endian or little-endian. An alternative is to send the length
as characters, terminated by, say, '\n' or chr(0).

Alternatively, make it a fixed-length string of bytes, zero-padded in
front.
 
P

Piet van Oostrum

JM> Apologies in advance for my ignorance -- the last time I dipped my toe
JM> in that kind of water, protocols like zmodem and Kermit were all the
JM> rage -- but I would have thought there would have been an off-the-
JM> shelf library for peer-to-peer file transfer over a socket
JM> interface ... not so?

Yes, many of them, for example HTTP or FTP. But I suppose they are
overkill in this situation. There are also remote procedure call
protocols which can do much more, like XMLRPC.

By the way if the image file
is the only thing you send, the client should close the socket after
sending and then the receiver will detect end of file which will be
detected by your `if len(data) == 0:'
 
H

Hendrik van Rooyen

Apologies in advance for my ignorance -- the last time I dipped my toe
in that kind of water, protocols like zmodem and Kermit were all the
rage -- but I would have thought there would have been an off-the-
shelf library for peer-to-peer file transfer over a socket
interface ... not so?

*Grins at the references to Kermit and zmodem,
and remembers Laplink and PC Anywhere*

If there is such a transfer beast in Python, I have
not found it.
(There is an FTP module but that is not quite
the same thing)

I think it is because the network stuff is
all done in the OS or NFS and SAMBA
now - with drag and drop support and
other nice goodies.

I have ended up writing a netstring thingy,
that addresses the string transfer problem
by having a start sentinel, a four byte ASCII
length (so you can see it with a packet
sniffer/displayer) and the rest of the
data escaped to take out the start
sentinel and the escape character.

It works, but the four byte ASCII limits the size
of what can be sent and received.

It guarantees to deliver either the whole
string, or fail, or timeout.

If anybody is interested I will attach the
code here. It is not a big module.

This question seems to come up periodically
in different guises.

To the OP:

There are really very few valid ways of
solving the string transfer problem,
given a featureless stream of bytes
like a socket.

The first thing that must be addressed
is to sync up - you have to somehow
find the start of the thing as it comes
past.

And the second is to find the end of the
slug of data that you are transferring.

So the simplest way is to designate a byte
as a start and end sentinel, and to make
sure that such a byte does not occur in
the data stream, other than as a start
and end marker. This process is called
escaping, and the reverse is called
unescaping. (SDLC/HDLC does this at a bit
pattern level)

Another way is to use time, namely to
rely on there being some minimum
time between slugs of data. This
does not work well on TCP/IP sockets,
as retries at the lower protocol levels
can give you false breaks in the stream.
It works well on direct connections like
RS-232 or RS-485/422 lines.

Classic netstrings send length, then data.
They rely on the lower level protocols and
the length sent for demarcation of
the slug, and work well if you connect,
send a slug or two, and disconnect. They
are not so hot for long running processes,
where processors can drop out while
sending - there is no reliable way for a
stable receiver to sync up again if it is
waiting for a slug that will not finish.

Adapting the netstring by adding a sync
character and time out is a compromise
that I have found works well in practice.

- Hendrik
 
P

python

Hi Hendrik,
I have ended up writing a netstring thingy, that addresses the string transfer problem by having a start sentinel, a four byte ASCII length (so you can see it with a packet sniffer/displayer) and the rest of the data escaped to take out the start sentinel and the escape character. It works, but the four byte ASCII limits the size of what can be sent and received. It guarantees to deliver either the whole string, or fail, or timeout.
If anybody is interested I will attach the code here. It is not a big module.

I am interested in seeing your code and would be grateful if you shared
it with this list.

Thank you,
Malcolm
 
H

Hendrik van Rooyen

Hi Hendrik,

I am interested in seeing your code and would be grateful if you shared
it with this list.

All right here it is.

Hope it helps

- Hendrik
 
P

python

I am interested in seeing your code and would be grateful if you shared it with this list.
All right here it is. Hope it helps.

Hendrik,

Thank you very much!! (I'm not the OP, but found this thread
interesting)

Best regards,
Malcolm
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top