Receive data from socket stream

S

s0suk3

I wanted to ask for standard ways to receive data from a socket stream
(with socket.socket.recv()). It's simple when you know the amount of
data that you're going to receive, or when you'll receive data until
the remote peer closes the connection. But I'm not sure which is the
best way to receive a message with undetermined length from a stream
in a connection that you expect to remain open. Until now, I've been
doing this little trick:

data = client.recv(256)
new = data
while len(new) == 256:
new = client.recv(256)
data += new

That works well in most cases. But it's obviously error-prone. What if
the client sent *exactly* two hundred and fifty six bytes? It would
keep waiting for data inside the loop. Is there really a better and
standard way, or is this as best as it gets?

Sorry if this is a little off-topic and more related to networking,
but I'm using Python anyway.

Thanks,
Sebastian
 
S

s0suk3

You solve this by having a protocol that the client and server both
agree on, so that the client knows how much to read from the server.
There are any number of ways of doing this, all of which depend on the
kind of data you want to transfer and for what purpose.

--
Erik Max Francis && (e-mail address removed) &&http://www.alcyone.com/max/
  San Jose, CA, USA && 37 18 N 121 57 W && AIM, Y!M erikmaxfrancis
   In the final choice a solider's pack is not so heavy a burden as a
    prisoner's chains. -- Dwight D. Eisenhower, 1890-1969

So, in an HTTP client/server, I'd had to look in a Content-Length
header?
 
H

hdante

I wanted to ask for standard ways to receive data from a socket stream
(with socket.socket.recv()). It's simple when you know the amount of
data that you're going to receive, or when you'll receive data until
the remote peer closes the connection. But I'm not sure which is the
best way to receive a message with undetermined length from a stream
in a connection that you expect to remain open. Until now, I've been
doing this little trick:

data = client.recv(256)
new = data
while len(new) == 256:
    new = client.recv(256)
    data += new

That works well in most cases. But it's obviously error-prone. What if
the client sent *exactly* two hundred and fifty six bytes? It would
keep waiting for data inside the loop. Is there really a better and
standard way, or is this as best as it gets?

Sorry if this is a little off-topic and more related to networking,
but I'm using Python anyway.

Thanks,
Sebastian


done = False
remaining = ''
while done == False:
data = client.recv(256)
done, remaining = process(remaining + data)

PS: are you sure you shouldn't be using RPC or SOAP ?
 
S

s0suk3

Are you aware that recv() will not always return the amount of bytes asked for?
(send() is similar; it doesn't guarantee that the full buffer you pass to it will be
sent at once)

I suggest reading this:http://www.amk.ca/python/howto/sockets/sockets.html

--irmen

So every time I use I want to send some thing, I must use

totalsent = 0
while sent < len(data):
sent = sock.send(data[totalsent:])
totalsent += sent

instead of a simple sock.send(data)? That's kind of nasty. Also, is it
better then to use sockets as file objects? Maybe unbuffered?
 
H

Hrvoje Niksic

Nick Craig-Wood said:
What you are missing is that if the recv ever returns no bytes at all
then the other end has closed the connection. So something like this
is the correct thing to write :-

data = ""
while True:
new = client.recv(256)
if not new:
break
data += new

This is a good case for the iter() function:

buf = cStringIO.StringIO()
for new in iter(partial(client.recv, 256), ''):
buf.write(new)
data = buf.getvalue()

Note that appending to a string is almost never a good idea, since it
can result in quadratic allocation.
 
S

s0suk3

What you are missing is that if the recv ever returns no bytes at all
then the other end has closed the connection. So something like this
is the correct thing to write :-

data = ""
while True:
new = client.recv(256)
if not new:
break
data += new

From the man page for recv

RETURN VALUE

These calls return the number of bytes received, or -1 if an
error occurred. The return value will be 0 when the peer has
performed an orderly shutdown.

In the -1 case python will raise a socket.error.

But as I said in my first post, it's simple when you know the amount
of data that you're going to receive, or when you'll receive data
until the remote peer closes the connection. But what about receiving
a message with undetermined length in a connection that you don't want
to close? I already figured it out: in the case of an HTTP server/
client (which is what I'm doing), you have to look for an empty line
in the message, which signals the end of the message headers. As for
the body, you have to look at the Content-Length header, or, if the
message body contains the "chunked" transfer-coding, you have to
dynamically decode the encoding. There are other cases but those are
the most influent.

BTW, has anybody used sockets as file-like objects
(client.makefile())? Is it more secure? More efficient?

Sebastian
 
H

Hrvoje Niksic

Nick Craig-Wood said:
My aim was clear exposition rather than the ultimate performance!

That would normally be fine. My post wasn't supposed to pick
performance nits, but to point out potentially quadratic behavior.
Anyway str += was optimised in python 2.4 or 2.5 (forget which) wasn't
it?

That optimization works only in certain cases, when working with
uninterned strings with a reference count of 1, and then only when the
strings are in stored local variables, rather than in global vars or
in slots. And then, it only works in CPython, not in other
implementations. The optimization works by "cheating" -- breaking the
immutable string abstraction in the specific cases in which it is
provably safe to do so.
http://utcc.utoronto.ca/~cks/space/blog/python/ExaminingStringConcatOpt
examines it in some detail.

Guido was reluctant to accept the patch that implements the
optimization because he thought it would "change the way people write
code", a sentiment expressed in
http://mail.python.org/pipermail/python-dev/2004-August/046702.html
This discussion shows that he was quite right in retrospect. (I'm not
saying that the optimization is a bad thing, just that it is changing
the "recommended" way of writing Python in a way that other
implementations cannot follow.)
 
S

s0suk3

This is a good case for the iter() function:

buf = cStringIO.StringIO()
for new in iter(partial(client.recv, 256), ''):
buf.write(new)
data = buf.getvalue()

Note that appending to a string is almost never a good idea, since it
can result in quadratic allocation.

A question regarding cStringIO.StringIO(): is there a way to do get
getvalue() to return all the bytes after the current file position
(not before)? For example

buf = cStringIO.StringIO()
buf.write("foo bar")
buf.seek(3)
buf.getvalue(True) # the True argument means
# to return the bytes up
# to the current file position

That returns 'foo'. Is there a way to get it to return ' bar'?
 
G

Gabriel Genellina

A question regarding cStringIO.StringIO(): is there a way to do get
getvalue() to return all the bytes after the current file position
(not before)? For example

buf = cStringIO.StringIO()
buf.write("foo bar")
buf.seek(3)
buf.getvalue(True) # the True argument means
# to return the bytes up
# to the current file position

That returns 'foo'. Is there a way to get it to return ' bar'?

buf.read() - the obvious answer, once you know it :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,742
Latest member
AshliMayer

Latest Threads

Top