Developing a network protocol with Python

L

Laszlo Zsolt Nagy

Hello,

I would like to develop a new network protocol, where the server and the
clients are Python programs.
I think to be effective, I need to use TCP_NODELAY, and manually
buffered transfers.
I would like to create a general messaging object that has methods like

sendinteger
recvinteger
sendstring
recvstring

To be more secure, I think I can use this loads function to transfer
more elaborate python stuctures:

def loads(s):
"""Loads an object from a string.

@param s: The string to load the object from.
@return: The object loaded from the string. This function will not
unpickle globals and instances.
"""
f = cStringIO.StringIO(s)
p = cPickle.Unpickler(f)
p.find_global = None
return p.load()

Am I on the right way to develop a new protocol?
Are there any common mistakes that programmers do?
Is there a howto where I can read more about this?

Thanks

Les
 
D

Diez B. Roggisch

Am I on the right way to develop a new protocol?
Are there any common mistakes that programmers do?
Is there a howto where I can read more about this?

If you _must_ develop your own protocol, use at least twisted. But I'd
go for an existing solutions out there - namely pyro. No need to invent
wheels again :)

Regards,

Diez
 
T

Tom Anderson

I think to be effective, I need to use TCP_NODELAY, and manually
buffered transfers.
Why?

I would like to create a general messaging object that has methods like

sendinteger
recvinteger
sendstring
recvstring

Okay. So you're really developing a marshalling layer, somewhere between
the transport and application layers - fair enough, there are a lot of
protocols that do that.
To be more secure,

Do you really mean secure? I don't think using pickle will give you
security. If you want security, run your protocol over an TLS/SSL
connection.

If, however, you mean robustness, then this is a reasonable thing to do -
it reduces the amount of code you have to write, and so reduces the number
of bugs you'll write! One thing to watch out for, though, is the
compatibility of the pickling at each end - i have no idea what the
backwards- and forwards-compatibility of the pickle protocols is like, but
you might find that if they're on different python versions, the ends
won't understand each other. Defining your own protocol down to the
bits-on-the-socket level would preclude that possibility.
I think I can use this loads function to transfer more elaborate python
stuctures:

def loads(s):
"""Loads an object from a string.
@param s: The string to load the object from.
@return: The object loaded from the string. This function will not
unpickle globals and instances.
"""
f = cStringIO.StringIO(s)
p = cPickle.Unpickler(f)
p.find_global = None
return p.load()

I don't know the pickle module, so i can't comment on the code.
Am I on the right way to develop a new protocol?

Aside from the versioning issue i mention above, you should bear in mind
that using pickle will make it insanely hard to implement this protocol in
any language other than python (unless someone's implemented a python
pickle library in it - is there such a beast for any other language?).
Personally, i'd steer clear of doing it like this, and try to use an
existing, language-neutral generic marshalling layer. XML and ASN.1 would
be the obvious ones, but i wouldn't advise using either of them, as
they're abominations. JSON would be a good choice:

http://www.json.org/

If it's expressive enough for your objects. This is a stunningly simple
format, and there are libraries for working with it for a wide range of
languages.
Are there any common mistakes that programmers do?

The key one, i'd say, is not thinking about the future. Make sure your
protocol is able to grow - use a version number, so peers can figure out
what language they're talking, and perhaps an option negotiation
mechanism, if you're doing anything complex enough to warrant it (hey, you
could always start without it and add it in a later version!). Try to
allow for addition of new commands, message types or whatever, and for
extension of existing ones (within reason).
Is there a howto where I can read more about this?

Not really - protocol design is a bit of a black art. Someone asked about
this on comp.protocols.tcp-ip a while ago:

http://groups.google.co.uk/group/co...read/thread/39f810b43a6008e6/72ca111d67768b83

And didn't get much in the way of answers. Someone did point to this,
though:

http://www.internet2.edu/~shalunov/writing/protocol-design.html

Although i don't agree with much of what that says.

tom
 
L

Laszlo Zsolt Nagy

Tom said:
Because of the big delays when sending small messages (size < 1500 bytes).

Personally, i'd steer clear of doing it like this, and try to use an
existing, language-neutral generic marshalling layer. XML and ASN.1 would
be the obvious ones, but i wouldn't advise using either of them, as
they're abominations. JSON would be a good choice:

http://www.json.org/
I need to send Python objects too. They are too elaborate to convert
them to XML. (They are using cyclic weak references and other Python
specific stuff.) I can be sure that on both sides, there are Python
programs. Is there any advantage in using XML if I already need to send
Python objects? Those objects cannot be represented in XML, unless
pickled into a CDATA string.
And didn't get much in the way of answers. Someone did point to this,
though:

http://www.internet2.edu/~shalunov/writing/protocol-design.html
Hmm, this was very helpful. Thank you!

Les
 
L

Lawrence Oluyede

Il 2005-12-12 said:
Hello,

I would like to develop a new network protocol, where the server and the
clients are Python programs.

You should use Twisted for this:

Writing clients
http://twistedmatrix.com/projects/core/documentation/howto/clients.html

Writing servers
http://twistedmatrix.com/projects/core/documentation/howto/servers.html

I think to be effective, I need to use TCP_NODELAY, and manually
buffered transfers.
I would like to create a general messaging object that has methods like

sendinteger
recvinteger
sendstring
recvstring

You can inherit from twisted.internet.protocol.Protocol or one of its
subclasses, they handle buffering and all sort of these things for
you. Cannot have to reinvent the wheel.
To be more secure, I think I can use this loads function to transfer
more elaborate python stuctures:

def loads(s):
"""Loads an object from a string.

@param s: The string to load the object from.
@return: The object loaded from the string. This function will not
unpickle globals and instances.
"""
f = cStringIO.StringIO(s)
p = cPickle.Unpickler(f)
p.find_global = None
return p.load()

Using untrusted pickle loading is *NOT* more secure:
http://www.python.org/doc/2.2.3/lib/pickle-sec.html
and
http://www.livejournal.com/users/jcalderone/15864.html
 
L

Lawrence Oluyede

Il 2005-12-13 said:
I need to send Python objects too. They are too elaborate to convert
them to XML. (They are using cyclic weak references and other Python
specific stuff.) I can be sure that on both sides, there are Python
programs. Is there any advantage in using XML if I already need to send
Python objects? Those objects cannot be represented in XML, unless
pickled into a CDATA string.

If you have to send Python objects over the wire the best way is to have
twisted on both ends and use twisted's PerspectiveBroker, see:
http://twistedmatrix.com/projects/core/documentation/howto/pb-intro.html
 
I

Irmen de Jong

Laszlo said:
I need to send Python objects too. They are too elaborate to convert
them to XML. (They are using cyclic weak references and other Python
specific stuff.) I can be sure that on both sides, there are Python
programs. Is there any advantage in using XML if I already need to send
Python objects? Those objects cannot be represented in XML, unless
pickled into a CDATA string.


Try Pyro http://pyro.sourceforge.net
before rolling your own Python-specific protocol.

--Irmen
 
L

Laszlo Zsolt Nagy

Try Pyro http://pyro.sourceforge.net
before rolling your own Python-specific protocol.
You are right. I wanted to use pyro before, because it is well tested
and it has nice features.
Unfortunately, it is not good for me. :-(

I already have my own classes. My objects are in object ownership trees,
and they are referencing to each other (weakly and strongly). These
classes have their own streaming methods, and they can be pickled
safely. This is something that pyro cannot handle. It cannot handle
these ownership object trees with weak references. Pyro can distribute
distinct objects only, and it uses an 'object naming scheme'. This is
not what I want. My objects do not have a name, and I do not want to
create 'technical names' just to force the Pyro style access. (Another
problem is that I do not want to rewrite my classes and inherit them
from the Pyro base object class.)

Thanks for the comment. I'm going to check the Pyro documentation again.
I might find something useful.

Les
 
P

Paul Rubin

Laszlo Zsolt Nagy said:
I already have my own classes. My objects are in object ownership
trees, and they are referencing to each other (weakly and
strongly). These classes have their own streaming methods, and they
can be pickled safely.

Standard warning: if you're accepting requests from potentially
hostile sources, don't use pickle.
 
L

Laszlo Zsolt Nagy

Paul said:
Standard warning: if you're accepting requests from potentially
hostile sources, don't use pickle.
Yes, I know. Not talking about TLS/SSL - there can be hostile persons,
knowing a valid password and using a modified client program.

But how can I transfer pure python objects otherwise? Pyro also uses
Pickle and it also transfers bytecode.
Well, Pyro has an option to use XML messaging, but that is very
restricted, you cannot publish arbitrary python objects with XML. :-(

I read somewhere that Pickle had a security problem before Python 2.2,
but after 2.2 it has been solved.
BTW how CORBA or COM does this? They can do object marshaling safely.
Can we do the same with Python?
Isn't it enough to implement find_global of a cPickler ?

Les
 
P

Paul Rubin

Laszlo Zsolt Nagy said:
But how can I transfer pure python objects otherwise? Pyro also uses
Pickle and it also transfers bytecode.

Pyro in the past used pickle in an insecure way. I'd heard it had
been fixed and I didn't realize it still uses pickle.
I read somewhere that Pickle had a security problem before Python 2.2,
but after 2.2 it has been solved.

If you use pickle in the obvious way, it's definitely still insecure.
There is some info in the docs about how to use some special pickle
features to protect yourself from the insecurity, but you have to go
out of your way for that. I'm skeptical that they really protect you
in all cases, so I'd avoid unpickling any untrusted data. But I don't
know a specific exploit
BTW how CORBA or COM does this? They can do object marshaling safely.

I think they don't let you marshal arbitrary class instances and have
the class constructors called as part of demarshalling (COM anyway, I
don't know about CORBA).
Can we do the same with Python?

Yes, of course, it's possible in principle, but pickle doesn't do it
that way.

See SF RFE #467384 and bug #471893 for some more discussion of this.
Basically I think these issues are better understood now than they
were a few years ago.
Isn't it enough to implement find_global of a cPickler ?

I can't definitely say the answer is no but I feel quite paranoid
about it. The cPickle code needs careful review which I don't think
it's gotten. It was not written with security in mind, though some
security hacks were added as afterthoughts. I continue to believe
that Python should have a deserializer designed to be absolutely
bulletproof no matter what anyone throws at it, and it doesn't
currently have one. I've gotten by with limited, ad hoc wire formats
for the applications where I've needed them.
 
L

Laszlo Zsolt Nagy

Paul said:
Pyro in the past used pickle in an insecure way. I'd heard it had
been fixed and I didn't realize it still uses pickle.
On the features page, you can read this:

"Mobile objects. Clients and servers can pass objects around - even when
the server has never known them before. Pyro will then automatically
transfer the needed Python bytecode."

I believe that using cPickle and transferring data (but not the code) is
still more secure than transferring bytecode. :)

Les
 
I

Irmen de Jong

Laszlo said:
"Mobile objects. Clients and servers can pass objects around - even when
the server has never known them before. Pyro will then automatically
transfer the needed Python bytecode."

I believe that using cPickle and transferring data (but not the code) is
still more secure than transferring bytecode. :)

Note that the mobile *code* feature of Pyro is off by default.
And that the transfer of bytecodes is only part of the "problem",
because it is possible to craft special constructed pickle streams
that will do nasty things on the receiving side....

--Irmen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top