python concurrency proposal

C

corey.coughlin

Alright, so I've been following some of the arguments about enhancing
parallelism in python, and I've kind of been struck by how hard things
still are. It seems like what we really need is a more pythonic
approach. One thing I've been seeing suggested a lot lately is that
running jobs in separate processes, to make it easy to use the latest
multiprocessor machines. Makes a lot of sense to me, those processors
are going to be more and more popular as time goes on. But it would
also be nice if it could also turn into a way to make standard
threading a little easier and trouble free. But I'm not seeing an easy
way to make it possible with the current constraints of the language,
so it seems like we're going to need some kind of language improvement.
Thinking of it from that perspective, I started thinking about how it
would be easy to deal with in a more idealized sense. It would be nice
to abstract out the concept of running something in parallel to
something that can be easily customized, is flexible enough to use in a
variety of concepts, and is resonably hard to screw up and fairly easy
to use. Perhaps a new built-in type might be just the trick. Consider
a new suite:

pardef <Name>(self, <par type>, arguments...):
self.send(<dest pardef>, <tag>, arguments)
self.receive(<tag>, arguments)
return arguments
yield arguments

so the object would then be something you can create an instance of,
and set up like a normal object, and it would have other interface
functions as well. Consider your basic vector add operation:

import concurrent
import array

pardef vecadd(self, concurrent.subprocess, veca, vecb, arrtype):
import array
output = array.array(arrtype)
for a,b in zip(veca, vecb):
output.append( a + b)
return output

a = array.array('d')
b = array.array('d')
for i in range(1000):
a.append(float(i))
b.append(float(i))

h1 = vecadd(a[:500], b[:500], 'd')
h2 = vecadd()
h2.veca = a[500:]
h2.vecb = b[500:]
h2.arrtype = 'd'

h1.run()
h2.run()
c = h1.result + h2.result

You can see a few things in this example. First off, you'll notice
that vecadd has the import for array inside it. One of the most
important things about the pardef is that it must not inherit anything
from the global scope, all variable passing must occur through either
the arguments or .receive statements. You'll also notice that it's
possible to set the arguments like instance values. This isn't as
important in this case, but it could be very useful for setting
arguments for other pardefs. Take this example of your basic SIMD-ish
diffusion simulation:

import concurrent

pardef vecadd(self, concurrent.subprocess, right, left, up, down,
initval):
current = initval
maxruns = 100
updef = not (isinstance(up, int) or isintance(up, float))
downdef = not (isinstance(down, int) or isintance(down, float))
rightdef = not (isinstance(right, int) or isintance(right, float))
leftdef = not (isinstance(left, int) or isintance(left, float))
for i in range(maxruns):
if updef:
upval = self.receive(up, 'up')
else:
upval = up
if downdef:
downval = self.receive(down, 'down')
else:
downval = down
if rightdef:
rightval = self.receive(right, 'right')
else:
rightval = right
if leftdef:
leftval = self.receive(left, 'left')
else:
leftval = left
current = (upval + downval + leftval + rightval) / 4
if updef:
up.send('down', current)
if downdef:
down.send('up', current)
if rightdef:
right.send('left', current)
if leftdef:
left.send('right', current)
return current

diffgrid = {}
for x, y in zip(range(10), range(10)):
diffgrid[(x, y)] = vecadd()
for x, y in zip(range(10), range(10)):
gridpt = diffgrid[(x, y)]
gridpt.initval = 50.0
if x == 0:
gridpt.left = 75.0
else:
gridpt.left = diffgrid[(x-1, y)]
if x == 10:
gridpt.right = 50.0
else:
gridpt.right = diffgrid[(x+1, y)]
if y == 0:
gridpt.down = 0.0
else:
gridpt.down = diffgrid[(x, y-1)]
if y == 10:
gridpt.up = 100.0
else:
gridpt.up = diffgrid[(x, y+1)]
for coord in diffgrid:
diffgrid[coord].run()
for x, y in zip(range(10), range(10)):
print '(%i, %i) = %f' % (x, y, diffgrid[(x,y)].return())

Now sure, this example is a little contrived, but it shows the utility
of allowing the input parameters to be set after instantiating the
pardef. You can also imagine that this would be useful for running a
single pardef over and over again with different arguments.
Remember that pardefs don't inherit any data from the global scope.
Data is only passed in through receive statements and the arguments.
The <par type> would control how the send and receive functions work,
and how the return and yield pass data back. In a way, the pardef
works something like a built in type, and something like an interface.
In this way, it can be used to implement different kinds of
parallelism. It could be used for threads, for processes, and
conceivably for clustering as well. I suspect that the easiest way to
implement it would be to use strict data copying for send and recieve,
and only allow shared data through the arguments, but it would be
easiest to leave it to the <partype> implementation. This would in
effect create this data type as more of an interface than anything
else. Since parallel processing is kind of complicated, this would
allow for a number of different approaches, but still keep them all
within the same framework, creating consistency between the approaches.
You could test with a safe version of threading, then change it to a
less safe but higher performance version if you need to, and so on.
Ultimately the approach would be obvious without having to get bogged
down in details.
So to be specific, the interface to this object would consist of at
least the following functions:

..run() -> runs the code, or restarts from a yield, non-blocking
..return() -> gets the return or yield value of the code, blocking
..send(<label>, values...) -> sends a value to the code, non-blocking
..receive(<label>) -> gets a new value for some code, blocking
..kill() -> kills some running code, blocking

Now sure, I suppose there could be other methods available, but I can't
think of any other good ones off hand. But anyway, that's pretty much
what I was thinking of, more or less. Of course, implementing it all
would be more than a little tricky. For one, all the standard
libraries that accessed limited system resources would have to be
played with, like file access and other such things. For instance,
consider 'print', which writes stuff to stdout, would probably have to
be rewritten as something like this:

import concurrent
import sys
pardef parprint(self, concurrent.systhread, outfile):
while True:
newmessage = self.receive('message')
outfile.write('%s\n' % newmessage)
outfile.flush() #may not be necessary

printp = parprint(sys.__stdout__)
printp.run()

then every 'print' call will need to be intercepted and recast as
something like this:

printp.send('message',outstr)

Of course, this kind of rewrite would only be needed for programs using
concurrency. I suppose you could have some kind of switch that got set
in the parser to detect pardef's and use concurrent libraries for
those, and use standard ones otherwise. In fact, if we wanted to ease
into the implementation, it might be simpler to allow ther parser to
detect when a parallel library isn't available for a function, so that
unimplemented libraries would give an error. For most libraries that
don't use scarce system resources, a parallel compatible library could
be added easily.
There could be some issues with this that I haven't thought of,
though. There may be some forms of concurrency that wouldn't work too
well with this scheme. It would take another keyword, and that usually
seems to be trouble. I'm not really even that fond of 'pardef', but
the alternatives I can think of (concurrentdef, parallelblock,
threaddef, processdef, thread, process, etc.) seem worse. Be sure to
chime in if you have any suggestions for a better one. There is also
no existing implementation for this, I thought I'd run it by everyone
here to see if it would even have a chance of being accepted. So hey,
let me know what you think, and if most people like it, I guess I'll
get started on an implementation and PEP and all that. OK? So what do
you all think?
 
P

Peter Tillotson

I'd really like to see a concurrency system come into python based on
theories such as Communicating Sequential Processes (CSP) or its
derivatives lambda or pi calculus. These provide an analytic framework
for developing multi thread / process apps. CSP like concurrency is one
of the hidden gems in the Java Tiger release (java.util.concurrency).
The advantages of the analytic framework is that they minimise livelock,
deadlock and facilitate debugging.

I'm no expert on the theory but i've developed under these frameworks
and found them a more reliable way of developing distributed agent systems.

You may also be interested in looking at
http://sourceforge.net/projects/pympi

p
Alright, so I've been following some of the arguments about enhancing
parallelism in python, and I've kind of been struck by how hard things
still are. It seems like what we really need is a more pythonic
approach. One thing I've been seeing suggested a lot lately is that
running jobs in separate processes, to make it easy to use the latest
multiprocessor machines. Makes a lot of sense to me, those processors
are going to be more and more popular as time goes on. But it would
also be nice if it could also turn into a way to make standard
threading a little easier and trouble free. But I'm not seeing an easy
way to make it possible with the current constraints of the language,
so it seems like we're going to need some kind of language improvement.
Thinking of it from that perspective, I started thinking about how it
would be easy to deal with in a more idealized sense. It would be nice
to abstract out the concept of running something in parallel to
something that can be easily customized, is flexible enough to use in a
variety of concepts, and is resonably hard to screw up and fairly easy
to use. Perhaps a new built-in type might be just the trick. Consider
a new suite:

pardef <Name>(self, <par type>, arguments...):
self.send(<dest pardef>, <tag>, arguments)
self.receive(<tag>, arguments)
return arguments
yield arguments

so the object would then be something you can create an instance of,
and set up like a normal object, and it would have other interface
functions as well. Consider your basic vector add operation:

import concurrent
import array

pardef vecadd(self, concurrent.subprocess, veca, vecb, arrtype):
import array
output = array.array(arrtype)
for a,b in zip(veca, vecb):
output.append( a + b)
return output

a = array.array('d')
b = array.array('d')
for i in range(1000):
a.append(float(i))
b.append(float(i))

h1 = vecadd(a[:500], b[:500], 'd')
h2 = vecadd()
h2.veca = a[500:]
h2.vecb = b[500:]
h2.arrtype = 'd'

h1.run()
h2.run()
c = h1.result + h2.result

You can see a few things in this example. First off, you'll notice
that vecadd has the import for array inside it. One of the most
important things about the pardef is that it must not inherit anything
from the global scope, all variable passing must occur through either
the arguments or .receive statements. You'll also notice that it's
possible to set the arguments like instance values. This isn't as
important in this case, but it could be very useful for setting
arguments for other pardefs. Take this example of your basic SIMD-ish
diffusion simulation:

import concurrent

pardef vecadd(self, concurrent.subprocess, right, left, up, down,
initval):
current = initval
maxruns = 100
updef = not (isinstance(up, int) or isintance(up, float))
downdef = not (isinstance(down, int) or isintance(down, float))
rightdef = not (isinstance(right, int) or isintance(right, float))
leftdef = not (isinstance(left, int) or isintance(left, float))
for i in range(maxruns):
if updef:
upval = self.receive(up, 'up')
else:
upval = up
if downdef:
downval = self.receive(down, 'down')
else:
downval = down
if rightdef:
rightval = self.receive(right, 'right')
else:
rightval = right
if leftdef:
leftval = self.receive(left, 'left')
else:
leftval = left
current = (upval + downval + leftval + rightval) / 4
if updef:
up.send('down', current)
if downdef:
down.send('up', current)
if rightdef:
right.send('left', current)
if leftdef:
left.send('right', current)
return current

diffgrid = {}
for x, y in zip(range(10), range(10)):
diffgrid[(x, y)] = vecadd()
for x, y in zip(range(10), range(10)):
gridpt = diffgrid[(x, y)]
gridpt.initval = 50.0
if x == 0:
gridpt.left = 75.0
else:
gridpt.left = diffgrid[(x-1, y)]
if x == 10:
gridpt.right = 50.0
else:
gridpt.right = diffgrid[(x+1, y)]
if y == 0:
gridpt.down = 0.0
else:
gridpt.down = diffgrid[(x, y-1)]
if y == 10:
gridpt.up = 100.0
else:
gridpt.up = diffgrid[(x, y+1)]
for coord in diffgrid:
diffgrid[coord].run()
for x, y in zip(range(10), range(10)):
print '(%i, %i) = %f' % (x, y, diffgrid[(x,y)].return())

Now sure, this example is a little contrived, but it shows the utility
of allowing the input parameters to be set after instantiating the
pardef. You can also imagine that this would be useful for running a
single pardef over and over again with different arguments.
Remember that pardefs don't inherit any data from the global scope.
Data is only passed in through receive statements and the arguments.
The <par type> would control how the send and receive functions work,
and how the return and yield pass data back. In a way, the pardef
works something like a built in type, and something like an interface.
In this way, it can be used to implement different kinds of
parallelism. It could be used for threads, for processes, and
conceivably for clustering as well. I suspect that the easiest way to
implement it would be to use strict data copying for send and recieve,
and only allow shared data through the arguments, but it would be
easiest to leave it to the <partype> implementation. This would in
effect create this data type as more of an interface than anything
else. Since parallel processing is kind of complicated, this would
allow for a number of different approaches, but still keep them all
within the same framework, creating consistency between the approaches.
You could test with a safe version of threading, then change it to a
less safe but higher performance version if you need to, and so on.
Ultimately the approach would be obvious without having to get bogged
down in details.
So to be specific, the interface to this object would consist of at
least the following functions:

.run() -> runs the code, or restarts from a yield, non-blocking
.return() -> gets the return or yield value of the code, blocking
.send(<label>, values...) -> sends a value to the code, non-blocking
.receive(<label>) -> gets a new value for some code, blocking
.kill() -> kills some running code, blocking

Now sure, I suppose there could be other methods available, but I can't
think of any other good ones off hand. But anyway, that's pretty much
what I was thinking of, more or less. Of course, implementing it all
would be more than a little tricky. For one, all the standard
libraries that accessed limited system resources would have to be
played with, like file access and other such things. For instance,
consider 'print', which writes stuff to stdout, would probably have to
be rewritten as something like this:

import concurrent
import sys
pardef parprint(self, concurrent.systhread, outfile):
while True:
newmessage = self.receive('message')
outfile.write('%s\n' % newmessage)
outfile.flush() #may not be necessary

printp = parprint(sys.__stdout__)
printp.run()

then every 'print' call will need to be intercepted and recast as
something like this:

printp.send('message',outstr)

Of course, this kind of rewrite would only be needed for programs using
concurrency. I suppose you could have some kind of switch that got set
in the parser to detect pardef's and use concurrent libraries for
those, and use standard ones otherwise. In fact, if we wanted to ease
into the implementation, it might be simpler to allow ther parser to
detect when a parallel library isn't available for a function, so that
unimplemented libraries would give an error. For most libraries that
don't use scarce system resources, a parallel compatible library could
be added easily.
There could be some issues with this that I haven't thought of,
though. There may be some forms of concurrency that wouldn't work too
well with this scheme. It would take another keyword, and that usually
seems to be trouble. I'm not really even that fond of 'pardef', but
the alternatives I can think of (concurrentdef, parallelblock,
threaddef, processdef, thread, process, etc.) seem worse. Be sure to
chime in if you have any suggestions for a better one. There is also
no existing implementation for this, I thought I'd run it by everyone
here to see if it would even have a chance of being accepted. So hey,
let me know what you think, and if most people like it, I guess I'll
get started on an implementation and PEP and all that. OK? So what do
you all think?
 
C

Cameron Laird

I'd really like to see a concurrency system come into python based on
theories such as Communicating Sequential Processes (CSP) or its
derivatives lambda or pi calculus. These provide an analytic framework
for developing multi thread / process apps. CSP like concurrency is one
of the hidden gems in the Java Tiger release (java.util.concurrency).
The advantages of the analytic framework is that they minimise livelock,
deadlock and facilitate debugging.

I'm no expert on the theory but i've developed under these frameworks
and found them a more reliable way of developing distributed agent systems.

You may also be interested in looking at
http://sourceforge.net/projects/pympi
.
.
.
Yes. Parallelism certainly deserves attention, and I believe
"amateurs" are likely to help in the breakthroughs to come. I
further suspect, though, that they'll be amateurs who benefit
from knowledge of existing research into the range of documented
concurrency concepts, including CSPs, tasks, guarded methods,
microthreads, weightless threads, chords, co-routines, and so on.
 
M

Michael

Alright, so I've been following some of the arguments about enhancing
parallelism in python, and I've kind of been struck by how hard things
still are. It seems like what we really need is a more pythonic
approach. [... major snippage ...]
OK? So what do you all think?

On the surface of it, what you've described resembles Kamaelia[1] -
specifically in the way used in the Axon Shell [2]. In other ways it
differs wildy. I think it's interesting because this (or some of this)
could be used as syntactic sugar for Kamaelia.
[1] http://kamaelia.sourceforge.net/Home
[2] http://kamaelia.sourceforge.net/AxonShell.html

The similarity is this: a pardef appears to be a function that can run,
and be made to run in parallel with other functions. In Kamaelia we use
generators to achieve precisely this.

However, given pythonic is a loaded term (beauty is in the eye of the
beholder), I'd personally say that there's some elements of your syntax
that suprise me, especially given the (IMO valid) approach of not being
able to access inside the pardef.

Since there are some strong similarities between what you've written and
Kamaelia it seems sensible to discuss them.

If you have your pardef:
pardef <Name>(self, <par type>, arguments...):
self.send(<dest pardef>, <tag>, arguments)
self.receive(<tag>, arguments)
return arguments
yield arguments

This maps to this: (in your terms :) )

class <Name>(<par type>):
Inboxes = [ <tag>, <tag>, <tag> ] # Explicit is better than implicit
Outboxes = [ <tag>, <tag>, <tag> ] # There are defaults available...
def __init__(self, arguments ...):
// standard initialisation //
super(<Name>, self).__init__()

def main(self):
// local
// Initialisation
while 1:
do stuff
yield //some value// (relinquish control to scheduler)
return
// We don't have the concept of a result, though this
would be useful, though we do have something
similar so this would be doable //

Inboxes and Outboxes are used as declarations to allow the baseclass
(component) to initialise some datastructures for communications. These
declarations are actually any iterable, and these days we tend to use
dictionaries because it simplifies documentation.

An example here might control the behaviour of a sprite onscreen. So,
suppose we have a sprite:

(This is a simplified example of an existing component)

class SimpleSprite(component, pygame.sprite.Sprite):
Inboxes = { "position" : "Expect to receive (x:int,y:int) values",
"rotation" : "Expect to an int in range 0..359",
}
def __init__(self, image, pos):
pygame.sprite.Sprite.__init__(self)
component.__init__(self) # Can't use super here because we inherit
# from pygame.sprite.Sprite as well
self.original = image
self.image = image # self.image is displayed by pygame
self.pos = pos

def main(self):
pos = self.pos
rotation = 0
image = self.image
while 1:
self.image = self.original
if not self.anyReady():
self.pause() # Ask scheduler not to run us until we
# receive data on an inbox
yield 1
if self.dataReady("position"):
pos = self.recv("position")
if self.dataReady("rotation"):
angle = self.recv("rotation")
self.image = pygame.transform.rotate(self.image, angle)
self.rect.center = pos
yield 1

We could then have some game logic that sends out information that controls
this sprite over time:

class WalkInASquare(component):
Outboxes = {
"position" : "Sends out an (x:int, y:int) pair",
"orientation" : "Sends out an angular orientation",
}
def __init__(self, left, top, right, bottom, steps):
# We assume here that left < right and top < bottom
# In something real you'd want to check or enforce this
# (eg asserts or swapping)
self.left = left
self.top = top
self.right = right
self.bottom = bottom
def main(self):
# You'd ideally want to check for shutdown messages as well
x = self.left
y = self.top
while 1: # do this forever, normally we'd shutdown
# Walk right
self.send(90, "orientation")
for i in xrange(self.left, self.right, steps):
self.send((i,y), "position")
yield 1
x = right

# Walk down
self.send(180, "orientation")
for i in xrange(self.top, self.bottom, steps):
self.send((x,i), "position")
yield 1
y = self.bottom

# Walk left
self.send(270, "orientation")
for i in xrange(self.right, self.left, -steps):
self.send((i,y), "position")
yield 1
x = self.left

# Walk up
self.send(0, "orientation")
for i in xrange(self.bottom, self.top, -steps):
self.send((x,i), "position")
yield 1
y = self.top

The logic of walking around should be clear:
* We take a step, and send out our position
* If we reach the end, we turn to face the new direction and
send that value out.

Clearly for this to be interesting this can be joined with the sprite.

Unlike your system, our components (your pardefs), don't know
about each other and only talk to inboxes AND outboxes - these
are connected at a higher level by something enclosing them.

Graphline(
Walker = SimpleSprite(walkerImage, 10,10),
WalkerLogic = WalkInASquare(10,10,400,400,50),
linkages = {
("WalkerLogic", "position") : ("Walker", "position"),
("WalkerLogic", "orientation") : ("Walker", "rotation"),
}
)

This graphline can then be activated or run like any other component.

[[ For more complete examples, grab Kamaelia from the website,
download and run :) ]]

You'll note from the above that clearly there are aspects to it where
your ideas could be used as syntactic sugar. Some of the benefits (like
collapsing __init__ and main together) can be gained by using decorators
in python 2.4.

You'll also note there are some areas where your ideas increase parallel
safety (such as when sending mutable data structures), at the expense on
increasing the cost of communication.

I'm not saying the above is the paragon of virtue (it isn't) - we're looking
to see what's possible - but not rushing into creating syntax until we have
an idea that the approaches work. That said, we are interested in ideas for
syntaxes. At the moment I'd prefer these to be based on API enhancements
rather than changes to python since I personally think that's an extreme
approach until you know the change to the language will actually be
beneficial.

((In some respects we're held back from some improvements because we want
to stay supporting python on series 60 (which limits us to 2.2 late
features rather than things like decorators and PEP 342 enhancements).
We may drop that if necessary later or simply have a specialised version
there. This might explain the emphasis on ))

Regarding other points, of your suggestions though...

You allow the following to be equivalent initialisations:

h1 = vecadd(a[:500], b[:500], 'd')
# vs
h1 = vecadd()
h1.veca = a[:500]
h1.vecb = b[:500]
h1.arrtype = 'd'

To me this looks dangerous. (as in likely to confuse and cause bugs)

What if vecadd is implemented as a thread, and can run concurrently with
the main piece of code? (especially since you suggest reusing the same
active object) This allows the possibility:

h1 = vecadd()
h1.run() # does this background? Do we block? (appears to do the former)
h1.veca = a[:500] # What happens, is h1 blocked until we get here?
h1.vecb = b[:500]
h1.arrtype = 'd'
h1.veca = b[:500] # Uh oh, what's happening here?
c = h1.result + h2.result # Um, what does this now mean ?

were the values to veca queued? Were they overwritten? Would this be valid?
To me it's completely non-obvious what this code means on reading it. It
also *looks* like you're updating internal state of the object rather than
an external interface. (It also has the implicit suggestion that you can
read these values as well, which may or may not be valid)

I think the core to of your idea is that your new suite really introduces
a scheduler for generators and a means for communicating (you call them
tags). What you've done here is also make the arguments for creation mutable
which becomes confusing.

I'm not trying to discourage you, I like the ideas, and would like to see
you expand them more since they interest me, but you have some other
inconsistencies. For example, you say here:

Remember your statement that pardefs don't inherit any data from the global
scope. Data is only passed in through receive statements and the arguments.

In practice however, you also have in your examples this:
updef = not (isinstance(up, int) or isintance(up, float))
downdef = not (isinstance(down, int) or isintance(down, float))
rightdef = not (isinstance(right, int) or isintance(right, float))
leftdef = not (isinstance(left, int) or isintance(left, float))

In these lines you have passed over global scope data - specifically
int and float.

Unlike your approach (syntax first), we've created a large (growing) number
of components (like your pardefs) which have been put together for various
systems which have varying levels of concurrency which are probably a useful
testbed for testing your ideas:

* http://kamaelia.sourceforge.net/Components.html

Example systems created using this vary from a "everything broadcast" PVR
for radio [*], a networked audio mixer matrix, simple games, a presentation
tool (probably including soon a live video mixer to go with the video
playback), topology viewing tools, a simple "paint" program, etc)

* Actually a box for creating a record of transmission, which
is slightly different, but the former description is more
accessible if slightly inaccurate.)

If you're interested in prototyping your ideas in python, you can simulate
some of your ideas using decorators. Something that might help you with
prototyping your ideas is our tutorial for Kamaelia, which is a "build your
own core" system type tutorial. It might also help show that your pardefs
are very similar to python's generators. It can be found here:
* http://kamaelia.sourceforge.net/MiniAxon/

In many respects, I feel that the API we have still isn't "100% pythonic" as
many would call it, but it does try to be unsurprising and consistent. You
could say we're aiming for pythonic - though personally I favour easy and
unsurprising as more concrete, less abstract goals - even if it's not there
yet. (portability of ideas to other languages is important to me, which
again is another reason for an API based view rather than syntax).

If you're willing to take a look at it, and make suggestions as to how your
ideas might fit, (or what you think is dumb :) I'd welcome it.

*Especially* if it simplifies the system (*or* the way it's used).

Finally though, as I say above,I'm not trying to discourage you, I like
the ideas, and would like to see you expand them more since they interest
me, and like you I think this is an area that needs work. I would suggest
playing with the ideas though and testing them against systems before
writing a PEP though. (real/useful systems rather than contrived ones! -)

Best Regards & Happy New Year,


Michael.
--
(e-mail address removed), http://kamaelia.sourceforge.net/
British Broadcasting Corporation, Research and Development
Kingswood Warren, Surrey KT20 6NP

This message (and any attachments) may contain personal views
which are not the views of the BBC unless specifically stated.
 
C

Corey Coughlin

Hey, some responses, let's see...

Peter said:
I'd really like to see a concurrency system come into python based on
theories such as Communicating Sequential Processes (CSP) or its
derivatives lambda or pi calculus. These provide an analytic framework
for developing multi thread / process apps. CSP like concurrency is one
of the hidden gems in the Java Tiger release (java.util.concurrency).
The advantages of the analytic framework is that they minimise livelock,
deadlock and facilitate debugging.

Yes, a CSP-like system would be kind of nice. Of course, to really pull
it off, you'd probably need some kind of primitive to represent a simple
process. Which is kind of what I'm proposing. It's kind of a primitive
version, but an object to easily represent processes and communication
channels would be a big help, I'd imagine. Once a primitive is in
place, I believe it would be fairly easy to build up a full set of
CSP-ish primitives to help assemble full systems.
I'm no expert on the theory but i've developed under these frameworks
and found them a more reliable way of developing distributed agent systems.

You may also be interested in looking at
http://sourceforge.net/projects/pympi

Ah yes, pympi, that's good stuff. It does suggest that I might need to
add blocking and non-blocking version of send and receive, there might
be a need for that. Especially the non-blocking receive, that looks
very handy. And maybe a .status for polling the status of running jobs.
It does go a lot further than I do with this proposal, it adds all
that stuff for collecting the jobs, running them as a group, gathering
results, and so on. This proposal could probably use an extra library
to provide a bunch of those type of operations, but I wanted to start a
little bit small here with just the proposal of a process primitive.
That seems like it would be a good first step. Of course, what I'm
really hoping for is that at some point you could do something like this:

import mpi

pardef vecadd(self, mpi.mpiproc, ...)

and so on. I'm not really all that concerned about the communication
channels and process distributions work, I suspect that many people will
want to try different methods like MPI, or the standard pthread wrapper
with some kind of standard queue communications channel, or pyro, maybe
even Kamaelia. I'm just proposing the primitive for it. So, if there's
anything else you'd like to see it work like, be sure to let me know.
Thanks for the input!

----- Corey
 
C

Corey Coughlin

Yes. Parallelism certainly deserves attention, and I believe
"amateurs" are likely to help in the breakthroughs to come. I
further suspect, though, that they'll be amateurs who benefit
from knowledge of existing research into the range of documented
concurrency concepts, including CSPs, tasks, guarded methods,
microthreads, weightless threads, chords, co-routines, and so on.

Yes, there are lots of different concepts, even in python, there's pympi
(as was mentioned), the standard python thread library, the subprocess
library, generators, microthreads and stackless, not to mention
Candygram, PyLinda, ATOM, Kamaelia (get to that in a minute), and other
things you can search for on the web. My motivation here is just to see
if I can find some lowest common denominator, to try to simplify this
stuff to the point where the whole concept is a little easier to use,
and the plumbing can be hidden away somewhere so "amateurs" don't have
to worry about it (too much) if they don't want to.
Now to be more specific, there does seem to be a lot of work with
generators to set up concurrency, and that's fine, but it does seem like
it takes a bunch of scaffolding and a different way of looking at
things, and it's not really obvious to me how it can scale up on
multiple processor systems with the GIL still in place. Now I'm not
sure that this will be the answer to all the problems, but breaking the
global address space and making it easy to break up jobs into small
communicating chunks seems like it would be a good way to go. Or maybe
I'm missing something? Is there anything you'd care to elaborate on?
 
C

Corey Coughlin

OK, thanks for all this criticism, you've obviously taken some time
here, guess I'll see if I can help clear some of this up....

Michael wrote:
On the surface of it, what you've described resembles Kamaelia[1] -
specifically in the way used in the Axon Shell [2]. In other ways it
differs wildy. I think it's interesting because this (or some of this)
could be used as syntactic sugar for Kamaelia.
[1] http://kamaelia.sourceforge.net/Home
[2] http://kamaelia.sourceforge.net/AxonShell.html

The similarity is this: a pardef appears to be a function that can run,
and be made to run in parallel with other functions. In Kamaelia we use
generators to achieve precisely this.

Hey, you got it, great!
However, given pythonic is a loaded term (beauty is in the eye of the
beholder), I'd personally say that there's some elements of your syntax
that suprise me, especially given the (IMO valid) approach of not being
able to access inside the pardef.

Since there are some strong similarities between what you've written and
Kamaelia it seems sensible to discuss them.

If you have your pardef:
pardef <Name>(self, <par type>, arguments...):
self.send(<dest pardef>, <tag>, arguments)
self.receive(<tag>, arguments)
return arguments
yield arguments

This maps to this: (in your terms :) )

class <Name>(<par type>):
Inboxes = [ <tag>, <tag>, <tag> ] # Explicit is better than implicit
Outboxes = [ <tag>, <tag>, <tag> ] # There are defaults available...
def __init__(self, arguments ...):
// standard initialisation //
super(<Name>, self).__init__()

def main(self):
// local
// Initialisation
while 1:
do stuff
yield //some value// (relinquish control to scheduler)
return
// We don't have the concept of a result, though this
would be useful, though we do have something
similar so this would be doable //

Inboxes and Outboxes are used as declarations to allow the baseclass
(component) to initialise some datastructures for communications. These
declarations are actually any iterable, and these days we tend to use
dictionaries because it simplifies documentation.

Yes, it would map in a more detailed level to a class of some kind with
the methods indicated. Although, I'm imagining that for different <par
type> base classes, the init and main loops would be implemented in
different ways. Different types of communication channels (shared
memory copies, file based pickling, socket communication, and so on)
would also have different implementations as well, beyond Inboxes and
Outboxes. This could make for some much more complicated base classes.
It also suggests some problems, notably that of coercing different
messages from different types of base classes (i.e., suppose the
standard library os and system classes get implemented as threads, but
you still want to send and receive messages to and from them from
processes and clustered processes).
An example here might control the behaviour of a sprite onscreen. So,
suppose we have a sprite:
Unlike your system, our components (your pardefs), don't know
about each other and only talk to inboxes AND outboxes - these
are connected at a higher level by something enclosing them.

Graphline(
Walker = SimpleSprite(walkerImage, 10,10),
WalkerLogic = WalkInASquare(10,10,400,400,50),
linkages = {
("WalkerLogic", "position") : ("Walker", "position"),
("WalkerLogic", "orientation") : ("Walker", "rotation"),
}
)

This graphline can then be activated or run like any other component.

Yes, I've become a little worried about the casual way I'm pulling the
pardefs together. I did want to keep the definition fairly primitive,
and then try to pull the jobs together using standard python objects
like lists and dictionaries and such, but given that pympi and Kamaelia
use more custom methods, it may be something to think about. It might
simplify the blocking/non-blocking behavior, which I'll get to soon...
[[ For more complete examples, grab Kamaelia from the website,
download and run :) ]]

You'll note from the above that clearly there are aspects to it where
your ideas could be used as syntactic sugar. Some of the benefits (like
collapsing __init__ and main together) can be gained by using decorators
in python 2.4.

I've actually been thinking that decorators may be the way to go to set
the base class of the pardef, either that or have it be another class
modifier. Neither seems very ideal, but having it after self also seems
kludgy.

You'll also note there are some areas where your ideas increase parallel
safety (such as when sending mutable data structures), at the expense on
increasing the cost of communication.

Yes, the hope is that the ability to have different base classes that
implement the same interface will allow people to experiment with
different base implementations. Sooner or later, the best solutions
would migrate to the standard library, with luck.
I'm not saying the above is the paragon of virtue (it isn't) - we're looking
to see what's possible - but not rushing into creating syntax until we have
an idea that the approaches work. That said, we are interested in ideas for
syntaxes. At the moment I'd prefer these to be based on API enhancements
rather than changes to python since I personally think that's an extreme
approach until you know the change to the language will actually be
beneficial.

Yeah, I guess I'm an extremist. Ah well, I just like to think of what
I'd ideally want, and see if I can get there. Maybe I can't, or maybe
nobody else shares my ideals, but I thought I'd toss it out and see what
people think.
((In some respects we're held back from some improvements because we want
to stay supporting python on series 60 (which limits us to 2.2 late
features rather than things like decorators and PEP 342 enhancements).
We may drop that if necessary later or simply have a specialised version
there. This might explain the emphasis on ))

Regarding other points, of your suggestions though...

You allow the following to be equivalent initialisations:

h1 = vecadd(a[:500], b[:500], 'd')
# vs
h1 = vecadd()
h1.veca = a[:500]
h1.vecb = b[:500]
h1.arrtype = 'd'

To me this looks dangerous. (as in likely to confuse and cause bugs)

What if vecadd is implemented as a thread, and can run concurrently with
the main piece of code? (especially since you suggest reusing the same
active object) This allows the possibility:

h1 = vecadd()
h1.run() # does this background? Do we block? (appears to do the former)
h1.veca = a[:500] # What happens, is h1 blocked until we get here?
h1.vecb = b[:500]
h1.arrtype = 'd'
h1.veca = b[:500] # Uh oh, what's happening here?
c = h1.result + h2.result # Um, what does this now mean ?

were the values to veca queued? Were they overwritten? Would this be valid?
To me it's completely non-obvious what this code means on reading it. It
also *looks* like you're updating internal state of the object rather than
an external interface. (It also has the implicit suggestion that you can
read these values as well, which may or may not be valid)

Ugh. Yeah, I go back and forth myself on the ability to change the
arguments. The only way it really works above is if the argument change
acts like a .send method, so it would probably have to act like the
arguments are sitting in a queue until the next .run() is called.
Probably the same for the .result, just take the next result from the
last run call. So the above sequence would hopefully just fail on the
..run() where the arguments are undefined, but if they were defined, then
the new arguments would go on a queue for the next run, I guess. But
yeah, it is probably the least 'pythonic' part of the proposal. It just
seems a little to valuable to give up. Although I suppose it's possible
to do, you could have to pass in all arguments through sends and
receives. Would that be less clunky or more clunky? Hard to say.
I think the core to of your idea is that your new suite really introduces
a scheduler for generators and a means for communicating (you call them
tags). What you've done here is also make the arguments for creation mutable
which becomes confusing.

Actually, I don't think I introduced a scheduler, and I sure don't want
to limit things to generators. I kind of figured that a scheduler would
be something a little more advanced that might be added in a library at
some point. I just wanted to see if people liked the primitive first off.
I'm not trying to discourage you, I like the ideas, and would like to see
you expand them more since they interest me, but you have some other
inconsistencies. For example, you say here:

Remember your statement that pardefs don't inherit any data from the global
scope. Data is only passed in through receive statements and the arguments.

In practice however, you also have in your examples this:
updef = not (isinstance(up, int) or isintance(up, float))
downdef = not (isinstance(down, int) or isintance(down, float))
rightdef = not (isinstance(right, int) or isintance(right, float))
leftdef = not (isinstance(left, int) or isintance(left, float))

In these lines you have passed over global scope data - specifically
int and float.

Well, the old preferred method was to import types and compare to those,
but I thought that comparing to the builtins was the shiny new way to do
things. I don't do this stuff very often, so I may have gotten this
wrong, though. But I do assume that a pardef does get a clean set of
built in functions, as you would get in any module without imports.
Unlike your approach (syntax first), we've created a large (growing) number
of components (like your pardefs) which have been put together for various
systems which have varying levels of concurrency which are probably a useful
testbed for testing your ideas:

* http://kamaelia.sourceforge.net/Components.html

Example systems created using this vary from a "everything broadcast" PVR
for radio [*], a networked audio mixer matrix, simple games, a presentation
tool (probably including soon a live video mixer to go with the video
playback), topology viewing tools, a simple "paint" program, etc)

* Actually a box for creating a record of transmission, which
is slightly different, but the former description is more
accessible if slightly inaccurate.)

If you're interested in prototyping your ideas in python, you can simulate
some of your ideas using decorators. Something that might help you with
prototyping your ideas is our tutorial for Kamaelia, which is a "build your
own core" system type tutorial. It might also help show that your pardefs
are very similar to python's generators. It can be found here:
* http://kamaelia.sourceforge.net/MiniAxon/

Yes, you have been busy. It is a very interesting system.
In many respects, I feel that the API we have still isn't "100% pythonic" as
many would call it, but it does try to be unsurprising and consistent. You
could say we're aiming for pythonic - though personally I favour easy and
unsurprising as more concrete, less abstract goals - even if it's not there
yet. (portability of ideas to other languages is important to me, which
again is another reason for an API based view rather than syntax).

If you're willing to take a look at it, and make suggestions as to how your
ideas might fit, (or what you think is dumb :) I'd welcome it.

I'm not sure, really, I'm still kind of confused as to the depth of your
system. It does look strongly generator based, so it should be possible
to port most of the algorithms you're using to this proposed system
(if I ever implement it) (and I really should mention that a .restart
method would really be useful for yield-ing pardefs, I keep forgetting
to mention that) but I'm not sure how much further Kamaelia goes. It
sounds like you have networking 'pipes' going that allow you distribute
jobs across a network, but it's not completely clear to me. If that's
the case, then it should be possible to encapsulate that into a Kamaelia
base class for this proposal so you could keep that functionality as
well. That might require some extra base-class specific methods, but I
have no problem with that. Like I said, I imagine this as a fairly
minimal interface, no problem supplementing it if you like.
But anyway, I'm beginning to ramble, but thanks for all the great
feedback, and a happy new year to you as well! :)


----- Corey
 
M

Mike Meyer

Alright, so I've been following some of the arguments about enhancing
parallelism in python, and I've kind of been struck by how hard things
still are. It seems like what we really need is a more pythonic
approach.

I certainly agree, and have thought about it some myself.
pardef <Name>(self, <par type>, arguments...):
self.send(<dest pardef>, <tag>, arguments)
self.receive(<tag>, arguments)
return arguments
yield arguments

[Rest elided]

This really has a lot in common with SCOOP. SCOOP makes concurrent
stuff easier, but i'm not sure it fits well with Python. I'll describe
it, on the off chance you may get some ideas from it. See <URL:
http://archive.eiffel.com/doc/manuals/technology/concurrency/ > for
full details.

Like pardef, you use SCOOP by creating objects that are going to run
concurrently. Unlike pardef, the objects are - well, objects, not just
restricted to functions. This has deep implications. For instance, one
of the things that sets SCOOP apart from other concurrency systems
that I know if is that the program doesn't have code to create
"processors" (threads, process, whatever - places for the code to
run). From my reading of your description, this isn't the case for
pardef. Invoking a pardef causes a new processor to be created, and
when the pardef returns, the processor is finished.

Like pardef, SCOOP separates the specification of the kind of
processor from the rest of the code. It carries things further than
pardef though. You specify a set of processors at build time, and the
support system takes care of getting objects on the right processor.

Synchronization and protection is all handled via method
invocation. There are no new primitives for locking objects, explicit
synchronization, or communications.

Calls to methods of a separate object are non-blocking. The calls are
queued, and executed later. Reading an attribute from a separate
object is a blocking action. Calling a method with a separate object
as an argument causes the call to block until the separate arguments
are all locked. There are rules - enforceable by the compiler - about
what you can do with separate objects that prevent a lot of the things
that cause headaches when doing concurrent programming.

So instead of "run", you instantiate a separate object. Instead of
".send", you just invoke a method on the seperate object, and the
arguments get sent to it. Instead of ".receive", you read an attribute
from a separate object. Instead of ."kill", you let the object be
garbage collected (or maybe those aren't the same thing). Return is
missing, because it's an object, not a function. On the other hand,
you could define an attribute that's the "return value", and reading
it fetches that value.

The problems with Python is that "separate" is really a property of a
variable, not an object. To see the difference, if I declare an object
separate, and it creates an attribute, and I read that attribute, then
that object is separate for me. But in the object that created it, it
isn't. This is one major clash with Python. The other is the
critical distinction that SCOOP places on reading vs. calling a
feature. What does:

a = Seperate_Object()
x = a.thing
x()

do with SCOOP semantics?

Oh well. I think SCOOP does the job you want of "making things
easier", and fits well in an OO language. Maybe you can find good
ideas in it.

<mike
 
C

Corey Coughlin

Mike said:
[Rest elided]

This really has a lot in common with SCOOP. SCOOP makes concurrent
stuff easier, but i'm not sure it fits well with Python. I'll describe
it, on the off chance you may get some ideas from it. See <URL:
http://archive.eiffel.com/doc/manuals/technology/concurrency/ > for
full details.

Eiffel! I should have thought of that. I may have to dig all my old
Meyer books out of storage......
Like pardef, you use SCOOP by creating objects that are going to run
concurrently. Unlike pardef, the objects are - well, objects, not just
restricted to functions. This has deep implications. For instance, one
of the things that sets SCOOP apart from other concurrency systems
that I know if is that the program doesn't have code to create
"processors" (threads, process, whatever - places for the code to
run). From my reading of your description, this isn't the case for
pardef. Invoking a pardef causes a new processor to be created, and
when the pardef returns, the processor is finished.

Well, not quite, the pardef is actually an object, and you can certainly
put all kinds of objects inside it if you want to. The pardef object is
really more of an interface specification, where the actual
implementation of most of the methods is really defined in the base
class. The body of the pardef really just defines what happens during a
..run() call. The whole 'def' in pardef and the use of returns (and
yields) may be kind of misleading, the return (and yield) are really
just ways to send info back to the calling scope without having to come
up with some wierd way of referencing it. What I'm envisioning can
return a single value, or stay resident and return multiple values
(through yield or .send calls) or whatever.
Like pardef, SCOOP separates the specification of the kind of
processor from the rest of the code. It carries things further than
pardef though. You specify a set of processors at build time, and the
support system takes care of getting objects on the right processor.

Synchronization and protection is all handled via method
invocation. There are no new primitives for locking objects, explicit
synchronization, or communications.

Yeah, I've been trying to think of some way to setup channels in a
better way. Tagged channels can certainly be used, but method (or
method-like) invocations would be more elegant in more than a few
situations. On the other hand, I've also been trying to avoid locking
people into a fully OO style in order to use concurrency, it would be
nice to allow people who don't want to define everything as an object to
use parallelism. But it can get ugly that way. Maybe something like a
"property" declaration to optionally simplify things, hmm....
Calls to methods of a separate object are non-blocking. The calls are
queued, and executed later. Reading an attribute from a separate
object is a blocking action. Calling a method with a separate object
as an argument causes the call to block until the separate arguments
are all locked. There are rules - enforceable by the compiler - about
what you can do with separate objects that prevent a lot of the things
that cause headaches when doing concurrent programming.

Undoubtedly enforced by contracts, Eiffel is nice that way. Some things
I do miss from stricter languages. But then again, having to define
everything as an object does drive me nuts.
So instead of "run", you instantiate a separate object. Instead of
".send", you just invoke a method on the seperate object, and the
arguments get sent to it. Instead of ".receive", you read an attribute
from a separate object. Instead of ."kill", you let the object be
garbage collected (or maybe those aren't the same thing). Return is
missing, because it's an object, not a function. On the other hand,
you could define an attribute that's the "return value", and reading
it fetches that value.

The problems with Python is that "separate" is really a property of a
variable, not an object. To see the difference, if I declare an object
separate, and it creates an attribute, and I read that attribute, then
that object is separate for me. But in the object that created it, it
isn't. This is one major clash with Python. The other is the
critical distinction that SCOOP places on reading vs. calling a
feature. What does:

a = Seperate_Object()
x = a.thing
x()

do with SCOOP semantics?

Oh well. I think SCOOP does the job you want of "making things
easier", and fits well in an OO language. Maybe you can find good
ideas in it.

<mike

Yeah, the dynamic nature of python makes a straight translation of SCOOP
pretty problematic. But it's good stuff to consider, definitely some
great ideas in there. I'm not really sure I want a solution that gets
as complicated as SCOOP, but it may not be something I can avoid. Ah
well, some thinking to do. But just out of curiosity, would you support
a PEP with some language additions to support and simplify concurrency?

------ Corey
 
M

Mike Meyer

Corey Coughlin said:
Undoubtedly enforced by contracts, Eiffel is nice that way.

Actually, no. "separate" is a property of a variable, and certain
things you can do with normal variables you can't do with separate
variables. The semantics of contracts do enter into things,
though. When you pass a separate variable to a function, the function
blocks until the separate variable is available, and all the
preconditions that mention it are true. This gives you gaurded
semaphores.
Yeah, the dynamic nature of python makes a straight translation of
SCOOP pretty problematic. But it's good stuff to consider,
definitely some great ideas in there. I'm not really sure I want a
solution that gets as complicated as SCOOP, but it may not be
something I can avoid. Ah well, some thinking to do. But just out
of curiosity, would you support a PEP with some language additions
to support and simplify concurrency?

I'm certainly interested in seeing such a thing done to
Python. Whether I'd support a specific PEP would depend on the PEP.

<mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,233
Members
46,820
Latest member
GilbertoA5

Latest Threads

Top