J
jeremy
But that's exactly what you said you wanted people to be able to do:
"with my suggestion they could potentially get a massive speed up just by
changing 'for' to 'par' or 'map' to 'pmap'."
I am finding this conversation difficult because it seems to me you don't
have a consistent set of requirements.
How will 'par' be any different? It won't magically turn code with
deadlocks into bug-free code.
A compiler directive is just as clear about the programmer's intention as
a keyword. Possibly even more so.
#$ PARALLEL-LOOP
for x in seq:
do(x)
Seems pretty obvious to me. (Not that I'm suggesting compiler directives
is a good solution to this problem.)
The problem is that "as simple as possible" is Not Very Simple. There's
no getting around the fact that concurrency is inherently complex. In
some special cases, you can keep it simple, e.g. parallel-map with a
function that has no side-effects. But in the general case, no, you can't
avoid dealing with the complexity, at least a little bit.
It's *already* easy to explain. And having explained it, you still need
to do something about it. You can't just say "Oh well, I've had all the
pitfalls explained to me, so now I don't have to actually do anything
about avoiding those pitfalls". You still need to actually avoid them.
For example, you can choose one of four tactics:
(1) the loop construct deals with locking;
(2) the caller deals with locking;
(3) nobody deals with locking, therefore the code is buggy and risks
deadlocks; or
(4) the caller is responsible for making sure he never shares data while
looping over it.
I don't think I've missed any possibilities. You have to pick one of
those four.
So now you want a second keyword as well.
Hi Steven,
I am finding this conversation difficult because it seems to me you don't have a consistent set of requirements".
I think that my position has actually been consistent throughout this
discussion about what I would like to achieve. However I have learned
more about the inner workings of python than I knew before which have
made it clear that it would be difficult to implement (in CPython at
least). And also I never intended to present this as a fait accompli -
the intention was to start a debate as we have been doing. You also
wrote
So now you want a second keyword as well
I actually described the 'sync' keyword in my second email before
anybody else contributed.
I *do* actually know a bit about concurrency and would never imply
that *any* for loop could be converted to a parallel one. The
intention of my remark "with my suggestion they could potentially get
a massive speed up just by changing 'for' to 'par' or 'map' to
'pmap'." is that it could be applied in the particular circumstances
where there are no dependencies between different iterations of the
loop.
Regarding your implementation strategies, mine would be related to
this one:
(4) the caller is responsible for making sure he never shares data while
looping over it.
However in my world there is no problem with threads sharing data as
long as they do not change the values. So the actual rule would be
something like:
5. The caller is responsible for making sure that one iteration of the
parallel loop never tries to write to a variable that another
iteration may read, unless separated by a 'sync' event.
This shows why the sync event is needed - to avoid race conditions on
shared variables. It is borrowed from the BSP paradigm - although that
is a distibuted memory approach. Without the sync clause, rule 5 would
just be the standard way of defining a parallelisable loop.
Jeremy
P.S. I have a couple of additional embellishments to share at this
stage:
1. Parallel slackness - this is a concept where one creates many more
threads than there are available cores to cover up for a load
imbalance. So you might use this in a case where the iterations you
wish to parallelise take a variable amount of time. I think my par
concept needs a mechanism for the user to specify how many threads to
create - overriding any decision made by Python, e.g.
par 100 i in list: # Create 100 threads no matter how many cores are
available.
2. Scope of the 'sync' command. It was pointed out to me by a
colleague that I need to define what happens with sync when there are
nested par loops. I think that it should be defined to apply to the
innermost par construct which encloses the statement.