[ANN]: "newthreading" - an approach to simplified thread usage, anda path to getting rid of the GIL

J

John Nagle

We have just released a proof-of-concept implementation of a new
approach to thread management - "newthreading". It is available
for download at

https://sourceforge.net/projects/newthreading/

The user's guide is at

http://www.animats.com/papers/languages/newthreadingintro.html

This is a pure Python implementation of synchronized objects, along
with a set of restrictions which make programs race-condition free,
even without a Global Interpreter Lock. The basic idea is that
classes derived from SynchronizedObject are automatically locked
at entry and unlocked at exit. They're also unlocked when a thread
blocks within the class. So at no time can two threads be active
in such a class at one time.

In addition, only "frozen" objects can be passed in and out of
synchronized objects. (This is somewhat like the multiprocessing
module, where you can only pass objects that can be "pickled".
But it's not as restrictive; multiple threads can access the
same synchronized object, one at a time.

This pure Python implementation is usable, but does not improve
performance. It's a proof of concept implementation so that
programmers can try out synchronized classes and see what it's
like to work within those restrictions.

The semantics of Python don't change for single-thread programs.
But when the program forks off the first new thread, the rules
change, and some of the dynamic features of Python are disabled.

Some of the ideas are borrowed from Java, and some are from
"safethreading". The point is to come up with a set of liveable
restrictions which would allow getting rid of the GIL. This
is becoming essential as Unladen Swallow starts to work and the
number of processors per machine keeps climbing.

This may in time become a Python Enhancement Proposal. We'd like
to get some experience with it first. Try it out and report back.
The SourceForge forum for the project is the best place to report problems.

John Nagle
 
P

Paul Rubin

I only had a couple minutes to look at it (maybe more during the
weekend). It looks interesting. I wonder whether Python is really the
right host language for it. How do you handle nested objects whose
outermost layer is immutable but whose contents are potentially mutable?
An obvious example is a list within a tuple

( [1,2,3], )

but conceivably the mutable stuff could exist at an arbitrary depth
(making a thorough scan impractical) and give rise to a race condition.

Also, I haven't followed PyPy development lately, but appraently stuff
has been happening. Maybe the issues of CPython and the GIL will be
behind us not that long from now. Is newthreading likely to be
applicable to PyPy?
 
J

John Nagle

I only had a couple minutes to look at it (maybe more during the
weekend). It looks interesting. I wonder whether Python is really the
right host language for it. How do you handle nested objects whose
outermost layer is immutable but whose contents are potentially mutable?
An obvious example is a list within a tuple

( [1,2,3], )

but conceivably the mutable stuff could exist at an arbitrary depth
(making a thorough scan impractical) and give rise to a race condition.

"freeze" freezes recursively. So

freeze(( [1,2,3], ))

returns

( (1,2,3), )

So passing a big, mutable data object into a synchronized object is
inefficient, but it works. (The same issue appears with
the multiprocessing module; if you pass in something big, there's
considerable copying overhead.)

If you have some complex structure, like
a tree, that needs to be visible to multiple threads, it's appropriate
to put it inside a SynchronizedObject or AtomicObject and
manipulate it through methods.

Python is a good language for this because so much Python data is
immutable. Sharing immutable data between threads is thread-safe,
which can eliminate much copying.
Also, I haven't followed PyPy development lately, but appraently stuff
has been happening. Maybe the issues of CPython and the GIL will be
behind us not that long from now. Is newthreading likely to be
applicable to PyPy?

PyPy has a global lock. See
"http://morepypy.blogspot.com/2008/05/threads-and-gcs.html". They've
looked at replacing their global lock
with many local locks, but the overhead on that is high.
"safethreading" tried, and it didn't work out well. The "newthreading"
concept could be applied to PyPy; it's not tied to a particular
implementation. I haven't looked much at PyPy though.

There's more on memory management issues in the "newthreading" theory
paper, "http://www.animats.com/papers/languages/pythonconcurrency.html".
Right now, I'm focusing on usability of the approach. It can be made
to go fast; the question is whether it can be made popular.

John Nagle
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,962
Messages
2,570,134
Members
46,692
Latest member
JenniferTi

Latest Threads

Top