Going the PL/1 way

Christopher T King · Aug 10, 2004

Another reason is reference counting, which must be synchronized.

Forgot about that; a global reference count lock might work well, but this
could negatively impact performance in the case of things like argument
tuples. Perhaps internal objects that are guaranteed to be thread-local
can skip the reference-count-locking step, but I'm not sure how many (if
any) objects can guarantee this.

Aahz · Aug 10, 2004

The real reason behind the GIL is that the Python interpreter is not
re-entrant; it keeps internal state in a global structure which must be
switched out (and stored somewhere) on thread changes. The real solution
to this problem is to make the interpreter stateless, thus obviating the
need for the GIL entirely. I think this task would be much easier to do
in Stackless than in CPython, but I may be wrong.

There's no "the" real reason. Another critical reason is that it's a
design goal of CPython to be a useful glue language for C libraries.
Many libraries use internal static variables....

Until a good API exists on all standard platforms for determining
whether a library call needs a GIL around it, we can't seriously discuss
removing the GIL. Currently, any library designed to work with Python's
GIL can easily unlock the GIL while it's running "standalone" (no calls
back into Python).
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"To me vi is Zen. To use vi is to practice zen. Every command is a
koan. Profound to the user, unintelligible to the uninitiated. You
discover truth everytime you use it." (e-mail address removed)

G. S. Hayes · Aug 10, 2004

state machines and processes [and inter-process
communication/coordination]? It sounds like you're saying the alternative
to threads is "home brewed" threads.

No. I'm using alternative multiprocessing/event-driven programming
methods that are much less error prone (ie easier to program and
maintain) and easier to get good performance out of than using
threads. The core difference between threads and processes is that
threads share all their memory while processes have protected memory.
That difference _should_ be the deciding factor in whether you use
processes or threads in an application, although as I've said
throughout this thread two major platforms (Windows{see * below} and
Java) don't give you the tools to make that possible; thankfully,
Python does on OSes where it's possible.

I'll repeat: On platforms with sane process implementations, the
deciding factor between using threads or using processes should be
whether you want all your memory (including the code) shared or not.
Because THAT is the fundamental difference between the two.

99% of the time, throwing out protected memory is the wrong thing to
do; it is extremely error prone, difficult to debug, and while it
often seems easy to design a multithreaded program up front it almost
always winds up being more work than using events and processes where
needed. In many cases it'll run dramatically slower than a multiple
process implementation as well.

Note that I'm not alone in this belief; since you seem particularly
interested in the GUI side of things, you might consider reading John
Ousterhout's famous paper "Threads Considered Harmful" (he's the
author of Tcl/Tk and knows a thing or two about programming GUI apps).

Until recently linux implemented threads as processes.

This is not true for any reasonable definition of "recently". What
has recently changed is how the threads are shown by tools like ps and
top, and the threading library has changed from LinuxThreads to a more
efficient implementation (NPTL) that uses faster mutexes. But the
core COE implementation hasn't changed in nearly a decade (since 1996
at least):

Linux implements arbitrary "contexts of execution" in a way similar to
Plan 9. Traditional processes and threads are just two kinds of COE;
COEs can share various attributes. A traditional thread is akin to a
COE that shares memory, whereas traditional processes don't. But
other attributes (e.g. signal handlers or even the filesystem
namespace) can be private or shared. "thread" and "processes" are
only two instances of a much more flexible object.

Various other OSes (e.g. Irix, BSD) do similar things (often using the
name "sproc" or "rfork" instead of "clone).

I'm really not concerned with how the threads are implemented in the
thread library. I just don't want a language to prevent me from
accessing the thread library.

I agree, though I'd much rather use a language like Python that hides
the thread library than a language like Java that hides the much more
useful process library.

*Windows has an excellent I/O Completion Port mechanism for
event-driven programming that can build very efficient multiplexed I/O
(e.g scalable network servers); Linux's queued realtime signals were
based in part on that mechanism.

Donn Cave · Aug 11, 2004

(e-mail address removed) wrote in message

I agree, though I'd much rather use a language like Python that hides
the thread library than a language like Java that hides the much more
useful process library.

May have missed some of the context here, I suspect there's
a particular language that's preventing some particular type
of access to the thread library, but the details seem to be
missing.

In more general terms, note that some languages implement
threads on their own, rather than integrating with the OS
thread system. That doesn't work for me, but for some
applications it's said to be much more efficient and robust.
Erlang is probably a good example, the great "microthreads"
stackless add-on of yore probably a more interesting one.

It would be interesting to get someone to revive microthreads,
and someone else to restore the free threading patches, and
then try both on the same compute intensive pure Python
problem on a 4-way Xeon. Microthreads would of course be
limited to one processor, but I wouldn't bet a dime on the
outcome.

Donn Cave, (e-mail address removed)

John J. Lee · Aug 12, 2004

No. I'm using alternative multiprocessing/event-driven programming
methods that are much less error prone (ie easier to program and
maintain) and easier to get good performance out of than using
threads. The core difference between threads and processes is that
threads share all their memory while processes have protected memory.
That difference _should_ be the deciding factor in whether you use
processes or threads in an application, although as I've said
throughout this thread two major platforms (Windows{see * below} and
Java) don't give you the tools to make that possible; thankfully,
Python does on OSes where it's possible.

[...]

....although Java 1.4 does provide a select()-workalike -- see Alan
Kennedy's recent post on implementing socket, select and asyncore for
Jython.

John

Anthony Baxter · Aug 13, 2004

May have missed some of the context here, I suspect there's
a particular language that's preventing some particular type
of access to the thread library, but the details seem to be
missing.

One other point on Python's thread library - it tries to offer a consistent
set of functions and methods and behaviour across all platforms where
threading exists. It also implements it in terms of the platform's native
threads libraries, rather than implementing it's own. This, in theory, makes
Python more portable, and it's threads more robust.

And then there's HP/UX. Oh well, nice theory.

Going the PL/1 way

Christopher T King

Aahz

G. S. Hayes

Donn Cave

John J. Lee

Anthony Baxter

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads