Am Dienstag, den 05.07.2005, 08:37 -0700 schrieb Jonathan Ellis:
In many ways, Python is an incredibly bad choice for deeply
multithreaded applications. One big problem is the global interpreter
lock; no matter how many CPUs you have, only one will run python code
at a time. (Many people who don't run on multiple CPUs anyway try to
wave this off as a non-problem, or at least worth the tradeoff in terms
of a simpler C API, but with multicore processors coming from every
direction I think the "let's pretend we don't have a problem" approach
may not work much longer.)
Well, it's not a tradeoff in a simpler C API. It's a more complicated
thing. Much more complicated.
Basically nobody has been able to
propose a sensible solution for removing the GIL.
Any solution would have to have the following properties:
a) must be safe.
b) should be probably not slow as a snail
The problem with a) is that loosing this property is not a proposition.
As it is probably a core benefit of Python. So the ceval function would
have to lock any object used it encounters when executing. Trouble: How
to deal with deadlocks. And it would entail locking and unlocking heaps
of objects on any line of Python code.
Basically the current state of art in "threading" programming doesn't
include a safe model. General threading programming is unsafe at the
moment, and there's nothing to do about that. It requires the developer
to carefully add any needed locking by hand. Any error in doing that
will give very hard to debug errors that might show up only in very
specific hardware configurations. And there is no way to detect these
errors automatically, which would be needed to raise a nice exception,
which is the standard at the moment in Python.
If the GIL isn't an issue (and in your case it clearly is), you'll
quickly find that there's little support for debugging multithreaded
applications, and even less for profiling.
As I've said above, there is a case that the current "safe computing"
model of Python isn't compatible with the current state of art in
threading.
Sometimes running multiple processes is an acceptable workaround; if
not, good luck with the rewrite in Java or something else with real
thread support. (IIRC Jython doesn't have a GIL; that might be an
option too.)
Jython might not have a GIL, but it probably will be have really bad
performance because it has to lock all kind of objects during
executation.
Python is a great tool but if you really need good threading support
you will have to look elsewhere.
Yes and no. For a completely general threading support, Python isn't
probably what one wants. OTOH general threading developement is a bit
more painful than many application developers want to endure. There are
ways to do a more structured threading in Python quite ok:
a) rewrite your number crunching thread stuff in C and release the GIL.
b) if you've got a task that can live with less communication -> one
might fork of some computation and communicate the results via pipe.
c) Live with the GIL.
While multi-core CPUs are coming, it will a time before the mainstream
hardware will get to more than 2 logical CPUs. You get quite a bit of
speedup already by delegating all the OS and other background tasks to
one CPU core.
And one thing one shouldn't forget is that finely multithreaded apps
aren't faster magically. If you spend need to add 50% more work for
locking, you will not get many benefits from threading on a 2 core box:
Assuming a two-core box does have about the processing power of 1.8
cores (because of memory contentation, and other problems. Depending
upon use and the exact hardware it might be even worse than that).
Now your single-threaded app runs 10seconds.
With locking this would be about 15seconds.
15 seconds divided by 1.8 gives 8.33seconds.
And that assumes that your application is perfectly threadable and will
have no contentation for data between it's threads. And it assumes a
favorable 1.8 speedup factor for 2 cores. And the hardware level
overhead for more cores goes up, especially if you run multiple threads
of one program -> because the fine interaction between this threads
raises the contentation for data between the cores/processors.
So, yes Python isn't multithreading well. At least not at the moment.
But this is basically there isn't currently a theoretical way to provide
the environment that Python does safely in a multithreaded app without
an incredible performance hit.
Andreas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
iD8DBQBC1MzxHJdudm4KnO0RApfHAKDfv0dS85IsWtryE1eD769kWJ/uOACgmfC9
s7GA2Mfe3Hax6leuMXvvpaA=
=UA5+
-----END PGP SIGNATURE-----