global interpreter lock

B

Bryan Olson

km said:
> Hi all,
>
> is true parallelism possible in python ? or atleast in the
> coming versions ? is global interpreter lock a bane in this
> context ?

No; maybe; and currently, not usually.

On a uniprocessor system, the GIL is no problem. On multi-
processor/core systems, it's a big loser.
 
D

Donn Cave

Bryan Olson said:
No; maybe; and currently, not usually.

On a uniprocessor system, the GIL is no problem. On multi-
processor/core systems, it's a big loser.

I rather suspect it's a bigger winner there.

Someone who needs to execute Python instructions in parallel
is out of luck, of course, but that has to be a small crowd.
I would have to assume that in most applications that need
the kind of computational support that implies, are doing most
of the actual computation in C, in functions that run with the
lock released. Rrunnable threads is 1 interpreter, plus N
"allow threads" C functions, where N is whatever the OS will bear.

Meanwhile, the interpreter's serial concurrency limits the
damage. The unfortunate reality is that concurrency is a
bane, so to speak -- programming for concurrency takes skill
and discipline and a supportive environment, and Python's
interpreter provides a cheap and moderately effective support
that compensates for most programmers' unrealistic assessment
of their skill and discipline. Not that you can't go wrong,
but the chances you'll get nailed for it are greatly reduced -
especially in an SMP environment.

Donn Cave, (e-mail address removed)
 
K

km

Hi all,

is true parallelism possible in python ? or atleast in the coming versions ?
is global interpreter lock a bane in this context ?

regards,
KM
 
P

Paul Rubin

Mike Meyer said:
The real problem is that the concurrency models available in currently
popular languages are still at the "goto" stage of language
development. Better models exist, have existed for decades, and are
available in a variety of languages.

But Python's threading system is designed to be like Java's, and
actual Java implementations seem to support concurrent threads just fine.

One problem with Python is it doesn't support synchronized objects
nearly as conveniently as Java, though. You need messy explicit
locking and unlocking all over the place. But it's not mysterious how
to do those explicit locks; it's just inconvenient.
 
A

Alan Kennedy

[km]
is true parallelism possible in python ?

cpython: no.
jython: yes.
ironpython: yes.
> or atleast in the coming versions ?

cpython: unknown.
pypy: don't have time to research. Anyone know?
is global interpreter lock a bane in this context ?

beauty/bane-is-in-the-eye-of-the-beholder-ly y'rs
 
D

Donn Cave

Quoth Paul Rubin <http://[email protected]>:

|> The real problem is that the concurrency models available in currently
|> popular languages are still at the "goto" stage of language
|> development. Better models exist, have existed for decades, and are
|> available in a variety of languages.
|
| But Python's threading system is designed to be like Java's, and
| actual Java implementations seem to support concurrent threads just fine.

I don't see a contradiction here. "goto" is "just fine", too --
you can write excellent programs with goto. 20 years of one very
successful software engineering crusade against this feature have
made it a household word for brokenness, but most current programming
languages have more problems in that vein that pass without question.
If you want to see progress, it's important to remember that goto
was a workable, useful, powerful construct that worked fine in the
right hands - and that wasn't enough.

Anyway, to return to the subject, I believe if you follow this
subthread back you will see that it has diverged a little from
simply whether or how Python could support SMP.

Mike, care to mention an example or two of the better models you
had in mind there?

Donn Cave, (e-mail address removed)
 
B

Bryan Olson

Mike said:
> The real problem is that the concurrency models available in currently
> popular languages are still at the "goto" stage of language
> development. Better models exist, have existed for decades, and are
> available in a variety of languages.

That's not "the real problem"; it's a different and arguable
problem. The GIL isn't even part of Python's threading model;
it's part of the implementation.
> It's not that these languages are for "thread-phobes", either. They
> don't lose power any more than Python looses power by not having a
> goto. They languages haven't taken off for reasons unrelated to the
> threading model(*).
>
> The rule I follow in choosing my tools is "Use the least complex tool
> that will get the job done."

Even if a more complex tool could do the job better?
> Given that the threading models in
> popular languages are complex and hard to work with, I look elsewhere
> for solutions. I've had good luck using async I/O in lieue of
> theards. It's won't solve every problem, but where it does, it's much
> simpler to work with.

I've found single-line-of-execution async I/O to be worse than
threads. I guess that puts me in the Tannenbaum camp and not the
Ousterhout camp. Guido and Tannenbaum worked together on Amoeba
(and other stuff), which featured threads with semaphores and
seemed to work well.

Now I've gotten off-topic. Threads are winning, and the industry
is going to multiple processors even for PC-class machines.
Might as well learn to use that power.
 
A

Alan Kennedy

[Bryan Olson]
[Mike Meyer]
The real problem is that the concurrency models available in currently
popular languages are still at the "goto" stage of language
development. Better models exist, have existed for decades, and are
available in a variety of languages.

I think that having a concurrency mechanism that doesn't use goto will
require a fundamental redesign of the underlying execution hardware,
i.e. the CPU.

All modern CPUs allow flow control through the use of
machine-code/assembly instructions which branch, either conditionally or
unconditionally, to either a relative or absolute memory address, i.e. a
GOTO.

Modern languages wrap this goto nicely using constructs such as
generators, coroutines or continuations, which allow preservation and
restoration of the execution context, e.g. through closures, evaluation
stacks, etc. But underneath the hood, they're just gotos. And I have no
problem with that.

To really have parallel execution with clean modularity requires a
hardware redesign at the CPU level, where code units, executing in
parallel, are fed a series of data/work-units. When they finish
processing an individual unit, it gets passed (physically, at a hardware
level) to another code unit, executing in parallel on another execution
unit/CPU. To achieve multi-stage processing of data would require
breaking up the processing into a pipeline of modular operations, which
communicate through dedicated hardware channels.

I don't think I've described it very clearly above, but you can read a
good high-level overview of a likely model from the 1980's, the
Transputer, here

http://en.wikipedia.org/wiki/Transputer

Transputers never took off, for a variety of technical and commercial
reasons, even though there was full high-level programming language
support in the form of Occam: I think it was just too brain-bending for
most programmers at the time. (I personally *almost* took on the task of
developing a debugger for transputer arrays for my undergrad thesis in
1988, but when I realised the complexity of the problem, I picked a
hypertext project instead ;-)

http://en.wikipedia.org/wiki/Occam_programming_language

IMHO, python generators (which BTW are implemented with a JVM goto
instruction in jython 2.2) are a nice programming model that fits neatly
with this hardware model. Although not today.
 
M

Mike Meyer

Donn Cave said:
Quoth Paul Rubin <http://[email protected]>:
|> The real problem is that the concurrency models available in currently
|> popular languages are still at the "goto" stage of language
|> development. Better models exist, have existed for decades, and are
|> available in a variety of languages.
| But Python's threading system is designed to be like Java's, and
| actual Java implementations seem to support concurrent threads just fine.
I don't see a contradiction here. "goto" is "just fine", too --
you can write excellent programs with goto.

Right. The only thing wrong with "goto" is that we've since found
better ways to describe program flow. These ways are less complex,
hence easier to use and understand.
Mike, care to mention an example or two of the better models you
had in mind there?

I've seen a couple of such, but have never been able to find the one I
really liked in Google again :-(. That leaves Eiffel's SCOOP (aka
Concurrent Eiffel). You can find a short intro at <URL:
http://archive.eiffel.com/doc/manuals/technology/concurrency/short/page.html

Even simpler to program in is the model used by Erlang. It's more CSP
than threading, though, as it doesn't have shared memory as part of
the model. But if you can use the simpler model to solve your problem
- you probably should.

<mike
 
M

Mike Meyer

Bryan Olson said:
That's not "the real problem"; it's a different and arguable
problem. The GIL isn't even part of Python's threading model;
it's part of the implementation.

Depends on what point you consider the problem.
Even if a more complex tool could do the job better?

In that case, the simpler model isn't necessarily getting the job
done. I purposely didn't refine the word "job" just so this would be
the case.
Now I've gotten off-topic. Threads are winning, and the industry
is going to multiple processors even for PC-class machines.
Might as well learn to use that power.

I own too many orphans to ever confuse popularity with technical
superiority. I've learned how to use threads, and done some
non-trivial thread proramming, and hope to never repeat that
experience. It was the second most difficult programming task I've
ever attempted(*). As I said above, the real problem isn't threads per
se, it's that the model for programming them in popular languages is
still primitive. So far, to achieve the non-repitition goal, I've used
async I/O, restricted my use of real threads in popular languages to
trivial cases, and started using servers so someone else gets tod eal
with these issues. If I ever find myself having to have non-trivial
threads again, I'll check the state of the threading models in other
languages, and make a serious push for implementing parts of the
program in a less popular language with a less primitive threading
model.

<mike

*) The most difficult task was writing horizontal microcode, which
also had serious concurrency issues in the form of device settling
times. I dealt with that by inventing a programming model that hid
most of the timing details from the programmer. It occasionally lost a
cycle, but the people who used it after me were *very* happy with it
compared to the previous model.
 
P

Paul Rubin

Mike Meyer said:
Even simpler to program in is the model used by Erlang. It's more CSP
than threading, though, as it doesn't have shared memory as part of
the model. But if you can use the simpler model to solve your problem
- you probably should.

Well, ok, the Python equivalent would be wrapping every shareable
object in its own thread, that communicates with other threads through
Queues. This is how some Pythonistas suggest writing practically all
multi-threaded Python code. It does a reasonable job of avoiding
synchronization headaches and it's not that hard to code that way.

But I think to do it on Erlang's scale, Python needs user-level
microthreads and not just OS threads. Maybe Python 3000 can add some
language support, though an opportunity was missed when Python's
generator syntax got defined the way it did.

I've been reading a bit about Erlang and am impressed with it. Here
is a good thesis about parallelizing Erlang, link courtesy of Ulf
Wiger on comp.lang.functional:

http://www.erlang.se/publications/xjobb/0089-hedqvist.pdf

The thesis also gives a good general description of how Erlang works.
 
D

Donn Cave

Quoth Mike Meyer <[email protected]>:
[... wandering from the nominal topic ...]

| *) The most difficult task was writing horizontal microcode, which
| also had serious concurrency issues in the form of device settling
| times. I dealt with that by inventing a programming model that hid
| most of the timing details from the programmer. It occasionally lost a
| cycle, but the people who used it after me were *very* happy with it
| compared to the previous model.

My favorite concurrency model comes with a Haskell variant called
O'Haskell, and it was last seen calling itself "Timber" with some
added support for time as an event source. The most on topic thing
about it -- its author implemented a robot controller in Timber, and
the robot is a little 4-wheeler called ... "Timbot".

Donn Cave, (e-mail address removed)
 
D

Dennis Lee Bieber

with these issues. If I ever find myself having to have non-trivial
threads again, I'll check the state of the threading models in other
languages, and make a serious push for implementing parts of the
program in a less popular language with a less primitive threading
model.
The June edition of "SIGPLAN Notices" (the PLDI'05 proceeding issue)
has a paper titled "Threads Cannot Be Implemented As a Library" -- which
is primarily concerned with the problems of threading being done, well,
via an add-on library (as opposed to a native part of the language
specification: C#, Ada, Java).

I suspect Python falls into the "library" category.
--
 
S

sjdevnull

km said:
is true parallelism possible in python ? or atleast in the coming versions ?
is global interpreter lock a bane in this context ?

I've had absolutely zero problems implementing truly parallel programs
in python. All of my parallel programs have been multiprocess
architectures, though--the GIL doesn't affect multiprocess
architectures.

Good support for multiple process architectures was one of the things
that initially lead me to pick Python over Java in the first place
(Java was woefully lacking in support facilities for this kind of
architecture at that time; it's improved somewhat since then but still
requires some custom C coding). I don't have much desire to throw out
decades of work by OS implementors on protected memory without a pretty
darn good reason.
 
M

Mike Meyer

Paul Rubin said:
Well, ok, the Python equivalent would be wrapping every shareable
object in its own thread, that communicates with other threads through
Queues. This is how some Pythonistas suggest writing practically all
multi-threaded Python code. It does a reasonable job of avoiding
synchronization headaches and it's not that hard to code that way.

This sort of feels like writing your while loops/etc. with if and
goto. Sure, they really are that at the hardware level, but you'd like
the constructs you work with to be at a higher level. It's not really
that bad, because Queue is a higher level construct, but it's still
not quite not as good as it could be.
But I think to do it on Erlang's scale, Python needs user-level
microthreads and not just OS threads. Maybe Python 3000 can add some
language support, though an opportunity was missed when Python's
generator syntax got defined the way it did.

I'm not sure we need to go as far as Erlang does. On the other hand,
I'm also not sure we can get a "much better" threading model without
language support of some kind. Threading and Queues are all well and
good, but they still leave the programmer handling primitive threading
objects.

<mike
 
M

Mike Meyer

Dennis Lee Bieber said:
The June edition of "SIGPLAN Notices" (the PLDI'05 proceeding issue)
has a paper titled "Threads Cannot Be Implemented As a Library" -- which
is primarily concerned with the problems of threading being done, well,
via an add-on library (as opposed to a native part of the language
specification: C#, Ada, Java).

Thanks for the reference. A litte googling turns up a copy published
via HP at <URL:
http://www.hpl.hp.com/techreports/2004/HPL-2004-209.html >.
I suspect Python falls into the "library" category.

Well, that's what it's got now, so that seem likely.

<mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,831
Latest member
RusselWill

Latest Threads

Top