when does the GIL really block?

C

Craig Allen

I have followed the GIL debate in python for some time. I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-thread asynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that the threads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.

My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.

I start two threads... I use gkrellm to watch my processors (dual
processor machine). If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in my thread's run functions is
actually executing non-serially.

I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of the threads... anyone care to explain?
 
R

Rhamphoryncus

I have followed the GIL debate in python for some time.  I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-threadasynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that thethreads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.

My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs.  In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.

I start twothreads... I use gkrellm to watch my processors (dual
processor machine).  If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in mythread's run functions is
actually executing non-serially.

Try using sys.setcheckinterval(10000) (or even larger), overriding the
default of 100. This will reduce the locking overhead, which might by
why you see both CPUs as busy.

I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected?  What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of thethreads...  anyone care to explain?

The GIL is locked during *all* access to the python interpreter.
There's nothing pure python code can do to avoid it - only a C
extension that doesn't access python could.
 
C

Craig Allen

Try using sys.setcheckinterval(10000) (or even larger), overriding the
default of 100. This will reduce the locking overhead, which might by
why you see both CPUs as busy.


The GIL is locked during *all* access to the python interpreter.
There's nothing pure python code can do to avoid it - only a C
extension that doesn't access python could.

thanks
 
J

John Krukoff

I have followed the GIL debate in python for some time. I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-thread asynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that the threads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.

My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.

I start two threads... I use gkrellm to watch my processors (dual
processor machine). If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in my thread's run functions is
actually executing non-serially.

I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of the threads... anyone care to explain?

It's worth mentioning that the most common place for the python
interpreter to release the GIL is during I/O, which printing a number to
the screen certainly counts as. You might try again with a set of loops
that only increment, and don't print, and you may more obviously see the
GIL in action.
 
C

Craig Allen

It's worth mentioning that the most common place for the python
interpreter to release the GIL is during I/O, which printing a number to
the screen certainly counts as. You might try again with a set of loops
that only increment, and don't print, and you may more obviously see the
GIL in action.

thanks, good idea, I think I'll try that.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,965
Messages
2,570,148
Members
46,710
Latest member
FredricRen

Latest Threads

Top