Why does start_new_thread() create an extra process under Linux?

J

Jon Perez

Running the following under Linux creates
3 processes instead of 2. Once the started
thread exits, 2 processes still remain. Why?


import thread
from thread import start_new_thread

def newthread():
print "child"

.... suitable delay ...

thread.exit()


start_new_thread(newthread, () )
while 1:
pass


Note: am running Linux Kernel 2.6.7 / glibc 2.3.2 (Slackware 10)
 
J

Jp Calderone

Jon said:
Running the following under Linux creates
3 processes instead of 2. Once the started
thread exits, 2 processes still remain. Why?

Most likely, the "extra" you are seeing is an implementation detail
of your platform's underlying thread library. It probably exists to act
as a scheduler or perform other administrative tasks for the "real"
threads of your application.

Jp
 
H

Heiko Wundram

Am Donnerstag, 29. Juli 2004 16:00 schrieb Jp Calderone:
Most likely, the "extra" you are seeing is an implementation detail
of your platform's underlying thread library. It probably exists to act
as a scheduler or perform other administrative tasks for the "real"
threads of your application.

Well, first of all, what the op was seeing wasn't actually what he thought he
was seeing.

In Python there's always the main thread (which is started when python starts
up), and other threads may be started. Thus, if you start two threads in your
program, you'll see three processes in the process list (one for the main
thread, two for the started threads).

But whether these threads will show up as processes depends on the threading
library you use...

LinuxThreads creates a process for each thread that is run. All these
processes share the same memory, although they show up as separate processes
(and actually are, at least for the kernel, they are started by the sys-call
CLONE, which clones a process creating a new process ID, stack and
instruction pointer, but keeping the data and code segment of the cloning
process).

NPTL (Native Posix Threads Library), the "next-generation" threads library for
Linux, handles threads "correctly" in the sense that they are just one
process with separate execution frames but shared memory. NPTL requires
kernel >= 2.5.40-something and a specially adapted glibc. Most new Linux
distributions (>= 9.0 something, debian sid aka. unstable) ship with NPTL
enabled by default, although this creates compatability problems with apps
written for LinuxThreads, as LinuxThreads isn't completely Posix-Threads
compatible (which NPTL is). It also uses some form of syscall, but you'd have
to see the docs for this, I don't know. ps from procps was augmented to
support NPTL threads sometime ago, there's a specific flag you have to
specify to have threads shown.

There are also other Linux threads libraries out there, all of them completely
implemented in user-space, using dispatch/longjmp and other black magic. When
a program uses one of these, you'll also see only one process, although I
don't know any production program that uses one of these threading libraries.

Anyway, hope this clears it up a little...

Heiko.
 
H

Heiko Wundram

Am Donnerstag, 29. Juli 2004 17:31 schrieb Heiko Wundram:
Well, first of all, what the op was seeing wasn't actually what he thought
he was seeing.

In Python there's always the main thread (which is started when python
starts up), and other threads may be started. Thus, if you start two
threads in your program, you'll see three processes in the process list
(one for the main thread, two for the started threads).

Forget that, I read the first post wrong. What the op was probably seeing was
output from an NPTL patched ps, which always shows the threads that are
running (not only when asked for it). ps outputs one line (the first) for the
process, all other lines are for each of the threads that are currently
running under this process. So, the following output from ps (actually
ps ax -m on my machine) means that pickup (part of postfix) only runs one
thread (not two processes), and xmms runs five threads.

17539 ? - 0:00 pickup -l -t fifo -u
- - S 0:00 -
18338 ? - 0:02 /usr/bin/xmms
- - S 0:02 -
- - S 0:00 -
- - S 0:00 -
- - S 0:00 -
- - S 0:00 -

I have an NPTL enabled glibc + kernel (without a somewhat strange patch, as
the op seems to have), when I only type ps ax, it'll show up as:

17539 ? S 0:00 pickup -l -t fifo -u
18338 ? S 0:02 /usr/bin/xmms

To see whether you have an NPTL enabled glibc, type /lib/libc.so.6, and it'll
output something like:

....
Available extensions:
....
NPTL 0.61 by Ulrich Drepper
....

Heiko.
 
J

Jon Perez

Heiko said:
NPTL (Native Posix Threads Library), the "next-generation" threads library for
Linux, handles threads "correctly" in the sense that they are just one
process with separate execution frames but shared memory.

Does this the fact that NPTL threads are 'just one process' mean they
are not created using clone()? Are NPTL threads not scheduled by
the kernel?

If so, then how come NTPL is described as a 1:1 model which afaik
means 1 application thread is mapped to exactly 1 'kernel' scheduled thread
(or lightweight process if you will) which, again afaik, can only be created
via a clone() call (albeit with different flags) and nothing else.

If NTPL threads are scheduled by NPTL code as opposed to kernel code and are
all mapped onto one process started by one clone() call, wouldn't that make it M:1?

> What the op was probably seeing was output from an NPTL patched ps, which
> always shows the threads that are running (not only when asked for it).

Note that my glibc is not NTPL-enabled, it is the stock 2.3.2 used
in Slackware 10 (although the procps-3.2.1 it uses may already be NPTL-ready),
so this would not seem to be the explanation.

If you start the sample program in my original message and it hasn't launched
a thread yet, ps will only show one running process. The moment it calls
start_new_thread() however, two new processes show up in ps (so that makes
three total)! Once this newly started thread dies, only one process gets removed,
so there will still be two processes running and that's what's puzzling me.

Same thing applies if you start N number of threads. Seems there's always
one extra thread lying around after you call start_new_thread().
 
E

Erno Kuusela

Jon Perez said:
Does this the fact that NPTL threads are 'just one process' mean they
are not created using clone()? Are NPTL threads not scheduled by
the kernel?

they are just hidden from the /proc directory listing.

(erno@fabulous) /home.b/erno % ls -l /proc/`pidof firefox-bin`/task
total 0
dr-xr-xr-x 3 erno erno 0 Jul 30 17:56 28319
dr-xr-xr-x 3 erno erno 0 Jul 30 17:56 31596
dr-xr-xr-x 3 erno erno 0 Jul 30 17:56 31597
dr-xr-xr-x 3 erno erno 0 Jul 30 17:56 31599
(erno@fabulous) /home.b/erno % ls -l /proc/28319 | wc -l
16
(erno@fabulous) /home.b/erno % ls -l /proc|grep -c 28319
0

-- erno
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,202
Messages
2,571,057
Members
47,665
Latest member
salkete

Latest Threads

Top