thread limit in python

D

danieldsmith

hello all,

i have run into a problem where i cannot start more than 1021 threads
in a python program, no matter what my ulimit or kernel settings are.
my program crashes with 'thread.error: can't start new thread' when it
tries to start the 1021st thread.

in addition to tweaking my ulimit settings, i have also verified that
the version of glibc i'm using has PTHREAD_THREADS_MAX defined to be
16374. i have posted my test program below as well as the ulimit
settings i was running it with. any help would be greatly appreciated.


thanks in advance,
dan smith

<program>
import os
import sys
import time
import threading

num_threads = int(sys.argv[1])

def run ():
# just sleep
time.sleep(10000)

for i in range(1,num_threads+1):

print 'starting thread %d' %i
t=threading.Thread (None, run)

t.start()

os._exit(1)
</program>

<ulimit -a>
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) 4
max memory size (kbytes, -m) unlimited
open files (-n) 1000000
pipe size (512 bytes, -p) 8
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
</ulimit -a>
 
D

danieldsmith

also, on the same box a similar C program (posted below) has no problem
starting 5000+ threads.

#include <stdlib.h>
#include <stdio.h>
#include <sys/time.h>
#include <pthread.h>


void *
run (void *arg) {

sleep(1000);
}

int main(int argc, char *argv[]) {

int j;
pthread_t tid;
int num_threads = atoi(argv[1]);

for (j=0; j < num_threads; j++) {

pthread_create (&tid, NULL, run, NULL);
printf("created thread %d\n",j);
fflush(stdout);

}

sleep(1000);

}
 
D

danieldsmith

disregard the C example. wasn't checking the return code of
pthread_create. the C program breaks in the same place, when creating
the 1021st thread.
 
B

Bryan Olson

> disregard the C example. wasn't checking the return code of
> pthread_create. the C program breaks in the same place, when creating
> the 1021st thread.

So that's pretty good evidence that it's an OS limit, not a
Python limit. The most likely problem is that the stack size is
too large, so you're running out of virtual address space.
 
D

danieldsmith

i modified my C test program (included below) to explicitly set the
default thread stack size, and i'm still running into the same
problem. can you think of any other thing that would possibly be
limiting me?

and sorry to continue to post here. since this is occurring in both c
and python, i think there's no question i'm running into an os limit.


#include <stdlib.h>
#include <stdio.h>
#include <sys/time.h>
#include <pthread.h>

void *
run (void *arg) {

sleep(1000);
}

int main(int argc, char *argv[]) {

int j;
int ret;
pthread_t tid;
int num_threads = atoi(argv[1]);
pthread_attr_t attr;
int stacksize;

pthread_attr_init(&attr);
pthread_attr_getstacksize (&attr, &stacksize);
printf("Default stack size = %d\n", stacksize);

// set stack size to 64K
pthread_attr_setstacksize (&attr, 0x10000);

pthread_attr_getstacksize (&attr, &stacksize);
printf("New stack size = %d\n", stacksize);

for (j=0; j < num_threads; j++) {

ret = pthread_create (&tid, NULL, run, NULL);

if (ret != 0) {
printf("thread create failed\n",j);
fflush(stdout);
exit(0);
}
printf("created thread %d\n",j);
fflush(stdout);

}

sleep(1000);
}
 
P

Peter Hansen

and sorry to continue to post here. since this is occurring in both c
and python, i think there's no question i'm running into an os limit.

Probably, but I haven't yet seen anyone ask the real important question.
What possible use could you have for more than 1000 *simultaneously
active* threads? There are very likely several alternative approaches
that will fit your use case and have better characteristics (e.g. higher
performance, simpler code, safer architecture, etc).

-Peter
 
B

Bryan Olson

Peter said:
> Probably, but I haven't yet seen anyone ask the real important question.
> What possible use could you have for more than 1000 *simultaneously
> active* threads? There are very likely several alternative approaches
> that will fit your use case and have better characteristics (e.g. higher
> performance, simpler code, safer architecture, etc).

Threading systems have come a long way in the last decade or so,
and they're still advancing. 64-bit, multi-core processors make
mega-threading even more attractive.

To read why zillions of threads are good, see:

http://www.usenix.org/events/hotos03/tech/vonbehren.html

For an example of a high-performance server that uses massive
threading, I'd nominate MySQL.

Prediction: Ten years from now, someone will ask that same
"What possible use..." question, except the number of threads
will be a million.
 
C

Christopher Subich

i modified my C test program (included below) to explicitly set the
default thread stack size, and i'm still running into the same
problem. can you think of any other thing that would possibly be
limiting me?

Hrm, you're on an A64, so that might very well mean you're dealing with
4MB pages. If each thread gets its own page of memory for stack space
regardless of how small you've set it, then ~1k threads * 4MB ~= 4GB of
virtual memory.
 
P

Peter Hansen

Bryan said:
Threading systems have come a long way in the last decade or so,
and they're still advancing. 64-bit, multi-core processors make
mega-threading even more attractive.

To read why zillions of threads are good, see:
http://www.usenix.org/events/hotos03/tech/vonbehren.html

Judging by the abstract alone, that article promotes "user-level"
threads, not OS threads, and in any case appears (with a quick scan) to
be saying nothing more than that threads are not *inherently* a bad
idea, just that current implementations suffer from various problems
which they believe could be fixed.

My question was in the context of the OP's situation. What possible use
for 1000 OS threads could he have? (Not "could his requirements be
answered with a non-existent proposed improved-performance
implementation of threads?")
Prediction: Ten years from now, someone will ask that same
"What possible use..." question, except the number of threads
will be a million.

Perhaps. And in ten years it will still be a valid question whenever
the context is not fully known. Is the OP's situation IO-bound,
CPU-bound, or just an experiment to see how many threads he can pile on
the machine at one time? The fact that these threads are all sleeping
implies the latter, though what he posted could have been a contrived
example. I'm interested in the real requirements, and whether more than
1000 threads in this day and age (not some imaginary future) might not
be a poor approach.

(For the record, Bryan, I am a strong proponent of thread systems for
many of the reasons the authors of that article put forth. None of
which means my question was without merit.)

-Peter
 
B

Bryan Olson

Peter said:
> My question was in the context of the OP's situation. What possible use
> for 1000 OS threads could he have?

Is this a language thing? Surely you realize that "what possible
use could <thing> be" carries an insinuation that <thing> is not
such a good idea. Possible uses are many and perfectly
reasonable, such as building a quick, responsive server like
MySQL.

> Is the OP's situation IO-bound,
> CPU-bound, or just an experiment to see how many threads he can pile on
> the machine at one time? The fact that these threads are all sleeping
> implies the latter, though what he posted could have been a contrived
> example. I'm interested in the real requirements, and whether more than
> 1000 threads in this day and age (not some imaginary future) might not
> be a poor approach.

If you're interested in his requirements, why not just ask him
about his requirements? Kind of premature to condemn his
approach.
 
P

Peter Hansen

Bryan said:
Is this a language thing? Surely you realize that "what possible
use could <thing> be" carries an insinuation that <thing> is not
such a good idea.

Obviously. Is it no longer permissible to question someone's approach
to doing something?

You're questioning my approach to inquiring after the OP's requirements,
and clearly you believe there is a better way to do it. Wonderful. You
may even be right. It's also off-topic.

I'm questioning why the OP thinks he needs more than 1000 threads, and
I'm genuinely interested in his answer. Not your answer, unless you
have personal knowledge of his actual situation.

If it's merely a contrived example, I have no further interest in the
thread, as I'm interested in practical matters. If he thinks he has a
real need, then I happen to believe there's very likely a better way to
do it. And _my_ question happens to be on topic...

-Peter
 
B

Bryan Olson

Peter said:
> Bryan Olson wrote:
>
>
> Obviously. Is it no longer permissible to question someone's approach
> to doing something?
>
> You're questioning my approach to inquiring after the OP's requirements,
> and clearly you believe there is a better way to do it. Wonderful. You
> may even be right. It's also off-topic.

I'm just arguing against the notion that a couple thousands
threads is generally a bad idea; if you didn't mean to suggest
that, then I misread you. There are a lot of neat ways to do
things that use one-or-two threads per thing-they-can-support.
In days past such methods did not scale well, but on modern
systems that is no longer true.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,832
Latest member
UtaHetrick

Latest Threads

Top