feature requests

macker · Oct 3, 2013

Hi, hope this is the right group for this:

I miss two basic (IMO) features in parallel processing:

1. make `threading.Thread.start()` return `self`

I'd like to be able to `workers = [Thread(params).start() for params in whatever]`. Right now, it's 5 ugly, menial lines:

workers = []
for params in whatever:
thread = threading.Thread(params)
thread.start()
workers.append(thread)

2. make multiprocessing pools (incl. ThreadPool) limit the size of their internal queues

As it is now, the queue will greedily consume its entire input, and if the input is large and the pool workers are slow in consuming it, this blows upRAM. I'd like to be able to `pool = Pool(4, max_qsize=1000)`. Same with the output queue (finished tasks).

Or does anyone know of a way to achieve this?

Chris Angelico · Oct 3, 2013

I'd like to be able to `workers = [Thread(params).start() for params in whatever]`. Right now, it's 5 ugly, menial lines:

workers = []
for params in whatever:
thread = threading.Thread(params)
thread.start()
workers.append(thread)

You could shorten this by iterating twice, if that helps:

workers = [Thread(params).start() for params in whatever]
for thrd in workers: thrd.start()

ChrisA

Tim Chase · Oct 3, 2013

workers = []
for params in whatever:
thread = threading.Thread(params)
thread.start()
workers.append(thread)

Click to expand...

You could shorten this by iterating twice, if that helps:

workers = [Thread(params).start() for params in whatever]
for thrd in workers: thrd.start()

Do you mean

workers = [Thread(params) for params in whatever]
for thrd in workers: thrd.start()

? ("Thread(params)" vs. "Thread(params).start()" in your list comp)

-tkc

Chris Angelico · Oct 3, 2013

Do you mean

workers = [Thread(params) for params in whatever]
for thrd in workers: thrd.start()

? ("Thread(params)" vs. "Thread(params).start()" in your list comp)

Whoops, copy/paste fail. Yes, that's what I meant.

Thanks for catching!

ChrisA

Ethan Furman · Oct 3, 2013

Hi, hope this is the right group for this:

I miss two basic (IMO) features in parallel processing:

1. make `threading.Thread.start()` return `self`

I'd like to be able to `workers = [Thread(params).start() for params in whatever]`. Right now, it's 5 ugly, menial lines:

workers = []
for params in whatever:
thread = threading.Thread(params)
thread.start()
workers.append(thread)

Ugly, menial lines are a clue that a function to hide it could be useful.

2. make multiprocessing pools (incl. ThreadPool) limit the size of their internal queues

As it is now, the queue will greedily consume its entire input, and if the input is large and the pool workers are slow in consuming it, this blows up RAM. I'd like to be able to `pool = Pool(4, max_qsize=1000)`. Same with the output queue (finished tasks).

Have you verified that this is a problem in Python?

Or does anyone know of a way to achieve this?

You could try subclassing.

macker · Oct 5, 2013

Ugly, menial lines are a clue that a function to hide it could be useful.

Or a clue to add a trivial change elsewhere (hint for Ethan: `return self` at the end of `Thread.start()`).

Have you verified that this is a problem in Python?
?

You could try subclassing.

I could try many things. What this thread is about is trying to fix it on stdlib level, so that people don't have to reinvent the wheel every time.

Thanks to Chris for his suggestion. Ethan, please stay away from this thread.

-macker

Ethan Furman · Oct 5, 2013

Or a clue to add a trivial change elsewhere (hint for Ethan: `return self` at the end of `Thread.start()`).

I'm aware that would solve your issue. I'm also aware that Python rarely does a 'return self' at the end of methods.
Since that probably isn't going to change, a helper function is probably your best way forward.

?

You stated it "would blow up RAM" -- have you actually tested this, or are you making assumptions based on experience
from other languages, or assumptions based on nothing at all?

I could try many things. What this thread is about is trying to fix it on stdlib level, so that people don't have to reinvent the wheel every time.

Did you really expect your idea to just sail through with no opposition, no counter-ideas, no reasons why it might not,
or would not, work?

Thanks to Chris for his suggestion. Ethan, please stay away from this thread.

Wow, you're rude.

Terry Reedy · Oct 5, 2013

I'm aware that would solve your issue. I'm also aware that Python
rarely does a 'return self' at the end of methods.

Not returning self is a basic design principle of Python since its
beginning. (I am not aware of any exceptions and would regard one as
possibly a mistake.) Guido is aware that not doing so prevents chaining
of mutation methods. He thinks it very important that people know and
remember the difference between a method that mutates self and one that
does not. Otherwise, one could write 'b = a.sort()' and not know
(remember) that b is just an alias for a. He must have seen this type of
error, especially in beginner code, in other languages before designing
Python.

Since that probably isn't going to change,

as it would only make things worse.

Note that some mutation methods also return something useful other than
default None. Examples are mylist.pop() and iterator.__next__ (usually
accessed by next(iterator)*. So it is impossible for all mutation
methods to just 'return self'.

* iterator.__next__ is a generalized specialization of list.pop. It can
only return the 'first' item, but can do so with any iterable, including
those that are not ordered and those that represent virtual rather than
concrete collections.

Ending data exchange through multiprocessing pipe	6	Apr 22, 2009
introduction and first question about multithreading	1	Jul 11, 2012
GeneratorExit should derive from BaseException, not Exception	0	Aug 20, 2007
The Future of Python Threading	34	Aug 10, 2007
Ruby Weekly News 29th August - 4th September 2005	1	Sep 7, 2005
comp.lang.java.gui FAQ	0	Sep 13, 2006
Ruby Weekly News 7th - 13th March 2005	1	Mar 13, 2005
clc selected threads (30-jan-2005 to 31-jan-2005) #1	3	Feb 6, 2005

feature requests

macker

Chris Angelico

Tim Chase

Chris Angelico

Ethan Furman

macker

Ethan Furman

Terry Reedy

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads