can't delete from a dictionary in a loop

Dan Upton · May 16, 2008

This might be more information than necessary, but it's the best way I
can think of to describe the question without being too vague.

The task:

I have a list of processes (well, strings to execute said processes)
and I want to, roughly, keep some number N running at a time. If one
terminates, I want to start the next one in the list, or otherwise,
just wait.

The attempted solution:

Using subprocess, I Popen the next executable in the list, and store
it in a dictionary, with keyed on the pid:
(outside the loop)
procs_dict={}

(inside a while loop)
process = Popen(benchmark_exstring[num_started], shell=true)
procs_dict[process.pid]=process

Then I sleep for a while, then loop through the dictionary to see
what's terminated. For each one that has terminated, I decrement a
counter so I know how many to start next time, and then try to remove
the record from the dictionary (since there's no reason to keep
polling it since I know it's terminated). Roughly:

for pid in procs_dict:
if procs_dict[pid].poll() != None
# do the counter updates
del procs_dict[pid]

The problem:

RuntimeError: dictionary changed size during iteration

So, the question is: is there a way around this? I know that I can
just /not/ delete from the dictionary and keep polling each time
around, but that seems sloppy and like it could keep lots of memory
around that I don't need, since presumably the dictionary holding a
reference to the Popen object means the garbage collector could never
reclaim it. Is the only reasonable solution to do something like
append all of those pids to a list, and then after I've iterated over
the dictionary, iterate over the list of pids to delete?

(Also, from the implementation side, is there a reason the dictionary
iterator can't deal with that? If I was deleting from in front of the
iterator, maybe, but since I'm deleting from behind it...)

Hans Nowak · May 16, 2008

Dan said:
for pid in procs_dict:
if procs_dict[pid].poll() != None
# do the counter updates
del procs_dict[pid]

The problem:

RuntimeError: dictionary changed size during iteration

I don't know if the setup with the pids in a dictionary is the best way to
manage a pool of processes... I'll leave it others, presumably more
knowledgable, to comment on that.

But I can tell you how to solve the
immediate problem:

for pid in procs_dict.keys():
...

Hope this helps!

--Hans

bruno.desthuilliers · May 16, 2008

Dan said:
Dan said:

for pid in procs_dict:
if procs_dict[pid].poll() != None
# do the counter updates
del procs_dict[pid]

Click to expand...

The problem:

Click to expand...

RuntimeError: dictionary changed size during iteration

Click to expand...

I don't know if the setup with the pids in a dictionary is the best way to
manage a pool of processes... I'll leave it others, presumably more
knowledgable, to comment on that. But I can tell you how to solve the
immediate problem:

for pid in procs_dict.keys():

I'm afraid this will do the same exact thing. A for loop on a dict
iterates over the dict keys, so both statements are strictly
equivalent from a practical POV.

bruno.desthuilliers · May 16, 2008

I'm afraid this will do the same exact thing. A for loop on a dict
iterates over the dict keys, so both statements are strictly
equivalent from a practical POV.

Hem. Forget it. I should think twice before posting - this will
obviously make a big difference here. Sorry for the noise.

Gary Herron · May 16, 2008

Dan said:
Dan said:

for pid in procs_dict:
if procs_dict[pid].poll() != None
# do the counter updates
del procs_dict[pid]

The problem:

RuntimeError: dictionary changed size during iteration

Click to expand...

I don't know if the setup with the pids in a dictionary is the best way to
manage a pool of processes... I'll leave it others, presumably more
knowledgable, to comment on that. But I can tell you how to solve the
immediate problem:

for pid in procs_dict.keys():

Click to expand...

No, keys() produces a list (which is what is wanted here).

It's iterkeys() that produces an iterator which would reproduce the OP's
problem.

And then, in Python3, keys() produces something else altogether (call a
view of the dictionary) which would provoke the same problem, so yet
another solution would have to be found then.

Gary Herron

Hans Nowak · May 16, 2008

Hem. Forget it. I should think twice before posting - this will
obviously make a big difference here. Sorry for the noise.

It appears that you would be right if this was Python 3.0, though:

Python 3.0a5 (r30a5:62856, May 16 2008, 11:43:33)
[GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>> d = {1: 2, 3: 4, 5: 6}
>>> for i in d.keys(): del d

Click to expand...

Click to expand...

....
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration

Maybe 'for i in d' and 'for i in d.keys()' *are* functionally equivalent in 3.0,
as d.keys() returns an object that iterates over d's keys... but I haven't read
enough about it yet to be sure. In any case, the problem goes away when we
force a list:

>>> d = {1: 2, 3: 4, 5: 6}
>>> for i in list(d.keys()): del d ....
>>> d

Click to expand...

Click to expand...

Click to expand...

{}

--Hans

castironpi · May 16, 2008

Hem. Forget it. I should think twice before posting - this will
obviously make a big difference here. Sorry for the noise.

Click to expand...

It appears that you would be right if this was Python 3.0, though:

Python 3.0a5 (r30a5:62856, May 16 2008, 11:43:33)
[GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> d = {1: 2, 3: 4, 5: 6}
>>> for i in d.keys(): del d
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration

Maybe 'for i in d' and 'for i in d.keys()' *are* functionally equivalent in 3.0,
as d.keys() returns an object that iterates over d's keys... but I haven't read
enough about it yet to be sure. In any case, the problem goes away when we
force a list:

>>> d = {1: 2, 3: 4, 5: 6}
>>> for i in list(d.keys()): del d
...
>>> d
{}

--Hans- Hide quoted text -

- Show quoted text -

You may be searching for:

for i in d.keys()[:]:
del d[ i ]

Eduardo O. Padoan · May 16, 2008

Dan Upton wrote:

for pid in procs_dict:
if procs_dict[pid].poll() != None
# do the counter updates
del procs_dict[pid]
The problem:
RuntimeError: dictionary changed size during iteration

I don't know if the setup with the pids in a dictionary is the best way
to
manage a pool of processes... I'll leave it others, presumably more
knowledgable, to comment on that. But I can tell you how to solve
the
immediate problem:

for pid in procs_dict.keys():

Click to expand...

Click to expand...

No, keys() produces a list (which is what is wanted here).
It's iterkeys() that produces an iterator which would reproduce the OP's
problem.

And then, in Python3, keys() produces something else altogether (call a view
of the dictionary) which would provoke the same problem, so yet another
solution would have to be found then.

In Python 3.0, list(procs_dict.keys()) would have the same effect.

MRAB · May 17, 2008

This might be more information than necessary, but it's the best way I
can think of to describe the question without being too vague.

The task:

I have a list of processes (well, strings to execute said processes)
and I want to, roughly, keep some number N running at a time. If one
terminates, I want to start the next one in the list, or otherwise,
just wait.

The attempted solution:

Using subprocess, I Popen the next executable in the list, and store
it in a dictionary, with keyed on the pid:
(outside the loop)
procs_dict={}

(inside a while loop)
process = Popen(benchmark_exstring[num_started], shell=true)
procs_dict[process.pid]=process

Then I sleep for a while, then loop through the dictionary to see
what's terminated. For each one that has terminated, I decrement a
counter so I know how many to start next time, and then try to remove
the record from the dictionary (since there's no reason to keep
polling it since I know it's terminated). Roughly:

for pid in procs_dict:
if procs_dict[pid].poll() != None
# do the counter updates
del procs_dict[pid]

The problem:

RuntimeError: dictionary changed size during iteration

So, the question is: is there a way around this? I know that I can
just /not/ delete from the dictionary and keep polling each time
around, but that seems sloppy and like it could keep lots of memory
around that I don't need, since presumably the dictionary holding a
reference to the Popen object means the garbage collector could never
reclaim it. Is the only reasonable solution to do something like
append all of those pids to a list, and then after I've iterated over
the dictionary, iterate over the list of pids to delete?

(Also, from the implementation side, is there a reason the dictionary
iterator can't deal with that? If I was deleting from in front of the
iterator, maybe, but since I'm deleting from behind it...)

Why do you need a counter? len(procs_dict) will tell you how many are
in the dictionary.

You can rebuild the dictionary, excluding those that are no longer
active, with:

procs_dict = dict((id, process) for id, process in
procs_dict.iteritems() if process.poll() != None)

and then start N - len(procs_dict) new processes.

George Sakkis · May 17, 2008

This might be more information than necessary, but it's the best way I
can think of to describe the question without being too vague.

The task:

I have a list of processes (well, strings to execute said processes)
and I want to, roughly, keep some number N running at a time. If one
terminates, I want to start the next one in the list, or otherwise,
just wait.

The attempted solution:

Using subprocess, I Popen the next executable in the list, and store
it in a dictionary, with keyed on the pid:
(outside the loop)
procs_dict={}

(inside a while loop)
process = Popen(benchmark_exstring[num_started], shell=true)
procs_dict[process.pid]=process

Then I sleep for a while, then loop through the dictionary to see
what's terminated. For each one that has terminated, I decrement a
counter so I know how many to start next time, and then try to remove
the record from the dictionary (since there's no reason to keep
polling it since I know it's terminated). Roughly:

for pid in procs_dict:
if procs_dict[pid].poll() != None
# do the counter updates
del procs_dict[pid]

Since you don't look up processes by pid, you don't need a dictionary
here. A cleaner and efficient solution is use a deque to pop processes
from one end and push them to the other if still alive, something like
this:

from collections import deque

processes = deque()
# start processes and put them in the queue

while processes:
for i in xrange(len(processes)):
p = processes.pop()
if p.poll() is None: # not finished yet
processes.append_left(p)
time.sleep(5)

HTH,
George

castironpi · May 17, 2008

This might be more information than necessary, but it's the best way I
can think of to describe the question without being too vague.

Click to expand...

The task:

Click to expand...

I have a list of processes (well, strings to execute said processes)
and I want to, roughly, keep some number N running at a time. If one
terminates, I want to start the next one in the list, or otherwise,
just wait.

Click to expand...

The attempted solution:

Click to expand...

Using subprocess, I Popen the next executable in the list, and store
it in a dictionary, with keyed on the pid:
(outside the loop)
procs_dict={}

Click to expand...

(inside a while loop)
process = Popen(benchmark_exstring[num_started], shell=true)
procs_dict[process.pid]=process

Click to expand...

Then I sleep for a while, then loop through the dictionary to see
what's terminated. For each one that has terminated, I decrement a
counter so I know how many to start next time, and then try to remove
the record from the dictionary (since there's no reason to keep
polling it since I know it's terminated). Roughly:

Click to expand...

for pid in procs_dict:
if procs_dict[pid].poll() != None
# do the counter updates
del procs_dict[pid]

Click to expand...

Since you don't look up processes by pid, you don't need a dictionary
here. A cleaner and efficient solution is use a deque to pop processes
from one end and push them to the other if still alive, something like
this:

from collections import deque

processes = deque()
# start processes and put them in the queue

while processes:
for i in xrange(len(processes)):
p = processes.pop()
if p.poll() is None: # not finished yet
processes.append_left(p)
time.sleep(5)

HTH,
George- Hide quoted text -

- Show quoted text -

No underscore in appendleft.

Infinite loop problem	1	Nov 4, 2023
Can't execute php to delete multiple rows in database	3	May 14, 2023
building a dictionary dynamically	6	Feb 4, 2012
While Loop Freezing?	1	Feb 20, 2021
Can't solve this problem from my university	7	Oct 6, 2022
Ordering in the printout of a dictionary	3	Mar 18, 2014
A process take input from /proc/<pid>/fd/0, but won't process it	0	Oct 29, 2023
Pickling a dictionary	6	Nov 7, 2012

can't delete from a dictionary in a loop

Dan Upton

Hans Nowak

bruno.desthuilliers

bruno.desthuilliers

Gary Herron

Hans Nowak

castironpi

Eduardo O. Padoan

MRAB

George Sakkis

castironpi

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads