heapq._siftdown / decrease-key support?

J

Joshua Bronson

I recently implemented A* search in Python using the heapq module and
in my first pass, to accomplish the decrease-key operation I resorted
to doing a linear scan of the list to find the position of the key in
the heap, and then calling the private undocumented method
heapq._siftdown at this position to enforce the heap property after
updating the key's priority[1].

Then I found Daniel Stutzbach's HeapDict[2], which looks like a whole
new package written just to support this operation (it also provides
some nice encapsulation to boot). So I switched[3]. The cost of this
improved implementation is slightly worse performance; I'm guessing
this is because more code is being run in Python now rather than in C.

Now I've just stumbled upon Mikael Lind's python-astar package[4],
which implements A* using heapq without needing to use _siftdown;
instead, if a better path to a neighbor is found, it marks the already-
found node on the worse path as invalid, pushes a new node onto the
heap with the improved priority, and at the end of each iteration,
pops off any invalid nodes from the top of the heap[5].

Before I rewrite my code another time to see what kind of performance
this results in, I figured I'd ask some interested parties first: Does
heapq not provide decrease-key (and increase-key) because it expects
you to work around it as Mikael has done, or for some other reason, or
is it planned for the future?

Thanks,
Josh

[1] http://bitbucket.org/jab/toys/src/7fe2965b275c/wordchain.py#cl-212
[2] http://github.com/DanielStutzbach/heapdict
[3] http://bitbucket.org/jab/toys/src/2ae894c91d08/wordchain.py#cl-218
[4] http://github.com/elemel/python-astar
[5] http://github.com/elemel/python-ast...7ee24968c6d90fc6b6957461dec/src/astar.py#L112
 
J

Joshua Bronson

Thanks for the responses!

Your guess is correct. Someday I'd like to rewrite HeapDict in C for speed, but I haven't been able to find the time (and no one has offered to pay me to make the time ;) ).

Daniel, did you realize you can accomplish logarithmic-time priority
change (albeit indirectly) with the built-in heapq module using this
mark-as-invalid trick? Are there other algorithms requiring decrease-
key besides A* and Dijkstra where this technique doesn't help?

If I recall correctly, avoiding the linear search did wonders for
algorithm complexity.

It's still possible to perform the decrease-key operation without
having to do a linear search; heapdict does it in logarithmic time. If
my cursory benchmarks were accurate (which upon reconsideration I'm
not actually sure they are), I think the only explanation for it being
possibly slower is that it's running in pure Python.

I just switched back to using heapq but now with the mark-as-invalid
trick instead of the linear scan, and the performance I'm observing is
very close to heapdict's. Which is curious, because now that both
operations are running in logarithmic time, I'd expect the heapq
implementation to be much faster since it's written in C. Maybe I have
some confounding variables in how I'm running my tests.

By the way, I just realized that the obvious reason the heapq module
doesn't support priority change is that its functions are designed to
operate on external lists rather than providing an encapsulating data
structure like heapdict. With heapdict, logarithmic decrease_key is
possible because it's also storing the positions of the elements in an
underlying dictionary, so instead of a linear scan it does a constant-
time lookup. My guess is that the simplicity of having heapq operate
on external lists is preferable to supporting decrease-key directly,
so it isn't likely to change any time soon, especially since you can
accomplish the same thing with the mark-as-invalid technique.

Raymond, do you think this technique is worth documenting in the heapq
module? It'd be too bad if any future users incorrectly think that it
won't meet their needs the way I did.
 
T

Terry Reedy

Thanks for the responses!



Daniel, did you realize you can accomplish logarithmic-time priority
change (albeit indirectly) with the built-in heapq module using this
mark-as-invalid trick? Are there other algorithms requiring decrease-
key besides A* and Dijkstra where this technique doesn't help?



It's still possible to perform the decrease-key operation without
having to do a linear search; heapdict does it in logarithmic time. If
my cursory benchmarks were accurate (which upon reconsideration I'm
not actually sure they are), I think the only explanation for it being
possibly slower is that it's running in pure Python.

I just switched back to using heapq but now with the mark-as-invalid
trick instead of the linear scan, and the performance I'm observing is
very close to heapdict's. Which is curious, because now that both
operations are running in logarithmic time, I'd expect the heapq
implementation to be much faster since it's written in C. Maybe I have
some confounding variables in how I'm running my tests.

By the way, I just realized that the obvious reason the heapq module
doesn't support priority change is that its functions are designed to
operate on external lists rather than providing an encapsulating data
structure like heapdict. With heapdict, logarithmic decrease_key is
possible because it's also storing the positions of the elements in an
underlying dictionary, so instead of a linear scan it does a constant-
time lookup.

If the values in the queue are associated with 0 to n-1, as can be used
for graph node identifiers , the positions can also be kept in an list,
again with constant time lookup.


My guess is that the simplicity of having heapq operate
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,228
Members
46,818
Latest member
SapanaCarpetStudio

Latest Threads

Top