which data structure to use?

  • Thread starter Robert Voigtländer
  • Start date
R

Robert Voigtländer

Hi,

which would be the best data structure to use for the following case?

I have objects like this:

class Node(object):
def __init__(self, pos, parent, g , h):
self.pos = pos
self.parent = parent
self.g = g
self.h = h
self.f = g+h


I need to build a "list" of these objects. The amount is unknown.
On this list I need to regularly

1. check if a specific item - identified by Node.pos - is in the list.
2. find the object with the lowest Node.f attribute and update or remove it


What would be a good approach. Using a list I always need to traverse the whole list to do one of the above actions.

Thanks
Robert
 
C

Chris Angelico

1. check if a specific item - identified by Node.pos - is in the list.
2. find the object with the lowest Node.f attribute and update or remove it

Are both those values constant once the Node is added? If so, the
easiest way would be to maintain a dictionary mapping pos to the Node
(or to a list of Nodes, if you can have multiple with the same pos),
and probably heapq for the f values. But if they change, you'll have
to update both data structures. If they change often, it's probably
not worth maintaining index structures - just search for what you
want, when you want it. And if you're working with a small list (say,
less than a thousand items), performance isn't going to matter at all,
so just do whatever looks cleanest in your code - probably you have
that already.

ChrisA
 
R

Robert Voigtländer

On Tue, Jan 21, 2014 at 03:17:43AM -0800, Robert Voigtl�nder wrote:


First thanks to all who responded. Although I only partially understand your answers as I am a Python starter.
What's clear to me is the difference between object and instance of an object. Just didn't explain it well.

Maybe I give some more info on what I need / want to do. I will also provide a working code example. Should have done this before.

I would very much appreciate a working example of what you mean. Then I have a chance to understand it. :)

I would like to implement a class for a A* pathfinding algorithm. (there are ready libraries out there but I would like to learn it myself) This requires to maintain a list of nodes to be processed and nodes already processed.. For new nodes I need to check if they are on one of the lists. I also need to regularly pick the node with the lowest value f from the list.

Here some working code. For one function I sill need a solution. Any bettersolution is welcome.

Thanks
Robert

---------------
class Node:
def __init__(self, pos, parent, g , h):
self.pos = pos
self.parent = parent
self.g = g
self.h = h
self.f = g+h


def isinlist(nodeToSeatch):
for item in openlist:
if item.pos == nodeToSeatch: return True
return False


def lowestF():
lowestF = ''
for item in openlist:
if item.f < lowestF: lowestF = item
return lowestF

def deleteItemWithPos(pos):
## need this function or different approach
pass

openlist=[]
openlist.append(Node((1,1),None,1,5))
openlist.append(Node((1,2),(1,1),4,6))
openlist.append(Node((1,3),(1,2),9,10))

for item in openlist: print item.pos, item.f

print isinlist((1,1))
print isinlist((1,5))

nextNode = lowestF()
print nextNode.pos, nextNode.f
 
R

Robert Voigtländer

Am Dienstag, 21. Januar 2014 14:38:34 UTC+1 schrieb Robert Voigtländer:
First thanks to all who responded. Although I only partially understand your answers as I am a Python starter.

What's clear to me is the difference between object and instance of an object. Just didn't explain it well.



Maybe I give some more info on what I need / want to do. I will also provide a working code example. Should have done this before.



I would very much appreciate a working example of what you mean. Then I have a chance to understand it. :)



I would like to implement a class for a A* pathfinding algorithm. (there are ready libraries out there but I would like to learn it myself) This requires to maintain a list of nodes to be processed and nodes already processed. For new nodes I need to check if they are on one of the lists. I also need to regularly pick the node with the lowest value f from the list.



Here some working code. For one function I sill need a solution. Any better solution is welcome.



Thanks

Robert

Sorry - found a bug right after my post.
Here the corrected version.

class Node:
def __init__(self, pos, parent, g , h):
self.pos = pos
self.parent = parent
self.g = g
self.h = h
self.f = g+h


def isinlist(nodeToSeatch):
for item in openlist:
if item.pos == nodeToSeatch: return True
return False


def lowestF():
lowestF = ''
for item in openlist:
if item.f < lowestF:
lowestF = item.f
lowestFItem = item
return lowestFItem

def deleteItemWithPos(pos):
## need this function or different approach
pass

openlist=[]
openlist.append(Node((1,1),None,1,5))
openlist.append(Node((1,2),(1,1),4,6))
openlist.append(Node((1,3),(1,2),9,10))

for item in openlist: print item.pos, item.f

print isinlist((1,1))
print isinlist((1,5))

nextNode = lowestF()
print nextNode.pos, nextNode.f
 
O

Oscar Benjamin

First thanks to all who responded. Although I only partially understand your answers as I am a Python starter.
What's clear to me is the difference between object and instance of an object. Just didn't explain it well.

Maybe I give some more info on what I need / want to do. I will also provide a working code example. Should have done this before.

I would very much appreciate a working example of what you mean. Then I have a chance to understand it. :)

I would like to implement a class for a A* pathfinding algorithm. (there are ready libraries out there but I would like to learn it myself) This requires to maintain a list of nodes to be processed and nodes already processed. For new nodes I need to check if they are on one of the lists. I also need to regularly pick the node with the lowest value f from the list.

Here some working code. For one function I sill need a solution. Any better solution is welcome.

Thanks
Robert

---------------
class Node:
def __init__(self, pos, parent, g , h):
self.pos = pos
self.parent = parent
self.g = g
self.h = h
self.f = g+h


def isinlist(nodeToSeatch):
for item in openlist:
if item.pos == nodeToSeatch: return True
return False


def lowestF():
lowestF = ''
for item in openlist:
if item.f < lowestF: lowestF = item
return lowestF

The function above is incorrect. I think it should be:

def lowestF():
lowestF = ''
for item in openlist:
if item.f < lowestF:
lowestF = item.f # Note the .f here
return lowestF
def deleteItemWithPos(pos):
## need this function or different approach
pass

def deleteItemWithPos(pos):
for n, item in enumerate(openlist):
if item.pos == pos:
del openlist[n]
return
else:
raise RuntimeError('Not in the list!')

openlist=[]
openlist.append(Node((1,1),None,1,5))
openlist.append(Node((1,2),(1,1),4,6))
openlist.append(Node((1,3),(1,2),9,10))

for item in openlist: print item.pos, item.f

print isinlist((1,1))
print isinlist((1,5))

nextNode = lowestF()
print nextNode.pos, nextNode.f

My first suggestion would be to use a dict instead of a list:


class Node:
def __init__(self, pos, parent, g , h):
self.pos = pos
self.parent = parent
self.g = g
self.h = h
self.f = g+h

def isinlist(pos):
return pos in opendict

def lowestF():
return min(opendict.values(), key=lambda x: x.f)

def deleteItemWithPos(pos):
del opendict[pos]

nodes = [
Node((1,1),None,1,5),
Node((1,2),(1,1),4,6),
Node((1,3),(1,2),9,10),
]

opendict = {}
for node in nodes:
opendict[node.pos] = node

for item in opendict.values():
print item.pos, item.f

print isinlist((1,1))
print isinlist((1,5))

nextNode = lowestF()
print nextNode.pos, nextNode.f


The above is more efficient and simpler. It is still O(N) for the lowestF()
function. Changing data structure could make that more efficient.


Oscar
 
P

Peter Otten

Robert said:
First thanks to all who responded. Although I only partially understand
your answers as I am a Python starter. What's clear to me is the
difference between object and instance of an object. Just didn't explain
it well.

Maybe I give some more info on what I need / want to do. I will also
provide a working code example. Should have done this before.

I would very much appreciate a working example of what you mean. Then I
have a chance to understand it. :)

I would like to implement a class for a A* pathfinding algorithm. (there
are ready libraries out there but I would like to learn it myself) This
requires to maintain a list of nodes to be processed and nodes already
processed. For new nodes I need to check if they are on one of the lists.
I also need to regularly pick the node with the lowest value f from the
list.

Here some working code. For one function I sill need a solution. Any
better solution is welcome.

Thanks
Robert

---------------
class Node:
def __init__(self, pos, parent, g , h):
self.pos = pos
self.parent = parent
self.g = g
self.h = h
self.f = g+h


def isinlist(nodeToSeatch):
for item in openlist:
if item.pos == nodeToSeatch: return True
return False


def lowestF():
lowestF = ''
for item in openlist:
if item.f < lowestF: lowestF = item
return lowestF

def lowestF():
return min(openlist, key=operator.attrgetter("f"))

def deleteItemWithPos(pos):
## need this function or different approach
pass

openlist=[]
openlist.append(Node((1,1),None,1,5))
openlist.append(Node((1,2),(1,1),4,6))
openlist.append(Node((1,3),(1,2),9,10))

for item in openlist: print item.pos, item.f

print isinlist((1,1))
print isinlist((1,5))

nextNode = lowestF()
print nextNode.pos, nextNode.f

Here is an OO implementation of Chris Angelico's suggestion:

import heapq

class Node:
def __init__(self, pos, parent, g , h):
self.pos = pos
self.parent = parent
self.g = g
self.h = h
self.f = g+h
def __str__(self):
return "Node(pos={!r}, f={!r})".format(self.pos, self.f)

class Nodes():
def __init__(self):
self.lookup = {}
self.heap = []
def add(self, node):
self.lookup[node.pos] = node
heapq.heappush(self.heap, (node.f, node))
def __iter__(self):
return iter(self.lookup.values())
def __contains__(self, pos):
return pos in self.lookup
def lowest(self):
return self.heap[0][1]
def pop(self):
f, node = heapq.heappop()
del lookup[node.pos]
return node

nodes = Nodes()
nodes.add(Node((1,1), None, 1, 5))
nodes.add(Node((1,2), (1,1), 4, 6))
nodes.add(Node((1,3), (1,2), 9, 10))

for node in nodes:
print(node)

print((1,1) in nodes)
print((1,5) in nodes)

print(nodes.lowest())
 
P

Peter Otten

Peter said:
def pop(self):
f, node = heapq.heappop()
del lookup[node.pos]
return node

That should be

def pop(self):
f, node = heapq.heappop(self.heap)
del self.lookup[node.pos]
return node
 
R

Robert Voigtländer

Am Dienstag, 21. Januar 2014 15:19:54 UTC+1 schrieb Peter Otten:
Peter Otten wrote:


def pop(self):
f, node = heapq.heappop()
del lookup[node.pos]
return node



That should be



def pop(self):

f, node = heapq.heappop(self.heap)

del self.lookup[node.pos]

return node

Hi Peter,

this works great. I will try to find some info about the functionality you used and to understnad what you did.
Maybe you can add a function to remove a node?

Thanks for your support and all tthe swift answers.

Robert
 
R

Robert Voigtländer

def pop(self):
f, node = heapq.heappop()
del lookup[node.pos]
return node
That should be
def pop(self):
f, node = heapq.heappop(self.heap)
del self.lookup[node.pos]
return node

Hi Peter,
this works great. I will try to find some info about the functionality you used and to understnad what you did.
Maybe you can add a function to remove a node?
Thanks for your support and all tthe swift answers.

Robert

Just realized what the pop function is for. I changed it from:
def pop(self):
to
def pop(self,pos):

and it works.

Now I go on with trying to understand it.
 
M

Mark Lawrence

On 21/01/2014 13:43, Robert Voigtländer wrote:

[double spaced google disease snipped]

I'm pleased to see the regular contributors helping out as usual. In
response would you please be kind enough to read and action this
https://wiki.python.org/moin/GoogleGroupsPython to prevent us seeing the
double line spacing that google inserts, thanks.
 
P

Peter Otten

Robert said:
def pop(self):
f, node = heapq.heappop()
del lookup[node.pos]
return node
That should be
def pop(self):
f, node = heapq.heappop(self.heap)
del self.lookup[node.pos]
return node

Hi Peter,
this works great. I will try to find some info about the functionality
you used and to understnad what you did. Maybe you can add a function to
remove a node? Thanks for your support and all tthe swift answers.

Robert

Just realized what the pop function is for. I changed it from:
def pop(self):
to
def pop(self,pos):

and it works.

Unlikely. Are you sure that .heap and .lookup contents are still in sync
with your modification?
Now I go on with trying to understand it.

The pop() method as posted can only remove the "lowest" node. If contrary to
your initial spec
2. find the object with the lowest Node.f attribute and update or remove
it

you want to remove arbitrary nodes a heap may not be the appropriate data
structure. The modified

import operator

#...

class Nodes():
def __init__(self):
self.lookup = {}
def add(self, node):
self.lookup[node.pos] = node
def __iter__(self):
return iter(self.lookup.itervalues())
def __contains__(self, pos):
return pos in self.lookup
def lowest(self):
return min(self, key=operator.attrgetter("f"))
def __delitem__(self, pos):
del self.lookup[pos]

nodes = Nodes()

nodes.add(Node((1,1), None, 1, 5))
nodes.add(Node((1,2), (1,1), 4, 6))
nodes.add(Node((1,3), (1,2), 9, 10))

del nodes[1, 2]


allows you to delete random nodes, but the lowest() method will slow down as
it has to iterate over all dict values.

One important caveat: both Nodes classes lead to data corruption if you
modify a .pos attribute while the node is in a Nodes instance. The same goes
for the first implementation and the .f attribute.
 
R

Robert Voigtländer

Unlikely. Are you sure that .heap and .lookup contents are still in sync
with your modification?
No it's not. Atfer having read about heapq it's clear why.
Thanks for the hint.

allows you to delete random nodes, but the lowest() method will slow down as
it has to iterate over all dict values.

Thanks a lot for the alternative class. I will go with two lists, each using one of the aproaches. In one list I always need to have only hte object with lowest .f.
With the other list I may have to delete random nodes.

So I can make perfect use of both classes.

One important caveat: both Nodes classes lead to data corruption if you
modify a .pos attribute while the node is in a Nodes instance. The same goes
for the first implementation and the .f attribute.

That's fine. pos is static and identifies the node.


Thanks again to all who responded. I learned a lot.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,999
Messages
2,570,246
Members
46,839
Latest member
MartinaBur

Latest Threads

Top