Representing Undirected Edges (advice, please)

G

Gavin Kistner

--Apple-Mail-2--176720861
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
delsp=yes;
format=flowed

Summary
==================================
How do you represent and handle bidirectional links between items,
given that you will necessarily need two distinct variables for each
end? How then do you handle a directed traversal of those links?

(Sorry for the length of this email; the concept is simple but I
wanted to be clear about the trouble I'm having.)


Background
==================================
I got all caught up in the quiz I gave last week, researching graph/
network theory. As a result, I'm writing a generic library for
finding paths. More on this library when I get closer to finishing.

One thing that's causing me a bit of trouble is how to represent non-
directed edges between nodes. Let me explain the concept for those
not familiar with the terminology, and then the problem.

A non-directed edge between nodes is a fancy way of saying "ObjectA
links to ObjectB, and vice versa". So while a directed edge would be:
ObjectA -------------> ObjectB or ObjectA
<------------- ObjectB
an undirected edge is simply:
ObjectA <------------> ObjectB

Edges also may have other information associated with them (such as a
'weight' for the edge), so I currently have them pulled out into
their own class:

class Edge
attr_accessor :start_node, :end_node, :weight

def initialize( start_node, end_node, directed=false,
weight=nil )
@start_node, @end_node, @weight, @directed = start_node,
end_node, weight, directed
end

def directed?
@directed
end
end

To store both ends, I need two variables, and here's where the
trouble starts creeping in. In a directed edge it's necessary to
distinguish which node is the start and which is the end, but in an
undirected edge I want the class to be agnostic about this. So, for
example, I now have to redefine the == method to account for non-
directionality:

def ==( other_edge )
( !@directed == !other_edge.directed? )
and
( @weight == other_edge.weight )
and
(
( @start_node == other_edge.start_node && @end_node
== edge.end_node )
or
( !@directed && @start_node == other_edge.end_node
&& @end_node == edge.start_node )
)
end

So far I'm still staying afloat.


The Problem
==================================
Any path that links a few nodes together should be able to be
represented by its edges. So, for example, the path from 'a' to 'd'
a <-----> b <-----> c <-----> d
is represented by the ordered list of three edges:
<Edge @start_node=a @end_node=b>
<Edge @start_node=b @end_node=c>
<Edge @start_node=c @end_node=d>

If I want to recreate the list of nodes visited, I can just loop
through the edges and pick out the nodes. ... Except I can't, because
if these are non-directional edges, the list might look like:
<Edge @start_node=b @end_node=a>
<Edge @start_node=b @end_node=c>
<Edge @start_node=d @end_node=c>

* I can't just 'flip' the edges while creating the path, because the
same graph might be crawled multiple times to create multiple paths.

* I could use code to determine the correct node order (figuring out
which ends touch which) but that will be slightly messy and slower.
(Premature optimization is the root of all evil, but you gotta strive
for a good design at least.)

* I could store both the edge list AND the node list for a path, but
that's not very DRY or normalized.

* I could create a subclass of Edge which is a Link (since
essentially the problem is that I'm using undirected information to
represent directed information), but then I'd be duplicating
information. (Later changes to the weight of an edge would not be
reflected in the Link used in a Path.)

* Perhaps I could create a Link class which references edges and
expresses directionality. (This thought just occurred to me.)


I know I've hit this problem a few times before in programming, and
have never been pleased with any of my solutions.

Thanks for any insights/opinions! :)



--
"When I am working on a problem I never think about beauty. I only
think about how to solve the problem. But when I have finished, if
the solution is not beautiful, I know it is wrong."
- R. Buckminster Fuller


--Apple-Mail-2--176720861--
 
G

Gavin Kistner

--Apple-Mail-1--170645187
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
delsp=yes;
format=flowed

Thanks for the prompt reply! Response inline.


You never explicitly write what is the problem here. It's very hard
to suggest solutions around it.

Sorry, you're right. I sort of stated it in the summary. It's not so
much a "how can I accomplish this?" question as a "what is the 'best'
way to solve it?"


Then list [a, b, c, d] would be right. One can deduct nodes_on_path
list from non_directional_edges_on_path if there are no loops,
nodes are not visited twice or other fancy issues.

I think even if there are loops and revisitations it can be deduces,
as long as non_directional_edges_on_path is ordered.

If your definition of path is "a list of successive nodes connected
by non-directional edges, so that one can visit all nodes or edges
or both in order they appear in path" then your list could be of form

[a, <Edge b a>, b, <Edge b c>, c, <Edge d c>, d]

or

[[a,b,c,d], [<Edge b a>, <Edge b c>, <Edge d c>]

or something else.

So by stating problem clearly you probably have your answer in it
already.

* I could store both the edge list AND the node list for a path,
but that's not very DRY or normalized.

The two latter forms I've written are not repeating oneself. But if
you're strongly against that you can verify you _must repeat
yourself_ if the solution requires it.

Correct, they're not really a violation of DRY, but if the node list
can be deduced from the ordered edge list (which it can) then it _is_
non-normalized; if my code somehow screws up and changes either the
node list or the edge list without properly updating the other, I
have an inconsistent state stored.

1) If you store just nodes_on_path, then you can find which edge
has both ends and recover edges_on_path. Except it doesn't work
with aforementioned special cases or even with multiple edges
between same two nodes.

Aye - the multiple-edges-between-the-same-nodes case is one that I
allow and that would mess this case up.

Where's the repetition?

I suppose it's trivial, but it's the extra data structure and two
more pointers to the 'real' start and end. I think it's becoming
apparent that I *am* succumbing to premature optimization. :)

If there's beef in here, feel free to forward to the list. If not,
I don't mind :).

Thanks, I will just for the shared knowledge.

(Original email follows in full for the record.)


Begin forwarded message:
From: Aleksi <[email protected]>
Date: May 8, 2005 11:32:45 AM MDT
To: Gavin Kistner <[email protected]>
Subject: Re: Representing Undirected Edges (advice, please)


Gavin said:
The Problem
==================================
Any path that links a few nodes together should be able to be
represented by its edges. So, for example, the path from 'a' to 'd'
a <-----> b <-----> c <-----> d
is represented by the ordered list of three edges:
<Edge @start_node=a @end_node=b>
<Edge @start_node=b @end_node=c>
<Edge @start_node=c @end_node=d>
If I want to recreate the list of nodes visited, I can just loop
through the edges and pick out the nodes. ... Except I can't,
because if these are non-directional edges, the list might look
like:
<Edge @start_node=b @end_node=a>
<Edge @start_node=b @end_node=c>
<Edge @start_node=d @end_node=c>

You never explicitly write what is the problem here. It's very hard
to suggest solutions around it.

For example, the latter list _is_ correct path through the graph,
that is a list of (non-directional) edges that you progress
through. That means that edge[1].node1 or node2 must be same as
either edge[0].node1 or node2. And this condition is fulfilled with
each step.

If you want to use the path to state which were the nodes that were
on the path, then you're using _different definition_ what is the
path. Then list [a, b, c, d] would be right. One can deduct
nodes_on_path list from non_directional_edges_on_path if there are
no loops, nodes are not visited twice or other fancy issues.

If your definition of path is "a list of successive nodes connected
by non-directional edges, so that one can visit all nodes or edges
or both in order they appear in path" then your list could be of form

[a, <Edge b a>, b, <Edge b c>, c, <Edge d c>, d]

or

[[a,b,c,d], [<Edge b a>, <Edge b c>, <Edge d c>]

or something else.

So by stating problem clearly you probably have your answer in it
already.

* I could store both the edge list AND the node list for a path,
but that's not very DRY or normalized.

The two latter forms I've written are not repeating oneself. But if
you're strongly against that you can verify you _must repeat
yourself_ if the solution requires it.

1) If you store just nodes_on_path, then you can find which edge
has both ends and recover edges_on_path. Except it doesn't work
with aforementioned special cases or even with multiple edges
between same two nodes.

2) If you store just edges_on_path, you can't find correct order in
case of loop (is it a->b->a->b->c or b->a->b->a->c).

If you store both it might be easier to see why it's not repeating
information by giving all things on graph an index. All nodes run
1..n (a=1, b=2, c=3, d=4) and all edges n+1..m (b-a=5, b-c=6, d-
c=7). Now you can make path as an indexing array:

path = [1, 6, 2, 6, 3, 7, 4]

and edges_on_path = path.find_all? {|i| i <= 4}, and
nodes_on_path = path.find_all? {|i| i => 5}.

Where's the repetition?

If there's beef in here, feel free to forward to the list. If not,
I don't mind :).

- Aleksi
 
P

Phil Tomson

--Apple-Mail-2--176720861
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
delsp=yes;
format=flowed

Summary
==================================
How do you represent and handle bidirectional links between items,
given that you will necessarily need two distinct variables for each
end? How then do you handle a directed traversal of those links?

(Sorry for the length of this email; the concept is simple but I
wanted to be clear about the trouble I'm having.)


Background
==================================
I got all caught up in the quiz I gave last week, researching graph/
network theory. As a result, I'm writing a generic library for
finding paths. More on this library when I get closer to finishing.

One thing that's causing me a bit of trouble is how to represent non-
directed edges between nodes. Let me explain the concept for those
not familiar with the terminology, and then the problem.

A non-directed edge between nodes is a fancy way of saying "ObjectA
links to ObjectB, and vice versa". So while a directed edge would be:
ObjectA -------------> ObjectB or ObjectA
<------------- ObjectB
an undirected edge is simply:
ObjectA <------------> ObjectB

Edges also may have other information associated with them (such as a
'weight' for the edge), so I currently have them pulled out into
their own class:

class Edge
attr_accessor :start_node, :end_node, :weight

def initialize( start_node, end_node, directed=false,
weight=nil )
@start_node, @end_node, @weight, @directed = start_node,
end_node, weight, directed
end

def directed?
@directed
end
end

To store both ends, I need two variables, and here's where the
trouble starts creeping in. In a directed edge it's necessary to
distinguish which node is the start and which is the end, but in an
undirected edge I want the class to be agnostic about this. So, for
example, I now have to redefine the == method to account for non-
directionality:

def ==( other_edge )
( !@directed == !other_edge.directed? )
and
( @weight == other_edge.weight )
and
(
( @start_node == other_edge.start_node && @end_node
== edge.end_node )
or
( !@directed && @start_node == other_edge.end_node
&& @end_node == edge.start_node )
)
end

The thing that strikes me here is that you probably want to have an
base edge class (or maybe actually a Mixin module to put it in more
Ruby-ish terms) that would define the basic functionality of an edge and
then have undirected and directed edge classes that derive from it.

In my experience, you either have a directed graph or an undirected graph
- I've never run into a situation where a graph can have both directed
and undirected edges (I suppose it's possible, and maybe that's what
you're trying to plan for, but in my experience in working with graphs
they're either one or another).

Seperating your directed graphs and undirected graphs could greatly
simplify things.


So far I'm still staying afloat.


The Problem
==================================
Any path that links a few nodes together should be able to be
represented by its edges. So, for example, the path from 'a' to 'd'
a <-----> b <-----> c <-----> d
is represented by the ordered list of three edges:
<Edge @start_node=a @end_node=b>
<Edge @start_node=b @end_node=c>
<Edge @start_node=c @end_node=d>

If I want to recreate the list of nodes visited, I can just loop
through the edges and pick out the nodes. ... Except I can't, because
if these are non-directional edges, the list might look like:
<Edge @start_node=b @end_node=a>
<Edge @start_node=b @end_node=c>
<Edge @start_node=d @end_node=c>

* I can't just 'flip' the edges while creating the path, because the
same graph might be crawled multiple times to create multiple paths.

* I could use code to determine the correct node order (figuring out
which ends touch which) but that will be slightly messy and slower.
(Premature optimization is the root of all evil, but you gotta strive
for a good design at least.)

* I could store both the edge list AND the node list for a path, but
that's not very DRY or normalized.

* I could create a subclass of Edge which is a Link (since
essentially the problem is that I'm using undirected information to
represent directed information), but then I'd be duplicating
information. (Later changes to the weight of an edge would not be
reflected in the Link used in a Path.)

* Perhaps I could create a Link class which references edges and
expresses directionality. (This thought just occurred to me.)


I know I've hit this problem a few times before in programming, and
have never been pleased with any of my solutions.

Thanks for any insights/opinions! :)

One idea:
Graphs are often represented by hashes of lists, like so:

graph = {
a => [b,c,d],
b => [a,e],
c => [a,e],
d => [a],
e => [b,c]
}

So, for example, a connects with nodes b,c and d. You can find all the
nodes that a connects to by iterating through the list of graph[a]. You
can then traverse further down in the graph by taking every node in a's
list and recursively iterating through all of their children.

If you use a Hash for representing the graph, you really don't need an
Edge class. You can then have a DirectedGraph class and an
UndirectedGraph class each of which contains the connection hash - how
you traverse through the hash will be a bit different depending on
whether it's directed or not. The hash of lists representation is simple
and compact.

It can, however be useful to have an Edge class (for weighted edges, as
you point out).

Take a look at some other OO graph class libraries out there such as the
Boost Graph library. There are lots of them out there and different ones
emphasize different things: some emphasize the nodes (as in the hash
representation) while others emphasize the edges (each has it's
advantages/disadvantages).

Phil
 
A

Ara.T.Howard

The Problem
==================================
Any path that links a few nodes together should be able to be
represented by its edges. So, for example, the path from 'a' to 'd'
a <-----> b <-----> c <-----> d

i think you must also add the concept of direction to path. for example, if i
tell you walk around my block on a particular path i'd say

- start at my house and go NORTH
- when you hit the corner turn WEST
- when you hit the next corner turn SOUTH
- when you hit the corner turn EAST
- when you hit the corner turn NORTH and continue until you see my house

so a Path might be represented as someting like

class Edge
def <=> other
raise NotConnectedError unless connected_to? other or parallel_to? other
case other.start_node
when end_node
-1
when start_node
0
else
1
end
end
def connected_to? other
start_node == other.end or other.end_node == start_node
end
def parallel_to? other
start_node == other.start_node and end_node == other.end_node
end
end

class Path
def initialize
@edges = []
end
def add_edge edge
@edges << edge
@edges.sort!
end
end

in otherwords a Path is a set of ordered (directed) edges
is represented by the ordered list of three edges:
<Edge @start_node=a @end_node=b>
<Edge @start_node=b @end_node=c>
<Edge @start_node=c @end_node=d>

so this is a path

If I want to recreate the list of nodes visited, I can just loop
through the edges and pick out the nodes. ... Except I can't, because
if these are non-directional edges, the list might look like:
<Edge @start_node=b @end_node=a>
<Edge @start_node=b @end_node=c>
<Edge @start_node=d @end_node=c>

but this is not - there is no way described here to get from a to c

in summaray: i think adding a Path class, where paths are lists of connected
(or parallel) edges may help abstract the problem. then again maybe not ;-)
how does the RGL handle this?

cheers.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| renunciation is not getting rid of the things of this world, but accepting
| that they pass away. --aitken roshi
===============================================================================
 
H

Horst Duchene

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,817
Latest member
DicWeils

Latest Threads

Top