Returning "nearest in document" matches using XPath

N

Nick Leverton

I have an application which attempts to describe a tree of TCP subnets,
which in essence are not fully accessible from each other. I have a
description of the network in XML as shown in the excerpt below.

The application is actually trying to optimise delivery of large files
to multiple destinations over expensive links, so it's not just a matter
of opening up firewalls and adding a bit of NATting. To avoid duplicated
transfers I need to know what is the nearest machine which leads onto
the ultimate destination for the file I am currently handling.

So for instance files destined for units 26 and 27 are first delivered
to node V9990 which then delivers them to V9991, to which 26 and 27
are directly attached. The distinction between nodes and units isn't
important for this part of the task. The ID attribute defines the
ultimate destination which I am trying to reach and each ID is unique,
so there is only one "nearest" IP address corresponding to each ID.

<?xml version="1.0"?>
<nodes>
<node id="V9990" ip="1.1.1.1">
<unit id="23" ip="10.10.10.10"/>
<unit id="24" ip="10.10.10.11"/>
<node id="V9991" ip="10.10.10.12">
<unit id="26" ip="192.168.0.1"/>
<unit id="27" ip="192.168.0.2"/>
</node>
</node>
<node id="V9992" ip="2.2.2.2">
<node id="V9993" ip="10.10.10.10">
<unit id="21"/>
<unit id="22"/>
</node>
</node>
</nodes>

To simplify network maintenance I would like to use the same config file
on all the "nodes", and to modify the XPath query with extra terms on
the sub-nodes. In other words, on the "root" machine a query for id=26
will return ip=1.1.1.1, but on node V9990 a query for id=26 will return
ip=10.10.10.12

In summary, what I want to do is to retrieve the nearest ip attribute
in the document which has a given id attribute as a descendant. I am
currently using the following XPath:

Querying from the root:
descendant-or-self::*[@ip and descendant-or-self::*[@id="26"]][last()]/@ip

I used descendant-or-self as the first term here rather than //*
because I don't want XPath to descend the doc and return all matches,
only the node which matches nearest the root of the XML document.

Querying from a sub-node:
//*[@id="V9990"]/*[@ip and descendant-or-self::*[@id="26"]][last()]/@ip

Here I establish a context node first and then work on that with
predicates.

First question - these two work, but are probably not ideal since I'm
not yet very familiar with XPath. In particular I don't understand
why I need to use [last()] predicate rather than [1], as I thought the
descendant axis should work downwards in document order not upwards.

Secondly, I now have a requirement to retrieve all the "nearest" ip
attributes for polling/reporting purposes. In other words, querying
from the root I would want to return 1.1.1.1 and 2.2.2.2. Or querying
from node V9990 I would want to return 10.10.10.10, 10.10.10.11 and
10.10.10.12. I don't mind about getting multiple instances of the same
attribute back as de-duping is simple. But I cannot figure out how to
arrange the predicates so as to return the "topmost" ip attribute only,
neither for the root case nor for the sub-context case.

Am I bending XPath a step too far here ? I was hoping not to have to
introduce an extra processing step but I am thinking maybe the sub-nodes
need to extract their "local" view of the network and only to work
on that. Any advice would be very helpful.

I'm working in perl XML::XPath in case it makes a difference.

Thankyou

Nick
 
D

Dimitre Novatchev

What do you mean by "nearest"? Is this the geographical distance b/n two
nodes? I dont see this reflected in the XML document.

Cheers,
Dimitre Novatchev

Nick Leverton said:
I have an application which attempts to describe a tree of TCP subnets,
which in essence are not fully accessible from each other. I have a
description of the network in XML as shown in the excerpt below.

The application is actually trying to optimise delivery of large files
to multiple destinations over expensive links, so it's not just a matter
of opening up firewalls and adding a bit of NATting. To avoid duplicated
transfers I need to know what is the nearest machine which leads onto
the ultimate destination for the file I am currently handling.

So for instance files destined for units 26 and 27 are first delivered
to node V9990 which then delivers them to V9991, to which 26 and 27
are directly attached. The distinction between nodes and units isn't
important for this part of the task. The ID attribute defines the
ultimate destination which I am trying to reach and each ID is unique,
so there is only one "nearest" IP address corresponding to each ID.

<?xml version="1.0"?>
<nodes>
<node id="V9990" ip="1.1.1.1">
<unit id="23" ip="10.10.10.10"/>
<unit id="24" ip="10.10.10.11"/>
<node id="V9991" ip="10.10.10.12">
<unit id="26" ip="192.168.0.1"/>
<unit id="27" ip="192.168.0.2"/>
</node>
</node>
<node id="V9992" ip="2.2.2.2">
<node id="V9993" ip="10.10.10.10">
<unit id="21"/>
<unit id="22"/>
</node>
</node>
</nodes>

To simplify network maintenance I would like to use the same config file
on all the "nodes", and to modify the XPath query with extra terms on
the sub-nodes. In other words, on the "root" machine a query for id=26
will return ip=1.1.1.1, but on node V9990 a query for id=26 will return
ip=10.10.10.12

In summary, what I want to do is to retrieve the nearest ip attribute
in the document which has a given id attribute as a descendant. I am
currently using the following XPath:

Querying from the root:
descendant-or-self::*[@ip and
descendant-or-self::*[@id="26"]][last()]/@ip

I used descendant-or-self as the first term here rather than //*
because I don't want XPath to descend the doc and return all matches,
only the node which matches nearest the root of the XML document.

Querying from a sub-node:
//*[@id="V9990"]/*[@ip and descendant-or-self::*[@id="26"]][last()]/@ip

Here I establish a context node first and then work on that with
predicates.

First question - these two work, but are probably not ideal since I'm
not yet very familiar with XPath. In particular I don't understand
why I need to use [last()] predicate rather than [1], as I thought the
descendant axis should work downwards in document order not upwards.

Secondly, I now have a requirement to retrieve all the "nearest" ip
attributes for polling/reporting purposes. In other words, querying
from the root I would want to return 1.1.1.1 and 2.2.2.2. Or querying
from node V9990 I would want to return 10.10.10.10, 10.10.10.11 and
10.10.10.12. I don't mind about getting multiple instances of the same
attribute back as de-duping is simple. But I cannot figure out how to
arrange the predicates so as to return the "topmost" ip attribute only,
neither for the root case nor for the sub-context case.

Am I bending XPath a step too far here ? I was hoping not to have to
introduce an extra processing step but I am thinking maybe the sub-nodes
need to extract their "local" view of the network and only to work
on that. Any advice would be very helpful.

I'm working in perl XML::XPath in case it makes a difference.

Thankyou

Nick
--
Serendipity: http://www.leverton.org/blosxom (last update 19th September
2008)
"The Internet, a sort of ersatz counterfeit of real life"
-- Janet Street-Porter, BBC2, 19th March 1996
 
N

Nick Leverton

What do you mean by "nearest"? Is this the geographical distance b/n two
nodes? I dont see this reflected in the XML document.

No, sorry for being unclear. I mean that from the set of ip attributes
on the axis which contains both the root and the required id attribute:

/ ... @ip ... @ip ... @ip ... @id

I want to find the left-most one in the above diagram, nearest to the root
(or to other selected starting node inbetween the root and the required @id).

I can do this for single ids with the Xpath query I posted, although I
don't fully understand the ordering I am getting. I can't figure out
how to make a satisfactory query which will return the set of leftmost
@ip for all the ids in the XML document.

Thanks for your interest, if I'm still not explaining clearly please let
me know. I'm quite new to XML/Xpath and don't always know the correct
way to describe what I want to do.

Nick
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,736
Latest member
zacharyharris

Latest Threads

Top