XSLT XPATH uses node content in logical conditions???!?!

=?iso-8859-1?q?Jean-Fran=E7ois_Michaud?= · Sep 19, 2006

Who in the name of #%@! thought this one out??

I noticed this behavior when trying to debug a problem I was having.

I used this logical expression and some XPATH in a specific sequence of
instructions that allow me to transform a CALS table model into our own
specific table model and I used this expression:

<xsl:if test="self::node()=../CELL[1] and self::node()[@colname >
1]">...

to artificially create cells at the beginning of the row if their
colname is greater than 1. The algorithm works fine, but the expression
DID NOT in every case.

.../CELL[1] actually refers to the content of CELL[1] instead of
refering to a unique identifier specific to the first child of the
parent. Who?!?! What??!?. Shouldn't the XPATH expression point to
unique cells within the XML tree as opposed to outputting the content?

A freak case emerged when of course, 2 different cells within the same
row had exactly the same content. I had to patch up the logical
condition so that it would uniquely identify the cells.

Shouldn't the distinction be made between the node itself and its
content? Or am I missing something?

Regards
Jean-Francois Michaud

Richard Tobin · Sep 19, 2006

Jean-François Michaud said:
../CELL[1] actually refers to the content of CELL[1] instead of
refering to a unique identifier specific to the first child of the
parent. Who?!?! What??!?.

This is, of course, what you usually want. It's much more common to
compare the content of nodes than to compare them for identity.

A freak case emerged when of course, 2 different cells within the same
row had exactly the same content.

That's not a freak case, it's perfectly normal.

Unfortunately there's no "natural" way to determine node identity in
XPath (at least, in XPath 1). Two unnatural ways are:

generate-id(node1) = generate-id(node2)

and (probably more efficient):

count(node1 | node2) = 1

-- Richard

A. Bolmarcich · Sep 19, 2006

On 2006-09-19 said:
Shouldn't the distinction be made between the node itself and its
content? Or am I missing something?

You must have missed the first sentence of the fifth paragraph of the
"Booleans" section of the XPATH specification (see
://www.w3.org/TR/xpath#booleans): "If both objects to be compared are
node-sets, then the comparison will be true if and only if there is a
node in the first node-set and a node in the second node-set such that
the result of performing the comparison on the string-values of the
two nodes is true."

Joseph Kesselman · Sep 19, 2006

The problem is the definition of comparison of nodesets as comparison of
their values to see if any node in the set compares true. That's
actually a very useful behavior, but as you've just demonstrated it
isn't always the one you want.

To obtain a unique identifier for that node for identity-comparison
purposes, use the generate-id() function. In fact there's an explicit
example of this in XSLT spec, during their description of the document()
function, where they use it to illustrate that repeated retrieval of the
same URI yields the same actual nodes and not just the same data:
generate-id(document("foo.xml"))=generate-id(document("foo.xml"))

Joseph Kesselman · Sep 19, 2006

Richard said:
count(node1 | node2) = 1

Thanks; I'd forgotten that alternative.

=?iso-8859-1?q?Jean-Fran=E7ois_Michaud?= · Sep 19, 2006

Richard said:
Jean-François Michaud said:

../CELL[1] actually refers to the content of CELL[1] instead of
refering to a unique identifier specific to the first child of the
parent. Who?!?! What??!?.

Click to expand...

This is, of course, what you usually want. It's much more common to
compare the content of nodes than to compare them for identity.

Hmmm. It seems upside down to me.

That's not a freak case, it's perfectly normal.

According to what I read of the spec it seems to be. It seems to me
though as if it should be the other way around. Or at least, I believe
it should be made explicit weather we are referencing the node itself
or its content, giving us the choice of dealing with structure or with
content without having to scope into esoteric programming. As it is,
the XPATH expression gives me the impression that we are talking about
structure as opposed to content; it so happens to be the other way
around we attain content by talking about structure. Do XPATH
expressions in 'select' and 'match' behave the same way? If this isn't
the case, then it seems we have inconsistent behavior. Two XPATH
expressions talking about 2 different things. One talking about
structure whereas the other talks about content.

Unfortunately there's no "natural" way to determine node identity in
XPath (at least, in XPath 1). Two unnatural ways are:

generate-id(node1) = generate-id(node2)

and (probably more efficient):

count(node1 | node2) = 1

Thanks for the tip.

Regards
Jean-Francois Michaud

Richard Tobin · Sep 19, 2006

This is, of course, what you usually want. It's much more common to
compare the content of nodes than to compare them for identity.

Hmmm. It seems upside down to me.[/QUOTE]

XSLT is primarily a language for processing text. It's far more
common to, for example, find the elements whose "name" attribute is
equal to some other element's "name" attribute than it is to test
whether two nodes are the same node.

I wouldn't be surprised if you've written dozens of expressions that
assumed that node equality compared text, without even realising it.
Look at your other uses of the = operator where the arguments are
node sets.

According to what I read of the spec it seems to be. It seems to me
though as if it should be the other way around. Or at least, I believe
it should be made explicit weather we are referencing the node itself
or its content, giving us the choice of dealing with structure or with
content without having to scope into esoteric programming. As it is,
the XPATH expression gives me the impression that we are talking about
structure as opposed to content; it so happens to be the other way
around we attain content by talking about structure. Do XPATH
expressions in 'select' and 'match' behave the same way? If this isn't
the case, then it seems we have inconsistent behavior. Two XPATH
expressions talking about 2 different things. One talking about
structure whereas the other talks about content.

Read the definition of the equality operator when the arguments are
node sets. It's perfectly well-defined and consistent.

-- Richard

Joe Kesselman · Sep 20, 2006

Richard said:
I wouldn't be surprised if you've written dozens of expressions that
assumed that node equality compared text

Yep. All those instances of FOO[@BAR="something"], or FOO[@bar=

, without even realising it.

=?iso-8859-1?q?Jean-Fran=E7ois_Michaud?= · Sep 20, 2006

Joe said:
Richard said:

I wouldn't be surprised if you've written dozens of expressions that
assumed that node equality compared text

Click to expand...

Yep. All those instances of FOO[@BAR="something"], or FOO[@bar=

Oh I have, but this still refers to a structural construct rather than
content.

The FOO element that meets the predicate requirement of having its BAR
attribute equal "something". To me this is still clearly a structural
reference, but oh well

.

[snip]

Regards
Jeff

Context Node in XPath 1.0	9	Nov 11, 2007
Xpath and XSLT-transformation problem	1	Mar 30, 2007
How to make XML::XPath ignore namespaces?	0	May 21, 2013
XPath: Selecting namspace node	1	Sep 18, 2004
XPath position predicates	2	May 6, 2009
XPath for selecting elements which start with a text node???	7	Jul 18, 2007
Merging tables with XSLT	2	May 29, 2009
Returning "nearest in document" matches using XPath	2	Dec 5, 2008

XSLT XPATH uses node content in logical conditions???!?!

=?iso-8859-1?q?Jean-Fran=E7ois_Michaud?=

Richard Tobin

A. Bolmarcich

Joseph Kesselman

Joseph Kesselman

=?iso-8859-1?q?Jean-Fran=E7ois_Michaud?=

Richard Tobin

Joe Kesselman

=?iso-8859-1?q?Jean-Fran=E7ois_Michaud?=

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads