XSLT XPATH uses node content in logical conditions???!?!

  • Thread starter =?iso-8859-1?q?Jean-Fran=E7ois_Michaud?=
  • Start date
?

=?iso-8859-1?q?Jean-Fran=E7ois_Michaud?=

Who in the name of #%@! thought this one out??

I noticed this behavior when trying to debug a problem I was having.

I used this logical expression and some XPATH in a specific sequence of
instructions that allow me to transform a CALS table model into our own
specific table model and I used this expression:

<xsl:if test="self::node()=../CELL[1] and self::node()[@colname >
1]">...

to artificially create cells at the beginning of the row if their
colname is greater than 1. The algorithm works fine, but the expression
DID NOT in every case.

.../CELL[1] actually refers to the content of CELL[1] instead of
refering to a unique identifier specific to the first child of the
parent. Who?!?! What??!?. Shouldn't the XPATH expression point to
unique cells within the XML tree as opposed to outputting the content?

A freak case emerged when of course, 2 different cells within the same
row had exactly the same content. I had to patch up the logical
condition so that it would uniquely identify the cells.

Shouldn't the distinction be made between the node itself and its
content? Or am I missing something?

Regards
Jean-Francois Michaud
 
R

Richard Tobin

Jean-François Michaud said:
../CELL[1] actually refers to the content of CELL[1] instead of
refering to a unique identifier specific to the first child of the
parent. Who?!?! What??!?.

This is, of course, what you usually want. It's much more common to
compare the content of nodes than to compare them for identity.
A freak case emerged when of course, 2 different cells within the same
row had exactly the same content.

That's not a freak case, it's perfectly normal.

Unfortunately there's no "natural" way to determine node identity in
XPath (at least, in XPath 1). Two unnatural ways are:

generate-id(node1) = generate-id(node2)

and (probably more efficient):

count(node1 | node2) = 1

-- Richard
 
A

A. Bolmarcich

On 2006-09-19 said:
Shouldn't the distinction be made between the node itself and its
content? Or am I missing something?

You must have missed the first sentence of the fifth paragraph of the
"Booleans" section of the XPATH specification (see
://www.w3.org/TR/xpath#booleans): "If both objects to be compared are
node-sets, then the comparison will be true if and only if there is a
node in the first node-set and a node in the second node-set such that
the result of performing the comparison on the string-values of the
two nodes is true."
 
J

Joseph Kesselman

The problem is the definition of comparison of nodesets as comparison of
their values to see if any node in the set compares true. That's
actually a very useful behavior, but as you've just demonstrated it
isn't always the one you want.

To obtain a unique identifier for that node for identity-comparison
purposes, use the generate-id() function. In fact there's an explicit
example of this in XSLT spec, during their description of the document()
function, where they use it to illustrate that repeated retrieval of the
same URI yields the same actual nodes and not just the same data:
generate-id(document("foo.xml"))=generate-id(document("foo.xml"))
 
?

=?iso-8859-1?q?Jean-Fran=E7ois_Michaud?=

Richard said:
Jean-François Michaud said:
../CELL[1] actually refers to the content of CELL[1] instead of
refering to a unique identifier specific to the first child of the
parent. Who?!?! What??!?.

This is, of course, what you usually want. It's much more common to
compare the content of nodes than to compare them for identity.

Hmmm. It seems upside down to me.
That's not a freak case, it's perfectly normal.

According to what I read of the spec it seems to be. It seems to me
though as if it should be the other way around. Or at least, I believe
it should be made explicit weather we are referencing the node itself
or its content, giving us the choice of dealing with structure or with
content without having to scope into esoteric programming. As it is,
the XPATH expression gives me the impression that we are talking about
structure as opposed to content; it so happens to be the other way
around we attain content by talking about structure. Do XPATH
expressions in 'select' and 'match' behave the same way? If this isn't
the case, then it seems we have inconsistent behavior. Two XPATH
expressions talking about 2 different things. One talking about
structure whereas the other talks about content.
Unfortunately there's no "natural" way to determine node identity in
XPath (at least, in XPath 1). Two unnatural ways are:

generate-id(node1) = generate-id(node2)

and (probably more efficient):

count(node1 | node2) = 1

Thanks for the tip.

Regards
Jean-Francois Michaud
 
R

Richard Tobin

This is, of course, what you usually want. It's much more common to
compare the content of nodes than to compare them for identity.

Hmmm. It seems upside down to me.[/QUOTE]

XSLT is primarily a language for processing text. It's far more
common to, for example, find the elements whose "name" attribute is
equal to some other element's "name" attribute than it is to test
whether two nodes are the same node.

I wouldn't be surprised if you've written dozens of expressions that
assumed that node equality compared text, without even realising it.
Look at your other uses of the = operator where the arguments are
node sets.
According to what I read of the spec it seems to be. It seems to me
though as if it should be the other way around. Or at least, I believe
it should be made explicit weather we are referencing the node itself
or its content, giving us the choice of dealing with structure or with
content without having to scope into esoteric programming. As it is,
the XPATH expression gives me the impression that we are talking about
structure as opposed to content; it so happens to be the other way
around we attain content by talking about structure. Do XPATH
expressions in 'select' and 'match' behave the same way? If this isn't
the case, then it seems we have inconsistent behavior. Two XPATH
expressions talking about 2 different things. One talking about
structure whereas the other talks about content.

Read the definition of the equality operator when the arguments are
node sets. It's perfectly well-defined and consistent.

-- Richard
 
J

Joe Kesselman

Richard said:
I wouldn't be surprised if you've written dozens of expressions that
assumed that node equality compared text

Yep. All those instances of FOO[@BAR="something"], or FOO[@bar=


, without even realising it.
 
?

=?iso-8859-1?q?Jean-Fran=E7ois_Michaud?=

Joe said:
Richard said:
I wouldn't be surprised if you've written dozens of expressions that
assumed that node equality compared text

Yep. All those instances of FOO[@BAR="something"], or FOO[@bar=

Oh I have, but this still refers to a structural construct rather than
content.

The FOO element that meets the predicate requirement of having its BAR
attribute equal "something". To me this is still clearly a structural
reference, but oh well :).

[snip]

Regards
Jeff
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,005
Messages
2,570,264
Members
46,859
Latest member
HeidiAtkin

Latest Threads

Top