XHTML vs. XPath: @class match?

I

Ivan Shmakov

Given that the "class" attribute is "a value that is a set of
space-separated tokens" [1], is there an easy way to match
elements of a particular class in XPath? Unfortunately,
//node ()[@class = "foo"] doesn't seem to fit.

TIA.

[1] http://www.w3.org/TR/html5/dom.html#classes

PS. I'm using libxml2 for XPath support.
 
B

Bjoern Hoehrmann

* Ivan Shmakov wrote in comp.text.xml:
Given that the "class" attribute is "a value that is a set of
space-separated tokens" [1], is there an easy way to match
elements of a particular class in XPath? Unfortunately,
//node ()[@class = "foo"] doesn't seem to fit.

That works with XPath 2.0 if the attribute is known to be an NMTOKENS
attribute through schema information, but with XPath 1.0 you have to
use something like

contains(' foo ', concat(' ', normalize-space(@class), ' '))

To account for the various possible cases, with some caveats like the
definition of white space being different between HTML and XPath.
 
I

Ivan Shmakov

Bjoern Hoehrmann said:
* Ivan Shmakov wrote in comp.text.xml:
Given that the "class" attribute is "a value that is a set of
space-separated tokens" [1], is there an easy way to match elements
of a particular class in XPath? Unfortunately,
//node ()[@class = "foo"] doesn't seem to fit.
That works with XPath 2.0 if the attribute is known to be an NMTOKENS
attribute through schema information, but with XPath 1.0 you have to
use something like
contains(' foo ', concat(' ', normalize-space(@class), ' '))

ACK, thanks! I guess that with libxml2 I'm stuck to XPath 1.0.
To account for the various possible cases, with some caveats like the
definition of white space being different between HTML and XPath.

Which are "(#x20 | #x9 | #xD | #xA)" (as per XML 1.0),
vs. (#xC | #x20 | #x9 | #xD | #xA) (as per HTML5.)

Isn't all that bad (especially given that the document I'm
processing is an XHTML template, to be shipped with the
application I'm working on); one of them could've been allowing
the whole host of Unicode whitespace characters, too.

Do I understand it correctly that explicitly translate'ing &#xC
to ' ' would be the proper solution? Or perhaps translate ()
may render normalize-space () unnecessary /in this case/?
Consider, e. g.:

contains (' foo ',
concat (' ', translate (@class, '

', ' '), ' '))

TIA.
 
B

Bjoern Hoehrmann

* Ivan Shmakov wrote in comp.text.xml:
Which are "(#x20 | #x9 | #xD | #xA)" (as per XML 1.0),
vs. (#xC | #x20 | #x9 | #xD | #xA) (as per HTML5.)

Isn't all that bad (especially given that the document I'm
processing is an XHTML template, to be shipped with the
application I'm working on); one of them could've been allowing
the whole host of Unicode whitespace characters, too.

Do I understand it correctly that explicitly translate'ing &#xC
to ' ' would be the proper solution? Or perhaps translate ()
may render normalize-space () unnecessary /in this case/?
Consider, e. g.:

You can use translate() in place of normalize-space() to normalize white
space to simple spaces, but it might not be possible to handle the case
of U+000C since that's not a valid character in XPath expressions and no
escaping mechanism exists in XPath 1.0 (and neither is there a function
to get the Unicode character numbers which would be a possible solution
otherwise).
 
I

Ivan Shmakov

Bjoern Hoehrmann said:
Which are "(#x20 | #x9 | #xD | #xA)" (as per XML 1.0), vs. (#xC |
#x20 | #x9 | #xD | #xA) (as per HTML5.)
[...]
Do I understand it correctly that explicitly translate'ing &#xC to
' ' would be the proper solution? Or perhaps translate () may
render normalize-space () unnecessary /in this case/?
[...]

You can use translate() in place of normalize-space () to normalize
white space to simple spaces, but it might not be possible to handle
the case of U+000C since that's not a valid character in XPath
expressions and no escaping mechanism exists in XPath 1.0 (and
neither is there a function to get the Unicode character numbers
which would be a possible solution otherwise).

ACK, thanks.

Though, as I've just found, the whole point is moot, for U+000C
is not a valid character in XML 1.0, either [1], and thus no
XHTML document may ever contain one, whether in @class, or any
other place.

[1] http://www.w3.org/TR/REC-xml/#charsets
 
P

Peter Flynn

* Ivan Shmakov wrote in comp.text.xml:
Given that the "class" attribute is "a value that is a set of
space-separated tokens" [1], is there an easy way to match
elements of a particular class in XPath? Unfortunately,
//node ()[@class = "foo"] doesn't seem to fit.

That works with XPath 2.0 if the attribute is known to be an NMTOKENS
attribute through schema information

Does it also work if the attribute has been declared as IDREFS?

///Peter
 
B

Bjoern Hoehrmann

* Peter Flynn wrote in comp.text.xml:
* Ivan Shmakov wrote in comp.text.xml:
Given that the "class" attribute is "a value that is a set of
space-separated tokens" [1], is there an easy way to match
elements of a particular class in XPath? Unfortunately,
//node ()[@class = "foo"] doesn't seem to fit.

That works with XPath 2.0 if the attribute is known to be an NMTOKENS
attribute through schema information

Does it also work if the attribute has been declared as IDREFS?

Good question. I gave up researching this when I found that there is no
constructor function xs:IDREFS defined in the 2.0 specifications, but in
the 3.0 proposals there is one. So for 3.0 I suspect "yes", but I don't
know about 2.0.
 
P

Peter Flynn

* Peter Flynn wrote in comp.text.xml:
* Ivan Shmakov wrote in comp.text.xml:
Given that the "class" attribute is "a value that is a set of
space-separated tokens" [1], is there an easy way to match
elements of a particular class in XPath? Unfortunately,
//node ()[@class = "foo"] doesn't seem to fit.

That works with XPath 2.0 if the attribute is known to be an NMTOKENS
attribute through schema information

Does it also work if the attribute has been declared as IDREFS?

Good question. I gave up researching this when I found that there is no
constructor function xs:IDREFS defined in the 2.0 specifications, but in
the 3.0 proposals there is one. So for 3.0 I suspect "yes", but I don't
know about 2.0.

Excellent, thanks. Some of us still have clients using ID/IDREF :)

///Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,821
Latest member
AleidaSchi

Latest Threads

Top