upper-casing parts of xpath

J

Johannes Koch

Hi there,

I'd like to apply an xpath to both HTML and XHTML documents. First I
create a DOM document with a Java DOM parser, then apply the xpath with
Xalan's XPathAPI class. The problem is that in HTML DOM element names
are all upper-case, whereas in Core DOM (used for the XHTML documents)
element names are lower-case. When I use a lower-case xpath, e.g.

/head[@profile='http://www.example.org/MyProfile']

it won't match with a head element in an HTML document. OTOH, when I use

/HEAD[@profile='http://www.example.org/MyProfile']

it won't match with a head element in an XHTML document.

I cannot make the whole xpath lower-case in case of an XHTML document,
because there may be case-sensitive things in the xpath, like the URL in
the example above.

There may be some Java classes to parse the xpath string and get the
element names to make them upper-case for HTML. Does anyone know of such
things?
 
P

Philippe Poulard

Johannes said:
Hi there,

I'd like to apply an xpath to both HTML and XHTML documents. First I
create a DOM document with a Java DOM parser, then apply the xpath with
Xalan's XPathAPI class. The problem is that in HTML DOM element names
are all upper-case, whereas in Core DOM (used for the XHTML documents)
element names are lower-case. When I use a lower-case xpath, e.g.

/head[@profile='http://www.example.org/MyProfile']

it won't match with a head element in an HTML document. OTOH, when I use

/HEAD[@profile='http://www.example.org/MyProfile']

it won't match with a head element in an XHTML document.

I cannot make the whole xpath lower-case in case of an XHTML document,
because there may be case-sensitive things in the xpath, like the URL in
the example above.

There may be some Java classes to parse the xpath string and get the
element names to make them upper-case for HTML. Does anyone know of such
things?

hi,

maybe you have to plug a sax parser that do the job before building the
DOM model ?

another solution is to use Jaxen instead of Xalan's XPathAPI ; instead
of parsing xpath expressions with jaxen.dom.DOMXPath, you can parse them
with a copy of the package jaxen.dom.*, for example koch.dom.* ; the
main class is DocumentNavigator, that you have to extend to write your
own methods, such as getElementName() that should give upper-case names...
easy !

the last solution is to use Jaxen again, and write your own SAXPath
parser ; but i don't know really where to act exactly
--
Cordialement,

///
(. .)
-----ooO--(_)--Ooo-----
| Philippe Poulard |
-----------------------
 
K

Kenneth Stephen

Johannes said:
Hi there,

I'd like to apply an xpath to both HTML and XHTML documents. First I
create a DOM document with a Java DOM parser, then apply the xpath with
Xalan's XPathAPI class. The problem is that in HTML DOM element names
are all upper-case, whereas in Core DOM (used for the XHTML documents)
element names are lower-case. When I use a lower-case xpath, e.g.

/head[@profile='http://www.example.org/MyProfile']

it won't match with a head element in an HTML document. OTOH, when I use

/HEAD[@profile='http://www.example.org/MyProfile']
Hi,

You could write a pre-processor XSL program that converts all uppercase
tags to lower-case tags. And then feed the output to your regular
program. Shown below is an example of such a pre-processor (warning :
not extensively tested. use at your own risk) :

<?xml version="1.0"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">
<xsl:apply-templates />
</xsl:template>

<xsl:template match="*">
<xsl:variable name="elementName"
select="translate(local-name(),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz')"
/>
<xsl:element name="{$elementName}">
<xsl:for-each select="@*">
<xsl:apply-templates select="." />
</xsl:for-each>
<xsl:apply-templates />
</xsl:element>
</xsl:template>

<xsl:template match="@*">
<xsl:variable name="attrName"
select="translate(local-name(),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz')"
/>
<xsl:attribute name="{$attrName}">
<xsl:value-of select="." />
</xsl:attribute>
</xsl:template>

<xsl:template match="text()">
<xsl:value-of select="." />
</xsl:template>

</xsl:stylesheet>

Regards,
Kenneth
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,830
Latest member
HeleneMull

Latest Threads

Top