XLST2 and stringmanipulation

A

Andreas Kraftl

Hello,

i've a xml file:
<lexikon>
<begriff>
<synonyme>
<synonym>a</synonym>
<synonym>b c</synonym>
</synonym>
<beschreibung>
Great letters.
</beschreibung>
</begriff>
...
</lexikon>

Then there is an other XML file where i would every string from <synonym>
extending.

Example XML file:
There is an b c and there an a

should be extend to
There is an
<a href="lexikon.php" class="lexikon">b c<span>
Great letters.</span></a>
and there an <a href="lexikon.php" class="lexikon">a<span>
Great letters.</span></a>

Following works not:

<!-- build the regex string -->
<xsl:variable name="lexwordstring">
<xsl:text>(.*?)(nurDummText</xsl:text>
<xsl:for-each
select="/pages/page/content/lexikon/begriff/synonyme/*">
<xsl:text>|</xsl:text>
<xsl:value-of select="normalize-space(.)"/>
</xsl:for-each>
<xsl:text>)([ ,.]*?)</xsl:text>
</xsl:variable>

<xsl:template match="text()">
<xsl:analyze-string select="." regex="{$lexwordstring}">
<xsl:matching-substring>
<xsl:value-of select="regex-group(1)"/>
<a href="{$lexikonpage}" class="lexikon">
<xsl:value-of select="regex-group(2)"/>
<span>
<xsl:apply-templates select="..."/>
</span>
</a>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>

Saxon means with <xsl:apply-templates>
"Cannot select a node here: the context item is an atomic value"
Without <xsl:apply-templates>, it works, but without the description ;).

Every idea is welcome?

Thanks
Andy
 
J

Joris Gillis

Hi,
<xsl:template match="text()">
<xsl:analyze-string select="." regex="{$lexwordstring}">
<xsl:matching-substring>
<xsl:value-of select="regex-group(1)"/>
<a href="{$lexikonpage}" class="lexikon">
<xsl:value-of select="regex-group(2)"/>
<span>
<xsl:apply-templates select="..."/>

I've never touched XPath2.0 , but 3 dots in a row doesn't sound legal.
You can use '..' to select the parent or '../..' to select the grandparent.
</span>
</a>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>

regards,
 
M

Martin Honnen

Andreas Kraftl wrote:

Following works not:

<!-- build the regex string -->
<xsl:variable name="lexwordstring">
<xsl:text>(.*?)(nurDummText</xsl:text>
<xsl:for-each
select="/pages/page/content/lexikon/begriff/synonyme/*">
<xsl:text>|</xsl:text>
<xsl:value-of select="normalize-space(.)"/>
</xsl:for-each>
<xsl:text>)([ ,.]*?)</xsl:text>
</xsl:variable>

<xsl:template match="text()">
<xsl:analyze-string select="." regex="{$lexwordstring}">
<xsl:matching-substring>
<xsl:value-of select="regex-group(1)"/>
<a href="{$lexikonpage}" class="lexikon">
<xsl:value-of select="regex-group(2)"/>
<span>
<xsl:apply-templates select="..."/>
</span>
</a>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>

Saxon means with <xsl:apply-templates>
"Cannot select a node here: the context item is an atomic value"

I don't really understand on which node you want to apply templates but
it doesn't work as Saxon rightly tells you that the context item inside
the matching-substring instruction is an atomic value, it is a string
value resulting from the regular expression match.

Using the following "dictionary"

<?xml version="1.0" encoding="UTF-8"?>
<lexikon>
<begriff>
<synonyme>
<synonym>a</synonym>
<synonym>b c</synonym>
</synonyme>
<beschreibung>
Tolle Buchstaben.
</beschreibung>
</begriff>
</lexikon>

and the following XML input

<?xml version="1.0" encoding="UTF-8"?>
<text>Da ist ein b c und da ein a</text>

the following XSLT 2.0 stylesheet

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:eek:utput method="xml" encoding="UTF-8" />

<xsl:variable name="lexwordstring">
<xsl:text>(</xsl:text>
<xsl:for-each
select="document('test2004121801.xml')/lexikon/begriff/synonyme/*">
<xsl:value-of select="normalize-space(.)"/>
<xsl:if test="position() != last()">
<xsl:text>|</xsl:text>
</xsl:if>
</xsl:for-each>
<xsl:text>)</xsl:text>
</xsl:variable>

<xsl:template match="text()">
<xsl:analyze-string select="." regex="{$lexwordstring}">
<xsl:matching-substring>
<a href="" class="lexikon">
<xsl:value-of select="regex-group(1)"/>
<span>
<xsl:value-of
select="document('test2004121801.xml')/lexikon/begriff[synonyme/synonym
= regex-group(1)]/beschreibung" />
</span>
</a>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>

<xsl:template match="*">
<xsl:copy>
<xsl:apply-templates />
</xsl:copy>
</xsl:template>

</xsl:stylesheet>

outputs

<?xml version="1.0" encoding="UTF-8"?><text>D<a href=""
class="lexikon">a<span>
Tolle Buchstaben.
</span></a> ist ein <a href="" class="lexikon">b c<span>
Tolle Buchstaben.
</span></a> und d<a href="" class="lexikon">a<span>
Tolle Buchstaben.
</span></a> ein <a href="" class="lexikon">a<span>
Tolle Buchstaben.
</span></a></text>

with Saxon 8.1.1, maybe that helps you to find a solution.
I realize that more characters are matched than you are looking for
(e.g. the 'a' in 'Da'), I have tried to remedy that with using \b in the
regular expression but unfortunately it seems the regular expression
syntax in XSLT 2.0/XPath 2.0 doesn't know \b (word boundary).
 
A

Andreas Kraftl

Joris Gillis said:
I've never touched XPath2.0 , but 3 dots in a row doesn't sound legal.
You can use '..' to select the parent or '../..' to select the grandparent.

Of course you are right.
.... means just a placeholder. :)

Thx
Andy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,222
Members
46,810
Latest member
Kassie0918

Latest Threads

Top