Andy Fish wrote:
I'm stuck with an XSL problem - can anyone give me any hints?
I have some XML with nested formatting tags like this:
<text>
this is plain
<bold>
this is bold
<italic>
this is bold-italic
</italic>
</bold>
this is plain
</text>
which I need to 'flatten out' into something like this:
<text>this is plain</text>
<text bold="true">this is bold</text>
<text bold="true" italic="true">this is bold-italic</text>
<text>this is plain</text>
It doesn't have to work with any arbitrary tags - there are only a few
possible ones - but I'm not sure how to "remember" the outer level
formatting nodes when processing the text inside. It seems to be
crying out for some kind of state variable
Modes can help to give some kind of state in which a node is to be
processed, here is my attempt at using them to solve the problem:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="
http://www.w3.org/1999/XSL/Transform">
<xsl
utput method="xml" encoding="UTF-8" indent="yes" />
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="text">
<xsl:apply-templates select="node()" mode="flatten" />
</xsl:template>
<xsl:template match="text()" mode="flatten">
<text><xsl:value-of select="." /></text>
</xsl:template>
<xsl:template match="bold" mode="flatten">
<xsl:apply-templates select="node()" mode="flattenBold" />
</xsl:template>
<xsl:template match="text()" mode="flattenBold">
<text bold="true"><xsl:value-of select="." /></text>
</xsl:template>
<xsl:template match="italic" mode="flattenBold">
<xsl:apply-templates select="node()" mode="flattenBoldItalic" />
</xsl:template>
<xsl:template match="text()" mode="flattenBoldItalic">
<text bold="true" italic="true"><xsl:value-of select="." /></text>
</xsl:template>
</xsl:stylesheet>
The result is not quite what you want but besides a white space text
node showing up it has the right structure (note I wrapped your source
above in a <doc> element as otherwise if the result is flattened it
wouldn't have a root element):
<doc>
<text>
this is plain
</text>
<text bold="true">
this is bold
</text>
<text italic="true" bold="true">
this is bold-italic
</text>
<text bold="true">
</text>
<text>
this is plain
</text>
</doc>
Now to solve the whitespace text node issue I think the following should
help:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="
http://www.w3.org/1999/XSL/Transform">
<xsl
utput method="xml" encoding="UTF-8" indent="yes" />
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="text">
<xsl:apply-templates select="node()" mode="flatten" />
</xsl:template>
<xsl:template match="text()" mode="flatten">
<xsl:variable name="normalizedText" select="normalize-space(.)" />
<xsl:if test="$normalizedText">
<text><xsl:value-of select="." /></text>
</xsl:if>
</xsl:template>
<xsl:template match="bold" mode="flatten">
<xsl:apply-templates select="node()" mode="flattenBold" />
</xsl:template>
<xsl:template match="text()" mode="flattenBold">
<xsl:variable name="normalizedText" select="normalize-space(.)" />
<xsl:if test="$normalizedText">
<text bold="true"><xsl:value-of select="." /></text>
</xsl:if>
</xsl:template>
<xsl:template match="italic" mode="flattenBold">
<xsl:apply-templates select="node()" mode="flattenBoldItalic" />
</xsl:template>
<xsl:template match="text()" mode="flattenBoldItalic">
<xsl:variable name="normalizedText" select="normalize-space(.)" />
<xsl:if test="$normalizedText">
<text bold="true" italic="true"><xsl:value-of select="." /></text>
</xsl:if>
</xsl:template>
</xsl:stylesheet>