XSLT problem with single tags

D

dwergkees

Hi,

Got a litte problem here. I'm trying to create a XSLT file that will do
a transformation from WordML format (MS Word XML format, see
http://rep.oio.dk/Microsoft.com/officeschemas/welcome.htm) to a
reasonably clean (X)HTML format.

(The reason being that, combined with some PHP scripting it should be
possible to store the embedded images, which is pretty neat).

I am, however running into a XSLT problem. An piece of an old version
works like this:

<xsl:template match="w:r">
<xsl:choose>
<xsl:when test=".//w:i">
<i><xsl:apply-templates /></i>
</xsl:when>
<xsl:when test=".//w:b">
<b><xsl:apply-templates /></b>
</xsl:when>
<xsl:eek:therwise>
<xsl:apply-templates />
</xsl:eek:therwise>
</xsl:choose>
</xsl:template>

This matches the r element (Run element, kind of a default container
thingy). It tests whether the r element contains an i or b element
(meaning of course that the content of that r element is in italic or
bold.) When this is the case, nice html style tags are placed. This
doesn't function properly in the case where an r element contains both
an i and a b element, i.e. when the text is both italic and bold.
Therefore, i changed the code to:

<xsl:template match="w:r">
<xsl:if test=".//w:i">
<i>
</xsl:if>

<xsl:if test=".//w:b">
<b>
</xsl:if>

<xsl:apply-templates />

<xsl:if test=".//w:i">
</i>
</xsl:if>

<xsl:if test=".//w:b">
</b>
</xsl:if>
</xsl:template>

It now tests twice for each style, for the opening tag and for the
closing tag. In principal this works fine, but in practice the xslt
sheet is not well-formed and will not be applied as it contains non
closed tags (the <i> and <b> tags). I've tried to:
- replace the < and > with &lt; and &gt;
- put the tages inside CDATA sections, for example <![CDATA[<i>]]>


However, in both cases the tags of appear as literal text instead of
HTML code.

Any ideas on how to able to insert single open or closing tags in my
HTML code, or another solution to properly nest the <i> and <b>
elements?

TIA

Wilco - Dwergkees - Menge
 
M

Martin Honnen

dwergkees wrote:

<xsl:template match="w:r">
<xsl:choose>
<xsl:when test=".//w:i">
<i><xsl:apply-templates /></i>
</xsl:when>
<xsl:when test=".//w:b">
<b><xsl:apply-templates /></b>
</xsl:when>
<xsl:eek:therwise>
<xsl:apply-templates />
</xsl:eek:therwise>
</xsl:choose>
</xsl:template>

Why don't you simply do
<xsl:template match="w:r"><xsl:apply-templates /></xsl:template>

<xsl:template match="w:i">
<i><xsl:apply-templates /></i>
</xsl:template>

<xsl:template match="w:b">
<b><xsl:apply-templates /></b>
</xsl:template>

I am not familiar with WordML however, but based on what you have posted
and on how XSLT works it seems more natural to simply let
xsl:apply-templates do its work combined with templates for the
different elements you need to process.
 
D

dwergkees

Why don't you simply do
<xsl:template match="w:r"><xsl:apply-templates /></xsl:template>

<xsl:template match="w:i">
<i><xsl:apply-templates /></i>
</xsl:template>

<xsl:template match="w:b">
<b><xsl:apply-templates /></b>
</xsl:template>


I've tried fussin' about with your solution, but i can't get it to fit
just right. I'll show a small WordML example to demonstrate the problem
more clearly:

WordML:

<w:p>
<w:r>
<w:t>Plain text</w:t>
</w:r>
<w:r>
<w:rPr>
<w:b/>
</w:rPr>
<w:t>Bold text</w:t>
</w:r>
<w:r>
<w:t> </w:t>
</w:r>
<w:r>
<w:rPr>
<w:i/>
</w:rPr>
<w:t>Italic text</w:t>
</w:r>
<w:r>
<w:rPr>
<w:b/>
<w:i/>
</w:rPr>
<w:t>Bold and italic</w:t>
</w:r>
</w:p>

Should transform to:

<p>Plain text <b>Bold text</b> <i>Italic text</i> <i><b>Bold and
italic</b></i></p>

As you can see, the <w:i/> and <w:b/> tags are grandchildren of the r
element, the text itself is a child of the r element. So at each r
element I want to check the existence of w:i and w:b and surround the t
element with the corresponding HTML. Your solution matches the
existence, but then the processor is at the wrong current Node. (As far
as I understand the complexities of xslt).

Any thoughts?

TIA

Wilco.
 
P

Peter Flynn

dwergkees said:
I've tried fussin' about with your solution, but i can't get it to fit
just right. I'll show a small WordML example to demonstrate the problem
more clearly:

WordML:

<w:p>
<w:r>
<w:t>Plain text</w:t>
</w:r>
<w:r>
<w:rPr>
<w:b/>
</w:rPr>
<w:t>Bold text</w:t>
</w:r>
<w:r>
<w:t> </w:t>
</w:r>
<w:r>
<w:rPr>
<w:i/>
</w:rPr>
<w:t>Italic text</w:t>
</w:r>
<w:r>
<w:rPr>
<w:b/>
<w:i/>
</w:rPr>
<w:t>Bold and italic</w:t>
</w:r>
</w:p>

This is an interesting relic of (a) the fact that Word uses out-of-line
markup and (b) the sedulous avoidance of Mixed Content common to those
who think pointers are more fun to program than trees. It's also about
the only way you can model the behaviour of unschooled authors in Word.

Just add another condition to your original:

<xsl:template match="w:r">
<xsl:choose>
<xsl:when test=".//w:i and .//w:b">
<i><b><xsl:apply-templates/></b></i>
</xsl:when>
<xsl:when test=".//w:i and not(.//w:b)">
<i><xsl:apply-templates/></i>
</xsl:when>
<xsl:when test=".//w:b and not(.//w:i)">
<b><xsl:apply-templates/></b>
</xsl:when>
<xsl:eek:therwise>
<xsl:apply-templates/>
</xsl:eek:therwise>
</xsl:choose>
Should transform to:

<p>Plain text <b>Bold text</b> <i>Italic text</i> <i><b>Bold and
italic</b></i></p>

No, there is no white-space after "Plain text" nor after "Italic text"
in your quoted XML document. If you need to introduce extra white-space
you need to specify the rules for doing so.
As you can see, the <w:i/> and <w:b/> tags are grandchildren of the r
element, the text itself is a child of the r element. So at each r
element I want to check the existence of w:i and w:b and surround the t
element with the corresponding HTML. Your solution matches the
existence, but then the processor is at the wrong current Node. (As far
as I understand the complexities of xslt).

Any thoughts?

Here's another way to do it, based on Martin's suggestion of using the
normal "apply-templates" way of proceeding down a document. You'll have
to jiggle the declared namespace for w: as I don't know what your
document declares it as. This method will handle anything occurring in
w:rPr, not just bold and italics.

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0"
xmlns:w="http://foo.bar.org"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:eek:utput method="html"/>
<xsl:strip-space elements="*"/>
<xsl:preserve-space elements="w:t"/>

<xsl:template match="w:p">
<p>
<xsl:apply-templates/>
</p>
</xsl:template>

<xsl:template match="w:r">
<xsl:choose>
<xsl:when test="w:rPr">
<xsl:call-template name="nest">
<xsl:with-param name="styles" select="w:rPr/*"/>
</xsl:call-template>
</xsl:when>
<xsl:eek:therwise>
<xsl:apply-templates/>
</xsl:eek:therwise>
</xsl:choose>
</xsl:template>

<xsl:template name="nest">
<xsl:param name="styles"/>
<xsl:param name="counter">
<xsl:text>1</xsl:text>
</xsl:param>
<xsl:choose>
<xsl:when test="$counter>count($styles)">
<xsl:value-of select="w:t"/>
</xsl:when>
<xsl:eek:therwise>
<xsl:element name="{local-name($styles[$counter])}">
<xsl:call-template name="nest">
<xsl:with-param name="styles" select="$styles"/>
<xsl:with-param name="counter" select="$counter+1"/>
</xsl:call-template>
</xsl:element>
</xsl:eek:therwise>
</xsl:choose>
</xsl:template>

</xsl:stylesheet>

In this, I am stripping all space except in w:t. This will preserve the
otherwise vulnerable white-space-only node.

///Peter
 
D

dwergkees

Thanks!

This is the kind of solution I was looking for!!! I have to tweak it
here and there, but this is just the kind of nesting principle I was
interested in, as it allows for extensions (I plan to use the same
[QUOTE= said:
Should transform to:
<p>Plain text <b>Bold text</b> <i>Italic text</i> <i><b>Bold and
italic</b></i></p>

No, there is no white-space after "Plain text" nor after "Italic text"
in your quoted XML document. If you need to introduce[/QUOTE]

as for the extra whitespaces in the desired output, they are just
random typos from me! I'm just as happy with all the original
whitespaces minus unnecesary whitespace.
Again, thanks a lot for helping out both Martin and Peter!!

Wilco Menge
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,992
Messages
2,570,220
Members
46,805
Latest member
ClydeHeld1

Latest Threads

Top