help w/HTML escaping in XML tags?

J

Jim Bancroft

Hi everyone,

I receive XML documents which sometimes have HTML in the element
content. When performing XSL transformations the HTML text is escaped,
which affects us when we eventually display it in a browser.

I understand there's a "disable-output-escaping" attribute that can be
used in <xsl:value-of> elements, but is there way to do the same thing
across the entire XML document, by default, without having to modify
individual XSL tags?

Thanks for your advice.

-Jim
 
D

David Carlisle

Jim Bancroft said:
Hi everyone,

I receive XML documents which sometimes have HTML in the element
content. When performing XSL transformations the HTML text is escaped,
which affects us when we eventually display it in a browser.

It'll only be escaped in the output of it was escaped on input, although
probably used &lt entity references rather than (say) CDATA sections,
although these are equivalent. The usual advice is "don't start from
here" ie have input of
<foo><p>...<br/>...</p></foo>
rather than
<foo><![CDATA[<p>...<br/>...</p>]]></foo>
then you can just xsl:copy-of select="foo/node()". However yu can't
always control your input...

I understand there's a "disable-output-escaping" attribute that can be
used in <xsl:value-of> elements, but is there way to do the same thing
across the entire XML document, by default, without having to modify
individual XSL tags?

Thanks for your advice.

Yes and no. No, in xml or html mode you need to do this on each xsl:value-of
however if a large part of your result is copied from this kind of
escaped html you can use the text output method (which of course never
uses xml escaping) but then you can't output any nodes: you have to
generate all the tags directltly
<xsl:text>&lt;br/&gt;</xsl:text>
rather than <br/>

David
 
J

Jim Bancroft

Thanks David,

I may not have been completely clear my original post. The XML
documents I receive don't come with pre-escaped HTML but actual HTML.
Here's a brief example:

<myDocument>
<tag1>This is some <b>HTML</b> code</tag1>
</myDocument>

In this case, the <b> tags are HTML-escaped during the XSL transformation; I
wind up with &lt and &rt instead, which screws me up when rendering the XML
document. I'd like to keep the <b> tags as-is, if possible, but it sounds
like from your post that you can't do it at a global level, that you have
use the disable-output-escaping attribute on every text node? Sorry if
these questions sound newbieish, and thanks again.

-Jim


David Carlisle said:
Jim Bancroft said:
Hi everyone,

I receive XML documents which sometimes have HTML in the element
content. When performing XSL transformations the HTML text is escaped,
which affects us when we eventually display it in a browser.

It'll only be escaped in the output of it was escaped on input, although
probably used &lt entity references rather than (say) CDATA sections,
although these are equivalent. The usual advice is "don't start from
here" ie have input of
<foo><p>...<br/>...</p></foo>
rather than
<foo><![CDATA[<p>...<br/>...</p>]]></foo>
then you can just xsl:copy-of select="foo/node()". However yu can't
always control your input...

I understand there's a "disable-output-escaping" attribute that can
be
used in <xsl:value-of> elements, but is there way to do the same thing
across the entire XML document, by default, without having to modify
individual XSL tags?
 
D

David Carlisle

I may not have been completely clear my original post. The XML
documents I receive don't come with pre-escaped HTML but actual HTML.
Here's a brief example:

<myDocument>
<tag1>This is some <b>HTML</b> code</tag1>
</myDocument>

In this case, the <b> tags are HTML-escaped during the XSL
transformation;

It's possible that that happens but you would have to work pretty hard
at it for example
<xsl:template match="b">
&lt;b&gt;<xsl:apply-templates/> &lt;/b&gt;
</xsl:template>

would have that effect. If that is the case the answer would be to not
do that but instead just copy the nodes to the output

<xsl:template match="tag1">
<xsl:copy-of select="node()"/>
</xsl:template>

The result you say you want is far easier to obtain than the result you
say you are getting, so you'll have to give at least _some_ hint of what
your stylesheet looks like to give anyone a clue how to change it.

David
 
D

David Carlisle

I thought I'd replied to this but it hasn't shown up so I'll try again
sorry if you get two.


<myDocument>
<tag1>This is some <b>HTML</b> code</tag1>
</myDocument>

In this case, the <b> tags are HTML-escaped during the XSL
transformation;


that wouldn't happen by default, only if you explictly program it that
way, eg

<xsl:template name="b">
&lt;b&gt;<xsl:apply-templates/> &lt;/b&gt;
</xsl:template>

If you copy nodes from the source or generate nodes rather than text in
teh stylesheet they will be linearised as xml element tags so the nodes
get re-created when the result is parsed.

David
 
P

Peter Flynn

Jim said:
Hi everyone,

I receive XML documents which sometimes have HTML in the element
content. When performing XSL transformations the HTML text is escaped,
which affects us when we eventually display it in a browser.

I understand there's a "disable-output-escaping" attribute that can be
used in <xsl:value-of> elements, but is there way to do the same thing
across the entire XML document, by default, without having to modify
individual XSL tags?

Just write a template to output them again, eg

<xsl:template match="b">
<b>
<xsl:apply-templates/>
</b>
</xsl:template>

Far easier than messing with disabling output escaping.

///Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,831
Latest member
RusselWill

Latest Threads

Top