Empty element match

  • Thread starter Tjerk Wolterink
  • Start date
T

Tjerk Wolterink

i have the following rule,


<xsl:template match="br">
<br/>
</xsl:template>


This should convert all <br/> to <br/>
but, my transformer transforms it all to
<br></br>
Ok this does not look like a problem but i use it in a
web application and microsoft interpretes this
as 2 <br/><br/>

How can i force the transformer to not use the
implicit end tag like this <br/> ?
 
J

Joris Gillis

Tempore 10:23:19 said:
How can i force the transformer to not use the
implicit end tag like this <br/> ?

Set your output method to 'html'.

regards,
 
D

David Carlisle

How can i force the transformer to not use the
implicit end tag like this <br/> ?

If you are writing html then you should use the html output method
and this will be output as <br> However if you are writing xml then
within standard xslt there is no way to customise this although your
system may allow you to specify a non standard serialiser.

If you send xml to IE as text/html then it will parse it as html and
not understanding XML syntax for br is only a small part of the
problems.. If you send it with an XML mime type then it will use an xml
parser and parse it correctly, but you then have to supply a stylesheet
in order for it to display as it doesn't have xhtml support built in.

David
 
T

Tjerk Wolterink

Joris said:
Tempore 10:23:19, die Friday 01 July 2005 AD, hinc in foro



Set your output method to 'html'.

regards,

it is set to html:
<xsl:eek:utput method="html" indent="yes"/>
 
T

Tjerk Wolterink

David said:
If you are writing html then you should use the html output method
and this will be output as <br> However if you are writing xml then
within standard xslt there is no way to customise this although your
system may allow you to specify a non standard serialiser.

ok im outputting to XHTML TRANSITIONAL:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

So it think i should set the output method to xml,
and internet explorer should see said:
If you send xml to IE as text/html then it will parse it as html and
not understanding XML syntax for br is only a small part of the
problems.. If you send it with an XML mime type then it will use an xml
parser and parse it correctly, but you then have to supply a stylesheet
in order for it to display as it doesn't have xhtml support built in.

David

Ok i will try to set the response mime type to text/xml,
is text/xhtml also possible?
 
D

David Carlisle

Ok i will try to set the response mime type to text/xml,
ok (or application/xml is probably better)
is text/xhtml also possible?
no, the declared type for xhtml is text/xhtml+xml but IE doesn't know
about that.

You will need to use <?xml-stylesheet to specify a stylesheet (which can
simply be <xsl:copy-of select="."/>

But if you need to send to IE, why not send HTML, which it understands,
rather than XHTML, which it doesn't?

David
 
T

Tjerk Wolterink

David said:
ok (or application/xml is probably better)


no, the declared type for xhtml is text/xhtml+xml but IE doesn't know
about that.

You will need to use <?xml-stylesheet to specify a stylesheet (which can
simply be <xsl:copy-of select="."/>

But if you need to send to IE, why not send HTML, which it understands,
rather than XHTML, which it doesn't?

I'm sending it as html now, with output method set to html,
but then my document is not valid anymore,

elements like <meta and <br are not closed now,
does html transitional force you to close them??

Another strange thing,

I run 2 separate xsl-processors (my hosting provider runs
another version of libxslt)

My processor understands <xsl:eek:utput method="html"
and converts <br></br> to <br>
But the processor at the hostingprovider does not understand it.
It makes me crazy


I thought xsl was a standard, but each processor handles that standard
differently
 
D

David Carlisle

I'm sending it as html now, with output method set to html,
but then my document is not valid anymore,

XSLT does not ensure its output is valid, it is the responsibility of
the stylesheet author to do that. If the stylesheet generates an
element foobar then xslt will happily write <foobar>..</foobar>
and the resulting html will not validate.

If you were writing valid xhtml and you change the method to html and
change the doctype to specify an html dtd then it is very unlikely that
the result is invalid html. What validation error do you get?


elements like <meta and <br> are not closed now,
does html transitional force you to close them??

They are declared EMPTY in the HTML DTD so they are never opened and
don't need (and can't be) closed. HTML is not an XML language and doesn't
use XML syntax. <br> is the correct syntax for the element (or <BR> as
html is not case sensitive) <br></br> is a syntax error, <br/> is
legal syntax but equivalent to <br>&gt; and should (on a conformant system)
typeset a > after the newline. (Most browsers though don't do this, but
then they don't use conformant html parsers, they use purpose built
parsers aimed to do "something sensible" even in the face of incorrect
markup)
But the processor at the hostingprovider does not understand it.
It makes me crazy

The XSLt spec specifies that <xsl:eek:utput method="html" must be
understood (that is, not generate an xslt error) but any xslt system may
always ignore the xsl:eek:utput instruction and use its own system
specific methods to output the file. (For example in cocoon the output
from xslt is always piped to another trandformation process or
serialiser and so not serialised under the control of the stylesheet and
xsl:eek:utput is ignored.
I thought xsl was a standard, but each processor handles that standard
differently

At places where the standard explictly authorises this difference.

David
 
T

Tjerk Wolterink

David said:
XSLT does not ensure its output is valid, it is the responsibility of
the stylesheet author to do that. If the stylesheet generates an
element foobar then xslt will happily write <foobar>..</foobar>
and the resulting html will not validate.

If you were writing valid xhtml and you change the method to html and
change the doctype to specify an html dtd then it is very unlikely that
the result is invalid html. What validation error do you get?

Errors like this:
line 11 column 5 - Warning: <meta> element not empty or not closed
line 25 column 5 - Warning: <link> element not empty or not closed
line 2 column 1 - Warning: <html> proprietary attribute "xmlns:menu"

A piece of the generated html page:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd">
<html xmlns:r="http://www.wolterinkwebdesign.com/xml/roles"
xmlns:menu="http://www.wolterinkwebdesign.com/xml/menu"
xmlns:page="http://www.wolterinkwebdesign.com/xml/page"
xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Scharenborg :: Assurantien en hyptheken</title>
<!--
! Wolterink Webdesign
! (C) Tjerk Wolterink
! (e-mail address removed)
! 2005
-->
<meta name="keywords" content="Scharenborg Assurantien">
<meta name="description" content="Berendsen Meubelen">
<meta name="author" content="Tjerk Wolterink;[email protected]">
<meta name="publisher" content="Scharenborg Assurantien">

<meta name="language" content="nl">
<meta name="robots" content="all">
<script type="text/javascript"
src="http://localhost/webapps/scharenborg/js/standard.js"></script>
<script type="text/javascript"
src="http://localhost/webapps/scharenborg/js/xmlhttprequest.js"></script>
<script type="text/javascript"
src="http://localhost/webapps/scharenborg/js/xmlsax.js"></script>
<script type="text/javascript"
src="http://localhost/webapps/scharenborg/js/xmlw3cdom.js"></script>
<script type="text/javascript"
src="http://localhost/webapps/scharenborg/js/vcXMLRPC.js"></script>

<script type="text/javascript"
src="http://localhost/webapps/scharenborg/js/fck_editor/fckeditor.js"></script>
<script type="text/javascript"
src="http://localhost/webapps/scharenborg/js/xcm.js"></script>
<script type="text/javascript"
src="http://localhost/webapps/scharenborg/js/submenu.js"></script>
<link rel="stylesheet" type="text/css" media="screen"
href="http://localhost/webapps/scharenborg/css/style.css">
They are declared EMPTY in the HTML DTD so they are never opened and
don't need (and can't be) closed.

With closing i also mean: said:
HTML is not an XML language and doesn't
use XML syntax. <br> is the correct syntax for the element (or <BR> as
html is not case sensitive) <br></br> is a syntax error, <br/> is
legal syntax but equivalent to <br>&gt; and should (on a conformant system)
typeset a > after the newline. (Most browsers though don't do this, but
then they don't use conformant html parsers, they use purpose built
parsers aimed to do "something sensible" even in the face of incorrect
markup)

Ok i understand, but html transitional and html strict are not xml
languages? Or am i interchanging xhtml and html wrongly?
The XSLt spec specifies that <xsl:eek:utput method="html" must be
understood (that is, not generate an xslt error) but any xslt system may
always ignore the xsl:eek:utput instruction and use its own system
specific methods to output the file. (For example in cocoon the output
from xslt is always piped to another trandformation process or
serialiser and so not serialised under the control of the stylesheet and
xsl:eek:utput is ignored.




At places where the standard explictly authorises this difference.

So the conclusion of this is:
IF your writing an html application that uses xsl transformation
to output the html, then
you have to change the xsl for each xsltransformer-implementation,

and sometimes it just doest work,
for example with those <br><br/> tags,

Is there no possibility to enforce the parser to use <br/> for
empty tags?
 
D

David Carlisle

Errors like this:
line 11 column 5 - Warning: <meta> element not empty or not closed

Looks like those come from an XMl parser.
Parsing HTML with an XML parser is like parsing FORTRAN with a CParser,
you get errors, but it doesn't mean that there is anything wrong with
the file.

A piece of the generated html page:

That looks Ok except of course all teh xml namespace declarations
xmlns:... are invald html attributes.

With closing i also mean: <br/> (properly closed empty tag)

But as I said /> is XML syntax. It is not the syntax for an empty element
in HTML.
Ok i understand, but html transitional and html strict are not xml
languages? Or am i interchanging xhtml and html wrongly?

No version of HTML is an XML language. They are all defined via SGML
DTD. The XML versions are all called XHTML.
So the conclusion of this is:
IF your writing an html application that uses xsl transformation
to output the html, then
you have to change the xsl for each xsltransformer-implementation,

No that is not the conclusion at all.
You don't have to change your stylesheet but you may have to specify
that you want html serialisation in other (system specific) places.
The reason why XSLT doesn't mandate that xsl:eek:uput always has an effect
is that it allowes the result tree to be passed on as an in-memory treee
(or stream of sax events, or any other internal representation) This is
what happens in cocoon or if you run xslt inside of mozilla.
If the result tree is being passed on in such an in-memory format it is
never serialised to a linear document including tags so the hints on
xsl:eek:utput as to how to serialse the tree are never used.

Is there no possibility to enforce the parser to use <br/> for
empty tags?

You can force xslt to use <br/> syntax by using the xml output method,
but that is not correct html markup for a line break..

You have to decide what you want to do, generate html or generate xhtml,
and in the later case you need to decide if you want to make IE be able
to read the file as it has no built in xhtml support (unlike say mozilla
or opera which can render xhtml files).

The simplest, if you do not need xhtml in the document, is just to
generate html.

David
 
T

Tjerk Wolterink

David said:
Errors like this:
line 11 column 5 - Warning: <meta> element not empty or not closed

Looks like those come from an XMl parser.
Parsing HTML with an XML parser is like parsing FORTRAN with a CParser,
you get errors, but it doesn't mean that there is anything wrong with
the file.

A piece of the generated html page:

That looks Ok except of course all teh xml namespace declarations
xmlns:... are invald html attributes.





But as I said /> is XML syntax. It is not the syntax for an empty element
in HTML.




No version of HTML is an XML language. They are all defined via SGML
DTD. The XML versions are all called XHTML.




No that is not the conclusion at all.
You don't have to change your stylesheet but you may have to specify
that you want html serialisation in other (system specific) places.
The reason why XSLT doesn't mandate that xsl:eek:uput always has an effect
is that it allowes the result tree to be passed on as an in-memory treee
(or stream of sax events, or any other internal representation) This is
what happens in cocoon or if you run xslt inside of mozilla.
If the result tree is being passed on in such an in-memory format it is
never serialised to a linear document including tags so the hints on
xsl:eek:utput as to how to serialse the tree are never used.





You can force xslt to use <br/> syntax by using the xml output method,
but that is not correct html markup for a line break..

You have to decide what you want to do, generate html or generate xhtml,
and in the later case you need to decide if you want to make IE be able
to read the file as it has no built in xhtml support (unlike say mozilla
or opera which can render xhtml files).

The simplest, if you do not need xhtml in the document, is just to
generate html.

David

ok i want html,

but why how do i overcome the problem that the parser/transformer
on the hostingprovider does not listen to the xsl:eek:utput method attribute?

Now it just renders to the xml output method, and that does not render
well in internet explorer.
 
D

David Carlisle

but why how do i overcome the problem that the parser/transformer
on the hostingprovider does not listen to the xsl:eek:utput method attribute?


You need to tell your serialser to output using html.
You haven't said which system you are using. The onlt one I use that
doesn't use xsl:eek:utput is cocoon where the serialisers (be it html,
xhtml, text, pdf, ...) are set up in sitemap.xmap with lines looking
something like

<map:serializers default="html">
<map:serializer logger="sitemap.serializer.links" name="links" src="org.apache.cocoon.serialization.LinkSerializer"/>

<map:serializer logger="sitemap.serializer.xml" mime-type="application/xml" name="xml" src="org.apache.cocoon.serialization.XMLSerializer"/>

<map:serializer logger="sitemap.serializer.html" mime-type="text/html" name="html" pool-grow="4" pool-max="32" pool-min="4" src="org.apache.cocoon.serialization.HTMLSerializer">
<buffer-size>1024</buffer-size>
</map:serializer>

the above is just boiler plate declarations part of the default
site map

then for a particular file that is to be serialised as html:



<map:pipeline>
<map:match pattern="index.html">
<map:generate src="xml/frontpage.xml" type="file"/>
<map:transform src="stylesheets/html/om-page.xsl"/>
<map:serialize type="html"/>
^^^^^^^^^^^^^^^^
</map:match>
</map:pipeline>


If you are not using cocoon then the above syntax will be wrong, but
probably something similar is available...


David
 
T

Tjerk Wolterink

David said:
but why how do i overcome the problem that the parser/transformer

The hosting provider is running:

xsl
XSL enabled
libxslt Version 1.1.12
libxslt compiled against libxml Version 2.6.16
EXSLT enabled
libexslt Version 1.1.12

so libxslt 1.1.12.

With sablotron (i'm using that) there is no problem.


But what im doing is:

data.xml + form.xsl(output=xml) -> temp.xml (contains html tags but in
xml format)
temp.xml + page.xsl(output=html) -> page.html

In page.xsl there is a rule like this:

<!--
! All html should remain html
!-->
<xsl:template match="*[namespace-uri(.)='' or
namespace-uri(.)='http://www.w3.org/1999/xhtml']">
<xsl:copy>
<xsl:for-each select="@*">
<xsl:copy/>
</xsl:for-each>
<xsl:apply-templates select="./node()"/>
</xsl:copy>
</xsl:template>


Maybe this rule forces <br/> to convert to <br></br>
????
 
M

Martin Honnen

Tjerk Wolterink wrote:

In page.xsl there is a rule like this:

<!--
! All html should remain html
!-->
<xsl:template match="*[namespace-uri(.)='' or
namespace-uri(.)='http://www.w3.org/1999/xhtml']">
<xsl:copy>
<xsl:for-each select="@*">
<xsl:copy/>
</xsl:for-each>
<xsl:apply-templates select="./node()"/>
</xsl:copy>
</xsl:template>


Maybe this rule forces <br/> to convert to <br></br>

If you really want HTML output then you should strip the namespace from
XHTML elements e.g.
<xsl:template match="xhtml:*">
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<xsl:element name="{local-name()}">
<xsl:copy-of select="@*" />
<xsl:apply-templates />
</xsl:element>
</xsl:template>
 
T

Tjerk Wolterink

Martin said:
Tjerk Wolterink wrote:

In page.xsl there is a rule like this:

<!--
! All html should remain html
!-->
<xsl:template match="*[namespace-uri(.)='' or
namespace-uri(.)='http://www.w3.org/1999/xhtml']">
<xsl:copy>
<xsl:for-each select="@*">
<xsl:copy/>
</xsl:for-each>
<xsl:apply-templates select="./node()"/>
</xsl:copy>
</xsl:template>


Maybe this rule forces <br/> to convert to <br></br>


If you really want HTML output then you should strip the namespace from
XHTML elements e.g.
<xsl:template match="xhtml:*">
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<xsl:element name="{local-name()}">
<xsl:copy-of select="@*" />
<xsl:apply-templates />
</xsl:element>
</xsl:template>

Ok thus namespaces are not allowed in html?

I cannot wait until browser truely support xml syntax.

Another question,

How does xsl handle empty elements,
does it convert them to
<element/>
or to
<element></element>

And are there any settings in the parser to control that?
 
R

Richard Tobin

Tjerk Wolterink said:
Ok thus namespaces are not allowed in html?

HTML, being SGML rather than XML, knows nothing about namespaces.
As far as HTML is concerned, xhtml:html is an element it's never
heard of, and <html xmlns="whatever-the-xhtml-namespace-is"> has
an invalid attribute (though it probably wouldn't mind that).

But as far as outputting HTML from XSLT goes, only elements in no
namespace are output according to the HTML rules. So you have to
choose between outputting XHTML using the appropriate namespace, or
outputtting HTML with no namespace.
How does xsl handle empty elements,
does it convert them to
<element/>
or to
<element></element>

When outputting XML, it's up to the implementation, because they are
equivalent as XML. When outputting HTML, it will do the Right Thing,
according to what kind of HTML element it is (so it will output
always-empty elements as, for example, <br>).

-- Richard
 
T

Tjerk Wolterink

Richard said:
HTML, being SGML rather than XML, knows nothing about namespaces.
As far as HTML is concerned, xhtml:html is an element it's never
heard of, and <html xmlns="whatever-the-xhtml-namespace-is"> has
an invalid attribute (though it probably wouldn't mind that).

But as far as outputting HTML from XSLT goes, only elements in no
namespace are output according to the HTML rules. So you have to
choose between outputting XHTML using the appropriate namespace, or
outputtting HTML with no namespace.

ok but my output method is set to html, why are the namespaces still
there?
When outputting XML, it's up to the implementation, because they are
equivalent as XML. When outputting HTML, it will do the Right Thing,
according to what kind of HTML element it is (so it will output
always-empty elements as, for example, <br>).

-- Richard

ok thanks
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,001
Messages
2,570,249
Members
46,846
Latest member
BettinaOsw

Latest Threads

Top