xsltproc and DocBook

D

Darel Finkbeiner

This may be the wrong group, so let me know.

My "problem" is this: I am writing my commentary in DocBook 5 and
using the program xsltproc and the docbook5 XSL stylesheets to produce
XHTML output. Since it is a commentary, it has both English and
polytonic Greek with combining diacritics in it. My console and VIM
are both perfectly configured to allow me to edit such documents in a
very natural and easy way, and one in which I can actually read the
Greek that I've typed in.

After processing with xsltproc, all of my beautiful UTF-8 encoded
Greek is being transformed into butt-ugly entity references.

Now, I suppose, "technically speaking", this isn't an issue when
viewing the html document in a browser.... maybe. But I like to be
able to view and "debug" the resulting file in a text editor as I want
to ... additionally, how am I to be sure that the "correct" UTF-8
code points are being used for crucial combining marks ( and by
"correct", I mean the exact code points that I have chosen to use,
since there are alternatives in the unicode standard )? I
specifically chose XHTML output because it is natively UTF-8, so why
convert them to entities in the first place?

My question is, how do I turn off this "feature"? Or can I? Or
should I use a different XSLT processor?
 
J

Joseph Kesselman

Did you specify UTF-8 as your output encoding in the xsl:eek:utput directive?

If you did, and you're still getting everything converted to character
references... you may want to try another XSLT processor and see if its
serializer does a better job of taking advantage of UTF-8.
 
D

Darel Finkbeiner

Did you specify UTF-8 as your output encoding in the xsl:eek:utput directive?

If you did, and you're still getting everything converted to character
references... you may want to try another XSLT processor and see if its
serializer does a better job of taking advantage of UTF-8.

It looks like the output method in the XSL stylesheet sets the
encoding correctly.....

<xsl:eek:utput method="xml" encoding="UTF-8" indent="no" doctype-
public="-//W3C//DTD XHTML 1.0 Transitional//EN" doctype-system="http://
www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"/>

Any suggestions on a good XSLT processor?
 
A

Alain Ketterlin

Darel Finkbeiner said:
My question is, how do I turn off this "feature"? Or can I? Or
should I use a different XSLT processor?

You may try to not even mention XHTML in <xsl:eek:utput> (make a new
"driver" xslt stylesheet, with only <xsl:eek:utput> and <xsl:include> of
the other xsl). xsltproc should not use any entity then.

-- Alain.
 
D

Darel Finkbeiner

You may try to not even mention XHTML in <xsl:eek:utput> (make a new
"driver" xslt stylesheet, with only <xsl:eek:utput> and <xsl:include> of
the other xsl). xsltproc should not use any entity then.

-- Alain.

Amazing... you were absolutely correct. I changed the output to:

<xsl:eek:utput method="xml" encoding="UTF-8"/>

And suddenly it worked perfectly. Thanks for the tip, Alain!
 
J

Joseph Kesselman

Darel said:
Amazing... you were absolutely correct. I changed the output to:
<xsl:eek:utput method="xml" encoding="UTF-8"/>
And suddenly it worked perfectly. Thanks for the tip, Alain!

Note that method="xhtml" is actually not defined in the XSLT 1.0
standard... but since XHTML is an XML language, outputting as XML should
work.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,814
Latest member
SpicetreeDigital

Latest Threads

Top