Generating multiple XHTML pages from an XML file

T

Tristan Miller

Greetings.

I would like to produce a static multilingual website in XHTML. Is it
possible to specify each web page in its own XML file, but have all of the
translations encapsulated in that one file, and then process each XML file
to generate separate language-variant XHTML files?

For example, say we have a file foo.xml which contains, in part, something
like the following:

<body>
<h1>
<only lang="de">Guten Tag!</only>
<only lang="en">Hello!</only>
<only lang="hu">Jó napot kivánok!</only>
<only lang="fr">Bonjour!</only>
</h1>
</body>

I want to be able to run some simple command-line utility which will
automatically generate four separate XHTML files foo.de.html, foo.en.html,
foo.hu.html, and foo.fr.html, containing, respectively,

<body>
<h1>
Guten Tag!
</h1>
</body>

<body>
<h1>
Hello!
<h1>
</body>

And so on.

If this is possible, how do I go about doing this, and what software do I
need? (I am running a GNU/Linux system.)

Regards,
Tristan
 
A

Anne van Kesteren

Tristan said:
Greetings.

I would like to produce a static multilingual website in XHTML. Is it
possible to specify each web page in its own XML file, but have all of the
translations encapsulated in that one file, and then process each XML file
to generate separate language-variant XHTML files?

For example, say we have a file foo.xml which contains, in part, something
like the following:

[snip]

I want to be able to run some simple command-line utility which will
automatically generate four separate XHTML files foo.de.html, foo.en.html,
foo.hu.html, and foo.fr.html, containing, respectively,

[snip]

And so on.

If this is possible, how do I go about doing this, and what software do I
need? (I am running a GNU/Linux system.)

Use XSLT.
 
M

Martin Honnen

Tristan said:
I would like to produce a static multilingual website in XHTML. Is it
possible to specify each web page in its own XML file, but have all of the
translations encapsulated in that one file, and then process each XML file
to generate separate language-variant XHTML files?

For example, say we have a file foo.xml which contains, in part, something
like the following:

<body>
<h1>
<only lang="de">Guten Tag!</only>
<only lang="en">Hello!</only>
<only lang="hu">Jó napot kivánok!</only>
<only lang="fr">Bonjour!</only>

I suggest to use e.g.
<only xml:lang="de">
as the xml:lang attribute is the way the XML standards suggest the
language of an element's content should be specified.
I want to be able to run some simple command-line utility which will
automatically generate four separate XHTML files foo.de.html, foo.en.html,
foo.hu.html, and foo.fr.html, containing, respectively,

<body>
<h1>
Guten Tag!
</h1>
</body>

If this is possible, how do I go about doing this, and what software do I
need? (I am running a GNU/Linux system.)

Transforming XML files is easily done with XSLT, there are several XSLT
processors implemented in Java, some of which have command line
interfaces, for instance Saxon:
http://saxon.sourceforge.net/

Of course you need an XSLT stylesheet that takes the language as a
parameter, for instance the following XSLT stylesheet just copies all
nodes besides <only> element nodes for which only the content is copied
if the xml:lang attribute matches the outputLanguage parameter:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:eek:utput method="html" encoding="UTF-8" />

<xsl:param name="outputLanguage" select="'en'" />

<xsl:template match="only[lang($outputLanguage)]">
<xsl:apply-templates select="node()" />
</xsl:template>

<xsl:template match="only" />

<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()" />
</xsl:copy>
</xsl:template>

</xsl:stylesheet>


The stylesheet is set to produce HTML output but if needed it can also
produce XHTML.

Followup-To comp.text.xml
 
T

Tristan Miller

Greetings.

Of course you need an XSLT stylesheet that takes the language as a
parameter, for instance the following XSLT stylesheet just copies all
nodes besides <only> element nodes for which only the content is copied
if the xml:lang attribute matches the outputLanguage parameter:

OK, so I take it, then, that XSLT by default can't specify different output
streams, and that I would need to use a shell script to invoke the XSLT
processor on the stylesheet and source file n separate times (i.e., once
for each output language). Correct?

Thanks for the sample stylesheet; this is pretty much the sort of example I
was looking for to help get me started.

Regards,
Tristan
 
M

Martin Honnen

Tristan said:
OK, so I take it, then, that XSLT by default can't specify different output
streams, and that I would need to use a shell script to invoke the XSLT
processor on the stylesheet and source file n separate times (i.e., once
for each output language). Correct?

With XSLT 1.0 you can't have different output streams and you would
indeed have to call the XSLT processor on the stylesheet and the source
file passing in the the language as a parameter and do that for each
language.
However some processors (like Saxon for instance) have extensions to
produce multiple output files in one pass.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top