Novice - trying to get started with docbook

J

Jim Anderson

This is my first attempt at XML documentation.
I'm trying to get started with docbook so I can put a set
of documentation into docbook tags. I'm using 'XML In A
Nutshell" and "DocBook The Definitive Guide", both of which
are a bit outdated already.

I have a simple file that parses, but when I read it into
Netscape or Konqueror, I do not get the results that I would
hope for.

First, I'm not sure that the browsers are picking up
the referenced dtd file with the URL:
http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd

I'm opening the a local file, 'book.xml' in each of the browsers.
This file should be reading the local file, 'chap1.xml'. I'm
attaching both files.


In netscape I get the following results:

_________________________________________
This XML file does not appear to have any style information
associated with it. The document tree is shown below.

- <book>
<title>My First Book</title>
<chapter>Chapter 2</chapter>
<chapter>Chapter 3</chapter>
</book>

_________________________________________


In Konqueror I get these results:

_________________________________________
Chapter 2Chapter 3
_________________________________________



Netscape simply displays the xml tags, implying it did
not know how to interpret them. Konqueror seems to have
digested the tags, implying they are legitimate tags and
also implying that it read the dtd. But there is no
formatting for the 'chapters', so this implys Konqueror
did not really handle the dtd.

My questions are as follows:

How do I know if the docbookx.dtd is actually being read?

Do I need to use a different xml processor?

Jim Anderson
 
A

Andy Dingley

Throw away your XML Nutshell guide. I haven't read an O'Reilly worth
having in years now (sadly) - and that's certainly not one. DocBook
isn't great, but it's the only (?) book around

I'd recommend a decent XML / XSLT primer but I'm actually stuck for one

Is Michael Kay's XSLT book still the best around? Surely not by now
First, I'm not sure that the browsers are picking up
the referenced dtd file with the URL:
http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd

They aren't. XML processors (except in rare cases) do nothing with
DTDs.

XML is not SGML. XML is basically limited, but simple, and with a
rather better developed and better integrated DOM interface (a
programming API that represents the parsed document). For SGML the DTD
is a key part of parsing all documents - for XML it's just
documentation for the humans writing the XSLT stylesheets that transform
your XML into something more useful.

Fortunately for DocBook you need to write very little of this stuff as
most is already available - try Norman Walsh's XSLT libraries.
I'm opening the a local file, 'book.xml' in each of the browsers.
This file should be reading the local file, 'chap1.xml'. I'm
attaching both files.

Don't attach, upload and post URLs. For a lot of this sort of debugging
we need to see it live and for real.
Netscape simply displays the xml tags, implying it did
not know how to interpret them.

(Actually it probably did interpret them, but in a vanilla default
manner)

What you need to do here is to provide some XSLT to transform the XML
into HTML, then look at that through the browser. Either attach the
stylesheet to the XML document itself, or transform it server-side and
serve the resultant HTML. Web searching will surely turn up tutorials -
this is very old hat by now.

Also look at making PDF etc. by use of XSL:FO and Apache FOP. This also
need XSLT knowledge (or download existing work) but it's less trouble to
learn than XSLT is and probably worth looking at.
Do I need to use a different xml processor?

This is a question for your language platform, not your document format.
There are any number of them around and most are usable. XML / XSLT is
surprisingly platform independent and it's really not that hard to
switch processors (this is amazing stuff if you're experienced with most
software development!)

If you really get stuck, Manning's "Ajax in Action" book wouldn't hurt
to read, irrelevant though it might seem at present.
 
J

Joe Kesselman

Andy said:
Is Michael Kay's XSLT book still the best around? Surely not by now

It's still the best one I've seen for an intensive and authoritative
description of the language. Might not be the easiest thing to learn
from, but definitely worth having on hand as a reference if you aren't
good at reading formal specifications (and even if you are).

But I haven't been looking at books in a while, so it's certainly
possible there's something better out there.
They aren't. XML processors (except in rare cases) do nothing with
DTDs.

Not quite true. Some applications validate XML documents against their
DTD; many don't.

A DTD (or an XML Schema) is a formal description of what kinds of
documents are acceptable, and acts as a "contract" between the tool or
person writing the document and the tool or person reading it. For
informal use by humans that often isn't needed, so browsers generally
don't validate unless explicitly told to do so. But if you're trying to
design machine-to-machine transactions, you really do want to nail down
what you mean by "a purchase order document" or "a database query
transaction", to make sure everyone agrees on how to create and read
those messages.

In the case of DocBook, validation can help ensure that the document is
written correctly and hence will be processed correctly. But you may be
able to get away without it.
Fortunately for DocBook you need to write very little of this stuff as
most is already available - try Norman Walsh's XSLT libraries.

Good resource. On the other hand, DocBook can be legitimately
rendered/processed/filtered in many different ways and to different
target representations, so those are just one possible starting point.
There are any number of them around and most are usable. XML / XSLT is
surprisingly platform independent and it's really not that hard to
switch processors (this is amazing stuff if you're experienced with most
software development!)

The W3C, and the members who did the actual work of thrashing out these
details, put a lot of man-hours into achieving exactly that, making sure
XML and the specs built around it hit good "sweet spots" of generality,
usefulness, implementability and portability. I was involved in some of
that, both directly and informally; I was generally impressed by the
quality and seriousness of the people involved, and their willingness to
listen to other points of view.

XML and the related specs do have some warts; there are things I'm sure
we'd do differently if we were doing it all again with the benefit of
what we've learned. But for something that was built up incrementally,
in parallel, and sometimes backward from the ideal sequence, it's
surprisingly reasonable!
 
H

Henry S. Thompson

Jim said:
This is my first attempt at XML documentation.
I'm trying to get started with docbook so I can put a set
of documentation into docbook tags.

All you need is an xml-stylesheet processing instruction at the top of
your document, so the browser can get instructions on how to render
your XML:

<?xml-stylesheet type="text/xsl"
href=".../docbook-xsl-1.69.1/html/docbook.xsl"?>

Download the stylesheets from

http://docbook.sourceforge.net/projects/xsl/

ht
--
Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
Half-time member of W3C Team
2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 650-4587, e-mail: (e-mail address removed)
URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
 
A

Andy Dingley

Another good solution for on-the-fly rendering to HTML is the following:
http://www.badgers-in-foil.co.uk/projects/docbook-css/

Now why didn't I think of trying that domain name ? :cool:

I _wouldn't_ recommend this CSS approach entirely though. It's a lot
more limited than XSLT. CSS is entirely presentational, so it can't
generate links, re-order sections of the document, or duplicate sections
to more than one place (useful for tables-of-contents)
 
A

Andy Dingley

Not quite true. Some applications validate XML documents against their
DTD; many don't.

A DTD (or an XML Schema) is a formal description of what kinds of
documents are acceptable, and acts as a "contract" between the tool or
person writing the document and the tool or person reading it. For
informal use by humans that often isn't needed, so browsers generally
don't validate unless explicitly told to do so. But if you're trying to
design machine-to-machine transactions, you really do want to nail down
what you mean by "a purchase order document" or "a database query
transaction", to make sure everyone agrees on how to create and read
those messages.

While you're obviously correct (and why I stated "except in rare cases")
I'd still claim that these are rare cases.

XML simply does not need a DTD to parse the document into the DOM. SGML
does. The major difference between the SGML and XML specs is that XML is
simplified to make documents self-parseable, without a DTD.

Secondly, DTDs are far from adequate for documenting a data format. XML
Schema isn't a lot better! (it does add data typing though) Neither of
these offer any semantics, so they're quite insufficient for acting as
the "contract" you describe. It's arguable if OWL is even enough for
this.

Thirdly, DTDs are in an obscure syntax unfamiliar to XML developers.
Very few XML developers understand it even slightly well.

For all of these reasons, DTDs simply aren't used by XML applications,
except in rare cases. In a few more cases you might see XML Schema used,
but even that is hardly common.
 
J

Joe Kesselman

Andy said:
XML simply does not need a DTD to parse the document into the DOM.

I'd say "does not need a DTD to simply parse the document". Whether you
care about validating depends on the application, and every time I
assume I know what a "typical" application is, someone hits me with
another important one that does things differently.
Secondly, DTDs are far from adequate for documenting a data format. XML
Schema isn't a lot better!

They define another layer of syntax checking. They don't define
semantics, but nothing short of an application or a brain can do that
very well.
Thirdly, DTDs are in an obscure syntax unfamiliar to XML developers.
Very few XML developers understand it even slightly well.

Here I agree. Yes, you can get by without understanding DTDs or schemas.
But I think you're going to have trouble defending calling yourself "an
XML developer" on your resume unless you're at least marginally familiar
with these schema languages. (I don't claim to be fully fluent in XML
Schema myself, but I recognize that as something I need to correct when
management stops expecting me to work miracles every week and gives me
time to breathe.)
For all of these reasons, DTDs simply aren't used by XML applications,
except in rare cases. In a few more cases you might see XML Schema used,
but even that is hardly common.

As I say: They may indeed be rare in the applications you're dealing
with. I wouldn't advise generalizing it beyond that statement, and I
expect that to change over time... in fact, I've seen some technology
recently that is likely to help accelerate that change by using schema
information to improve processing speed as well as precision.

Your milage will vary. Void where prohibited. Absolutes are always
inherently false, including this one.
 
J

Jim Anderson

Thanks to all of you! It took a while to get
some free time to try out using XSLT. I started using
XSLT yesterday and I'm getting my xml files translated
using java.

So its:

*.xml --> java parser --> *.html --> browers

I'd really like to get:

*.xml --> browers

When I get more time, I'll experiment some. It seems
like it should work. For now, this is ok.

Jim
 
J

Jim Anderson

Thanks to all of you! It took a while to get
some free time to try out using XSLT. I started using
XSLT yesterday and I'm getting my xml files translated
using java.

So its:

*.xml --> java parser --> *.html --> browers

I'd really like to get:

*.xml --> browers

When I get more time, I'll experiment some. It seems
like it should work. For now, this is ok.

Jim
 
A

Andy Dingley

So its:

*.xml --> java parser --> *.html --> browers
Yes.

I'd really like to get:

*.xml --> browers

You can't.

You can do
*.xml --> {some} browers
easily, but it limits your browser audience. Stick with doing it client
side.
 
J

Joe Kesselman

Andy said:
You can do
*.xml --> {some} browers
easily, but it limits your browser audience. Stick with doing it [server]
side.

Or: Have your server check which browser is in use, and make the
decision on that basis.

Actually, the longterm right answer *is* client-side... provide a
default stylesheet, but let the client choose to make their own
decisions about how to style the information. Unfortunately everyone's
gotten so caught up in micro-styling their websites that they've
forgotten that the purpose is to present information in the form most
useful to the reader...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,149
Members
46,695
Latest member
StanleyDri

Latest Threads

Top