Jeff said:
It's been a while since I had to delve into this topic, and I hope this is a
good place for this question.
Good as any.
I don't know much about schemas, probably a bit more about dtd's.
I have a system that people/users can customize extensively.
I'm thinking that if they could use xml to transfer information between
systems, that would be helpful.
So far, so good...
So I figured that I would need to define a dtd.
But someone told me a dtd would be too limiting as customers need to be able
to support exporting/importing extra tables/columns etc that were not
originally specified.
This is jumping several guns at a time.
I assume from what you say that part of the document type will involve
tables of some kind, right?
Is that all numeric, or will the documents also contain text?
Importing and exporting *what* into and out of *what*? A database? Or
some other application like a wordprocessor or spreadsheet?
So is a dtd really limiting in this respect, or could I define elements to
handle that, such as generic table and column names and values?
There is no inbuilt restriction in a DTD or a Schema on the depth or
width of markup, so a table can have as many rows and columns as needed,
and no limit has to be placed on either.
It is certainly possible to constrain a table to having x columns and y
rows, if you needed to -- but from what you say, you don't.
But in a generic table typically all rows and columns have the same
element type name, eg
<table>
<row>
<col></col><col></col><col></col><col></col>
</row>
<row>
<col></col><col></col><col></col><col></col>
</row>
<row>
<col></col><col></col><col></col><col></col>
</row>
</table>
If you wanted them all to be called by different names, then you would
indeed be constraining them to a fixed number. But normally, if you want
a column to have a human-understandable name, you put it in an
attribute, eg <col name="Income">. This way any user can create tables
of any width. You can name the rows in a similar way.
What you may be thinking of is a typical data application taken from a
database where a "table" has an entirely different meaning. Once
re-expressed in XML, these typically have a format like
<quill-pen-stuff>
<accounting-record id="abc" lots="of" other="attributes">
<income>£20</income>
<expenditure>£19.19s.6d</expenditure>
<result>happiness</result>
</accounting-record>
<accounting-record id="xyz" lots="of" other="attributes">
<income>£20</income>
<expenditure>£20.0s.6d</expenditure>
<result>misery</result>
</accounting-record>
</quill-pen-stuff>
In XML terms this is not a table. It's just an element containing other
elements containing other elements. Referring to it in XML as "a table"
or "a record" or "a field" is inaccurate, fallacious, and misleading,
and usually evidence of spending far longer in the company of databases
than is healthy for a grown adult
You can certainly format and view it in tabular form, no problem. But
you can't arbitrarily add undeclared elements to its content when using
a DTD or Schema, because that is precisely what they are intended to
prevent happening. They act as a control on the formation of a document,
to ensure that it happens according to plan. If there is no plan (as you
seem to imply) then all bets are off for this structure.
Is a schema any more advantageous?
Nope. It's the conceptual part, not the application
Without seeing your application I can't judge what you need, but at a
wild guess you could easily define an extensible container structure
like the generic table above, and allow your customers to add as many
subelements as they like provided they give each one a name in an
attribute value (or some such constraint). Neither a DTD nor a Schema
imposes any restrictions on this method.
///Peter