dtds, customization question

J

Jeff Kish

It's been a while since I had to delve into this topic, and I hope this is a
good place for this question.

I don't know much about schemas, probably a bit more about dtd's.
I have a system that people/users can customize extensively.
I'm thinking that if they could use xml to transfer information between
systems, that would be helpful.

So I figured that I would need to define a dtd.
But someone told me a dtd would be too limiting as customers need to be able
to support exporting/importing extra tables/columns etc that were not
originally specified.

So is a dtd really limiting in this respect, or could I define elements to
handle that, such as generic table and column names and values?

Is a schema any more advantageous?


Thanks for your time. I'll post elsewhere if this isn't a good place for
this.. just flame me.


Jeff Kish
 
J

Joseph Kesselman

Lax schemas can be written, which leave the content unconstrained
(except for well-formedness) in specific places. That's part of how you
design something like XHTML which will have elements from other schemas
mixed into it.

DTDs do something similar using the ANY keyword.
 
P

Peter Flynn

Jeff said:
It's been a while since I had to delve into this topic, and I hope this is a
good place for this question.

Good as any.
I don't know much about schemas, probably a bit more about dtd's.
I have a system that people/users can customize extensively.
I'm thinking that if they could use xml to transfer information between
systems, that would be helpful.

So far, so good...
So I figured that I would need to define a dtd.
But someone told me a dtd would be too limiting as customers need to be able
to support exporting/importing extra tables/columns etc that were not
originally specified.

This is jumping several guns at a time.
I assume from what you say that part of the document type will involve
tables of some kind, right?
Is that all numeric, or will the documents also contain text?
Importing and exporting *what* into and out of *what*? A database? Or
some other application like a wordprocessor or spreadsheet?
So is a dtd really limiting in this respect, or could I define elements to
handle that, such as generic table and column names and values?

There is no inbuilt restriction in a DTD or a Schema on the depth or
width of markup, so a table can have as many rows and columns as needed,
and no limit has to be placed on either.

It is certainly possible to constrain a table to having x columns and y
rows, if you needed to -- but from what you say, you don't.

But in a generic table typically all rows and columns have the same
element type name, eg

<table>
<row>
<col></col><col></col><col></col><col></col>
</row>
<row>
<col></col><col></col><col></col><col></col>
</row>
<row>
<col></col><col></col><col></col><col></col>
</row>
</table>

If you wanted them all to be called by different names, then you would
indeed be constraining them to a fixed number. But normally, if you want
a column to have a human-understandable name, you put it in an
attribute, eg <col name="Income">. This way any user can create tables
of any width. You can name the rows in a similar way.

What you may be thinking of is a typical data application taken from a
database where a "table" has an entirely different meaning. Once
re-expressed in XML, these typically have a format like

<quill-pen-stuff>
<accounting-record id="abc" lots="of" other="attributes">
<income>£20</income>
<expenditure>£19.19s.6d</expenditure>
<result>happiness</result>
</accounting-record>
<accounting-record id="xyz" lots="of" other="attributes">
<income>£20</income>
<expenditure>£20.0s.6d</expenditure>
<result>misery</result>
</accounting-record>
</quill-pen-stuff>

In XML terms this is not a table. It's just an element containing other
elements containing other elements. Referring to it in XML as "a table"
or "a record" or "a field" is inaccurate, fallacious, and misleading,
and usually evidence of spending far longer in the company of databases
than is healthy for a grown adult :)

You can certainly format and view it in tabular form, no problem. But
you can't arbitrarily add undeclared elements to its content when using
a DTD or Schema, because that is precisely what they are intended to
prevent happening. They act as a control on the formation of a document,
to ensure that it happens according to plan. If there is no plan (as you
seem to imply) then all bets are off for this structure.
Is a schema any more advantageous?

Nope. It's the conceptual part, not the application :)

Without seeing your application I can't judge what you need, but at a
wild guess you could easily define an extensible container structure
like the generic table above, and allow your customers to add as many
subelements as they like provided they give each one a name in an
attribute value (or some such constraint). Neither a DTD nor a Schema
imposes any restrictions on this method.

///Peter
 
J

Jeff Kish

This is jumping several guns at a time.
I assume from what you say that part of the document type will involve
tables of some kind, right?
Is that all numeric, or will the documents also contain text?
Importing and exporting *what* into and out of *what*? A database? Or
some other application like a wordprocessor or spreadsheet?


There is no inbuilt restriction in a DTD or a Schema on the depth or
width of markup, so a table can have as many rows and columns as needed,
and no limit has to be placed on either.

It is certainly possible to constrain a table to having x columns and y
rows, if you needed to -- but from what you say, you don't.

But in a generic table typically all rows and columns have the same
element type name, eg
What you may be thinking of is a typical data application taken from a
database where a "table" has an entirely different meaning. Once
re-expressed in XML, these typically have a format like

<quill-pen-stuff>
<accounting-record id="abc" lots="of" other="attributes">
<income>£20</income>
<expenditure>£19.19s.6d</expenditure>
<result>happiness</result>
</accounting-record>
<accounting-record id="xyz" lots="of" other="attributes">
<income>£20</income>
<expenditure>£20.0s.6d</expenditure>
<result>misery</result>
</accounting-record>
</quill-pen-stuff>

In XML terms this is not a table. It's just an element containing other
elements containing other elements. Referring to it in XML as "a table"
or "a record" or "a field" is inaccurate, fallacious, and misleading,
and usually evidence of spending far longer in the company of databases
than is healthy for a grown adult :)

You can certainly format and view it in tabular form, no problem. But
you can't arbitrarily add undeclared elements to its content when using
a DTD or Schema, because that is precisely what they are intended to
prevent happening. They act as a control on the formation of a document,
to ensure that it happens according to plan. If there is no plan (as you
seem to imply) then all bets are off for this structure.


Nope. It's the conceptual part, not the application :)

Without seeing your application I can't judge what you need, but at a
wild guess you could easily define an extensible container structure
like the generic table above, and allow your customers to add as many
subelements as they like provided they give each one a name in an
attribute value (or some such constraint). Neither a DTD nor a Schema
imposes any restrictions on this method.

///Peter
thanks for the reply.

Well, to talk about it in generic terms,
it is a database export/import, but the export is to be supplied by a
3rd party, and the import I think we get to write, so it is between
two disparate databases.

So I figured, hey XML is a good candidate.
So I wanted to specify elements like you mentioned, with subelements
representing table relationships.

For example say we were talking about documentation for assets.

Then there might be an asset table, and a document table, and a
asset_document table that gave many to many relationships.
Then there could be a part table and a part_document table to map what
parts go with what document.

Finally to get information from one db to another, I was thinking
systemexport := asset +, document *, part * (i.e.
one or more assets, zero or more documents, zero or more parts)
asset := id, description, document * (i.e. zero or more documents)
document := id, description, part+ (one or more parts)
part := id, description

so you could have an asset without a document,
a document without an asset,
a part related to neither,
or you could develop a bill of materials for an asset using
the information if structured as above.

but if someone customizes their system I'd like to be able to add more
'stuff' to the data.xml file, and was trying to figure out the dtd.

the data columns in the table could be anything (int, string, blob
etc)

thanks for your thoughts.
Jeff
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,007
Messages
2,570,266
Members
46,865
Latest member
AveryHamme

Latest Threads

Top