Timbo said:
How this information is formated is not really relevant, as
long as the "is-a" relations (and others) are present.
When a new document type is to be defined, when should one
choose child elements and when attributes?
The criterion that makes sense regarding the meaning can not
be used in XML due to syntactic restrictions.
An element is describing something. A description is an
assertion. An assertion might contain unary predicates or
binary relations.
Comparing this structure of assertions with the structure
of XML, it seems to be natural to represent unary predicates
with types and binary relations with attributes.
Say, "x" is a rose and belongs to Jack. The assertion is:
rose( x ) ^ owner( x, "Jack" )
This is written in XML as:
<rose owner="Jack" />
Thus, my answer would be: use element types for unary
predicates and attributes for binary relations.
Unfortunately, in XML, this is not always possible, because in
XML:
- there might be at most one type per element,
- there might be at most one attribute value per attribute
name, and
- attribute values are not allowed to be structured in
XML.
Therefore, the designers of XML document types are forced to
abuse element /types/, to describe the /relation/ of an
element to its parent element.
This /is/ an abuse, because the designation "element type"
obviously is supposed to give the /type of an element/,
i.e., a property which is intrinsic to the element alone
and has nothing to do with its relation to other elements.
The document type designers, however, are being forced to
commit this abuse, to reinvent poorly the missing structured
attribute values using the means of XML. If a rose has two
owners, the following element is not allowed in XML:
<rose owner="Jack" owner="Jill" />
One is made to use representations such as the following:
<rose>
<owner>Jack</owner>
<owner>Jill</owner></rose>
Here the notion "element type" suggests that it is marked that
Jack is "an owner", in the sense that "owner" is supposed to
be the type (the kind) of Jack.
The intention of the author, however, is that "owner" is
supposed to give the /relation/ to the containing element
"rose". This is the natural field of application for
attributes, as the meaning of the word "attribute" outside of
XML makes clear, but it is not possible to use them for this
purpose in XML.
An alternative solution might be the following notation.
<rose owner="Alexander Marie" />
Here a /new/ mini language (not XML anymore) is used within an
attribute value, which, of course, can not be checked anymore
by XML validators. This is really done so, for example, in
XHTML, where classes are written this way.
So in its main language XHTML, the W3C has to abandon XML
even to write class attributes. This is not such a good
accomplishment given that the W3C was able to use the
experience made with SGML and HTML when designing XML and that
XHTML is one of the most prominent XML applications.
The needless restrictions of XML inhibit the meaningful use of
syntax. This makes many document type designers wondering,
when attributes and when elements are supposed to be used,
which actually is an evidence of incapacity for the design of
XML, that does not have many more notations than attributes
and elements. And now the W3C failed to give even these two
notations a clear and meaningful dedication!
Without the restrictions described, XML alone would have
nearly the expressive power of RDF/XML, which has to repair
painfully some of the errors made in the XML-design.
Now, some recommend to /always/ use subelements, because one
can never know, whether an attribute value that seems to be
unstructured today might need to become structured tomorrow.
(Or it is recommended to use attributes only when one is quite
confident that they never will need to be structured.) Now, this
recommendation does not even try to make a sense out of
attributes, but just explains how to circumvent the obstacles
the W3C has built into XML.
Others recommend to use attributes for something they
call "metadata".
Others use an XML editor that happens to make the input of
attributes more comfortable than the input of elements and
seriously suggest, therefore, to use as many attributes as
possible.
Still others have studied how to use CSS to format XML
documents and are using this to give recommendations about
when to use attributes and when to use subelements.
Of course: Mixing all these criteria (structured vs.
unstructured, data vs. "metadata", by CSS, by the ease of
editing, ...) often will give conflicting recommendations.
Other notations than XML have solved the problem by either
omitting attributes altogether or by allowing structured
attributes. I believe that notations with structured
attributes, which also allow multiple element types and
multiple attribute values for the same attribute name,
are helpful.