Attributes vs. element content text

E

Eric Smith

I'm writing a DTD for a simulator to save the state of the simulated
machine, such as register and memory contents. In this particular
application, it is not expected that the generated XML will normally be
seen or edited by a person, either as text or in a structured XML
editor. It is only intended as an intermediate storage format, so that
the same state can be loaded back into the simulator at a later time.

My first thought was to use an element for a memory location, with an
attribute for the address, and text contents of the element for the
data:

<loc addr="12a7">4e</loc>

Then it occurred to me to try using attributes only:

<loc addr="12a7" data="4e"/>

When I started actually writing the DTD, I saw a web page advising that
attributes not be used for significant data. The given reason was that
attributes are harder to parse. I'm using a SAX-based parser, and find
that attributes are actually quite easy to deal with. So is there any
good reason to avoid the two styles above? It seems like the
alternative would be very cumbersome:

<loc><addr>12a7</addr><data>4e</data></loc>

And that would in fact take more work to process with SAX. By using
attributes only (my second example above), I can write the DTD to
simply require both the addr and data attributes, and not have to maintain
as much state in my own parser code.

I think what I'm mostly asking is whether the attribute-only model is
really considered to be poor form, and if so, how poor, and for what
reasons? Would using it cause problems that would haunt me later?

Thanks for any advice or insights.
Eric
 
S

Stefan Ram

Eric Smith said:
When I started actually writing the DTD, I saw a web page
advising that attributes not be used for significant data.
The given reason was that attributes are harder to parse.

This reason sounds ridiculous to me.
So is there any good reason to avoid the two styles above?

<loc addr="12a7" data="4e"/>

looks reasonable to me.

XML does have quite a static concept of validity,
otherwise, semantically one might prefer even

<loc 12a7="4e" />

(which is not valid nor well-formed XML, of course;
just a phantasy of an XML-like notation).
I think what I'm mostly asking is whether the attribute-only
model is really considered to be poor form, and if so, how
poor, and for what reasons? Would using it cause problems that
would haunt me later?

No, to me it makes sense. It describes a location having an
address and data as its properties.
 
E

Eric Smith

Stefan said:
<loc addr="12a7" data="4e"/>
looks reasonable to me. [...] to me it makes sense. It describes a
location having an address and data as its properties.

Thanks! I've proceeded with that plan, and it's working out pretty
well. I've now got a program using libxml2 that convert my old
simulator state save files to and from XML format. I don't yet have the
main simulator using the XML files, because I need to rework a few APIs
a bit, but I expect I'll have that done soon.

The simulator is Nonpareil, and it is a microcode-level simulation
of old HP calculators, such as the HP-35, HP-25, HP-33C, and HP-41CV.
http://nonpareil.brouhaha.com/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top