Simplified DOM idiom for building XML - revisited

S

Steve Jorgensen

A while ago, I posted a 1/2 formed idea for what was probably an
overcomplicated scheme for improving how we create content using the DOM API.
Since then, I've been refactoring some real-world code that builds content
with DOM, and have come up with a much simpler, more practical idea that I've
had success in implementing.

The problem in a nutshell is that, to build a tree or fragment in the DOM,
it's just about impossible to arrange the code to be structured much like the
XML being generated. Furthermore, the size of the code and the degree of
coupling with the DOM is outrageuos. You have to invoke a method of the
document object to create each new node, you have to invoke an appendChild
method of each node to add content to it, and you have to pass a namespace
argument to each createNode call if you want it to have a namespace URI, even
if they're all the same. Finally, the syntax for creating and appending
attribute nodes is totally nuts, and nothing like adding child elements.

Of course, you can solve all these problems with an appropriate wrapper class
or set of wrapper classes, but we're each inventing our own wrappers to do
this, each solving the problems described above with varying degrees of
success.

The solution I came up with, I think is really nice, and involves using the
native capabilities of most languages to build nested arrays in-line along
with some very trivial parsing to allow the use of compact node definition
strings (syntax roughly similar to XPath). The inability of nested arrays to
represent children of sepcific nodes is solved simply by treating any nested
array as the set of children for the preceeding node in the array.

Here's a hypothetical example (similar to some real, working code) in VB...

DOMWrap.DOM.appendChild DOMWrap.createFragment( _
Array("Invoice", Array(
"@date=", Date, _
"Customer", Array( _
"@name=", CustName),
"Body",
InvoiceLines()))), _
"uri:fooinc.com:invoice"

In the array, each element may be a node definition string, a node object, or
a fragment. A node definition may end with a dangling = sign, in which case,
the next array item supplies the text content for the node. Notice that the
code above has a structure much like the XML it is generating, and is vastly
more compact and easy to read than if it were written as discrete createNode
and appendChild calls. Also notice that, if all the nodes being created from
string definitions are in the same namespace, I only have to pass a single
namespaceURI argument.

So - my suggestion is to create this simple wrapper in many languages (at
least Java, C#, VB, Perl, and Python), put it on SourceForge, and try to get
people to use it as a de facto standard, so no one has to reinvent this wheel
again.

Would anyone besides me be interested in having me do that?

- Steve J.
 
M

Mukul Gandhi

I personally feel, DOM is a nice API. I think, it was natural outcome
of the hierarchial nature of XML. By using arrays, you are introducing
a linear storage to map to a hierarchical object(i.e. XML) which is
hard to grasp..

Regards,
Mukul
 
S

Steve Jorgensen

I personally feel, DOM is a nice API. I think, it was natural outcome
of the hierarchial nature of XML. By using arrays, you are introducing
a linear storage to map to a hierarchical object(i.e. XML) which is
hard to grasp..

Regards,
Mukul

I'll start my reply by acknowledging that it's silly to ask for feedback, then
argue with it, but I'm human, and therefore silly.

I agree that the DOM has its advantages, or I'd never have bothered with it
enough to get tired of its verboseness. Yes, it's great that DOM allows such
fine control over nodes and abstracts away much of how they actually appear in
XML text. That does not mean, however, that I always want to use something
like 8 verbose procedure calls to build a minor 3-node fragment. Furthermore,
to decipher that code later takes a lot of effort, but I value writing code
that's obvious just looking at it, so the unwrapped DOM raises some real
issues for me. It's the unwrapped DOM that seems to me to be hard to grasp
for some parts of a task.

The wrapper I propose leaves the whole DOM API in place to use as much as you
like, and adds the ability to easily toss together simple fragments as needed.
Good uses for this include building the high-level skeleton of a document and
building skeletons of smallish fragments that will have details added to them,
and then be added as children somewhere in the main document skeleton. In
short, I have some control over the granularity of the code rather than being
stuck at the finest granularity for every single node.
 
M

Mukul Gandhi

your said:
That does not mean, however, that I always want to use something
like 8 verbose procedure calls to build a minor 3-node fragment.

sorry, but I don't mind writing tons of code if I am writing in a
standard language(that too published by W3C)..

certainly if people read your approach, they will find merit in it.. I
did'nt say I did'nt like this proposal :)

Regards,
Mukul
 
S

Soren Kuula

Steve said:
Would anyone besides me be interested in having me do that?

Well, have you ever looked at the ML language? Defining types and
constructing object structures of them is a breeze.

Soren
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,989
Messages
2,570,207
Members
46,782
Latest member
ThomasGex

Latest Threads

Top