Use an existing XML Serializing tool, or write my own?

M

M Jared Finder

I'm confused. XML looks to be extremely simple to read and write (so
simple that I feel confidant I could program serialization and
deserailization from a DOM document in an about an hour), yet I see
many serializing tools available (Xerces and .NET's System.Xml appear
to be the most popular). What functionality do these tools provide
that makes them useful over rolling my own parser?

-- MJF
 
S

Stefan Ram

I'm confused. XML looks to be extremely simple to read and write (so
simple that I feel confidant I could program serialization and
deserailization from a DOM document in an about an hour),

Could you please take that time and post the result?
 
P

Patrick TJ McPhee

% I'm confused. XML looks to be extremely simple to read and write (so
% simple that I feel confidant I could program serialization and
% deserailization from a DOM document in an about an hour),

If you want to have a non-conforming parser, you might be able to do it
in something like that time. If you're interested in being able to process
valid XML, just reading the spec takes an hour.

% to be the most popular). What functionality do these tools provide
% that makes them useful over rolling my own parser?

Apart from providing (in some cases) correct and widely tested parsers,
the widely available tools often provide support for other useful related
specs, like xpath, xslt, xpointer, and xlink.
 
R

Rolf Magnus

M said:
I'm confused. XML looks to be extremely simple to read and write (so
simple that I feel confidant I could program serialization and
deserailization from a DOM document in an about an hour),

I am sure that you really don't know the full scope of XML. It might be
easy to write a parser that can parse a very limited subset of XML and
use it for a specific application, but I don't see many reasons to do
that. The only reason I can actually think of is if you are e.g. using
an embedded system with 2k of each ram and rom that needs to parse a
limited set of XML files.
yet I see many serializing tools available (Xerces and .NET's
System.Xml appear to be the most popular). What functionality do
these tools provide that makes them useful over rolling my own parser?

Existing parsers use standard interfaces (DOM, SAX), deal with stuff
like different character encodings, DTDs, xslt, Schemas and whatnot,
have lots of features, are easy to use, general purpose, well optimized
and heavily tested. None of those could be done in one hour or anything
near one hour.
 
M

M Jared Finder

Rolf Magnus said:
Existing parsers use standard interfaces (DOM, SAX), deal with stuff
like different character encodings, DTDs, xslt, Schemas and whatnot,
have lots of features, are easy to use, general purpose, well optimized
and heavily tested. None of those could be done in one hour or anything
near one hour.

I'm relatively certain that I do not need to worry about DTDs or XSLT
or any character encodings other than ASCII. I just want a well
structured, human readable file format format. In this case, would
you say that XML is not the right tool for the task at hand? I'm
starting to think that way.

-- MJF
 
R

Rolf Magnus

M said:
I'm relatively certain that I do not need to worry about DTDs or XSLT
or any character encodings other than ASCII.

But it certainly doesn't hurt if that's supported by the xml parser. And
if you later find out that you want to reduce redundant information in
your xml files by using entities (defined in the DTD) or to support
foreign (non-ascii) languages, you don't need to adapt your parser to
it because it already handles that for you.
I just want a well structured, human readable file format format. In
this case, would you say that XML is not the right tool for the task
at hand?

It probably is the right tool. I wonder why you are obsessed with
thinking that you can only use XML for your job if you wrote your own
parser. Why would you put work into writing your own XML parser if
there are already lots of them available? Actually, it's one of the
reasons that make XML the right tool for the job: You don't need to
bother writing your own parser for it, since excellent parsers are
already available.
I'm starting to think that way.

I can't see why.
 
S

Stefan Ram

I'm relatively certain that I do not need to worry about DTDs or XSLT
or any character encodings other than ASCII. I just want a well
structured, human readable file format format. In this case, would
you say that XML is not the right tool for the task at hand? I'm
starting to think that way.

I am using a structured, human readable file format called
Unotal that is simplier than XML, while it has expressive
power (due to the possibility of structured attribute values)
and specific semantics (similar to RDF). (Elaborations on
request.) Some notes about it:

http://www.purl.org/stefan_ram/pub/unotal_en
 
P

Patrick TJ McPhee

% structured, human readable file format format. In this case, would
% you say that XML is not the right tool for the task at hand? I'm
% starting to think that way.

You don't say what the task at hand is, so it's hard to answer the
question.

What XML does for you is define the syntax of the data representation.
This can be tremendously helpful if you ever need to describe the
syntax to somebody else, since you can just say `it's an XML file
with this structure...' and go on to describe the element hierarchy.
If you never need to document your file structure, and if you feel
that you can write a parser for any file format you're likely to
need faster than you can learn the ins and outs of one of the existing
XML parsers, then it probably doesn't give you much.

Beware that using an XML-like syntax but not supporting all the details
doesn't give you anything. You don't want your documentation to say
`it's an XML file, except ...'.
 
C

Chris Lovett

The real answer is that System.Xml saves you about 55 minutes :)

You run xsd.exe passing in the schema for your XML and it generates classes
for you, let's say the top level class that comes out of this is called
"MyData", matching the top level element in your schema.

Then to deserialize the XML into memory you simply do the following:

XmlSerializer s = new XmlSerializer(typeof(MyData));
MyData data = (MyData)s.Deserialize(new XmlTextReader(filename)

That's it !!

If you don't have a schema, then go to
http://www.gotdotnet.com/team/xmltools/ and download the XSD Infererence
engine which will cook up a schema based on one or more input XML files in a
flash. If you have a DTD and not an XSD, no problem, there's a DTD to XSD
converter in the gotdotnet user samples.

The downside is that it's probably not going to be as fast as if you wrote
all the serialization code by hand. The upside is it probably won't contain
as many bugs as the code you crammed out in an hour while wishing you didn't
have to do it :) Lastly, the XmlSerializer does not support every wild and
crazy feature of the overbloated XSD standard. But it does handle the 80%
case.

Chris.

PS: don't waste your time on XML competitors. XML is here to stay.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top