xml newbie question.

J

JustSomeGuy

I've run into a snag in my understanding of xml

up to date I thought that xml looked like this:

<someObject>
Its Value
</someObject>
<AnotherObject>
<AOSubObject>
Its Value
</AOSubObject>
</AnotherObject>

To find the value of AOSubObject you would specify a 'path' (for lack of a
better word in my knowledge of xml)
like AnotherObject/AOSubObject

However I've just seen some xml that confuses me on how to search it to
retrieve an objects value.

<someObject>
Its Value
</someObject>
<AnotherObject>
<AOSubObject>
Its Value
</AOSubObject>
<AOSubObject>
Another Value
</AOSubObject>
<AOSubObject>
Yet Another Value
</AOSubObject>
</AnotherObject>

So how do you get AOSubObjects value? I mean which one do you need? All of
them?
I suppose this is complient xml syntax as it comes from iTunes.
 
J

Jeff Kish

I've run into a snag in my understanding of xml

up to date I thought that xml looked like this:

<someObject>
Its Value
</someObject>
<AnotherObject>
<AOSubObject>
Its Value
</AOSubObject>
</AnotherObject>

To find the value of AOSubObject you would specify a 'path' (for lack of a
better word in my knowledge of xml)
like AnotherObject/AOSubObject

However I've just seen some xml that confuses me on how to search it to
retrieve an objects value.

<someObject>
Its Value
</someObject>
<AnotherObject>
<AOSubObject>
Its Value
</AOSubObject>
<AOSubObject>
Another Value
</AOSubObject>
<AOSubObject>
Yet Another Value
</AOSubObject>
</AnotherObject>

So how do you get AOSubObjects value? I mean which one do you need? All of
them?
I suppose this is complient xml syntax as it comes from iTunes.
{--
Here are some simple xqueries that return what I think you were asking about..
I got some pretty good tutorials off the net.
I believe this was at least partly from hompages.inf.ed.ac.uk_wadler.pdf
--}

{--
for $b in document("ITunes01.xml")//AnotherObject
return name($b/AOSubObject[2])

returns
AOSubObject
--}
{--
for $b in document("ITunes01.xml")//AnotherObject
return $b/AOSubObject[2]

returns
<AOSubObject>
Another Value
</AOSubObject>
--}
{--
for $b in document("ITunes01.xml")//AnotherObject
return $b/AOSubObject[1]

returns
<AOSubObject>
Its Value
</AOSubObject>
--}
 
A

Andy Dingley

However I've just seen some xml that confuses me on how to search it to
retrieve an objects value.

Firstly - that's a valid fragment, but it's not a valid XML document
(multiple root elements)

Secondly, why do you think you _should_ be able to retrieve data from
XML ? :cool:

XML isn't really a data storage format. It's a syntactically-based
document format (like SGML) which had a data format described for it
later on. Read the W3C note "XML Infoset" if you want to know more. So
XML can be used to store and retrieve data, but it's often vague and
messy to do so. You're trying to deal with something that has the
mindset of a wordprocessor, not a SQL database.

In an awful lot of "pure XML" cases (especially in XSLT) it's not easy
to identify specific nodes because it's _inappropriate_ to treat them
as single nodes. The tools (especially XSLT) work with node-sets, not
just single nodes. Although a node-set often contains only a single
node, or may be an empty set, you should always be aware of the
possibility that it's multiple, and that it's _correct_ for this to be
multiple. Don't write code that breaks with multiple nodes, unless
it's bound to a strong filter to avoid this happening. It's even
better to just make it work sensibly if you happen to pass it multiple
nodes.


Your question might be re-phrased as "How do I distinguish between
elements of equivalent name, at comparable positions in the document
tree".

One way is to simply count nodes in the matching set.
Some trivial XPath "//AnotherObject/AOSubObject" will give you the
set of possible elements. Using a predicate can give you just one of
them "//AnotherObject/AOSubObject [2]"
In your example this is the best you can do.
The trouble with this approach is that it's position-based. It might
break if there's a sorting operation, or if new elements get added.

The "XML Way" is to have an attribute on each element, and for this
atttribute to have a type of ID. This is how XHTML works. NB - The
attribute needn't be _named_ "id", it's the _type_ that's important
rather than the name. Such an ID-type attribute is unique in the
document, so easily retrieved. It's hard to set them up though - you
have to allocate them with some XML-knowledge and watch out for name
well-formedness, duplication etc. It's risky to just load a non-XML
value straight from a database.

Probably the best way is some sort of attribute-based scheme. Think
database - there's usually some way of generating a unique key ID from
the structures you already have. Then add an attribute to contain it
and use XPath something like this to refer to it

"//AnotherObject/AOSubObject [@objectCode = 'abc-123']"


I suppose this is complient xml syntax as it comes from iTunes.

It's valid if a validator likes it. Commercially produced XML is often
far from well-formed, even from big names.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,833
Latest member
BettyeMacf

Latest Threads

Top