Problems parsing, parsers disagree

  • Thread starter Christophe Vanfleteren
  • Start date
C

Christophe Vanfleteren

Hello,

I'm parsing xml that is returned by the Amazon webservices (using their REST
interface).

Their dev-heavy.xsd has the following entry:

<xs:element name="Track">
<xs:complexType>
<xs:sequence>
<xs:element name="TrackName" type="xs:string" minOccurs="0"/>
<xs:element name="ByArtist" type="xs:string" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Tracks">
<xs:complexType>
<xs:sequence>
<xs:element ref="Track" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>

The xml that is returned contains the following for tracks:

....
<Tracks>
<Track>Son of Sam</Track>
<Track>Somebody That I Used to Know</Track>
<Track>Junk Bond Trader</Track>
....
</Tracks>

When I unmarchall the XML using Castor (which uses the Xerces parser), I get
a SaxException:

org.xml.sax.SAXException: Illegal Text data found as child of: Track
value: "Son Of Sam"

The xml I get returned also doesn't validate against the schema according to
the validator in the Netbeans IDE. The following error occurs:

cvc-complex-type.2.3: Element 'Track' cannot have character [children],
because the type's content type is element-only. [36]


But when I run xmllint from the commandline:

xmllint --schema http://xml.amazon.com/schemas3/dev-heavy.xsd amazon.xml

and validate against the schema, the xml validates allright.

If I replace the <Track> section with
<Track><TrackName></TrackName></Track>, I can parse it allright with
Castor.

Now what I want to know is, which parser is correct here? I always thought
that only the replaced form should parse.
 
C

Chris Huebsch

Christophe Vanfleteren (Sun, 18 Apr 2004 08:14:56 GMT):
I'm parsing xml that is returned by the Amazon webservices (using their REST
interface).
[...]

If I replace the <Track> section with
<Track><TrackName></TrackName></Track>, I can parse it allright with
Castor.

Now what I want to know is, which parser is correct here? I always thought
that only the replaced form should parse.

xmllint is wrong.


Chris
 
C

Christophe Vanfleteren

Chris said:
Christophe Vanfleteren (Sun, 18 Apr 2004 08:14:56 GMT):
I'm parsing xml that is returned by the Amazon webservices (using their
REST interface).
[...]

If I replace the <Track> section with
<Track><TrackName></TrackName></Track>, I can parse it allright with
Castor.

Now what I want to know is, which parser is correct here? I always
thought that only the replaced form should parse.

xmllint is wrong.

Ok, thanks, I'll file a bugreport with Amazon.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,825
Latest member
VernonQuy6

Latest Threads

Top