C
Christophe Vanfleteren
Hello,
I'm parsing xml that is returned by the Amazon webservices (using their REST
interface).
Their dev-heavy.xsd has the following entry:
<xs:element name="Track">
<xs:complexType>
<xs:sequence>
<xs:element name="TrackName" type="xs:string" minOccurs="0"/>
<xs:element name="ByArtist" type="xs:string" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Tracks">
<xs:complexType>
<xs:sequence>
<xs:element ref="Track" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
The xml that is returned contains the following for tracks:
....
<Tracks>
<Track>Son of Sam</Track>
<Track>Somebody That I Used to Know</Track>
<Track>Junk Bond Trader</Track>
....
</Tracks>
When I unmarchall the XML using Castor (which uses the Xerces parser), I get
a SaxException:
org.xml.sax.SAXException: Illegal Text data found as child of: Track
value: "Son Of Sam"
The xml I get returned also doesn't validate against the schema according to
the validator in the Netbeans IDE. The following error occurs:
cvc-complex-type.2.3: Element 'Track' cannot have character [children],
because the type's content type is element-only. [36]
But when I run xmllint from the commandline:
xmllint --schema http://xml.amazon.com/schemas3/dev-heavy.xsd amazon.xml
and validate against the schema, the xml validates allright.
If I replace the <Track> section with
<Track><TrackName></TrackName></Track>, I can parse it allright with
Castor.
Now what I want to know is, which parser is correct here? I always thought
that only the replaced form should parse.
I'm parsing xml that is returned by the Amazon webservices (using their REST
interface).
Their dev-heavy.xsd has the following entry:
<xs:element name="Track">
<xs:complexType>
<xs:sequence>
<xs:element name="TrackName" type="xs:string" minOccurs="0"/>
<xs:element name="ByArtist" type="xs:string" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Tracks">
<xs:complexType>
<xs:sequence>
<xs:element ref="Track" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
The xml that is returned contains the following for tracks:
....
<Tracks>
<Track>Son of Sam</Track>
<Track>Somebody That I Used to Know</Track>
<Track>Junk Bond Trader</Track>
....
</Tracks>
When I unmarchall the XML using Castor (which uses the Xerces parser), I get
a SaxException:
org.xml.sax.SAXException: Illegal Text data found as child of: Track
value: "Son Of Sam"
The xml I get returned also doesn't validate against the schema according to
the validator in the Netbeans IDE. The following error occurs:
cvc-complex-type.2.3: Element 'Track' cannot have character [children],
because the type's content type is element-only. [36]
But when I run xmllint from the commandline:
xmllint --schema http://xml.amazon.com/schemas3/dev-heavy.xsd amazon.xml
and validate against the schema, the xml validates allright.
If I replace the <Track> section with
<Track><TrackName></TrackName></Track>, I can parse it allright with
Castor.
Now what I want to know is, which parser is correct here? I always thought
that only the replaced form should parse.