Transparent XML Tags?

J

Jim Whitehead

If I have a well-formed XML document, can I somehow tell a validating parser
to ignore selected tags? I want these tags to be ignored for validation
purposes, bit I still want to validate the contents of these tags based on
the rules applicable to their parent tags.

For example, if my schema looks something like this:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Parent">
<xs:complexType>
<xs:sequence>
<xs:element name="Child">
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Child"/>
<xs:element name="TransparentTag"/>
<xs:element name="SomeOtherTag"/>
</xs:schema>

I would like this to be valid:

<Parent>
<TransparentTag><Child>Jim</Child></TransparentTag>
</Parent>

But not this:

<Parent>
<SomeOtherTag><Child>Jim</Child></SomeOtherTag>
</Parent>

or this:

<Parent>
<TransparentTag><Sibling>Lucy</Sibling></TransparentTag>
</Parent>

We have enough of these types of tags, like TransparentTag, that spelling
out all the possible combinations in my schema would be impractical, and
would make our schema extremely large. We could possibly use a really loose
schema. For example, we could allow a "Parent" tag to have either a "Child"
tag or a "TransparentTag", or both types of tag, and then we could allow a
"TransparentTag" to hold any other tag. But we would prefer a stricter
schema, if possible.

Using <any> doesn't seem to help, because this allows any tag from, say, a
given namespace, but then it doesn't validate the contents of this tag based
on rules applying to the parent tag.

Any suggestion or ideas would be much appreciated. Also, I apologize if my
previous posting on this topic was not clear.

Thank you.

Jim Whitehead
 
G

Georg Bauhaus

: I would like this to be valid:
:
: <Parent>
: <TransparentTag><Child>Jim</Child></TransparentTag>
: </Parent>
:
: But not this:
:
: <Parent>
: <SomeOtherTag><Child>Jim</Child></SomeOtherTag>
: </Parent>

Hi Jim,

As to these cases, perhaps you could first make all of
TransparentTag, SomeOtherTag, etc valid and then apply a
tranformation as a filter that would remove the TranparentTag
wrappers and invalid nested child elements?

Or, to make specifying all cases somewhat more practical, maybe you can
more comfortably write a suitable SGML DTD and have that transformed
into a Schema? Like so:

<!ENTITY % Good "TransparentTag" -- may have child elements -->

<!ENTITY % Bad "(SomeOtherTag | YetAnotherTag)"
-- These elements must not have content -- >

<!ELEMENT Parent - - (%Good; | %Bad;)* >

<!ELEMENT %Good; - - (Child)>
<!ELEMENT %Bad; - - EMPTY>

<!ELEMENT Child - - (#PCDATA)>


Georg Bauhaus
 
J

Jim Whitehead

Georg Bauhaus said:
As to these cases, perhaps you could first make all of
TransparentTag, SomeOtherTag, etc valid and then apply a
tranformation as a filter that would remove the TranparentTag
wrappers and invalid nested child elements?
Thank you very much for the suggestion. The trouble is that I don't want to
remove the TranparentTags, because we have an application that needs to use
them. I guess we could remove the TranparentTags for validation purposes,
but then use the original, non-transformed document for our internal
purposes. It would seem we might then want to have two Schemas, one
pre-Transformation and one post-Transformation...
Or, to make specifying all cases somewhat more practical, maybe you can
more comfortably write a suitable SGML DTD and have that transformed
into a Schema? Like so:

<!ENTITY % Good "TransparentTag" -- may have child elements -->

<!ENTITY % Bad "(SomeOtherTag | YetAnotherTag)"
-- These elements must not have content -- >

<!ELEMENT Parent - - (%Good; | %Bad;)* >

<!ELEMENT %Good; - - (Child)>
<!ELEMENT %Bad; - - EMPTY>

<!ELEMENT Child - - (#PCDATA)>

I will have to spend some time with this SGML DTD idea before I can say if
it will work for us. Are you suggesting that we automatically generate the
XML schema from an SGML DTD because the SGML DTD allows more flexibility in
specifying validation rules (and so we can simplify the creation of the XML
Schema)? Thanks again!
 
G

Georg Bauhaus

: I guess we could remove the TranparentTags for validation purposes,
: but then use the original, non-transformed document for our internal
: purposes. It would seem we might then want to have two Schemas, one
: pre-Transformation and one post-Transformation...

: I will have to spend some time with this SGML DTD idea before I can say if
: it will work for us. Are you suggesting that we automatically generate the
: XML schema from an SGML DTD because the SGML DTD allows more flexibility in
: specifying validation rules (and so we can simplify the creation of the XML
: Schema)? Thanks again!

Yes that is what I have had in mind. However, I can't for the life of
me remember what software I have seen that can do the transformation
automatically (to the extent that a grammar transformation is
possible?) So that might turn out to be less helpful than I had thought.

But maybe another idea works. Given a Schema, you transform that schema
into another one that describes the difference in the presence
of tags. Here is a hypothetical Schema and a transformation the will
remove the wrapping tags AnotherTag and YetAnotherTag from Parent's
content model. (I don't speak Schema well, so there might be errors...)


<xsd:schema version='1.0'
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="file:hide.xsd"
xmlns:e="file:hide.xsd">

<xsd:element name="Parent" type="e:parentCM"/>

<xsd:complexType name="ParentCM">
<xsd:choice>
<xsd:element ref="e:TransparentTag"/>
<xsd:element ref="e:SomeOtherTag"/>
<xsd:element ref="e:YetAnotherTag"/>
</xsd:choice>
</xsd:complexType>


<xsd:complexType name="ChildCM">
<xsd:sequence>
<xsd:element ref="e:Child"/>
</xsd:sequence>
</xsd:complexType>

<xsd:element name="Child" type="string"/>

<xsd:element name="TransparentTag" type="e:ChildCM"/>
<xsd:element name="AnotherTag" type="e:ChildCM"/>
<xsd:element name="YetAnotherTag" type="e:ChildCM"/>

</xsd:schema>

Now the transformation:


<stylesheet
version='1.0'
xmlns="http://www.w3.org/1999/XSL/Transform"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:e="file:hide.xsd"
<template match="/">
<apply-templates/>
</template>

<!--
Replace every reference to not(TransparentTag) with just
a reference to a Child
-->
<template match="xsd:element[@ref='e:SomeOtherTag' or
@ref='e:YetAnotherTag']">
<xsd:element ref="e:Child"/>
</template>

<!-- everything else is unchanged -->
<template match="node() | @*">
<copy>
<apply-templates select="node() | @*"/>
</copy>
</template>

</stylesheet>

The output Schema then contains


<xsd:complexType name="ParentCM">
<xsd:choice>
<xsd:element ref="e:TransparentTag"/>
<xsd:element ref="e:Child"/>
<xsd:element ref="e:Child"/>
</xsd:choice>
</xsd:complexType>


Georg
 
J

Jim Whitehead

Thank you very much, Georg. You have provided me with much good food for
thought.

Jim
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,812
Latest member
GracielaWa

Latest Threads

Top