XML Spec CharData doubt

M

Manish Tomar

Hi All,

I have a doubt in "CharData" grammar given XML 1.0 spec (http://
www.w3.org/TR/REC-xml/#NT-CharData).

[14] CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*)

The text above [14] (in spec) describing CharData says that char data
must not have following chars: ^,<,& and it must not have ']]>' (CDATA
ending section) inside it. The grammar is given above. I was wondering
if the grammar can be written as

CharData ::= [^<&]* - ']]>'

Wouldnt this mean the same thing as the text describing [14]?

Thanks in advance,
Manish

PS: I am sorry if the message doesn't fit in this group.
 
B

Bjoern Hoehrmann

* Manish Tomar wrote in comp.text.xml:
I have a doubt in "CharData" grammar given XML 1.0 spec (http://
www.w3.org/TR/REC-xml/#NT-CharData).

[14] CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*)

The text above [14] (in spec) describing CharData says that char data
must not have following chars: ^,<,& and it must not have ']]>' (CDATA
ending section) inside it. The grammar is given above. I was wondering
if the grammar can be written as

CharData ::= [^<&]* - ']]>'

Wouldnt this mean the same thing as the text describing [14]?

No, a string like "...]]>..." is not allowed by the original production
but is allowed by your rewrite. The productions match the whole string,
they don't look at whether a string contains something that matches, un-
less you write it as the original does.
 
R

Richard Tobin

Manish Tomar said:
[14] CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*)

The text above [14] (in spec) describing CharData says that char data
must not have following chars: ^,<,& and it must not have ']]>' (CDATA
ending section) inside it. The grammar is given above. I was wondering
if the grammar can be written as

CharData ::= [^<&]* - ']]>'

Wouldnt this mean the same thing as the text describing [14]?

A production "A - B" means all the strings that match A, but which don't
match B. If we used your production, we would accept strings like

xyz]]>xyz

because it matches A and doesn't match B. The production could have been
more concisely written as

[^<&]* - ( .* ']]>' .* )

but presumably it was thought clearer to restrict the second part to
be a subset of the first. (In fact, I don't think the XML spec uses
"." at all.)

-- Richard
 
M

Manish Tomar

Thanks a lot guys! :)
I also checked out the the Notation section in the spec and your
explanations were clear.
 
M

Manish Tomar

Thanks a lot guys! :)
I also checked out the the Notation section in the spec and your
explanations were clear.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top