standalone="yes"

T

tah

Hey,
Can someone please clarify, confirm, or set me straight on my
understanding of a standalone="yes" attribute in the xml version
element?
I assume it means that the xml document containing it is
standalone, and does not refer to any external document to define
types. In other words, it doesn't use an external dtd to validate any
types - everything used would be defined within the doc itself. In
other words, you would never see an xml document with standalone="yes"
defined if it had a corresponding and separate dtd file that defined
elements, attributes, etc. Is this correct? If so, then this should
probably almost never be used, since you don't usually define all types
in every doc.
Also, in one particular instance, I'm seeing this error from a
parser validation:

"White space must not occur between elements declared in an
external parsed entity with element content in a standalone document"

I really don't know what this means, or why it has anything to do
with 'standalone', yes or no.

Am I just misunderstanding the whole meaning?


Thanks!!

--Ty
 
R

Richard Tobin

tah said:
I assume it means that the xml document containing it is
standalone, and does not refer to any external document to define
types. In other words, it doesn't use an external dtd to validate any
types - everything used would be defined within the doc itself.

No. It means that if you use it without the external subset, you'll get
the same results. So there can't, for example, be declarations of
NMTOKENS attributes in the external subset, because that would cause
attributes to be differently normalized. (Actually, it's OK if there
are such declarations *but there aren't any of those attributes in
the document*.)
"White space must not occur between elements declared in an
external parsed entity with element content in a standalone document"

If an element is declared as having element-only content, a validating
parser will inform the application that the whitespace between child
elements is "ignorable" (that's not a term the standard uses, but
that's the idea). That is, the whitespace between child elements
is just for formatting, and is not significant. If an element is
declared in the external subset as having element-only content, then
the parser won't be able to report it correctly without reading the
external subset, so the document isn't standalone.

-- Richard
 
T

tah

Richard said:
No. It means that if you use it without the external subset, you'll get
the same results. So there can't, for example, be declarations of
NMTOKENS attributes in the external subset, because that would cause
attributes to be differently normalized. (Actually, it's OK if there
are such declarations *but there aren't any of those attributes in
the document*.)


If an element is declared as having element-only content, a validating
parser will inform the application that the whitespace between child
elements is "ignorable" (that's not a term the standard uses, but
that's the idea). That is, the whitespace between child elements
is just for formatting, and is not significant. If an element is
declared in the external subset as having element-only content, then
the parser won't be able to report it correctly without reading the
external subset, so the document isn't standalone.

-- Richard

Thanks Richard! That does clarify the first question, and your
answer helped me understand some of the other documentation I had read
and didn't quite get. The second point (whitespace) is still pretty
fuzzy, though.
Are you saying that if a parser tries to validate an xml doc
with standalone=yes, and finds whitespace between elements, it then
needs to know whether the element is declared to have element-only
content in order to determine whether the whitespace is ignorable? And
if in fact it is declared in an external dtd to have element-only
content, then it's not standalone? (***this is the important question
that i'd like to be clear on)

This seems pretty chicken-and-egg-ish to me: If there's
whitespace and I'm standalone, I need to know if it's element-only, but
if it's declared as element-only outside the doc, then it's not
standalone (I now know I can ignore the whitespace, but you lied, and
are not standalone, so I'm choking).


I know I must still be missing the main point. What's the point
of standalone, if it's not what I stated in the first place: I don't
need to, and cannot, rely on ANY external subset (dtd),
What is it useful for? For example, if I say I'm standalone, but
in an external subset I declare an element to have, say a required
attribute, but within the doc I don't have the attribute, am I still
valid? If I'm not valid, which rule did I fail, standalone, or
required-attribute? In other words, why can we have any external subset
at all here? If we have one, it must be used for something, and if it's
used for something, then I can't be standalone.

Anyway, my brain's ususally too small to understand the XML
standards, so I apologize if I'm missing some simple point. Thanks for
the help!
 
J

Joe Kesselman

.... Note that that's a Validity Condition, not a well-formedness
condition. If you don't validate, you may be able to get away with
abusing standalone -- but why would you want to?
 
R

Richard Tobin

tah said:
Are you saying that if a parser tries to validate an xml doc
with standalone=yes, and finds whitespace between elements, it then
needs to know whether the element is declared to have element-only
content in order to determine whether the whitespace is ignorable?

Whether it's really ignorable depends on the application. The parser's
job is to report that it's whitespace-in-element-content so that the
application can make that decision.
And
if in fact it is declared in an external dtd to have element-only
content, then it's not standalone? (***this is the important question
that i'd like to be clear on)

If an element is declared in the external subset to have element-only
content, AND there is such an element in the document with whitespace
between the children, then it's not standalone.
I know I must still be missing the main point. What's the point
of standalone, if it's not what I stated in the first place: I don't
need to, and cannot, rely on ANY external subset (dtd),
What is it useful for? For example, if I say I'm standalone, but
in an external subset I declare an element to have, say a required
attribute, but within the doc I don't have the attribute, am I still
valid? If I'm not valid, which rule did I fail, standalone, or
required-attribute?

Required attribute.
In other words, why can we have any external subset
at all here? If we have one, it must be used for something, and if it's
used for something, then I can't be standalone.

On the one hand, you want to be able to verify that your documents are
correct, so you use a DTD. On the other hand, you want to be able to
use your documents with lightweight processors that won't bother to
validate, and certainly won't fetch an external subset. So you
validate your documents when you create them, and the lightweight
processors just assume that they're correct.

The standalone declaration allows the "offline" validation to warn
you that the lightweight processor is not going to see the right
thing, because (for example) you've defaulted an attribute that
the lightweight processor won't see.

In practice, I don't think that standalone has been very widely used.
Many lightweight applications know enough about the document format to
provide default values, normalise attributes, and treat whitespace
appropriately, without reading the DTD at all.

-- Richard
 
J

Joe Kesselman

Richard said:
In practice, I don't think that standalone has been very widely used.

The spec also suggests that, if you want to distribute explicitly
"standalone" documents, you can explicitly convert them into that
form... which may be the right answer; don't use it unless you need it,
and when you do need it plug in the appropriate conversion.
Many lightweight applications know enough about the document format to
provide default values, normalise attributes, and treat whitespace
appropriately, without reading the DTD at all.

Very true. Also, see past debates here about whether DTDs are becoming
obsolete as schemas take over... and standalone says nothing about
schema validation; it's strictly a DTD-validation directive/assertion.
 
T

tah

Richard said:
Whether it's really ignorable depends on the application. The parser's
job is to report that it's whitespace-in-element-content so that the
application can make that decision.


If an element is declared in the external subset to have element-only
content, AND there is such an element in the document with whitespace
between the children, then it's not standalone.


Required attribute.


On the one hand, you want to be able to verify that your documents are
correct, so you use a DTD. On the other hand, you want to be able to
use your documents with lightweight processors that won't bother to
validate, and certainly won't fetch an external subset. So you
validate your documents when you create them, and the lightweight
processors just assume that they're correct.

The standalone declaration allows the "offline" validation to warn
you that the lightweight processor is not going to see the right
thing, because (for example) you've defaulted an attribute that
the lightweight processor won't see.

In practice, I don't think that standalone has been very widely used.
Many lightweight applications know enough about the document format to
provide default values, normalise attributes, and treat whitespace
appropriately, without reading the DTD at all.

-- Richard


All right! That last makes perfect sense, and clears it up. I hadn't
considered the offline (pre) validation vs. online validation. If
that's the reason for it, that makes sense and sounds useful. Thanks
for all the help!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,006
Messages
2,570,265
Members
46,861
Latest member
SanoraS48

Latest Threads

Top