Richard said:
The replacement text of the lt attribute is < which does
not contain a <. Note that < is a character reference,
not an entity reference. You can also use < directly in
attributes.
Yes, the XML spec does note that a numeric character reference is not
an entity, nor is "<", which is called a "string" even though its
structure suggests an entity reference.
In addition, the original 1998 XML spec, in rule 41, specifically
notes the following:
"The replacement text of any entity referred to directly or
indirectly in an attribute value (other than "<") must not
contain a <."
So, the original intent was to allow "<" to represent the "<"
character in attribute values (and by section 2.4 also allow the
numeric character reference of < / < ). Tim Bray
commented on the above constraint in his well-known Annotated XML
Specification:
http://www.xml.com/axml/notes/NoLTinAtt.html
"Banishing the < ... This rule might seem a bit unnecessary, on
the face of it. Since you can't have tags in attribute values,
having an < can hardly be confusing, so why ban it?
"This is another attempt to make life easy for the DPH ["Desperate
Perl Hacker"]. The rule in XML is simple: when you're reading text,
and you hit a <, then that's a markup delimiter. Not just
sometimes, always. When you want one in the data, you have to use
<. Not just sometimes, always. In attribute values too.
"This rule has another unintended beneficial side-effect; it makes
the catching of certain errors much easier. Suppose you have a
chunk of XML as follows:
<a href="notes.html> <img src='notes.gif'></a>
"Notice that the notes.html is missing its closing quote. Without
the no-< rule, it would be really hard to detect this problem
and issue a reasonable error message. Since attribute values can
contain almost anything, no error would be detected until the
processor finds the next quotation mark. Instead, you get an error
message the first time you hit a <, which in the example above, as
in many cases, is almost immediately."
So, from the possibilities list I previously posted:
1) <foo bar="is x < y ?">
2) <foo bar="is x < y ?">
3) <foo bar="is x < y ?">
4) <foo bar="is x &lessthan; y ?"
a) where in the DTD we have <!ENTITY lessthan "<">
b) where in the DTD we have <!ENTITY lessthan "<">
c) where in the DTD we have <!ENTITY lessthan "<">
It would seem like all are permissable except for #1 and #4a since
they involve the literal "<" character.
Am I right on this?
Thanks.
Jon