I
Ivan Shmakov
I've found a short discussion of XMPP as an XML application at
[1], which contains some points I cannot agree. But then, I'm
not really that confident in my knowledge of XMPP particulars,
so I'd appreciate if someone could comment on my arguments
below.
[1] http://search.cpan.org/~elmex/AnyEvent-XMPP-0.52/lib/AnyEvent/XMPP/Writer.pm
It's true, but such a subset could satisfy the definition of an
XML application (AIUI), which XMPP is intended to be.
OTOH, the requirement of a custom XMPP parser certainly doesn't
fit the notion of an XML application.
And as long as it's undefined (and not denied outright), the
particular interpretation of XML "fragments" used by XMPP seems
more like a natural extension, than a failure to comply with the
standard.
Once again, this is a specialization, and it's my understanding
that an XML application may choose to explicitly define an
acceptable subset of XML.
Though, of course, this allows for XMPP parsers that aren't XML
parsers at the same time.
This one indeed may be a problem, but probably not as much in
practice as in theory.
Indeed, and such a problem seems to be quite common.
To note is that the XHTML 1.1 + MathML 2.0 + SVG 1.1 profile [2]
(as implemented by, e. g., the W3C validator [3]) explicitly
requires that the embedded MathML and SVG documents use the m:
and svg: namespace prefixes, respectively.
My understanding is that it simplifies the task of DTD-based
validation, but DTD doesn't seem such a major part of XML as it
was of SGML, and I doubt of whether it's really necessary to
continue to enforce such restrictions.
[2] http://w3.org/TR/XHTMLplusMathMLplusSVG/
[3] http://validator.w3.org/
[1], which contains some points I cannot agree. But then, I'm
not really that confident in my knowledge of XMPP particulars,
so I'd appreciate if someone could comment on my arguments
below.
[1] http://search.cpan.org/~elmex/AnyEvent-XMPP-0.52/lib/AnyEvent/XMPP/Writer.pm
The whole "XML" concept of XMPP is fundamentally broken anyway. It's
supposed to be an subset of XML. But a subset of XML productions is
not XML.
It's true, but such a subset could satisfy the definition of an
XML application (AIUI), which XMPP is intended to be.
Strictly speaking you need a special XMPP "XML" parser and writer to
be 100% conformant.
OTOH, the requirement of a custom XMPP parser certainly doesn't
fit the notion of an XML application.
On top of that XMPP requires you to parse these partial "XML"
documents. But a partial XML document is not well-formed, heck, it's
not even a XML document! And a parser should bail out with an error.
But XMPP doesn't care, it just relies on implementation dependend
behaviour of chunked parsing modes for SAX parsing. This
functionality isn't even specified by the XML recommendation in any
way. The recommendation even says that it's undefined what happens
if you process not-well-formed XML documents.
And as long as it's undefined (and not denied outright), the
particular interpretation of XML "fragments" used by XMPP seems
more like a natural extension, than a failure to comply with the
standard.
But I try to be as XMPP "XML" conformant as possible (it should be
around 99-100%). But it's hard to say what XML is conformant, as the
specifications of XMPP "XML" and XML are contradicting. For example
XMPP also says you only have to generated and accept UTF-8 encodings
of XML, but the XML recommendation says that each parser has to
accept UTF-8 and UTF-16.
Once again, this is a specialization, and it's my understanding
that an XML application may choose to explicitly define an
acceptable subset of XML.
Though, of course, this allows for XMPP parsers that aren't XML
parsers at the same time.
So, what do you do? Do you use a XML conformant parser or do you
write your own?
I'm using XML:arser::Expat because expat knows how to parse broken
(aka 'partial') "XML" documents, as XMPP requires. Another argument
is that if you capture a XMPP conversation to the end, and even if a
'</stream:stream>' tag was captured, you wont have a valid XML
document. The problem is that you have to resent a <stream> tag
after TLS and SASL authentication each! Awww... I'm repeating
myself.
This one indeed may be a problem, but probably not as much in
practice as in theory.
But well... AnyEvent::XMPP does it's best with expat to cope with
the fundamental brokeness of "XML" in XMPP.
Back to the issue with "XML" generation: I've discoverd that many
XMPP servers (eg. jabberd14 and ejabberd) have problems with XML
namespaces. Thats the reason why I'm assigning the namespace
prefixes manually: The servers just don't accept validly namespaced
XML. The draft 3921bis does even state that a client SHOULD generate
a 'stream' prefix for the <stream> tag.
Indeed, and such a problem seems to be quite common.
To note is that the XHTML 1.1 + MathML 2.0 + SVG 1.1 profile [2]
(as implemented by, e. g., the W3C validator [3]) explicitly
requires that the embedded MathML and SVG documents use the m:
and svg: namespace prefixes, respectively.
My understanding is that it simplifies the task of DTD-based
validation, but DTD doesn't seem such a major part of XML as it
was of SGML, and I doubt of whether it's really necessary to
continue to enforce such restrictions.
[2] http://w3.org/TR/XHTMLplusMathMLplusSVG/
[3] http://validator.w3.org/