C
Christian Roth
Hello,
I am merely asking this for my own understanding:
Processing instruction's data part is not entity-aware, i.e. character
and numercial entities are not resolved at parsing time. E.g.,
<?mypi <par/> ?>
delivers as data part the String(!) "<par/>".
This effectively means that the possivle character contents of a PI is
limited by the document's encoding, since numerical entities cannot be
used to express characters outside of this encoding.
Consequently, this means that writing a PI and using any character
outside the ASCII range is bound for trouble when submitting such a
document (originally, say, in UTF-8) to an unknown XML workflow, since
intermediary stages may decide to serialize the document to e.g. ASCII
and therefore will lose any characters outside that range within PIs.
Is my understanding correct?
Regards, Christian.
I am merely asking this for my own understanding:
Processing instruction's data part is not entity-aware, i.e. character
and numercial entities are not resolved at parsing time. E.g.,
<?mypi <par/> ?>
delivers as data part the String(!) "<par/>".
This effectively means that the possivle character contents of a PI is
limited by the document's encoding, since numerical entities cannot be
used to express characters outside of this encoding.
Consequently, this means that writing a PI and using any character
outside the ASCII range is bound for trouble when submitting such a
document (originally, say, in UTF-8) to an unknown XML workflow, since
intermediary stages may decide to serialize the document to e.g. ASCII
and therefore will lose any characters outside that range within PIs.
Is my understanding correct?
Regards, Christian.