Oliver said:
Is this even possible? Wouldn't the escaping mechanism depend on what
the punctuations of the file format are?
I don't see why not. There are several broad categories of encoding[*]
techniques.
([*] don't take the word "encoding" to imply that the format is not
normally
readable.)
One simply requires that the text format is self-delimiting and that /any/
text
should be interpreted according to the rules of the encoding. So the
syntax of
the context is irrelevant. E.g. a length prefix, or a strong quoting
convention like the 'xyz' strings in Unix Bourne shell and its
derivatives.
Another possibility is similar, but the encoding is parameterised. For
instance
a C-like escape mechanism could be parameterised on
the Start character (defaults to ")
the End character (defaults to Start)
the Escape character (defaults to \)
the range of characters that need to be escaped (defaults to End and
Escape
itself).
Another set of possibilities are like URL-encoding or the numerical
character
entities in XML/HTML (I may have the name wrong, I mean things like &2345;
but
not $amp
. In this case the mechanism is necessarily parameterised on
the
surrounding format, since that determines what /has/ to be escaped.
And so on. My point is that it /could/ have been done (a "best practise"
RFC
perhaps). Sad that it was not...