L
Laszlo Nagy
I believe most data passed in URLs are character data. RFC 2986 also
suggest that the standard should be percent encoded UTF-8:
It is somewhat confusing that URI may be used to represent binary data.
More specifically, http and https URLs contain textual data in almost
all cases. When it is textual, it must be in UTF-8 (as dictated by the
RFC). So what is the reason in arguments.get returning binary data?
[1] http://en.wikipedia.org/wiki/Percent-encoding#Percent-encoding_in_a_URI
suggest that the standard should be percent encoded UTF-8:
The generic URI syntax mandates that new URI schemes that provide for
the representation of character data in a URI must, in effect,
represent characters from the unreserved set without translation, and
should convert all other characters to bytes according to UTF-8
<http://en.wikipedia.org/wiki/UTF-8>, and then percent-encode those
values. This requirement was introduced in January 2005 with the
publication of RFC 3986 <http://tools.ietf.org/html/rfc3986>. URI
schemes introduced before this date are not affected. [1]
It is somewhat confusing that URI may be used to represent binary data.
More specifically, http and https URLs contain textual data in almost
all cases. When it is textual, it must be in UTF-8 (as dictated by the
RFC). So what is the reason in arguments.get returning binary data?
[1] http://en.wikipedia.org/wiki/Percent-encoding#Percent-encoding_in_a_URI