J
Java script Dude
[Issue]
When a web page is encoded using Unicode (ISO-8859-1 -> default for
many engines) and web pages have extended ASCII characters (128 - 256),
it is possible for the user to configure their browser to not
Auto-Select, the browser may corrupt these special characters.
The developer may work around this by configuring the application
server to host the pages in UTF-8 but this may not be available if the
source is from a third party vendor. When this is the case, it may be
possible to detect whether the user has configured their browser such
that it corrupts these high ascii characters. I just wrote the
following code to do just that.
Code to test:
<script>
sEnc = "%20%C3%A8%20"
sUnEnc=" è "
document.write("View Encoding Issues exist:
"+(decodeURIComponent(sEnc)!=sUnEnc))
</script>
When the above stream is encoded utf-8, the è character is preserved
and the validation is correct no matter what the browser encoding is
set to. However, if the document is encoded with the default unicode
ISO-8859-1, and the browser view encoding is set to say Chinese
Simplified GB2312, the è character will get corrupted and the
javascript will catch this.
I am considering putting such validation on the login page for our
system to alert the user that they should set their browser view
encoding to Auto-Select.
[question] In liew of the possibility of encoding the document in
utf-8, do you see any issues with this methodology?
Thanks,
JsD
When a web page is encoded using Unicode (ISO-8859-1 -> default for
many engines) and web pages have extended ASCII characters (128 - 256),
it is possible for the user to configure their browser to not
Auto-Select, the browser may corrupt these special characters.
The developer may work around this by configuring the application
server to host the pages in UTF-8 but this may not be available if the
source is from a third party vendor. When this is the case, it may be
possible to detect whether the user has configured their browser such
that it corrupts these high ascii characters. I just wrote the
following code to do just that.
Code to test:
<script>
sEnc = "%20%C3%A8%20"
sUnEnc=" è "
document.write("View Encoding Issues exist:
"+(decodeURIComponent(sEnc)!=sUnEnc))
</script>
When the above stream is encoded utf-8, the è character is preserved
and the validation is correct no matter what the browser encoding is
set to. However, if the document is encoded with the default unicode
ISO-8859-1, and the browser view encoding is set to say Chinese
Simplified GB2312, the è character will get corrupted and the
javascript will catch this.
I am considering putting such validation on the login page for our
system to alert the user that they should set their browser view
encoding to Auto-Select.
[question] In liew of the possibility of encoding the document in
utf-8, do you see any issues with this methodology?
Thanks,
JsD