Hi Jmh,
Welcome to ASPNET newsgroup.
As for the encoding for ASPX page in VS.NET/ ASP.NET RUNTIME, they'll
follow the below rules:
The strings we hardcoded in code file .cs or .vb are compiled into bytes at
compiled time, so we don't need care about them. The strings in aspx file
(ascx) are dynamically compiled into assembly at runtime, that what we need
to take care
1. When developing aspx page in VS.NET , the VS.NET ide will save the aspx
page through the default ANSI code page(setting in System Locale) by
default. Also, we can manually use the save options to change them to UTF8
or Unicode encoding.
Then in the web.config's <globalization> element, there is a fileEncoding
attribute this specify the encoding of the aspx file or other dynamic
resource(ascx...). Then, asp.net runtime will use this encoding to parse
the aspx pages. By default, we will find that <globalization> not
explicitly set fileEncoding, this means that runtime use the default ANSI
codepage of the machine(System Locale) to load aspx. So this is ok when we
develop asp.net pages and runtime on the same machine. But if we develop
pages on one box and will deploy to some other server which may have
different SYSTEM LOCALE settings. It's recommended that we explictly save
the aspx files as a certain charset(encoding) and speicfy the fileEncoding
as the same value in web.config.
2. After the asp.net runtime successfully parse the aspx file and load the
strings into memory, all the strings in .net (characeters) are represented
as utf-16 in memory (no matter what charset they're encoded in the source
file). And when asp.net is about to render the page content out to client
side. It will encode the in memory strings
using the charset(encoding) specified in the <globalization> settings 's
"responseEncoding" attribute.
And the "requestEncoding" attribute specify the charset(encoding) used to
decode the comming bytes from clientside( such as querystring, cookie...).
Both of them can be manually override by code using Request.ContentEncoding
/ Response.ContentEncoding
In addition, as for the
=================
Is there a MS tool (or 3rd Partly) that can quickly tell me if a file is
UTF-8 encoded and batch convert a set of file to UTF-8? I
================
question you mentioned, I have got any idea of any existing ones. However,
we can check whether a file is UTF-8 encoding by read the first bytes in
the file , most utf-8 encoded files will contains a three bytes BOM
(like the two byte BOM for unicode text file), see below:
#Byte Order Mark FAQ (from
www.unicode.org)
http://www.websina.com/bugzero/kb/unicode-bom.html
Also, we can open a certain text file in notepad and click save as menu, if
the notepad has successfully load the file as utf-8, in the save as dialog,
the encoding will be automatically set as utf-8.
And as for batch convert files different codepage/charset, we can manually
using the .net's System.Text, Sytem.IO api to convert them as long as we
know the source and destination charset.
Just some of my understandings, if you have any other questions or ideas
,please feel free to post here.
Hope helps.
Steven Cheng
Microsoft Online Support
Get Secure!
www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)