Character lost in POST submit

Pavils Jurjans · Feb 5, 2005

Hello,

I am experiencing a weird behaviour on my ASP.NET project. The project
consists from client-side, which can be whatever environment - web page, EXE
application, etc. The client sends HTTP POST request to the server with
data, and the server has ASP.NET application that handles the request and
gives answer.

I have biled all the fat code down to a very simple test case, which
consists from three files - HTML page, which does runtime HTTP POST request
(available in IExplorer from JS5, and all versions of Mozilla), and calls
two supposed-to-be-identical scripts, one done in classic ASP, another in
ASP.NET. The latter seems to lose all the characters 0xE4 in the incoming
POST data.

Please see the demo code here: http://www.s3.lv/demo/msnews.lostcharacter ,
you can also download the source code there.

I am now considering to do some mumbo-jumbo to handle the "%e4" characters
in some other way, what is hassle af course, because it involves both client
and server side code adjustments. It would be nice to understand, why this
is happening.

Regards,

Pavils

Joerg Jooss · Feb 5, 2005

Pavils said:
Hello,

I am experiencing a weird behaviour on my ASP.NET project. The
project consists from client-side, which can be whatever environment
- web page, EXE application, etc. The client sends HTTP POST request
to the server with data, and the server has ASP.NET application that
handles the request and gives answer.

I have biled all the fat code down to a very simple test case, which
consists from three files - HTML page, which does runtime HTTP POST
request (available in IExplorer from JS5, and all versions of
Mozilla), and calls two supposed-to-be-identical scripts, one done in
classic ASP, another in ASP.NET. The latter seems to lose all the
characters 0xE4 in the incoming POST data.

First of all, what is 0xE4 supposed to be? A Unicode code point? A
character from an 8 bit character encoding?

Please see the demo code here:
http://www.s3.lv/demo/msnews.lostcharacter , you can also download
the source code there.

I am now considering to do some mumbo-jumbo to handle the "%e4"
characters in some other way, what is hassle af course, because it
involves both client and server side code adjustments. It would be
nice to understand, why this is happening.

Your test code claims to post UTF-8, but 0xE4 is not a valid byte
sequence in UTF-8. Thus, the ASP.NET UTF-8 decoder stops after "a=X". I
guess the reason why it works in ASP is that the ASP runtime ignores
the charset attribute and uses some default character encoding like
ISO-8859-1 or Windows-1252 for which 0xE4 is a valid character.

Cheers,

Pavils Jurjans · Feb 5, 2005

Hi Joerg,

First of all, what is 0xE4 supposed to be? A Unicode code point? A
character from an 8 bit character encoding?

It's a character code for an Estonian character.

Your test code claims to post UTF-8, but 0xE4 is not a valid byte
sequence in UTF-8. Thus, the ASP.NET UTF-8 decoder stops after "a=X". I
guess the reason why it works in ASP is that the ASP runtime ignores
the charset attribute and uses some default character encoding like
ISO-8859-1 or Windows-1252 for which 0xE4 is a valid character.

This really shouldn't matter, because the sequence is URL-encoded anyway.

Pavils

Joerg Jooss · Feb 5, 2005

Pavils said:
Hi Joerg,

It's a character code for an Estonian character.

In what encoding? In Unicode, this character is "Ã¤".

This really shouldn't matter, because the sequence is URL-encoded
anyway.

It does matter. URL encoding is a means to transport special characters
(or rather their code points) in URLs using safe character sequences.
Sender and receiver need to agree on how to map those sequences to the
real characters. There's no implict or default character encoding when
using URL encoding.

If "Ã¤" is the character you want to use, try %C3%A4 instead -- that's
the proper way to URL encode it based on UTF-8.

Cheers,

Peter Morris [Droopy eyes software] · Feb 5, 2005

Hi

I once experienced problems with my website losing my £ pound sign on
postback.

Peter Morris [Droopy eyes software] · Feb 5, 2005

Hi

I once experienced a similar problem with a page losing my £ pound sign on
postback.

Not sure if it is the same thing or not, but here is what the problem
was....
http://www.howtodothings.com/ViewArticle.aspx?id=d72d1885e8084320a8b520eeb0c77a42

--
Pete
====
ECO Modeler, Audio compression components, DIB graphics controls,
FastStrings
http://www.droopyeyes.com

Read or write articles on just about anything
http://www.HowToDoThings.com

My blog
http://blogs.slcdug.org/petermorris/

Pavils Jurjans · Feb 8, 2005

Hi Joerg,

It does matter. URL encoding is a means to transport special characters
(or rather their code points) in URLs using safe character sequences.
Sender and receiver need to agree on how to map those sequences to the
real characters. There's no implict or default character encoding when
using URL encoding.

It nice to see a person here who knows the talk about encodings and their
application in transfers, its a true rarity.

I was somewhat blinded by assumption, that the only thing the server does is
decodes URL-encoded content, and creates string from the aquired charcodes
right away. Of course, there comes the character conversion in the middle! I
checked a code that I wrote for classic ASP two years ago, that was doing
just that - since the classic ASP did not intercept any character encoding
information for the incoming POST data (nor the browser is obliged to send a
hint), I made a code that reads the Request.BinaryRead(), splits the
received content by "=" and "&" characters, URL-decodes every key-value
pair, and finally applies the character encoding provided in the function
parameter, to get the correct unicode string.

I decided to solve my %e4 problem by committing to the server a direct code
point information:

var dataBody = "a=%u00e4";

This works on both classic ASP and ASP.NET, and I need not to bother about
the applied character encodings.

Thanks for opening my eyes,

Pavils

All CRUD operations work except POST. Why?	2	May 28, 2023
Outputting signal values to terminal Within Character Array	0	Dec 10, 2021
Issue with passing fetched data to POST form. How can I?	0	Jul 23, 2023
<b>Parse error</b>: syntax error, unexpected variable "$POST", expecting ")" in <b>the respective file</b> on line<b>2<b>	1	Feb 9, 2022
HttpApplicationState data being lost	1	Sep 16, 2008
How to retrive payment result value after secure paytabs payment done in PayTabs payment gateway in asp.net	0	May 11, 2023
Preventing Multiple submit (Disabling Submit Button Post Click) Solution	1	Oct 9, 2004
Post a form (server side) and submit to external url	8	Nov 4, 2005

Character lost in POST submit

Pavils Jurjans

Joerg Jooss

Pavils Jurjans

Joerg Jooss

Peter Morris [Droopy eyes software]

Peter Morris [Droopy eyes software]

Pavils Jurjans

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads