Y
Yohan N. Leder
Hello.
All my tests are done using ActivePerl 5.8.8.817 under Win2K FR and
Apache2.
I'm trying to obtain (and display) user data which come from a web form
with enctype as 'application/x-www-form-urlencoded' and don't succeed. I
can do-it if the form is a 'multipart/form-data' but not a
'application/x-www-form-urlencoded'.
Here is a script to show the difference :
---- BEGIN ----
#!/usr/bin/perl -w
my $this = "utf8_and_webform.pl";
require 5.8.0;
use utf8;
binmode(STDOUT, ':utf8');
print "Content-type: text/html; charset=UTF-8\n\n";
if (defined $ENV{'QUERY_STRING'} && length($ENV{'QUERY_STRING'}) > 0)
{&see;}
else {&ask;}
exit 0;
sub ask
{ # provide web forms for user to enter data
print <<PAGE
<html><head><title>Test about UTF-8 and web form</title></head><body>
Use the form you want and see the resulting data.
<p>
FORM with enctype as 'application/x-www-form-urlencoded' :<br>
<form action='$this?x' method='post' accept-charset='UTF-8'
enctype='application/x-www-form-urlencoded'>
<textarea name='msg' rows='4' cols='30' wrap='virtual'></textarea>
<input type='submit' value='send'>
</form></body></html></p>
<p>
FORM with enctype as 'multipart/form-data' :<br>
<form action='$this?x' method='post' accept-charset='UTF-8'
enctype='multipart/form-data'>
<textarea name='msg' rows='4' cols='30' wrap='virtual'></textarea>
<input type='submit' value='send'></p>
sub see
{ # display data which come from user form
my $data='';
binmode(STDIN, ':utf8'); # or ':encoding('UTF-8')'
read(STDIN, $data, $ENV{'CONTENT_LENGTH'});
# OR
#use Encode qw(decode);
#read(STDIN, $data, $ENV{'CONTENT_LENGTH'});
#$data = decode('UTF-8', $data);
print $data;
----- END ----
For example, if I submit the 'urlencoded' form (the first one, at top of
generated web page, if you run the script without any url parameter)
with the letter 'é' (accentuated e) inside the textarea, I get 'msg=%C3%
A9' displayed in the browser (knowing this has been proceeded through
the see() sub).
While, if I submit the same 'é' from the 'multipart/form-data' form (the
second one, at bottom of generated web page), I get a well interpreted
UTF-8 'é' as expected.
How to get this same UTF-8 'é' when form uses 'application/x-www-form-
urlencoded' enctype ? How to modify the see() sub for this urlencoded
form case ?
All my tests are done using ActivePerl 5.8.8.817 under Win2K FR and
Apache2.
I'm trying to obtain (and display) user data which come from a web form
with enctype as 'application/x-www-form-urlencoded' and don't succeed. I
can do-it if the form is a 'multipart/form-data' but not a
'application/x-www-form-urlencoded'.
Here is a script to show the difference :
---- BEGIN ----
#!/usr/bin/perl -w
my $this = "utf8_and_webform.pl";
require 5.8.0;
use utf8;
binmode(STDOUT, ':utf8');
print "Content-type: text/html; charset=UTF-8\n\n";
if (defined $ENV{'QUERY_STRING'} && length($ENV{'QUERY_STRING'}) > 0)
{&see;}
else {&ask;}
exit 0;
sub ask
{ # provide web forms for user to enter data
print <<PAGE
<html><head><title>Test about UTF-8 and web form</title></head><body>
Use the form you want and see the resulting data.
<p>
FORM with enctype as 'application/x-www-form-urlencoded' :<br>
<form action='$this?x' method='post' accept-charset='UTF-8'
enctype='application/x-www-form-urlencoded'>
<textarea name='msg' rows='4' cols='30' wrap='virtual'></textarea>
<input type='submit' value='send'>
</form></body></html></p>
<p>
FORM with enctype as 'multipart/form-data' :<br>
<form action='$this?x' method='post' accept-charset='UTF-8'
enctype='multipart/form-data'>
<textarea name='msg' rows='4' cols='30' wrap='virtual'></textarea>
<input type='submit' value='send'></p>
}[quoted text muted]
sub see
{ # display data which come from user form
my $data='';
binmode(STDIN, ':utf8'); # or ':encoding('UTF-8')'
read(STDIN, $data, $ENV{'CONTENT_LENGTH'});
# OR
#use Encode qw(decode);
#read(STDIN, $data, $ENV{'CONTENT_LENGTH'});
#$data = decode('UTF-8', $data);
print $data;
}[quoted text muted]
----- END ----
For example, if I submit the 'urlencoded' form (the first one, at top of
generated web page, if you run the script without any url parameter)
with the letter 'é' (accentuated e) inside the textarea, I get 'msg=%C3%
A9' displayed in the browser (knowing this has been proceeded through
the see() sub).
While, if I submit the same 'é' from the 'multipart/form-data' form (the
second one, at bottom of generated web page), I get a well interpreted
UTF-8 'é' as expected.
How to get this same UTF-8 'é' when form uses 'application/x-www-form-
urlencoded' enctype ? How to modify the see() sub for this urlencoded
form case ?