R
Realbot
Hi,
I'm having some problems with a web application of mine.
To make things clearer here is an html input form which shows it.
It inputs two strings with GET and POST and it uses HTML::Mason.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Test utf</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<form name="formutfget" method="GET">
Enter text (get):<br>
<input type="text" name="textget" size="20" maxlength="30">
</form>
<form name="formutfpost" method="POST">
Enter text (post):<br>
<input type="text" name="textpost" size="20" maxlength="30">
</form>
Value of GET: <% $textget %><br>
Hex of GET: <% $hexget %><br>
Value of POST: <% $textpost %><br>
Hex of POST: <% $hexpost %><br>
</body>
</html>
<%args>
$textget => ''
$textpost => ''
$hexget => ''
$hexpost => ''
</%args>
<%init>
$hexget = unpack('H*', $textget);
$hexpost = unpack('H*', $textpost);
</%init>
The strange thing is that running this form under these environments
Debian Woody - perl 5.6.1 - Mozilla 1.4.3/Firefox 1.0
Debian Sid - perl 5.8.4 - Mozilla 1.4.3/Firefox 1.0
using as input the string "Δωδεκανήσων" (I don't know what it means btw...), I get as output
Value of GET: Δωδεκανήσων
Hex of GET: 26233931363b26233936393b26233934383b26233934393b26233935343b26233934353b26233935373b26233934323b26233936333b26233936393b26233935373b
Value of POST: Δωδεκανήσων
Hex of POST: 26233931363b26233936393b26233934383b26233934393b26233935343b26233934353b26233935373b26233934323b26233936333b26233936393b26233935373b
while in OpenBSD - perl 5.8.0 - Mozilla 1.4.3/Firefox 1.0 with the same input string I get
Value of GET: Δωδεκανήσων
Hex of GET: ce94cf89ceb4ceb5cebaceb1cebdceaecf83cf89cebd
Value of POST: Δωδεκανήσων
Hex of POST: ce94cf89ceb4ceb5cebaceb1cebdceaecf83cf89cebd
So, it seems that in the former I get escaped unicode character and in the latter UTF-8 ones.
I thought that it could be a 5.6 vs 5.8 difference but as you can see even under Debian Sid I got the same unicode chars.
Could it be an OpenBSD peculiarity? I've Googled but with no luck, maybe someone can shed some light on it...
Thanks!
I'm having some problems with a web application of mine.
To make things clearer here is an html input form which shows it.
It inputs two strings with GET and POST and it uses HTML::Mason.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Test utf</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<form name="formutfget" method="GET">
Enter text (get):<br>
<input type="text" name="textget" size="20" maxlength="30">
</form>
<form name="formutfpost" method="POST">
Enter text (post):<br>
<input type="text" name="textpost" size="20" maxlength="30">
</form>
Value of GET: <% $textget %><br>
Hex of GET: <% $hexget %><br>
Value of POST: <% $textpost %><br>
Hex of POST: <% $hexpost %><br>
</body>
</html>
<%args>
$textget => ''
$textpost => ''
$hexget => ''
$hexpost => ''
</%args>
<%init>
$hexget = unpack('H*', $textget);
$hexpost = unpack('H*', $textpost);
</%init>
The strange thing is that running this form under these environments
Debian Woody - perl 5.6.1 - Mozilla 1.4.3/Firefox 1.0
Debian Sid - perl 5.8.4 - Mozilla 1.4.3/Firefox 1.0
using as input the string "Δωδεκανήσων" (I don't know what it means btw...), I get as output
Value of GET: Δωδεκανήσων
Hex of GET: 26233931363b26233936393b26233934383b26233934393b26233935343b26233934353b26233935373b26233934323b26233936333b26233936393b26233935373b
Value of POST: Δωδεκανήσων
Hex of POST: 26233931363b26233936393b26233934383b26233934393b26233935343b26233934353b26233935373b26233934323b26233936333b26233936393b26233935373b
while in OpenBSD - perl 5.8.0 - Mozilla 1.4.3/Firefox 1.0 with the same input string I get
Value of GET: Δωδεκανήσων
Hex of GET: ce94cf89ceb4ceb5cebaceb1cebdceaecf83cf89cebd
Value of POST: Δωδεκανήσων
Hex of POST: ce94cf89ceb4ceb5cebaceb1cebdceaecf83cf89cebd
So, it seems that in the former I get escaped unicode character and in the latter UTF-8 ones.
I thought that it could be a 5.6 vs 5.8 difference but as you can see even under Debian Sid I got the same unicode chars.
Could it be an OpenBSD peculiarity? I've Googled but with no luck, maybe someone can shed some light on it...
Thanks!