Y
Yohan N. Leder
Hi. I'm under Win with ActivePerl 5.8.8, working on a script to add UTF-
8 support. So, the script is generating an html page with charset=utf-8
and containing a form with an upload file field. The script (the same)
which receives the form submission (POST of multipart/form-data) proceed
with a raw STDIN read, then parsing of the content and decoding to
internal utf8 about all name and value text fields, including the one
containing the filename. Something like this about this code part :
binmode STDIN;
read(STDIN, $data, $ENV{'CONTENT_LENGTH'});
.... parsing code here ...
For each name/value pair gathered in a hash :
decode("utf8", $name);
decode("utf8", $value);
Well, it works for every pair except the filename one when ther's an
acccentuated character in the file name.
For example,
- In the form, I select (under Win) : "c:\â.bin"
- I clic sending button
- I receive "c:\â.bin" whatever be the decoding stage.
This is the same with or without decode() :-(
What does it means exactly ?
Does it means that this filename as to be decoded from the current
operating-system's charset ; decode('iso-8859-1', $value) ?
If yes, how to know the client os's charset ?
If no, what to do ?
And, same question but on server-side : do I have to take care of the
current server operating-system's charset for the purpose to create
files with filename in this same charset ?
8 support. So, the script is generating an html page with charset=utf-8
and containing a form with an upload file field. The script (the same)
which receives the form submission (POST of multipart/form-data) proceed
with a raw STDIN read, then parsing of the content and decoding to
internal utf8 about all name and value text fields, including the one
containing the filename. Something like this about this code part :
binmode STDIN;
read(STDIN, $data, $ENV{'CONTENT_LENGTH'});
.... parsing code here ...
For each name/value pair gathered in a hash :
decode("utf8", $name);
decode("utf8", $value);
Well, it works for every pair except the filename one when ther's an
acccentuated character in the file name.
For example,
- In the form, I select (under Win) : "c:\â.bin"
- I clic sending button
- I receive "c:\â.bin" whatever be the decoding stage.
This is the same with or without decode() :-(
What does it means exactly ?
Does it means that this filename as to be decoded from the current
operating-system's charset ; decode('iso-8859-1', $value) ?
If yes, how to know the client os's charset ?
If no, what to do ?
And, same question but on server-side : do I have to take care of the
current server operating-system's charset for the purpose to create
files with filename in this same charset ?