P
peter pilsl
Thnx to Alan and Shawn for their reply to my last posting. I read a lot
of docs before, after and still do, but its all very confusing.
Finally I found an aproach that is actually working to me and I wanted
to ask you if this makes sense and *might* even work for longer or if it
just cries for troubles.
I read parameters delivered by the webbrowser (html-header is always
UTF-8 !!), and want to sort and lowercase them and print them out again.
I dont set STDIN and STDOUT to ":utf8", cause this does not work with
mod_perl.
.....
my $input=$cgi->param('myfield');
utf8::decode($input);
utf8::downgrade($input); # otherwise sort will not sort according to
# my LC_COLLATE-setting and I need
# localized sort (mainly german data)
my $value=do_a_lot($input); # do some dataprocessing including sorting
utf8::upgrade($value); # otherwise the lc() in the next line would
# not lower chars like german umlauts
$value=lc($value);
utf8::downgrade($value); # to make sort work again
$value=do_a_lot_more($value); # do some more dataprocessing and sorting
utf8::encode($value);
print $value;
So is it ok to get the data somehow "raw" from the webinterface, then
decode it, process it and encode it again to print it out or is this a
rather stupid approach?
Is it normal that I need to decode values delivered by an webpage that
has UTF-8 charset in its header?
Is it ok to clear the utf-8 flag to make sorting work in a locale-way
and set the flag again to make lc() work? Or does this just show that
there is something wrong in my script?
If I use Unicode::Collate I would not need this fiddling with utf-8, but
this is very slow (cause it loads the big allkeys.txt - file) and might
cause troubles in multithreaded applications (as I read somewhere)
I did not provide a full script, cause this posting is long enough that
way. Hope this is ok.
I also tried to replace the utf8::encode/decode with Encode::from_to but
failed so far, cause I actually dont know from what to what I like to
convert. One side is utf8 but what is the other side?
thnx a lot,
peter
of docs before, after and still do, but its all very confusing.
Finally I found an aproach that is actually working to me and I wanted
to ask you if this makes sense and *might* even work for longer or if it
just cries for troubles.
I read parameters delivered by the webbrowser (html-header is always
UTF-8 !!), and want to sort and lowercase them and print them out again.
I dont set STDIN and STDOUT to ":utf8", cause this does not work with
mod_perl.
.....
my $input=$cgi->param('myfield');
utf8::decode($input);
utf8::downgrade($input); # otherwise sort will not sort according to
# my LC_COLLATE-setting and I need
# localized sort (mainly german data)
my $value=do_a_lot($input); # do some dataprocessing including sorting
utf8::upgrade($value); # otherwise the lc() in the next line would
# not lower chars like german umlauts
$value=lc($value);
utf8::downgrade($value); # to make sort work again
$value=do_a_lot_more($value); # do some more dataprocessing and sorting
utf8::encode($value);
print $value;
So is it ok to get the data somehow "raw" from the webinterface, then
decode it, process it and encode it again to print it out or is this a
rather stupid approach?
Is it normal that I need to decode values delivered by an webpage that
has UTF-8 charset in its header?
Is it ok to clear the utf-8 flag to make sorting work in a locale-way
and set the flag again to make lc() work? Or does this just show that
there is something wrong in my script?
If I use Unicode::Collate I would not need this fiddling with utf-8, but
this is very slow (cause it loads the big allkeys.txt - file) and might
cause troubles in multithreaded applications (as I read somewhere)
I did not provide a full script, cause this posting is long enough that
way. Hope this is ok.
I also tried to replace the utf8::encode/decode with Encode::from_to but
failed so far, cause I actually dont know from what to what I like to
convert. One side is utf8 but what is the other side?
thnx a lot,
peter