Helmut Richter said:
Well, each and every Perl text is in Unicode already. So there really
_is_ no problem. The problems appear when you start mugging around and
interfacing with other character sets and encodings, Then you really
need to keep track of if you have (Perl) text (in Unicode) or some
binary data in some other format and when and how to convert between
those. Not to mention to use the right encoding settings when reading
from such files as was discussed very recently here.
On the plus side there are some really great conversion tools and years
ago it was Perl that helped me to save a very large software product by
being able to automatically convert text into numerous local email
encodings.
It is the user who has to keep track which of his strings are meant as
bytes and which are meant as text characters. The details are explained
in
http://perldoc.perl.org/perlunitut.html .
Yikes! The term "string" usually implies text, therefore may I rephrase
that as "... has to keep track which of his scalars are meant to contain
binary data (e.g. pictures, hex dumps, file images, yenc-encoded data,
shift-JIS encoded email, ...) and which are meant as text"? This way
you can avoid the awkward "byte string".
Problems may arise when subroutines of unknown modules are used and it is not
specified which kind of strings are expected.
It should (emphais being on should) be clear if they expect binary data
or text.
You should start with thoroughly understanding the tutorial cited above and
then understand other people's code.
Thanks, should have mentioned that myself.
jue