B
Bernie Cosell
I'm pretty buffaloed about the prospect of moving some of my programs
to the new version of RedHat that is native UTF-8. I don't understand
all the implications of it, and I wonder if there's some kind of
tutorial or programming or practices guide to deal with it besides
perlunicode(1)?
I note that much of my Perl code is already ugly because of a
character convention mismatch: on our system, the line terminator is
just \012, but on stuff coming in over a socket, the line terminator
is \015\012 and so I have some really sloppy code for inserting and
removing the "\r" in [most of? )] the right places in the code, but
it has always felt a bit awkward.
Once we move to the new system, it'll get worse: *most* of the stuff
coming in over TCP connections will still be just ISO-Latin [with
\r\n] and my "local files" will be UTF-8 [with just \n], and I don't
know *what* I'm going to do.
I've read the "perlunicode" man page and it is more or less clear, but
I'm not sure how to structure my program in an environment that
necessarily has to handle data streams that could be *either* UTF-8 or
ISO-Latin [it is at least fathomable, if a bit tricky, to do one or
the other]. And in the process, any tricks for cleaning up the
programming to handle \r\n vs \n on a per-stream basis would be
nice..)
Thanks!
/Bernie\
to the new version of RedHat that is native UTF-8. I don't understand
all the implications of it, and I wonder if there's some kind of
tutorial or programming or practices guide to deal with it besides
perlunicode(1)?
I note that much of my Perl code is already ugly because of a
character convention mismatch: on our system, the line terminator is
just \012, but on stuff coming in over a socket, the line terminator
is \015\012 and so I have some really sloppy code for inserting and
removing the "\r" in [most of? )] the right places in the code, but
it has always felt a bit awkward.
Once we move to the new system, it'll get worse: *most* of the stuff
coming in over TCP connections will still be just ISO-Latin [with
\r\n] and my "local files" will be UTF-8 [with just \n], and I don't
know *what* I'm going to do.
I've read the "perlunicode" man page and it is more or less clear, but
I'm not sure how to structure my program in an environment that
necessarily has to handle data streams that could be *either* UTF-8 or
ISO-Latin [it is at least fathomable, if a bit tricky, to do one or
the other]. And in the process, any tricks for cleaning up the
programming to handle \r\n vs \n on a per-stream basis would be
nice..)
Thanks!
/Bernie\