S
sln
Need help from Unicode guru's or anybody with some knowledge on the subject.
I maybe have a text (character) file I just open. But I don't know the encoding and I
can't open it with any encoding attribute.
It would appear to me that at the start of the file, there is an encoding mark (or none),
assuming a text file, a sort of BOM sequence of octets that mark what its encoding is.
Given that I might be passed a file descriptor only, I am module, and I rewind the position
to the start of the file, is there any way I can tell the encoding. If I could, and
its not utf8, I could decode() the rest of the file as octets, ie: in-place memeory decode,
create a temp file decoded, or possibly re-open it with the proper encoding.
I think that encoding is the usual 8/16/32 bit utf but with many locales (chars).
I am still sketchy where to find a list of encoding markers to be able to find out
this information. And still sketchy on the methods available for analysis and transformation.
I know Perl has a massive 'use Encode' lib, nevertheless, this is what I need to do to finalize
a module I'm working on.
Thanks for the help.
-sln
I maybe have a text (character) file I just open. But I don't know the encoding and I
can't open it with any encoding attribute.
It would appear to me that at the start of the file, there is an encoding mark (or none),
assuming a text file, a sort of BOM sequence of octets that mark what its encoding is.
Given that I might be passed a file descriptor only, I am module, and I rewind the position
to the start of the file, is there any way I can tell the encoding. If I could, and
its not utf8, I could decode() the rest of the file as octets, ie: in-place memeory decode,
create a temp file decoded, or possibly re-open it with the proper encoding.
I think that encoding is the usual 8/16/32 bit utf but with many locales (chars).
I am still sketchy where to find a list of encoding markers to be able to find out
this information. And still sketchy on the methods available for analysis and transformation.
I know Perl has a massive 'use Encode' lib, nevertheless, this is what I need to do to finalize
a module I'm working on.
Thanks for the help.
-sln