A
alex
I have a (little endian) UTF-16 unicode file which i want
to read. I use code looking like:
open(F, "<:encoding(utf16)", $file)
or die "can't open $file: $!\n";
while (<F>) {
print;
}
This works fine for the first few lines of the file,
before it throws an exception:
UTF-16:Unrecognised BOM 2400
What appears to be happening is that chunks of 1024 bytes
are being passed to Encode::Unicode::decode to break into
characters, and that on the second chunk there isn't (of
course!) a BOM and so it throws an exception.
The same also happens with
open(F, "<$file")
or die "can't open $file: $!\n";
binmode(F, ":encoding(utf16)");
So, what is the correct incantation of open'ing utf16 files?
TIA
to read. I use code looking like:
open(F, "<:encoding(utf16)", $file)
or die "can't open $file: $!\n";
while (<F>) {
print;
}
This works fine for the first few lines of the file,
before it throws an exception:
UTF-16:Unrecognised BOM 2400
What appears to be happening is that chunks of 1024 bytes
are being passed to Encode::Unicode::decode to break into
characters, and that on the second chunk there isn't (of
course!) a BOM and so it throws an exception.
The same also happens with
open(F, "<$file")
or die "can't open $file: $!\n";
binmode(F, ":encoding(utf16)");
So, what is the correct incantation of open'ing utf16 files?
TIA