P
Peter J. Holzer
David said:I don't think the docs are that unclear. In perlfunc#pack it says:
"C An unsigned char value. Only does bytes. See U for Unicode."
Yup. But that's for pack, and there is no ambiguity for pack.
pack('C*', 0xFC) always returns "\x{FC}". But the reverse operation is
ambiguous: unpack('C*', "\x{FC}") may return (0xFC) or (0xC3, 0xBC),
depending on whether the string happens to have the UTF-8 flag set or
not. I find this surprising and I find no mention that this is the
intended behaviour (rather than a side-effect of the implementation).
"Only does bytes" in the description of pack IMHO means "pack takes only
values from 0 to 255 and returns a byte string". It doesn't explicitely
say anything about the behaviour of unpack when fed a UTF-8 string, and
I'd like to have this explicitely spelled out (even if it is only "the
behaviour is undefined").
hp