T
toylet
Tad said:What happened when you tried it?
it worked as well. SO it's the default separator used by split()?
Tad said:What happened when you tried it?
You mean ASCII doesn't work for Chinese?
I was talking about the length(). Where is the connection between
Chinese characters and length()?
All computer data are referenced as 8-bit bytes these days.
characters are not stored in 8-bit bytes.
Nonsense. And what a particular machine/implementation calls a "byte"
has very little to do with characters.
What is a chacacter in Perl's sense?
Under the BIG5 character encoding,each chinese alphabet (or character)
is stored as two bytes. One byte always equal to 8-bits anyway.
i think we need to define "character".
toylet said:You emant each char in a perl string is not stored as one byte?
toylet said:I was talking about the length(). Where is the connection between
Chinese characters and length()?
All computer data are referenced as 8-bit bytes these days.
There is no simple and easy answer to that.
I think your question is probably best answered by referring you to
the perluniintro and perlunicode documentation (which come with
Perl); specifically the section titled "Byte and Character
semantics", and to advise you to read up on unicode and the various
encoding schemes that come with it.
It is important to stop thinking of characters as matching C's char
type, and to stop thinking of C's char type always being 8 bits (even
though a char is always a byte).
to be stored?
So what to you want to know? The length of the string in characters or the
size of the allocated memory. You were asking for the memory size.
Which means 256 distinct values which means there is just no way to encode
those tens of thousands of Chinese characters in one single byte.
You meant length() would react to unicode settings in Perl?
I think one byte always equal to 8 bits. All computer courses taught
that. 9-bit byte? What machines do that?
toylet said:it worked as well. SO it's the default separator used by split()?
toylet said:I didn't expect my question on displaying the bytes in a string would
end up talking about multi-lingual isssues.
toylet said:hex() should not be relevant as I need to convert from numeric to hex
digits. Already knew about substr().
toylet said:You meant length() would react to unicode settings in Perl?
I think one byte always equal to 8 bits. All computer courses taught
that. 9-bit byte? What machines do that?
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.