P
Paul Johnston
Hi
I have a string which I convert into a list then read through it
printing its glyph and numeric representation
#-*- coding: utf-8 -*-
thestring = "abcd"
thelist = list(thestring)
for c in thelist:
print c,
print ord(c)
Works fine for latin characters but when I put in a unicode character
a two byte character gives me two characters. For example an arabic
alef returns
* 216
* 167
( the first asterix is the empty set symbol the second a double "s")
Putting in sequential characters i.e. alef, beh, teh mabuta, gives me
sequential listings i.e.
216 167
216 168
216 169
So it is reading the correct details.
Is there anyway to get the c in the for loop to recognise it is
reading a multiple byte character.
I have followed the info in PEP 0263 and am using Python 2.4.3 Build
12 on a Windows box within Eclipse 3.2.0 and Python plugins 1.2.2
Cheers Paul
I have a string which I convert into a list then read through it
printing its glyph and numeric representation
#-*- coding: utf-8 -*-
thestring = "abcd"
thelist = list(thestring)
for c in thelist:
print c,
print ord(c)
Works fine for latin characters but when I put in a unicode character
a two byte character gives me two characters. For example an arabic
alef returns
* 216
* 167
( the first asterix is the empty set symbol the second a double "s")
Putting in sequential characters i.e. alef, beh, teh mabuta, gives me
sequential listings i.e.
216 167
216 168
216 169
So it is reading the correct details.
Is there anyway to get the c in the for loop to recognise it is
reading a multiple byte character.
I have followed the info in PEP 0263 and am using Python 2.4.3 Build
12 on a Windows box within Eclipse 3.2.0 and Python plugins 1.2.2
Cheers Paul