T
Terry Hancock
I still run into my own ignorance a lot with unicode in Python.
Is it possible to define some combination of __repr__, __str__,
and/or __unicode__ so that the unicode() wrapper isn't necessary
in this statement:
(i.e. can I make it so that the object that print gets is already
unicode, so that the label 'é’ã„' will print readably?)
Or, put another way, what exactly does 'print' do when it gets
a class instance to print? It seems to do the right thing if
given a unicode or string object, but I cant' figure out how to
make it do the same thing for a class instance.
I guess it would've seemed more intuitive to me if print attempted
to use __unicode__() first, then __str__(), and then __repr__(). But
it apparently skips straight to __str__(), unless the object is already
a unicode object. (?)
The following doesn't bother me:
And I understand that I might want that if I'm working in
an ASCII-only terminal. But it's a big help to be able to
read/recognize the labels when I'm working with localized
encodings, and I'd like to save the extra typing if I'm
going to be looking at a lot of these
So far, I've tried overriding the __unicode__ method to return
the unicode representation (doesn't seem like print calls it,
though), and I've tried returning the same thing from __repr__,
but the latter causes this unpleasant result:
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode characters in position
8-9: ordinal not in range(128)
so I don't think I want to do that.
Advice?
Terry
Is it possible to define some combination of __repr__, __str__,
and/or __unicode__ so that the unicode() wrapper isn't necessary
in this statement:
<GLOSS: é’ã„, cl=None, {'wd': u'\u9752\u3044'}>>>> print unicode(jp.concepts['adjectives']['BLUE'][0])
(i.e. can I make it so that the object that print gets is already
unicode, so that the label 'é’ã„' will print readably?)
Or, put another way, what exactly does 'print' do when it gets
a class instance to print? It seems to do the right thing if
given a unicode or string object, but I cant' figure out how to
make it do the same thing for a class instance.
I guess it would've seemed more intuitive to me if print attempted
to use __unicode__() first, then __str__(), and then __repr__(). But
it apparently skips straight to __str__(), unless the object is already
a unicode object. (?)
The following doesn't bother me:
<GLOSS: \u9752\u3044, cl=None, {'wd': u'\u9752\u3044'}>>>> jp.concepts['adjectives']['BLUE'][0]
And I understand that I might want that if I'm working in
an ASCII-only terminal. But it's a big help to be able to
read/recognize the labels when I'm working with localized
encodings, and I'd like to save the extra typing if I'm
going to be looking at a lot of these
So far, I've tried overriding the __unicode__ method to return
the unicode representation (doesn't seem like print calls it,
though), and I've tried returning the same thing from __repr__,
but the latter causes this unpleasant result:
Traceback (most recent call last):>>> print jp.concepts['adjectives']['BLUE'][0]
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode characters in position
8-9: ordinal not in range(128)
so I don't think I want to do that.
Advice?
Terry