Tkinter / Unicode and UTF-8

T

Thomas

I was used to pass Unicode strings to Tk widgets. IIRC, Tcl/Tk
expects UTF-8 encoded strings, but Tkinter took care of that.
This worked, as long as I was using Python/Tk on Red Hat 9 Linux
(and on earlier versions).

Now I switched to Fedora Core 1 Linux (where Python/Tk does not
work without fixing it - but I described that in another thread)
and I have to pass UTF-8 encoded strings to Tk widgets (i.e. I
cannot directly pass Unicode strings any more).

Now I have some questions:

- Was Tkinter changed to behave like that?
- Will it stay like that in the future?
- Isn't it strange, that you have to pass UTF-8 encoded strings
to Tk widgets, but that the widgets will return Unicode strings?

(My versions: Python 2.2.3, Tkinter 2.2.3, Tcl/Tk 8.3.5)

Thanks in advance for any comments and hints; I have to change
a lot of code if passing UTF-8 encoded strings to Tk widgets
is now the only way to do it. And before doing that, I would
really like to know what the 'correct' way is.
 
M

Martin v. =?iso-8859-15?q?L=F6wis?=

Thomas said:
Now I switched to Fedora Core 1 Linux (where Python/Tk does not
work without fixing it - but I described that in another thread)
and I have to pass UTF-8 encoded strings to Tk widgets (i.e. I
cannot directly pass Unicode strings any more).

Then you fixed it incorrectly.
- Was Tkinter changed to behave like that?
No.

- Will it stay like that in the future?

No, it wasn't even changed.
- Isn't it strange, that you have to pass UTF-8 encoded strings
to Tk widgets, but that the widgets will return Unicode strings?

You don't have to. It works just fine with Unicode strings.
Thanks in advance for any comments and hints; I have to change
a lot of code if passing UTF-8 encoded strings to Tk widgets
is now the only way to do it. And before doing that, I would
really like to know what the 'correct' way is.

Not change the code.

Regards,
Martin
 
T

Thomas

Then you fixed it incorrectly.

Hi Martin!

I just used the 'Python' and 'tkinter' RPMs from www.python.org to
update ('rpm -U ...') the RPMs provided with the Fedora Core 1 Linux
distribution. (The RPMs of the distribution did not allow to use any
Python/Tk application because Python was compiled with the UCS4 option
whereas Tcl/Tk uses UCS2 (if I understand this point correctly).)


Now, the following example does not work correctly:

from Tkinter import *
tk = Tk()

txt = Text(tk)
txt.pack()
message = u"hello"
txt.insert('1.0', message)

tk.mainloop()

(The above example will display h\x00e\x001 in the Text widget.)


But the following works (only the 'UTF-8' part is new here):

from Tkinter import *
tk = Tk()

txt = Text(tk)
txt.pack()
message = u"hello"
txt.insert('1.0', message.encode('UTF-8'))

tk.mainloop()

(I just used the Text widget as an example here; the same holds
for many other widgets, e.g. menus.)

In some Tcl/Tk documentation, I read that Tk widgets expect UTF-8;
somewhere else (don't remember the URLs), I read that _tkinter.c
handles this (by encoding Unicode strings with UTF-8 for Tk widgets);
that was the reason why I thought there might have been a change
in Tkinter recently.

But (my main problem!): I still do not understand why the first
example does not work, while the second does!?

Thomas
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Thomas said:
I just used the 'Python' and 'tkinter' RPMs from www.python.org to
update ('rpm -U ...') the RPMs provided with the Fedora Core 1 Linux
distribution.

Which RPM did you use specifically? If it is

http://www.python.org/ftp/python/2.3.2/rpms/redhat-9/python2.3-tkinter-2.3.2-1pydotorg.i386.rpm

then you can't use it on Fedora 1: The RPM is for Redhat 9, after all,
not for Fedora 1.
But (my main problem!): I still do not understand why the first
example does not work, while the second does!?

Because you are using incorrect binaries. You will have to build Python
from source on Fedora 1, or wait for Redhat to fix the package.

The pydotorg RPM assumes that Tk uses UCS-4 internally, as it does on
Redhat 9. On Fedora 1, Tk uses UCS-2, so copying a Python Unicode string
into a Tcl Unicode string copies twice as many character as you have
(and overwrites some unrelated memory in the process).

There is, unfortunately, no way to detect the problem at run-time. So
I repeat: You *have* to compile from source.

Regards,
Martin
 
T

Thomas

Because you are using incorrect binaries. You will have to build Python
from source on Fedora 1, or wait for Redhat to fix the package.

I think that will not happen (at least not for the Fedora 1 Release).
(My experience with RedHat is: security fixes: yes; bug fixes: no).

The pydotorg RPM assumes that Tk uses UCS-4 internally, as it does on
Redhat 9. On Fedora 1, Tk uses UCS-2, so copying a Python Unicode string
into a Tcl Unicode string copies twice as many character as you have
(and overwrites some unrelated memory in the process).

Thanks for the explanation! Now I got it.

There is, unfortunately, no way to detect the problem at run-time. So
I repeat: You *have* to compile from source.

I just did that, following your advice.

(Compiled Tcl/Tk 8.4 and Python 2.3 from the sources without
deleting the Fedora 1 installation of Python 2.2 and Tcl/Tk 8.3;
now - with my new Python/Tkinter 2.3, Tcl/Tk 8.4 - everything
works as usual. Both, Python and Tcl/Tk, use UCS-2 now.
I only had to rename the Python executable for Python 2.3 (so that
it does not get executed when the system actually wants to use its
own Python 2.2) and I have to use a startup script for Python 2.3
setting LD_LIBRARY_PATH, because otherwise libtk8.4.so is not found
by Python in /usr/local/lib. The 'LD_LIBRARY_PATH'-script could be
avoided with a permanent solution (symlink resp. ldconfig), but at
the moment, this solution is OK for me.)


Thanks a lot for your help! This really drove me nuts :)

Regards, Thomas.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top