Python 3.3, gettext and Unicode problems

Marcel Rodrigues · Dec 31, 2012

I'm using Python 3.3 (CPython) and am having trouble getting the standard
gettext module to handle Unicode messages.
My problem can be isolated as follows:

I have 3 files in a folder: greeting.py, greeting.po and msgfmt.py.

-- greeting.py --
import gettext

t = gettext.translation("greeting", "locale", ["pt"])
_ = t.lgettext

print("_charset = {0}\n".format(t._charset))
print(_("hello"))
-- EOF --

-- greeting.po --
msgid ""
msgstr ""
"Project-Id-Version: 1.0\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

msgid "hello"
msgstr "olÃ¡"
-- EOF --

msgfmt.py was downloaded from
http://hg.python.org/cpython/file/9e6ead98762e/Tools/i18n/msgfmt.py, since
this tool apparently isn't included in the python3 package available on
Arch Linux official repositories.

It's probably also worth noting that the file greeting.po is encoded itself
as UTF-8.

From that folder, I run the following commands:

$ mkdir -p locale/pt/LC_MESSAGES
$ python msgfmt.py -o !$/greeting.mo greeting.po
$ python greeting.py

The output is:
_charset = UTF-8

Traceback (most recent call last):
File "greeting.py", line 7, in <module>
print(_("hello"))
File "/usr/lib/python3.3/gettext.py", line 314, in lgettext
return tmsg.encode(locale.getpreferredencoding())
UnicodeEncodeError: 'ascii' codec can't encode character '\xe1' in position
2: ordinal not in range(128)

My interpretation of this output is that even though gettext correctly
detects the MO file charset as UTF-8, it tries to encode the translated
message with the system's "preferred encoding", which happens to be ASCII.

Anyone know why this happens? Is this a bug on my code? Maybe I have
misunderstood gettext...

Thanks,

Marcel

Hello gettext	1	May 14, 2007
Problems with gettext and msgfmt	1	Dec 16, 2009
gettext newbie frustration	0	Jul 5, 2006
problems with xml parsing (python 3.3)	5	Oct 28, 2012
what is wrong in my code?? (python 3.3)	4	Sep 27, 2013
gettext translate problem	0	Aug 29, 2009
Using gettext to provide different language-version of a script	2	Nov 22, 2005
Thinking Unicode	0	Aug 8, 2013

Python 3.3, gettext and Unicode problems

Marcel Rodrigues

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads