L
Laszlo Nagy
How can I specify encoding for the built-in eval function? Here is the
documentation:
http://docs.python.org/lib/built-in-funcs.html
It tells that the "expression" parameter is a string. But tells nothing
about the encoding. Same is true for: execfile, eval and compile.
The basic problem:
- expressions need to be evaluated by a program
- expressions are managed through a web based interface. The browser
supports UTF-8, the database also supports UTF-8. The user needs to be
able to enter string expressions in different languages, and store them
in the database
- expressions are for filtering emails, and the emails can contain any
character in any encoding
I tried to use eval with/without unicode strings and it worked. Example:
ãŸã‚³ã‚¹ãƒˆå‰Šæ¸› ÃÃÅ°ÅÜÖÚÓÉ трирова"' )
True
The above test was made on Unbuntu Linux and gnome-terminal.
gnome-terminal does support unicode. What would happen under Windows?
I'm also confused how it is related to PEP 0263. I always get a warning
when I try to enter '"徹底ã—ãŸã‚³ã‚¹ãƒˆå‰Šæ¸› ÃÃÅ°ÅÜÖÚÓÉ трирова"' in a source
file without "# -*- coding: " specified. Why is it not the same for
eval? Why it is not raising an exception (or why the encoding does not
need to be specified?)
Thanks,
Laszlo
documentation:
http://docs.python.org/lib/built-in-funcs.html
It tells that the "expression" parameter is a string. But tells nothing
about the encoding. Same is true for: execfile, eval and compile.
The basic problem:
- expressions need to be evaluated by a program
- expressions are managed through a web based interface. The browser
supports UTF-8, the database also supports UTF-8. The user needs to be
able to enter string expressions in different languages, and store them
in the database
- expressions are for filtering emails, and the emails can contain any
character in any encoding
I tried to use eval with/without unicode strings and it worked. Example:
ãŸã‚³ã‚¹ãƒˆå‰Šæ¸› ÃÃÅ°ÅÜÖÚÓÉ трирова"' )
True
The above test was made on Unbuntu Linux and gnome-terminal.
gnome-terminal does support unicode. What would happen under Windows?
I'm also confused how it is related to PEP 0263. I always get a warning
when I try to enter '"徹底ã—ãŸã‚³ã‚¹ãƒˆå‰Šæ¸› ÃÃÅ°ÅÜÖÚÓÉ трирова"' in a source
file without "# -*- coding: " specified. Why is it not the same for
eval? Why it is not raising an exception (or why the encoding does not
need to be specified?)
Thanks,
Laszlo