Python encoding question

M

Marc Muehlfeld

Hi,

I'm doing my first steps with python and I have a problem with understanding
an encoding problem I have. My script:

import os
os.environ["NLS_LANG"] = "German_Germany.UTF8"
import cx_Oracle
connection = cx_Oracle.Connection("username/password@SID")
cursor = connection.cursor()
cursor.execute("SELECT NAME1 FROM COR WHERE CORNB='ABCDEF'")
TEST = cursor.fetchone()
print TEST[0]
print TEST


When I run this script It prints me:
München
('M\xc3\xbcnchen',)

Why is the Umlaut of TEST[0] printed and not from TEST?


And why are both prints show the wrong encoding, when I switch "fetchone()" to
"fetchall()":
('M\xc3\xbcnchen',)
[('M\xc3\xbcnchen',)]


I'm running Python 2.4.3 on CentOS 5.


Regards,
Marc
 
J

Jean-Michel Pichavant

Marc said:
Hi,

I'm doing my first steps with python and I have a problem with
understanding an encoding problem I have. My script:

import os
os.environ["NLS_LANG"] = "German_Germany.UTF8"
import cx_Oracle
connection = cx_Oracle.Connection("username/password@SID")
cursor = connection.cursor()
cursor.execute("SELECT NAME1 FROM COR WHERE CORNB='ABCDEF'")
TEST = cursor.fetchone()
print TEST[0]
print TEST


When I run this script It prints me:
München
('M\xc3\xbcnchen',)

Why is the Umlaut of TEST[0] printed and not from TEST?


And why are both prints show the wrong encoding, when I switch
"fetchone()" to "fetchall()":
('M\xc3\xbcnchen',)
[('M\xc3\xbcnchen',)]


I'm running Python 2.4.3 on CentOS 5.


Regards,
Marc
Nothing related to encoding here. TEST[0] is a string, TEST is a tupple.

s1 = 'aline \n anotherline'
print str(s1) aline
anotherline

print repr(s1)
'aline \n anotherline'

atuple = (s1,)
print str(atuple)
('aline \n anotherline',)
print repr(atuple)
('aline \n anotherline',)

Read http://docs.python.org/reference/datamodel.html regarding __repr__
and __str__.

Basically, __str__ and __repr__ are the same method for tuples, while it
differs from each other for strings.
If you want a nice representation of tuple elements you have to do it
yourself:

print ', '.join([str(elem) for elem in atuple])

In a more general manner only strings will print nicely with carriage
returns & UTF8 characters. Everyhing else, like tuple, lists, objects
will using the __repr__ method which displays formal data.

JM

PS :
class Foo:
def __str__(self):
return 'I am a nice representation of a Foo instance'


print Foo()
I am a nice representation of a Foo instance
print str(Foo())
I am a nice representation of a Foo instance
print repr(Foo())
<__main__.Foo instance at 0xb73a07ac>
 
D

Dave Angel

Hi,

<snip>
TEST = cursor.fetchone()
print TEST[0]
print TEST


When I run this script It prints me:
München
('M\xc3\xbcnchen',)

Why is the Umlaut of TEST[0] printed and not from TEST?

When you print a string, it simply prints it, control characters,
international characters, and all.

When you print a more complex object, it's up to that object to decide
how to print. In the case of a tuple above, the tuple logic displays
the parentheses and the comma, but calls the repr() of any objects it
contains. Tuple doesn't make a special case for strings, or for
numbers, it just always calls repr() (actually it's __repr__(), I think)

A list does the same thing, though it'll use square brackets at the ends.

So the question boils down to what repr() does. It attempts to create a
representation that could be used to create the specific object. So if
there's a newline, it uses \n. And if there are non-ASCII codes, it
uses hex escape sequences. And of course it adds the quote marks.

DaveA
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,736
Latest member
zacharyharris

Latest Threads

Top