unicode encoding problem

G

garykpdx

Every time I think I understand unicode, I prove I don't.

I created a variable in interactive mode like this:
s = u'ä'
where this character is the a-umlaut
that worked alright. Then I encoded it like this:
s.encode( 'latin1')

and it printed out a sigma (totally wrong)

then I typed this:
s.encode( 'utf-8')

Then it gave me two weird characters +ñ

So how do I tell what encoding my unicode string is in, and how do I
retrieve that when I read it from a file?
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

So how do I tell what encoding my unicode string is in, and how do I
retrieve that when I read it from a file?

In interactive mode, you best avoid non-ASCII characters in a Unicode
literal.

In theory, Python should look at sys.stdin.encoding when processing
the interactive source. In practice, various Python releases ignore
sys.stdin.encoding, and just assume it is Latin-1. What is
sys.stdin.encoding on your system?

Regards,
Martin
 
C

Christos TZOTZIOY Georgiou

In theory, Python should look at sys.stdin.encoding when processing
the interactive source. In practice, various Python releases ignore
sys.stdin.encoding, and just assume it is Latin-1. What is
sys.stdin.encoding on your system?

The difference between theory and practice is that in theory there is no
difference.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,239
Messages
2,571,200
Members
47,836
Latest member
Stuart66

Latest Threads

Top