UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position

  • Thread starter Îίκος
  • Start date
F

feedthetroll

Am Freitag, 5. Juli 2013 12:33:05 UTC+2 schrieb Îίκος Gr33k:
...
Wait!
Are you saying that the ip address is being returned as a byte string
which then i have to decode with something like:

host = socket.gethostbyaddr( os.environ['REMOTE_HOST'].decode('utf-8') )[0]

Wait!
I get a decode error when python tries to automatically decode a bytestring
assuming it to be utf-8 encoded.
I am sure the error will disappear, when I try to decode it explicit using
utf-8. Heureka! I got it!

Or in other words:
If a big stone falls on my foot accidently, it hurts.
------------------------------------------^
But I am sure it will not hurt, if take that same stone and throw it on my foot.

Heureka! I got it!



P.S.:

Am 14.06.2013 10:35, schrieb Fábio Santos:
 
D

Dave Angel

Στις 5/7/2013 12:21 μμ, ο/η Dave Angel έγÏαψε:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.3/os.py", line 669, in __getitem__
value = self._data[self.encodekey(key)]
KeyError: b'REMOTE_ADDR


Wait!
Are you saying that the ip address is being returned as a byte string
which then i have to decode with something like:

host = socket.gethostbyaddr( os.environ['REMOTE_HOST'].decode('utf-8') )[0]

Don't fix the problem till you understand it. Figure out who is dealing
with a byte string here, and where that byte string came from. Adding a
decode, especially one that's going to do the same decode as your
original error message, is very premature.

You're quoting from my error output, and that's caused because I don't
have such an environment variable. But you do. So why aren't you in
there debugging it? And why on earth are you using the complex
expression instead of a refactored one which might be simple enough for
you to figure out what's wrong with it.

There is definitely something strange going on with that os.environ
reference (NOT call). So have you yet succeeded in running the factored
lines? If you can't get them to run, at least up to the point that you
get that unicode error, then you'll make progress only by guessing.

Get to that interactive debug session, and enter the lines till you get
an error. Then at least you know which line is causing the error.

xxx = os.environ['REMOTE_HOST']
yyy = socket.gethostbyaddr(xxx)
host = yyy[0]


I'll bet the real problem is you're using some greek characters in the
name of the environment variable, rather than "REMOTE_HOST" So
everything you show us is laboriously retyped, hiding the real problems
underneath.
 
L

Lele Gaifax

Dave Angel said:
You're quoting from my error output, and that's caused because I don't
have such an environment variable. But you do.

Dave, maybe you already know, but that variable is "injected" by the CGI
mechanism, is not coming from the OP shell environment.

As Îίκος discovered, when he "cloudfare" (whatever that means) his site,
the REMOTE_HOST envvar contains some (I guess) latin-greek encoded
string, and the remote address is carried by a different envvar...

ciao, lele.
 
Î

Îίκος Gr33k

Στις 5/7/2013 5:11 μμ, ο/η Lele Gaifax έγÏαψε:
Dave, maybe you already know, but that variable is "injected" by the CGI
mechanism, is not coming from the OP shell environment.

As Îίκος discovered, when he "cloudfare" (whatever that means) his site,
the REMOTE_HOST envvar contains some (I guess) latin-greek encoded
string, and the remote address is carried by a different envvar...

Exactly only when i CloudFlare(www.cloudflare.com) the domain the
hostname cannot be retrieved.

At least i managed to solve this by:

try:
host = socket.gethostbyaddr( os.environ['HTTP_CF_CONNECTING_IP'] )[0]
except Exception as e:
host = repr(e)


Seems like when you cloudflare a domain you can no longer have the
originates ip address of the visitor but you have to read the above
environmental variable to be bale to retrieve it!
 
W

Wayne Werner

Στις 4/7/2013 6:10 μμ, ο/η MRAB έγÏαψε:
What do you mean "I don't know how to catch the exception with
OSError"? You've tried "except socket.gaierror" and "except
socket.herror", well just write "except OSError" instead!


try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except OSError:
host = "UnResolved"

produces also an internal server error.

Are you sure is just except OSError ?

Have you ensured that 'REMOTE_ADDR' is actually a key in os.environ? I
highly recommend using the logging module to help diagnose what the actual
exception is.

HTH,
-W
 
F

Ferrous Cranus

Στις 12/7/2013 2:47 μμ, ο/η Wayne Werner έγÏαψε:
Στις 4/7/2013 6:10 μμ, ο/η MRAB έγÏαψε:
What do you mean "I don't know how to catch the exception with
OSError"? You've tried "except socket.gaierror" and "except
socket.herror", well just write "except OSError" instead!


try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except OSError:
host = "UnResolved"

produces also an internal server error.

Are you sure is just except OSError ?

Have you ensured that 'REMOTE_ADDR' is actually a key in os.environ? I
highly recommend using the logging module to help diagnose what the
actual exception is.

HTH,
-W

Yes it is a key, but the problem as i suspected was cloudflare.
i had to use os.environ['HTTP_CF_CONNECTING_IP'] that cloudflare passes
as variable i the cgi enviroment in order to retrieve the visitor's ip.


try:
gi = pygeoip.GeoIP('/usr/local/share/GeoLiteCity.dat')
city = gi.time_zone_by_addr( os.environ['HTTP_CF_CONNECTING_IP'] )
host = socket.gethostbyaddr( os.environ['HTTP_CF_CONNECTING_IP'] )[0]
except Exception as e:
host = repr(e)


Sometimes though iam still receiving the usual
UnicodeDecodeError('utf-8', b'\xc1\xf0\xef\xf4\xf5

but only for a few ip addresses, in moste cases it works.
 
D

Dave Angel

Στις 12/7/2013 2:47 μμ, ο/η Wayne Werner έγÏαψε:
Στις 4/7/2013 6:10 μμ, ο/η MRAB έγÏαψε:
What do you mean "I don't know how to catch the exception with
OSError"? You've tried "except socket.gaierror" and "except
socket.herror", well just write "except OSError" instead!


try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except OSError:
host = "UnResolved"

produces also an internal server error.

Are you sure is just except OSError ?

Have you ensured that 'REMOTE_ADDR' is actually a key in os.environ? I
highly recommend using the logging module to help diagnose what the
actual exception is.

HTH,
-W

Yes it is a key, but the problem as i suspected was cloudflare.
i had to use os.environ['HTTP_CF_CONNECTING_IP'] that cloudflare passes
as variable i the cgi enviroment in order to retrieve the visitor's ip.


try:
gi = pygeoip.GeoIP('/usr/local/share/GeoLiteCity.dat')
city = gi.time_zone_by_addr( os.environ['HTTP_CF_CONNECTING_IP'] )
host = socket.gethostbyaddr( os.environ['HTTP_CF_CONNECTING_IP'] )[0]
except Exception as e:
host = repr(e)


Sometimes though iam still receiving the usual
UnicodeDecodeError('utf-8', b'\xc1\xf0\xef\xf4\xf5

but only for a few ip addresses, in moste cases it works.

And naturally, you now know how to debug those UnicodeDecodeError
problems. Surely, the code you post here isn't what you actually do,
because when people spend time to give you detailed advice, you actually
read it, and work at understanding it.

Chortle, snort.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,091
Messages
2,570,604
Members
47,223
Latest member
smithjens316

Latest Threads

Top