urllib "quote" problem

J

John Nagle

This warning appeared from urllib.quote:

"/usr/local/lib/python2.6/urllib.py:1222: UnicodeWarning: Unicode equal
comparison failed to convert both arguments to Unicode - interpreting
them as being unequal res = map(safe_map.__getitem__, s) "

Here's urllib.quote from Python 2.6:

====

def quote(s, safe = '/'):
"""quote('abc def') -> 'abc%20def'

Each part of a URL, e.g. the path info, the query, etc., has a
different set of reserved characters that must be quoted.

RFC 2396 Uniform Resource Identifiers (URI): Generic Syntax lists
the following reserved characters.

reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
"$" | ","

Each of these characters is reserved in some component of a URL,
but not necessarily in all of them.

By default, the quote function is intended for quoting the path
section of a URL. Thus, it will not encode '/'. This character
is reserved, but in typical usage the quote function is being
called on a path where the existing slash characters are used as
reserved characters.
"""
cachekey = (safe, always_safe)
try:
safe_map = _safemaps[cachekey]
except KeyError:
safe += always_safe
safe_map = {}
for i in range(256):
c = chr(i)
safe_map[c] = (c in safe) and c or ('%%%02X' % i)
_safemaps[cachekey] = safe_map
res = map(safe_map.__getitem__, s) #### WARNING REPORTED HERE
return ''.join(res)

=====

I don't, unfortunately, know what went into this call to
produce the message; probably a URL in Unicode.

This looks like code that will do the wrong thing in
Python 2.6 for characters in the range 128-255. Those are
illegal in type "str", but this code is constructing such
values with "chr".

John Nagle
 
A

Aahz

This looks like code that will do the wrong thing in
Python 2.6 for characters in the range 128-255. Those are
illegal in type "str", but this code is constructing such
values with "chr".

WDYM "illegal"?
 
R

Robert Kern

Type "str" in Python 2.6 is ASCII, 0..127.

In Python 2.6, type "str" is comprised of bytes 0..255, not ASCII characters.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
M

MRAB

John said:
Type "str" in Python 2.6 is ASCII, 0..127.
Actually 'str' in Python 2.6 is bytestring, or ASCII + other characters,
by which I mean that the other characters aren't affected by .lower,
etc.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,992
Messages
2,570,220
Members
46,807
Latest member
ryef

Latest Threads

Top