Python 3.0 urllib.parse.parse_qs results in TypeError

A

ag73

Hi,

I am trying to parse data posted to a Python class that extends
http.server.BaseHTTPRequestHandler. Here is the code I am using:

def do_POST(self):
ctype, pdict = cgi.parse_header(self.headers['Content-Type'])
length = int(self.headers['Content-Length'])
if ctype == 'application/x-www-form-urlencoded':
qs = self.rfile.read(length)
print("qs="+str(qs))
form = urllib.parse.parse_qs(qs, keep_blank_values=1)

The print statement shows the following output, so it looks like the
data is being posted correctly:

qs=b'file_data=b
%27IyEvdXNyL2Jpbi9lbnYgcHl0aG9uCiMgZW5jb2Rpbmc6IHV0Zi04CiIiIgp1bnRpdGxlZC5weQoK
%5CnQ3JlYXRlZCBieSBBbmR5IEdyb3ZlIG9uIDIwMDgtMTItMDIuCkNvcHlyaWdodCAoYykgMjAwOCBf
%5CnX015Q29tcGFueU5hbWVfXy4gQWxsIHJpZ2h0cyByZXNlcnZlZC4KIiIiCgppbXBvcnQgc3lzCmlt
%5CncG9ydCBvcwoKCmRlZiBtYWluKCk6CglwcmludCAibmFtZTE9dmFsdWUxIgoJcHJpbnQgIm5hbWUy
%5CnPXZhbHVlMiIKCgppZiBfX25hbWVfXyA9PSAnX19tYWluX18nOgoJbWFpbigpCgo%3D
%5Cn%27&filename=test.py'

However, the last line of code that calls parse_qs causes the
following exception to be thrown:

<class 'TypeError'>
Type str doesn't support the buffer API

I haven't been able to find any information on the web about this. Any
pointers would be appreciated. I am using ActivePython 3.0 and have
tried this on Linux and Max OS X with the same outcome.

Thanks,

Andy.
 
J

John Machin

Hi,

I am trying to parse data posted to a Python class that extends
http.server.BaseHTTPRequestHandler. Here is the code I am using:

        def do_POST(self):
                ctype, pdict = cgi.parse_header(self.headers['Content-Type'])
                length = int(self.headers['Content-Length'])
                if ctype == 'application/x-www-form-urlencoded':
                        qs = self.rfile.read(length)
                        print("qs="+str(qs))
                        form = urllib.parse.parse_qs(qs, keep_blank_values=1)

The print statement shows the following output, so it looks like the
data is being posted correctly:

qs=b'file_data=b
%27IyEvdXNyL2Jpbi9lbnYgcHl0aG9uCiMgZW5jb2Rpbmc6IHV0Zi04CiIiIgp1bnRpdGxlZC5w­eQoK
%5CnQ3JlYXRlZCBieSBBbmR5IEdyb3ZlIG9uIDIwMDgtMTItMDIuCkNvcHlyaWdodCAoYykgMjA­wOCBf
%5CnX015Q29tcGFueU5hbWVfXy4gQWxsIHJpZ2h0cyByZXNlcnZlZC4KIiIiCgppbXBvcnQgc3l­zCmlt
%5CncG9ydCBvcwoKCmRlZiBtYWluKCk6CglwcmludCAibmFtZTE9dmFsdWUxIgoJcHJpbnQgIm5­hbWUy
%5CnPXZhbHVlMiIKCgppZiBfX25hbWVfXyA9PSAnX19tYWluX18nOgoJbWFpbigpCgo%3D
%5Cn%27&filename=test.py'

However, the last line of code that calls parse_qs causes the
following exception to be thrown:

<class 'TypeError'>
Type str doesn't support the buffer API

Please show the full traceback.
 
A

Andy Grove

Please show the full traceback.

John,

Thanks. Here it is:

File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/socketserver.py", line 281, in _handle_request_noblock
self.process_request(request, client_address)
File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/socketserver.py", line 307, in process_request
self.finish_request(request, client_address)
File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/socketserver.py", line 320, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/socketserver.py", line 614, in __init__
self.handle()
File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/http/server.py", line 363, in handle
self.handle_one_request()
File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/http/server.py", line 357, in handle_one_request
method()
File "/Users/andy/Development/EclipseWorkspace/dbsManage/kernel.py",
line 178, in do_POST
form = urllib.parse.parse_qs(qs, keep_blank_values=1)
File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/urllib/parse.py", line 351, in parse_qs
----------------------------------------
for name, value in parse_qsl(qs, keep_blank_values,
strict_parsing):
File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/urllib/parse.py", line 377, in parse_qsl
pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
TypeError: Type str doesn't support the buffer API
 
A

Andy Grove

I don't fully understand this but if I pass in "str(qs)" instead of
"qs" then the call works. However, qs is returned from file.read()
operation so shouldn't that be a string already?

In case it's not already obvious, I am new to Python :) .. so I'm
probably missing something here.
 
J

John Machin

Please show the full traceback.

John,

Thanks. Here it is:

  File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/socketserver.py", line 281, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/socketserver.py", line 307, in process_request
    self.finish_request(request, client_address)
  File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/socketserver.py", line 320, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/socketserver.py", line 614, in __init__
    self.handle()
  File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/http/server.py", line 363, in handle
    self.handle_one_request()
  File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/http/server.py", line 357, in handle_one_request
    method()
  File "/Users/andy/Development/EclipseWorkspace/dbsManage/kernel.py",
line 178, in do_POST
    form = urllib.parse.parse_qs(qs, keep_blank_values=1)
  File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/urllib/parse.py", line 351, in parse_qs
----------------------------------------
    for name, value in parse_qsl(qs, keep_blank_values,
strict_parsing):
  File "/Library/Frameworks/Python.framework/Versions/3.0/lib/
python3.0/urllib/parse.py", line 377, in parse_qsl
    pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
TypeError: Type str doesn't support the buffer API


| Python 3.0 (r30:67507, Dec 3 2008, 20:14:27) [MSC v.1500 32 bit
(Intel)] on win32
| Type "help", "copyright", "credits" or "license" for more
information.
| >>> qs_bytes = b'a;b&c;d'
| >>> qs_str = 'a;b&c;d'
| >>> pairs = [s2 for s1 in qs_bytes.split('&') for s2 in s1.split
(';')]
| Traceback (most recent call last):
| File "<stdin>", line 1, in <module>
| TypeError: Type str doesn't support the buffer API
| >>> pairs = [s2 for s1 in qs_str.split('&') for s2 in s1.split(';')]
| >>> pairs
| ['a', 'b', 'c', 'd']
| >>> b'x&y'.split('&')
| Traceback (most recent call last):
| File "<stdin>", line 1, in <module>
| TypeError: Type str doesn't support the buffer API
| >>> b'x&y'.split(b'&')
| [b'x', b'y']
| >>> 'x&y'.split('&')
| ['x', 'y']
| >>>

The immediate cause is that as expected mixing str and bytes raises an
exception -- this one however qualifies as "not very informative" and
possibly wrong [not having inspected the code for whatever.split() I'm
left wondering what is the relevance of the buffer API].

The docs for urllib.parse.parse_qs() and .parse_qsl() are a bit vague:
"""query string given as a string argument (data of type application/x-
www-form-urlencoded)""" ... does "string" mean "str only" or "str or
bytes"?

Until someone can give an authoritative answer [*], you might like to
try decoding your data (presuming you know what it is or how to dig it
out like you found the type and length) and feeding the result to
the .parse_qs().

[*] I know next to zilch about cgi and urllib -- I'm just trying to
give you some clues to see if you can get yourself back on the road.
 
A

Aahz

form = urllib.parse.parse_qs(qs, keep_blank_values=1)

However, the last line of code that calls parse_qs causes the
following exception to be thrown:

<class 'TypeError'>
Type str doesn't support the buffer API

One of the key features of Python 3.0 is the fact that it now
distinguishes between bytes and strings. Unfortunately, there are a lot
of ambiguous areas where the correct handling is not clear; for example,
nobody has yet agreed whether URLs are strings or bytes. As you
discovered, forced conversion to string seems to work here and I suggest
you make that your workaround. You could also file a bug on
bugs.python.org (first checking to see whether someone else has already
done so).
 
J

John Machin

One of the key features of Python 3.0 is the fact that it now
distinguishes between bytes and strings.  Unfortunately, there are a lot
of ambiguous areas where the correct handling is not clear; for example,
nobody has yet agreed whether URLs are strings or bytes.  As you
discovered, forced conversion to string seems to work here and I suggest
you make that your workaround.  

However I'm surprised on further reflection that that workaround
works; it must be only accidental.

"""if I pass in "str(qs)" instead of
"qs" then the call works."""

BUT str(bytes_instance) with no other args passed *doesn't* just do a
decoding: """When only object is given, this returns its nicely
printable representation."""

The nicely printable representation for bytes objects includes:
* wrapping it in b''
* showing non-ASCII characters as \xdd

3.0:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't d
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,968
Messages
2,570,152
Members
46,697
Latest member
AugustNabo

Latest Threads

Top