parsing json output

G

Gowri

Hi,

I have a service running somewhere which gives me JSON data. What I do
is this:

import urllib,urllib2
import cjson

url = 'http://cmsdoc.cern.ch/cms/test/aprom/phedex/dev/gowri/datasvc/
tbedi/requestDetails'
params = {'format':'json'}
eparams = urllib.urlencode(params)
request = urllib2.Request(url,eparams)
response = urllib2.urlopen(request) # This request is sent in HTTP
POST
print response.read()

This prints a whole bunch of nonsense as expected. I use cjson and am
unable to figure out how to print this json response and I guess if I
can do this, parsing should be straightforward? doing a
cjson.decode(str(repsonse)) does not work. In what form is this data
returned and how do I print it and later pass it on to another client
browser? Typing
http://cmsdoc.cern.ch/cms/test/aprom/phedex/dev/gowri/datasvc/tbedi/requestDetails
in my browser returns valid json data.

Regards,
Gowri
 
J

Justin Ezequiel

FWIW, using json.py I got from somewhere forgotten,
import json
url = 'http://cmsdoc.cern.ch/cms/test/aprom/phedex/dev/gowri/datasvc/tbedi/requestDetails'
params = {'format':'json'}
import urllib
eparams = urllib.urlencode(params)
import urllib2
request = urllib2.Request(url,eparams)
response = urllib2.urlopen(request)
s = response.read()
len(s) 115337
s[:200]
'{"phedex":{"request":
[{"last_update":"1188037561","numofapproved":"1","id":"7425"},
{"last_update":"1188751826","numofapproved":"1","id":"8041"},
{"last_update":"1190116795","numofapproved":"1","id":"92'
x = json.read(s)
type(x)
x.keys() ['phedex']
type(x['phedex'])
x['phedex'].keys()
['request_date', 'request_timestamp', 'request', 'call_time',
'instance', 'request_call', 'request_url']


## json.py implements a JSON (http://json.org) reader and writer.
## Copyright (C) 2005 Patrick D. Logan
## Contact mailto:p[email protected]
##
## This library is free software; you can redistribute it and/or
## modify it under the terms of the GNU Lesser General Public
## License as published by the Free Software Foundation; either
## version 2.1 of the License, or (at your option) any later
version.
##
## This library is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU
## Lesser General Public License for more details.
##
## You should have received a copy of the GNU Lesser General Public
## License along with this library; if not, write to the Free
Software
## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
02111-1307 USA
 
G

Gowri

I actually have a weirder problem. The code I posted earlier prints
garbage on Windows and python 2.5 and the perfect json data on RHEL
python 2.3.4. I'm so confused and helpless. json.py doesn't seem to
help either. It says

raise ReadException, "Input is not valid JSON: '%s'" %
self._generator.all()

Somebody please help!
 
T

Tim Roberts

Gowri said:
I have a service running somewhere which gives me JSON data.
...
This prints a whole bunch of nonsense as expected.

It's not nonsense. It's JSON.
I use cjson and am
unable to figure out how to print this json response

You are already printing the JSON response. That's what that script does.
and I guess if I
can do this, parsing should be straightforward? doing a
cjson.decode(str(repsonse)) does not work.

If response.read() returns the JSON string, as you show above, one might
expect
cjson.decode(response.read())
to parse it. Have you read the cjson documentation?
 
G

Gowri

Hi Tim,

I understand it's JSON. My problem is that it just prints crazy
characters instead of the JSON data. Like I mentioned, this happens on
my windows machine which has python 2.5. On the other hand, the same
code worked perfectly great on my linux machine with python 2.3.4.
What could the problem be?

Regards,
Gowri
 
T

Tim Roberts

Gowri said:
I understand it's JSON. My problem is that it just prints crazy
characters instead of the JSON data.

Ah, I see. I didn't get that from your original post.
Like I mentioned, this happens on
my windows machine which has python 2.5. On the other hand, the same
code worked perfectly great on my linux machine with python 2.3.4.
What could the problem be?

I'm not sure. It worked correctly on my Windows machine with Python 2.4.4.
Are you going through a proxy? Are you able to read other (non-JSON) web
pages using urllib2?
 
G

Gowri

Hi Tim,

I'm able to get and print correctly some HTML content. I don't know
how to fix this. Need all the help I can get with this :)

Regards,
Gowri
 
P

Paul McGuire

Hi,

I have a service running somewhere which gives me JSON data. What I do
is this:

import urllib,urllib2
import cjson

url = 'http://cmsdoc.cern.ch/cms/test/aprom/phedex/dev/gowri/datasvc/
tbedi/requestDetails'
params = {'format':'json'}
eparams = urllib.urlencode(params)
request = urllib2.Request(url,eparams)
response = urllib2.urlopen(request)    # This request is sent in HTTP
POST
print response.read()

This prints a whole bunch of nonsense as expected. I use cjson and am
unable to figure out how to print this json response and I guess if I
can do this, parsing should be straightforward?
<snip>

Gowri -

On a lark, I tried using the JSON parser that ships with the examples
in pyparsing (also available online at http://pyparsing.wikispaces.com/space/showimage/jsonParser.py).
The parsed data returned by pyparsing gives you a results object that
supports an attribute-style access to the individual fields of the
JSON object. (Note: this parser only reads, it does not write out
JSON.)

Here is the code to use the pyparsing JSON parser (after downloading
pyparsing and the jsonParser.py example), tacked on to your previously-
posted code to retrieve the JSON data in variable 's':


from jsonParser import jsonObject
data = jsonObject.parseString(s)

# dump out listing of object and attributes
print data.dump()
print

# printe out specific attributes
print data.phedex.call_time
print data.phedex.instance
print data.phedex.request_call

# access an array of request objects
print len(data.phedex.request)
for req in data.phedex.request:
#~ print req.dump()
print "-", req.id, req.last_update


This prints out (long lines clipped with '...'):

[['phedex', [['request', [[['last_update', '1188037561'], ...
- phedex: [['request', [[['last_update', '1188037561'],
['numofapproved', '1'],...
- call_time: 0.10059
- instance: tbedi
- request: [[['last_update', '1188037561'], ['numofapproved',
'1'], ...
- request_call: requestDetails
- request_date: 2008-03-24 12:56:32 UTC
- request_timestamp: 1206363392.09
- request_url: http://cmsdoc.cern.ch/cms/test/aprom/phedex/dev/gowri/datasvc/tbedi/requestDetails?format=json

0.10059
tbedi
requestDetails
1884
- 7425 1188037561
- 8041 1188751826
- 9281 1190116795
- 9521 1190248781
- 12821 1192615612
- 13121 1192729887
...

The dump() method is a quick way to see what keys are defined in the
output object, and from the code you can see how to nest the
attributes following the nesting in the dump() output.

Pyparsing is pure Python, so it is quite portable, and works with
Python 2.3.1 and up (I ran this example with 2.5.1).

You can find out more at http://pyparsing.wikispaces.com.

-- Paul
 
G

Gowri

Hi all,

Thank you so much for all your help :) I really appreciate it. I
discovered that my problem was because of my firewall and everything
works now :)

Regards,
Gowri
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,233
Members
46,820
Latest member
GilbertoA5

Latest Threads

Top