Download unnamed web image?

galileo228 · Feb 17, 2010

All,

My python program signs onto the student facebook at my school and,
given email addresses, returns the associated full name. If I were to
do this through a regular browser, there is also a picture of the
individual, and I am trying to get my program to download the picture
as well. The problem: the html code of the page does not point to a
particular file, but rather refers to (what seems like) a query.

So, if one went to the facebook and searched for me using my school
net id (msb83), the image of my profile on the results page is:

<img width="100" height="130" border="0" class="border" alt="msb83"
src="deliverImage.cfm?netid=MSB83">

Using BeautifulSoup, mechanize, and urllib, I've constructed the
following:

br.open("http://www.school.edu/students/facebook/")
br.select_form(nr = 1)

br.form['fulltextsearch'] = 'msb83' # this searches the facebook for
me
br.submit()
results = br.response().read()
soup = BeautifulSoup(results)
foo2 = soup.find('td', attrs={'width':'95'})
foo3 = foo2.find('a')
foo4 = foo3.find('img', attrs={'src':'deliverImage.cfm?netid=msb83'})
# this just drills down to the <img> line and until this point the
program does not return an error

save_as = os.path.join('./', msb83 + '.jpg')
urllib.urlretrieve(foo4, save_as)

I get the following error msg after running this code:

AttributeError: 'NoneType' object has no attribute 'strip'

I can download the picture through my browser by right-clicking,
selecting save as, and then the image gets saved as
'deliverImage.cfm.jpeg.'

Are there any suggestions as to how I might be able to download the
image using python?

Please let me know if more information is needed -- happy to supply
it.

Matt

John Bokma · Feb 17, 2010

galileo228 said:
Using BeautifulSoup, mechanize, and urllib, I've constructed the
following:

br.open("http://www.school.edu/students/facebook/")
br.select_form(nr = 1)

br.form['fulltextsearch'] = 'msb83' # this searches the facebook for
me
br.submit()
results = br.response().read()
soup = BeautifulSoup(results)
foo2 = soup.find('td', attrs={'width':'95'})
foo3 = foo2.find('a')
foo4 = foo3.find('img', attrs={'src':'deliverImage.cfm?netid=msb83'})
# this just drills down to the <img> line and until this point the
program does not return an error

save_as = os.path.join('./', msb83 + '.jpg')
urllib.urlretrieve(foo4, save_as)

I get the following error msg after running this code:

AttributeError: 'NoneType' object has no attribute 'strip'

Wild guess, since you didn't provide line numbers, etc.

foo4 is None

(I also would like to suggest to use more meaningful names)

galileo228 · Feb 17, 2010

galileo228 said:
galileo228 said:

Using BeautifulSoup, mechanize, and urllib, I've constructed the
following:

Click to expand...

br.open("http://www.school.edu/students/facebook/")
br.select_form(nr = 1)

Click to expand...

br.form['fulltextsearch'] = 'msb83' # this searches the facebook for
me
br.submit()
results = br.response().read()
soup = BeautifulSoup(results)
foo2 = soup.find('td', attrs={'width':'95'})
foo3 = foo2.find('a')
foo4 = foo3.find('img', attrs={'src':'deliverImage.cfm?netid=msb83'})
# this just drills down to the <img> line and until this point the
program does not return an error

Click to expand...

save_as = os.path.join('./', msb83 + '.jpg')
urllib.urlretrieve(foo4, save_as)>

Click to expand...

I get the following error msg after running this code:

Click to expand...

AttributeError: 'NoneType' object has no attribute 'strip'

Click to expand...

Wild guess, since you didn't provide line numbers, etc.

foo4 is None

(I also would like to suggest to use more meaningful names)

I thought it was too, and I just doublechecked. It's actually

foo3 = foo2.find('a')

that is causing the NoneType error.

Thoughts?

galileo228 · Feb 17, 2010

galileo228 said:
galileo228 said:

Using BeautifulSoup, mechanize, and urllib, I've constructed the
following:
br.open("http://www.school.edu/students/facebook/")
br.select_form(nr = 1)
br.form['fulltextsearch'] = 'msb83' # this searches the facebook for
me
br.submit()
results = br.response().read()
soup = BeautifulSoup(results)
foo2 = soup.find('td', attrs={'width':'95'})
foo3 = foo2.find('a')
foo4 = foo3.find('img', attrs={'src':'deliverImage.cfm?netid=msb83'})
# this just drills down to the <img> line and until this point the
program does not return an error
save_as = os.path.join('./', msb83 + '.jpg')
urllib.urlretrieve(foo4, save_as)>
I get the following error msg after running this code:
AttributeError: 'NoneType' object has no attribute 'strip'

Click to expand...

Click to expand...

Wild guess, since you didn't provide line numbers, etc.

Click to expand...

foo4 is None

Click to expand...

(I also would like to suggest to use more meaningful names)

Click to expand...

I thought it was too, and I just doublechecked. It's actually

foo3 = foo2.find('a')

that is causing the NoneType error.

Thoughts?

I've now fixed the foo3 issue, and I now know that the problem is with
the urllib.urlretrieve line (see above). This is the error msg I get
in IDLE:

Traceback (most recent call last):
File "/Users/Matt/Documents/python/dtest.py", line 59, in <module>
urllib.urlretrieve(foo4, save_as)
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/
python2.6/urllib.py", line 94, in urlretrieve
return _urlopener.retrieve(url, filename, reporthook, data)
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/
python2.6/urllib.py", line 226, in retrieve
url = unwrap(toBytes(url))
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/
python2.6/urllib.py", line 1033, in unwrap
url = url.strip()
TypeError: 'NoneType' object is not callable

Is this msg being generated because I'm trying to retrieve a url
that's not really a file?

John Bokma · Feb 17, 2010

galileo228 said:
[...]

I've now fixed the foo3 issue, and I now know that the problem is with
the urllib.urlretrieve line (see above). This is the error msg I get
in IDLE:

Traceback (most recent call last):
File "/Users/Matt/Documents/python/dtest.py", line 59, in <module>
urllib.urlretrieve(foo4, save_as)
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/
python2.6/urllib.py", line 94, in urlretrieve
return _urlopener.retrieve(url, filename, reporthook, data)
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/
python2.6/urllib.py", line 226, in retrieve
url = unwrap(toBytes(url))
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/
python2.6/urllib.py", line 1033, in unwrap
url = url.strip()
TypeError: 'NoneType' object is not callable

Is this msg being generated because I'm trying to retrieve a url
that's not really a file?

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/urllib.py", line 89, in urlretrieve
return _urlopener.retrieve(url, filename, reporthook, data)
File "/usr/lib/python2.5/urllib.py", line 210, in retrieve
url = unwrap(toBytes(url))
File "/usr/lib/python2.5/urllib.py", line 1009, in unwrap
url = url.strip()
AttributeError: 'NoneType' object has no attribute 'strip'
--8<---------------cut here---------------end--------------->8---

To me it looks like you're still calling urlretrieve with None as a
first value.

Matthew Barnett · Feb 17, 2010

galileo228 said:
Using BeautifulSoup, mechanize, and urllib, I've constructed the
following:
br.open("http://www.school.edu/students/facebook/")
br.select_form(nr = 1)
br.form['fulltextsearch'] = 'msb83' # this searches the facebook for
me
br.submit()
results = br.response().read()
soup = BeautifulSoup(results)
foo2 = soup.find('td', attrs={'width':'95'})
foo3 = foo2.find('a')
foo4 = foo3.find('img', attrs={'src':'deliverImage.cfm?netid=msb83'})
# this just drills down to the <img> line and until this point the
program does not return an error
save_as = os.path.join('./', msb83 + '.jpg')
urllib.urlretrieve(foo4, save_as)>
I get the following error msg after running this code:
AttributeError: 'NoneType' object has no attribute 'strip'
Wild guess, since you didn't provide line numbers, etc.
foo4 is None
(I also would like to suggest to use more meaningful names)

Click to expand...

I thought it was too, and I just doublechecked. It's actually

foo3 = foo2.find('a')

that is causing the NoneType error.

Thoughts?

Click to expand...

I've now fixed the foo3 issue, and I now know that the problem is with
the urllib.urlretrieve line (see above). This is the error msg I get
in IDLE:

Traceback (most recent call last):
File "/Users/Matt/Documents/python/dtest.py", line 59, in <module>
urllib.urlretrieve(foo4, save_as)
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/
python2.6/urllib.py", line 94, in urlretrieve
return _urlopener.retrieve(url, filename, reporthook, data)
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/
python2.6/urllib.py", line 226, in retrieve
url = unwrap(toBytes(url))
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/
python2.6/urllib.py", line 1033, in unwrap
url = url.strip()
TypeError: 'NoneType' object is not callable

Is this msg being generated because I'm trying to retrieve a url
that's not really a file?

It's because the URL you're passing in, namely foo4, is None. This is
presumably because foo3.find() returns None if it can't find the entry.

You checked the value of foo3, but did you check the value of foo4?

Simple, but difficult to me. Image border property	1	Sep 22, 2022
Survey details won't go through using php, ajax, Mysql	0	Oct 26, 2023
Problem with Image: Opening a file	2	Feb 4, 2008
Firefox bug?	2	Apr 1, 2008
download an image file to local...	2	May 3, 2006
help file file download link	3	Nov 15, 2004
Image preload problem	0	Sep 8, 2006
isp can't display the image that vs displays ok	1	Aug 24, 2008

Download unnamed web image?

galileo228

John Bokma

galileo228

galileo228

John Bokma

Matthew Barnett

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads