Downloading files using urllib in a for loop?

J

justsee

Hi,
I'm using Python 2.3 on Windows for the first time, and am doing
something wrong in using urllib to retrieve images from urls embedded
in a csv file. If I explicitly specify a url and image name it works
fine(commented example in the code), but if I pass in variables in this
for loop it throws errors:

--- The script:

import csv, urllib
reader = csv.reader(open("source.csv"))
for x,y,z,imagepath in reader
theurl = imagepath[:55]
theimage = imagepath[55:-8]
urllib.urlretrieve(theurl, theimage)
#urllib.urlretrieve("http://someurl/image.gif", "image.gif") # works!

--- The errors:

This throws the following errors:
File "getimages.py", line 9, in ?
urllib.urlretrieve(theurl,theimage)
File "C:\Python23\lib\urllib.py", line 83, in urlretrieve
return _urlopener.retrieve(url, filename, reporthook, data)
File "C:\Python23\lib\urllib.py", line 213, in retrieve
fp = self.open(url, data)
File "C:\Python23\lib\urllib.py", line 181, in open
return getattr(self, name)(url)
File "C:\Python23\lib\urllib.py", line 410, in open_file
return self.open_local_file(url)
File "C:\Python23\lib\urllib.py", line 420, in open_local_file
raise IOError(e.errno, e.strerror, e.filename)
IOError: [Errno 2] No such file or directory: ''

---

Would really appreciate some pointers on the right way to loop through
and retrieve images, as I've tried various other solutions but am
clearly missing something simple!

Thanks,

justin.
 
M

Martin Franklin

Hi,
I'm using Python 2.3 on Windows for the first time, and am doing
something wrong in using urllib to retrieve images from urls embedded
in a csv file. If I explicitly specify a url and image name it works
fine(commented example in the code), but if I pass in variables in this
for loop it throws errors:

--- The script:

import csv, urllib
reader = csv.reader(open("source.csv"))
for x,y,z,imagepath in reader
theurl = imagepath[:55]
theimage = imagepath[55:-8]

"No such file or directory: ''" sounds to me like you are trying
to open a file called '' (empty string)

try adding some debugging

print theimage, imagepath


urllib.urlretrieve(theurl, theimage)
#urllib.urlretrieve("http://someurl/image.gif", "image.gif") # works!

--- The errors:

This throws the following errors:
File "getimages.py", line 9, in ?
urllib.urlretrieve(theurl,theimage)
File "C:\Python23\lib\urllib.py", line 83, in urlretrieve
return _urlopener.retrieve(url, filename, reporthook, data)
File "C:\Python23\lib\urllib.py", line 213, in retrieve
fp = self.open(url, data)
File "C:\Python23\lib\urllib.py", line 181, in open
return getattr(self, name)(url)
File "C:\Python23\lib\urllib.py", line 410, in open_file
return self.open_local_file(url)
File "C:\Python23\lib\urllib.py", line 420, in open_local_file
raise IOError(e.errno, e.strerror, e.filename)
IOError: [Errno 2] No such file or directory: ''

---

Would really appreciate some pointers on the right way to loop through
and retrieve images, as I've tried various other solutions but am
clearly missing something simple!

Thanks,

justin.
 
J

justsee

Thanks - but have printed and verified they are valid paths and
filenames. One correction to the code I listed:
theurl = imagepath[:-8]

For some reason the values aren't being passed through
urllib.urlretrieve properly but this makes no sense to me?
 
F

Fredrik Lundh

Martin said:
"No such file or directory: ''" sounds to me like you are trying
to open a file called '' (empty string)

try adding some debugging

print theimage, imagepath

or, better:

print repr(theimage), repr(imagepath)

</F>
 
M

Martin Franklin

I just noticed the code you sent will not work... notice the lack of a
colon ( : ) and the end of the 'for' line....

please post an exact copy of your code and also the results with the
included print debugging line (with or without repr ;) )



Cheers
Martin

theurl = imagepath[:55]
theimage = imagepath[55:-8]

"No such file or directory: ''" sounds to me like you are trying
to open a file called '' (empty string)

try adding some debugging

print theimage, imagepath


urllib.urlretrieve(theurl, theimage)
#urllib.urlretrieve("http://someurl/image.gif", "image.gif") # works!

--- The errors:

This throws the following errors:
File "getimages.py", line 9, in ?
urllib.urlretrieve(theurl,theimage)
File "C:\Python23\lib\urllib.py", line 83, in urlretrieve
return _urlopener.retrieve(url, filename, reporthook, data)
File "C:\Python23\lib\urllib.py", line 213, in retrieve
fp = self.open(url, data)
File "C:\Python23\lib\urllib.py", line 181, in open
return getattr(self, name)(url)
File "C:\Python23\lib\urllib.py", line 410, in open_file
return self.open_local_file(url)
File "C:\Python23\lib\urllib.py", line 420, in open_local_file
raise IOError(e.errno, e.strerror, e.filename)
IOError: [Errno 2] No such file or directory: ''

---

Would really appreciate some pointers on the right way to loop through
and retrieve images, as I've tried various other solutions but am
clearly missing something simple!

Thanks,

justin.
 
M

Martin Franklin

Thanks - but have printed and verified they are valid paths and
filenames. One correction to the code I listed:
theurl = imagepath[:-8]

For some reason the values aren't being passed through
urllib.urlretrieve properly but this makes no sense to me?

A working (for me!) example:-


import urllib

paths = ["http://www.python.org/index.html", ]



for remotepath in paths:
# keep only last 10 chars (index.html)
# for local file name
localpath = remotepath[-10:]

print remotepath, localpath

urllib.urlretrieve(remotepath, localpath)
 
J

justsee

ah - thanks for your help, but what is happening is the first line
being returned contains the field names from the csv file! Schoolboy
errors :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,968
Messages
2,570,153
Members
46,699
Latest member
AnneRosen

Latest Threads

Top