S
SMERSH009X
I've written a Script that navigates various urls on a website, and
fetches the contents.
The Url's are being fed from a list "urlList". Everything seems to
work splendidly, until I introduce the concept of encoding parameters
for a certain url.
So for example if I wanted to navigate to an encoded url
http://online.investools.com/landing.iedu?signedin=true rather than
just http://online.investools.com/landing.iedu How would I do this?
How can I modify the script to urlencode these parameters:
{signedin:true} and to associate them with a specific url from the
urlList
Thank you!
import datetime, time, re, os, sys, traceback, smtplib, string,
urllib2, urllib, inspect
from urllib2 import build_opener, HTTPCookieProcessor, Request
opener = build_opener(HTTPCookieProcessor)
from urllib import urlencode
def urlopen2(url, data=None, user_agent='urlopen2'):
"""Opens Our URLS """
if hasattr(data, "__iter__"):
data = urlencode(data)
headers = {'User-Agent' : user_agent} # User-Agent for
Unspecified Browser
return opener.open(Request(url, data, headers))
def badCharCheck(host,url):
try:
page = urlopen2("http://"+host+".investools.com/"+url+"", ())
pageRead= page.read()
print "Loading:",url
#print pageRead
except:
print "Failed: ", traceback.format_tb(sys.exc_info()[2]),'\n'
if __name__ == '__main__':
host= "online"
urlList = ["landing.iedu","sitemap.iedu"]
print "\n","***** Begin BadCharCheck for", host
for url in urlList:
badCharCheck(host,url)
print'***** TEST FINISHED! Total Runs:'
sys.exit()
OUTPUT:
***** Begin BadCharCheck for online
Loading: landing.iedu
Loading: sitemap.iedu
***** TEST FINISHED! Total Runs:
fetches the contents.
The Url's are being fed from a list "urlList". Everything seems to
work splendidly, until I introduce the concept of encoding parameters
for a certain url.
So for example if I wanted to navigate to an encoded url
http://online.investools.com/landing.iedu?signedin=true rather than
just http://online.investools.com/landing.iedu How would I do this?
How can I modify the script to urlencode these parameters:
{signedin:true} and to associate them with a specific url from the
urlList
Thank you!
import datetime, time, re, os, sys, traceback, smtplib, string,
urllib2, urllib, inspect
from urllib2 import build_opener, HTTPCookieProcessor, Request
opener = build_opener(HTTPCookieProcessor)
from urllib import urlencode
def urlopen2(url, data=None, user_agent='urlopen2'):
"""Opens Our URLS """
if hasattr(data, "__iter__"):
data = urlencode(data)
headers = {'User-Agent' : user_agent} # User-Agent for
Unspecified Browser
return opener.open(Request(url, data, headers))
def badCharCheck(host,url):
try:
page = urlopen2("http://"+host+".investools.com/"+url+"", ())
pageRead= page.read()
print "Loading:",url
#print pageRead
except:
print "Failed: ", traceback.format_tb(sys.exc_info()[2]),'\n'
if __name__ == '__main__':
host= "online"
urlList = ["landing.iedu","sitemap.iedu"]
print "\n","***** Begin BadCharCheck for", host
for url in urlList:
badCharCheck(host,url)
print'***** TEST FINISHED! Total Runs:'
sys.exit()
OUTPUT:
***** Begin BadCharCheck for online
Loading: landing.iedu
Loading: sitemap.iedu
***** TEST FINISHED! Total Runs: