J
John Salerno
After I've run the re.search function on a string and no match was
found, how can I access that string? When I try to print it directly,
it's an empty string, I assume because it has been "consumed." How do
I prevent this?
It seems to work fine for this 2.x code:
import urllib.request
import re
next_nothing = '12345'
pc_url = 'http://www.pythonchallenge.com/pc/def/linkedlist.php?
nothing='
pattern = re.compile(r'[0-9]+')
while True:
page = urllib.request.urlopen(pc_url + next_nothing)
match_obj = pattern.search(page.read().decode())
if match_obj:
next_nothing = match_obj.group()
print(next_nothing)
else:
print(page.read().decode())
break
But when I try it with my own code (3.2), it won't print the text of
the page:
import urllib.request
import re
next_nothing = '12345'
pc_url = 'http://www.pythonchallenge.com/pc/def/linkedlist.php?
nothing='
pattern = re.compile(r'[0-9]+')
while True:
page = urllib.request.urlopen(pc_url + next_nothing)
match_obj = pattern.search(page.read().decode())
if match_obj:
next_nothing = match_obj.group()
print(next_nothing)
else:
print(page.read().decode())
break
P.S. I plan to clean up my code, I know it's not great right now. But
my immediate goal is to just figure out why the 2.x code can print
"text", but my own code can't print "page," which are basically the
same thing, unless something significant has changed with either the
urllib.request module, or the way it's decoded, or something, or is it
just an RE issue?
Thanks.
found, how can I access that string? When I try to print it directly,
it's an empty string, I assume because it has been "consumed." How do
I prevent this?
It seems to work fine for this 2.x code:
import urllib.request
import re
next_nothing = '12345'
pc_url = 'http://www.pythonchallenge.com/pc/def/linkedlist.php?
nothing='
pattern = re.compile(r'[0-9]+')
while True:
page = urllib.request.urlopen(pc_url + next_nothing)
match_obj = pattern.search(page.read().decode())
if match_obj:
next_nothing = match_obj.group()
print(next_nothing)
else:
print(page.read().decode())
break
But when I try it with my own code (3.2), it won't print the text of
the page:
import urllib.request
import re
next_nothing = '12345'
pc_url = 'http://www.pythonchallenge.com/pc/def/linkedlist.php?
nothing='
pattern = re.compile(r'[0-9]+')
while True:
page = urllib.request.urlopen(pc_url + next_nothing)
match_obj = pattern.search(page.read().decode())
if match_obj:
next_nothing = match_obj.group()
print(next_nothing)
else:
print(page.read().decode())
break
P.S. I plan to clean up my code, I know it's not great right now. But
my immediate goal is to just figure out why the 2.x code can print
"text", but my own code can't print "page," which are basically the
same thing, unless something significant has changed with either the
urllib.request module, or the way it's decoded, or something, or is it
just an RE issue?
Thanks.