Where can be a problem?

Lad · Aug 12, 2005

I use the following
###############
import re
Results=[]
data1='<a href="detailaspxmember=15015&mode=advert" </a><a
href="detailaspxmember=15016&mode=advert" </a><a
href="detailaspxmember=15017&mode=advert" </a>'
ID = re.compile(r'^.*=(\d+)&.*$',re.MULTILINE)
Results=re.findall(ID,data1)
print Results
#############
to extract from data1 all numbers such as 15015,15016,15017

But the program extracts only the last number 15017.
Why?
Thank you for help
La.

Peter Otten · Aug 12, 2005

Lad said:
I use the following
###############
import re
Results=[]
data1='<a href="detailaspxmember=15015&mode=advert" </a><a
href="detailaspxmember=15016&mode=advert" </a><a
href="detailaspxmember=15017&mode=advert" </a>'
ID = re.compile(r'^.*=(\d+)&.*$',re.MULTILINE)
Results=re.findall(ID,data1)
print Results
#############
to extract from data1 all numbers such as 15015,15016,15017

But the program extracts only the last number 15017.
Why?
Thank you for help
La.

After changing

data = '...
'

to

data = '''...
'''

I get all three numbers. There is probably another significant difference
between the posted code and the code you are actually running.

Peter

Lad · Aug 12, 2005

Peter,
I tried exactly this
########
import re
Results=[]
data1='<a href="detailaspxmember=15015&mode=advert" </a><a
href="detailaspxmember=15016&mode=advert" </a><a
href="detailaspxmember=15017&mode=advert" </a>'
ID = re.compile(r'^.*=(\d+)&.*$',re.MULTILINE)
Results=re.findall(ID,data1)
print "Results are= ",Results
#########
and received
Results are= ['15017']

Not all numbers

What exactly did you get?
Thanks.
L.

Peter Otten · Aug 12, 2005

Lad said:
Peter,
I tried exactly this
########
import re
Results=[]
data1='<a href="detailaspxmember=15015&mode=advert" </a><a
href="detailaspxmember=15016&mode=advert" </a><a
href="detailaspxmember=15017&mode=advert" </a>'
ID = re.compile(r'^.*=(\d+)&.*$',re.MULTILINE)
Results=re.findall(ID,data1)
print "Results are= ",Results
#########
and received
Results are= ['15017']

Not all numbers

What exactly did you get?

With /exactly/ this, I get:

$ cat lad1.py
import re
Results=[]
data1='<a href="detailaspxmember=15015&mode=advert" </a><a
href="detailaspxmember=15016&mode=advert" </a><a
href="detailaspxmember=15017&mode=advert" </a>'
ID = re.compile(r'^.*=(\d+)&.*$',re.MULTILINE)
Results=re.findall(ID,data1)
print "Results are= ",Results
$ python lad1.py
File "lad1.py", line 3
data1='<a href="detailaspxmember=15015&mode=advert" </a><a
^
SyntaxError: EOL while scanning single-quoted string

When I modify it to compile, I get /exactly/ this:

$ cat lad2.py
import re
Results=[]
data1='''<a href="detailaspxmember=15015&mode=advert" </a><a
href="detailaspxmember=15016&mode=advert" </a><a
href="detailaspxmember=15017&mode=advert" </a>'''
ID = re.compile(r'^.*=(\d+)&.*$',re.MULTILINE)
Results=re.findall(ID,data1)
print "Results are= ",Results
$ python lad2.py
Results are= ['15015', '15016', '15017']

Peter

Lad · Aug 12, 2005

Thank you Peter for help.
The reason why it did not work was the fact that findall function
required CRLF among lines

Paul McGuire · Aug 12, 2005

Try this, its a bit more readable than your re.

from pyparsing import Word,nums,Literal,replaceWith

data1='''<a href="detailaspxmember=15015&m-ode=advert" </a><a
href="detailaspxmember=15016&mode=advert" </a><a
href="detailaspxmember=15017&mode=advert" </a>'''

# a number is a word composed of nums, that is, the digits 0-9
# your search string is looking for a number between an '=' and '&'
EQUALS = Literal("=")
AMPER = Literal("&")
number = Word(nums)
hrefNumber = EQUALS + number + AMPER

# scanString is a generator, that returns matching tokens, start,
# and end location for each occurrence in the input string - we
# just care about the second token of each match
print [ tokens[1] for tokens,s,e in hrefNumber.scanString(data1) ]

# just for grins, here is how to convert the numbers to the
# string "###"
number.setParseAction( replaceWith("###") )
print number.transformString(data1)

Prints:

['15015', '15016', '15017']
<a href="detailaspxmember=###&m-ode=advert" </a><a
href="detailaspxmember=###&mode=advert" </a><a
href="detailaspxmember=###&mode=advert" </a>

Download pyparsing at http://pyparsing.sourceforge.net.

-- Paul

Is there a way where i can limit the array output results?	1	Oct 19, 2022
Can someone pls help me with a little algorithm script	1	Nov 28, 2024
Trouble with prediction code, for the life of me I can't figure out why it isnt running properly. Help would be appreciated.	0	Jul 8, 2023
Working on mobile css menu with plenty of frustration!	2	Dec 29, 2022
Need help with this script	4	Mar 12, 2023
Learning Regex looking for criticism	3	Jan 13, 2025
Crummy BS Script	8	Oct 1, 2010
Problem creating a regular expression to parse open-iscsi, iscsiadmoutput (help?)	5	Jun 13, 2013

Where can be a problem?

Lad

Peter Otten

Lad

Peter Otten

Lad

Paul McGuire

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads