regex question

R

rusi

Can someone throw some light on this anomalous behavior?
Traceback (most recent call last):
['b', 'bb', 'bbbbb']

So evidently group counts by number of '()'s and not by number of
matches (and this is the case whether one uses match or search). So
then whats the point of search-ing vs match-ing?

Or equivalently how to move to the groups of the next match in?

[Side note: The docstrings for this really suck:
Help on built-in function group:

group(...)
 
T

Thomas Jollans

Can someone throw some light on this anomalous behavior?
Traceback (most recent call last):
['b', 'bb', 'bbbbb']

So evidently group counts by number of '()'s and not by number of
matches (and this is the case whether one uses match or search). So
then whats the point of search-ing vs match-ing?

Or equivalently how to move to the groups of the next match in?

[Side note: The docstrings for this really suck:
Help on built-in function group:

group(...)

Pretty standard regex behaviour: Group 1 is the first pair of brackets.
Group 2 is the second, etc. pp. Group 0 is the whole match.
The difference between matching and searching is that match assumes that
the start of the regex coincides with the start of the string (and this
is documented in the library docs IIRC). re.match(exp, s) is equivalent
to re.search('^'+exp, s). (if not exp.startswith('^'))

Apparently, findall() returns the content of the first group if there is
one. I didn't check this, but I assume it is documented.

- Thomas
 
M

MRAB

Can someone throw some light on this anomalous behavior?
import re
r = re.search('a(b+)', 'ababbaaabbbbb')
r.group(1) 'b'
'ab'
r.group(2)
Traceback (most recent call last):
re.findall('a(b+)', 'ababbaaabbbbb')
['b', 'bb', 'bbbbb']

So evidently group counts by number of '()'s and not by number of
matches (and this is the case whether one uses match or search). So
then whats the point of search-ing vs match-ing?

Or equivalently how to move to the groups of the next match in?

[Side note: The docstrings for this really suck:
help(r.group)
Help on built-in function group:

group(...)

Pretty standard regex behaviour: Group 1 is the first pair of brackets.
Group 2 is the second, etc. pp. Group 0 is the whole match.
The difference between matching and searching is that match assumes that
the start of the regex coincides with the start of the string (and this
is documented in the library docs IIRC). re.match(exp, s) is equivalent
to re.search('^'+exp, s). (if not exp.startswith('^'))

Apparently, findall() returns the content of the first group if there is
one. I didn't check this, but I assume it is documented.
findall returns a list of tuples (what the groups captured) if there is
more than 1 group, or a list of strings (what the group captured) if
there is 1 group, or a list of strings (what the regex matched) if
there are no groups.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

regex question 7
Why is regex so slow? 21
regex negative lookbehind assertion not working correctly? 0
Help with regex 11
RegEx issues 6
Puzzled about this regex 0
regex question 3
Question on regex 1

Members online

Forum statistics

Threads
473,968
Messages
2,570,150
Members
46,697
Latest member
AugustNabo

Latest Threads

Top