Help with regexp please

F

Felix Collins

Hi,
I'm not a regexp expert and had a bit of trouble with the following
search.

I have an "outline number" system like

1
1.2
1.2.3
1.3
2
3
3.1

etc.

I want to parse an outline number and return the parent.

So for example...

parent("1.2.3.4") returns "1.2.3"

The only way I can figure is to do two searches feeding the output of
the first into the input of the second.

Here is the code fragment...

m = re.compile(r'(\d+\.)+').match("1.2.3.4")
n = re.compile(r'\d+(\.\d+)+').match(m.string[m.start():m.end()])
parentoutlinenumber = n.string[n.start():n.end()]

parentoutlinenumber
1.2.3

How do I get that into one regexp?

Thanks for any help...

Felix
 
S

Scott David Daniels

Felix said:
Hi,
I'm not a regexp expert and had a bit of trouble with the following search.
I have an "outline number" system like
1
1.2
1.2.3
I want to parse an outline number and return the parent.

Seems to me regex is not the way to go:
def parent(string):
return string[: string.rindex('.')]
 
C

Christopher Subich

Scott said:
Felix said:
I have an "outline number" system like
1
1.2
1.2.3
I want to parse an outline number and return the parent.

Seems to me regex is not the way to go:
def parent(string):
return string[: string.rindex('.')]

Absolutely, regex is the wrong solution for this problem. I'd suggest
using rsplit, though, since that will Do The Right Thing when a
top-level outline number is passed:
def parent(string):
return string.rsplit('.',1)[0]

Your solution will throw an exception, which may or may not be the right
behaviour.
 
F

Felix Collins

Christopher said:
Scott David Daniels wrote:
Thanks to you both. Wow! what a quick response!
>string.rsplit('.',1)[0]

Clever Python! ;-)


Sorry, I mainly code in C so I'm not very Pythonic in my thinking.
Thanks again...

Felix
 
T

Terry Hancock

Christopher said:
Scott David Daniels wrote:
Thanks to you both. Wow! what a quick response!
string.rsplit('.',1)[0]
Clever Python! ;-)
Sorry, I mainly code in C so I'm not very Pythonic in my thinking.
Thanks again...

I think this is the "regexes can't count" problem. When the repetition
count matters, you usually need something else. Usually some
combination of string and list methods will do the trick, as here.
 
C

Christopher Subich

Terry said:
I think this is the "regexes can't count" problem. When the repetition
count matters, you usually need something else. Usually some
combination of string and list methods will do the trick, as here.

Not exactly, regexes are just fine at doing things like "first" and
"last." The "regexes can't count" saying applies mostly to activities
that reduce to parentheses matching at arbitrary nesting.

The OP's problem could easily be written as a regex substitution, it's
just that there's no need to; I believe that the sub would be
(completely untested, and I'm probably going to use the wrong call to
re.sub anyway since I don't have the docs open):

re.sub(outline_value,'([0-9.]+)\.[0-9]+','\1')

It's just that the string.rsplit call is much more legible, much more
intutitive, doesn't do strange things if it's accidentally called on a
top-level outline value, and also extends immediately to handle
outlines of the form I.1.a.i.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
474,262
Messages
2,571,310
Members
47,976
Latest member
SheriBolli

Latest Threads

Top