Reg Ex help

D

don

I have a string from a clearcase cleartool ls command.

/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
from /main/parallel_branch_1/release_branch_1.0/4

I want to write a regex that gives me the branch the file was
checkedout on ,in this case - 'dbg_for_python'

Also if there is a better way than using regex, please let me know.

Thanks in advance,
Don
 
A

Aaron Barclay

Hi don,

there may well be a better way then regex, although I find them usefull
and use them a lot.

The way they work would be dependant on knowing some things. For
example, if the dir you are after is always 4
deep in the structure you could try something like...

path =
'/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT

from /main/parallel_branch_1/release_branch_1.0/4'

p = re.compile('/\S*/\S*/\S*/(\S*)/')
m = re.search(p, path)
..
if m:
print m.group(1)


This is a good reference...
http://www.amk.ca/python/howto/regex/

Hope that helps,
aaron.
 
J

James Thiele

don said:
I have a string from a clearcase cleartool ls command.

/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
from /main/parallel_branch_1/release_branch_1.0/4

I want to write a regex that gives me the branch the file was
checkedout on ,in this case - 'dbg_for_python'

Also if there is a better way than using regex, please let me know.

Thanks in advance,
Don

Not regex, but does this do what you want?
s = "/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT"
s = s + " from /main/parallel_branch_1/release_branch_1.0/4"
s.split('/')[4]
'dbg_for_python'
 
T

Tim Chase

/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
from /main/parallel_branch_1/release_branch_1.0/4

I want to write a regex that gives me the branch the file was
checkedout on ,in this case - 'dbg_for_python'

Also if there is a better way than using regex, please let me know.

Well, if you have it all in a single string:

s =
"/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
from /main/parallel_branch_1/release_branch_1.0/4"

you can do

branch = s.split("/")[4]

which returns the branch, assuming the path from root is the
same for each item in question.

If not, you can tinker with something like

r = re.compile(r'/([^/]*)/CHECKEDOUT')
m = r.match(s)

and which should make m.groups(1) the resulting item. You
don't give much detail regarding what is constant (the
number of subdirectories in the path? the CHECKEDOUT
portion?, etc) so it's kinda hard to figure out what is most
globally applicable.

-tkc
 
M

Mirco Wahab

Hi don
I have a string from a clearcase cleartool ls command.
/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
from /main/parallel_branch_1/release_branch_1.0/4
I want to write a regex that gives me the branch the file was
checkedout on ,in this case - 'dbg_for_python'
Also if there is a better way than using regex, please let me know.

This is a good situation where Regex come into play,
because all other solutions won't catch on different
string structures easily.

If you know that you will need the string
before CHECKEDOUT, you can, for example use some
nice positive lookahead (mentioned today here)

pseudo: take all strings between / ... / and
return 'em if the next thing is CHECKEDOUT
(or something else):

/ ([^/]+) / (?=CHECKEDOUT)

The ([^/]+) means ^/ (not /) in a character
class, [^/]+ one or more than one times
and ([^/]+) capture it by (..)

The code:

import re

t = '/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT from /main/p...'
r = r'/([^/]+)/(?=CHECKEDOUT)'

# print re.search(r, t).group(1)

would do the job, independent of the structure
of string - except the /CHECKEDOUT thing (which
has to be there)

If there are 'better ways' - that depends on
'better ways for whom?'. If you can handle
the Railgun, why bother with the Pistols ;-)

Regards

M.
 
E

Edward Elliott

Bruno said:
don a écrit :
Also if there is a better way than using regex, please let me know.

s ="/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
from /main/parallel_branch_1/release_branch_1.0/4"
parts = s.replace(' ', '/').strip('/').split('/')
branch = parts[parts.index('CHECKEDOUT') - 1]

I wouldn't call these better (or worse) than regexes, but a slight variation
on the above:

marker = s.index('/CHECKEDOUT')
branch = s [s.rindex('/', 0, marker) + 1 : marker]

This version will throw exceptions when the marker isn't found, which may or
may not be preferable under the circumstances.
 
B

Bruno Desthuilliers

don a écrit :
I have a string from a clearcase cleartool ls command.

/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
from /main/parallel_branch_1/release_branch_1.0/4

I want to write a regex that gives me the branch the file was
checkedout on ,in this case - 'dbg_for_python'

Also if there is a better way than using regex, please let me know.

s ="/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
from /main/parallel_branch_1/release_branch_1.0/4"
parts = s.replace(' ', '/').strip('/').split('/')
branch = parts[parts.index('CHECKEDOUT') - 1]
 
P

Paddy

P.S.

This is how it works:
'/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
from /main/parallel_branch_1/release_branch_1.0/4'['/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT',
'from', '/main/parallel_branch_1/release_branch_1.0/4']
s.split()[0] '/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT'
s.split()[0].split('/')
['', 'main', 'parallel_branch_1', 'release_branch_1.0',
'dbg_for_python', 'CHECKEDOUT']
s.split()[0].split('/')[-1] 'CHECKEDOUT'
s.split()[0].split('/')[-2] 'dbg_for_python'
s ="/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT from /main/parallel_branch_1/release_branch_1.0/4"
s.split()[0].split('/')[-2] 'dbg_for_python'

- Paddy.
 
B

bruno at modulix

Edward Elliott wrote:
(snip)
don a écrit :
(snip)

I wouldn't call these better (or worse) than regexes, but a slight variation
on the above:

marker = s.index('/CHECKEDOUT')
branch = s [s.rindex('/', 0, marker) + 1 : marker]

Much cleaner than mine. I shouldn't try to code when it's time to bed !-)
 
E

Edward Elliott

bruno said:
parts = s.replace(' ', '/').strip('/').split('/')
branch = parts[parts.index('CHECKEDOUT') - 1]

Edward said:
marker = s.index('/CHECKEDOUT')
branch = s [s.rindex('/', 0, marker) + 1 : marker]

Much cleaner than mine. I shouldn't try to code when it's time to bed !-)

Not terribly readable though, hard to tell what the magic slice indexes
mean. Yours is easier to follow. I think I'd just use a regex though.
 
A

Anthra Norell

('/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT')
'dbg_for_python'

If I understand your problem, this might be a solution. It is a stream
editor I devised on the impression that it could handle in a simple manner a
number of relatively simple problems on this list for which no
commensurately simple methodologies seem to exist. I intend to propose it to
the group when I finish the doc. Meantime who do I propose it to?

Frederic


----- Original Message -----
From: "don" <[email protected]>
Newsgroups: comp.lang.python
To: <[email protected]>
Sent: Thursday, May 11, 2006 7:39 PM
Subject: Reg Ex help
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
474,297
Messages
2,571,529
Members
48,241
Latest member
PorterShor

Latest Threads

Top