Regexp help

M

Marcus Bristav

Hello everyone,

I have a string of the form

2h 3m

or

3m 2h

or

2h 3minutes

or

2hour 3min

and so on

Is there a smart regexp one liner that could produce

[2, 3]

If anyone types just for example

2

than that should produce [2]

for any of the above input? I know that there will be an m or an h.

/Marcus
 
V

Vincent Fourmond

Hello
I have a string of the form

2h 3m

or

3m 2h
[....]
Is there a smart regexp one liner that could produce

[2, 3]

If you want to get [2,3] in both cases, that will be really difficult.
As far as I know, you can only do that in C#, which has named capturing
groups. In all the other languages I know, the capturing groups are
numbered when they are found... That rules it out.

By the way, would it be difficult to implement named capturing groups
in regular expressions ? Would that interest someone ?

Cheers !

Vince
 
M

MonkeeSage

Marcus said:
Is there a smart regexp one liner that could produce

[2, 3]

r = Regexp.new(/(\d+)h.*(\d+)m/)
s1 = "2h 3m"
s2 = "2h 3minutes"
s3 = "2hour 3min"
m = r.match(s1)
p [m[1].to_i, m[2].to_i] # => [2, 3]
m = r.match(s2)
p [m[1].to_i, m[2].to_i] # => [2, 3]
m = r.match(s3)
p [m[1].to_i, m[2].to_i] # => [2, 3]

Regards,
Jordan
 
S

Steve Callaway

Not so difficult, but it's not, as far as I can see, a
one liner. I am working something up at the moment
using an array of regexps.

Hello
I have a string of the form

2h 3m

or

3m 2h
[....]
Is there a smart regexp one liner that could produce

[2, 3]

If you want to get [2,3] in both cases, that will
be really difficult.
As far as I know, you can only do that in C#, which
has named capturing
groups. In all the other languages I know, the
capturing groups are
numbered when they are found... That rules it out.

By the way, would it be difficult to implement
named capturing groups
in regular expressions ? Would that interest someone
?

Cheers !

Vince


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
 
S

Steve Callaway

ah neat, Jordan, and more elegant than parsing an
arrayh of regexps:)

--- MonkeeSage said:
Marcus said:
Is there a smart regexp one liner that could produce

[2, 3]

r = Regexp.new(/(\d+)h.*(\d+)m/)
s1 = "2h 3m"
s2 = "2h 3minutes"
s3 = "2hour 3min"
m = r.match(s1)
p [m[1].to_i, m[2].to_i] # => [2, 3]
m = r.match(s2)
p [m[1].to_i, m[2].to_i] # => [2, 3]
m = r.match(s3)
p [m[1].to_i, m[2].to_i] # => [2, 3]

Regards,
Jordan


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
 
T

Tom Armitage

But, of course, that *won't* capture "3m 2h", like you described...

ah neat, Jordan, and more elegant than parsing an
arrayh of regexps:)

--- MonkeeSage said:
Marcus said:
Is there a smart regexp one liner that could produce

[2, 3]

r = Regexp.new(/(\d+)h.*(\d+)m/)
s1 = "2h 3m"
s2 = "2h 3minutes"
s3 = "2hour 3min"
m = r.match(s1)
p [m[1].to_i, m[2].to_i] # => [2, 3]
m = r.match(s2)
p [m[1].to_i, m[2].to_i] # => [2, 3]
m = r.match(s3)
p [m[1].to_i, m[2].to_i] # => [2, 3]

Regards,
Jordan


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
 
M

MonkeeSage

Tom said:
But, of course, that *won't* capture "3m 2h", like you described...

True...

So:

r = Regexp.new(/(\d+)h?m?.*(\d+)m?h?/)

'Course, then you'll have [3, 2] for the edge case rather than [2,
3]...but to get the full functionality that the OP described (including
the case where just "2" is given), you'd need fancier logic than just
regexp anyhow.

Regards,
Jordan
 
P

Park Heesob

Hi,

From: "Marcus Bristav" <[email protected]>
Reply-To: (e-mail address removed)
To: (e-mail address removed) (ruby-talk ML)
Subject: Regexp help
Date: Fri, 29 Sep 2006 18:04:16 +0900

Hello everyone,

I have a string of the form

2h 3m

or

3m 2h

or

2h 3minutes

or

2hour 3min

and so on

Is there a smart regexp one liner that could produce

[2, 3]

If anyone types just for example

2

than that should produce [2]

for any of the above input? I know that there will be an m or an h.

/Marcus

str = "2h 3m" # or somthing
str.scan(/(\d+)(\w*)/).sort_by{|x|x[1]}.collect{|x|x[0].to_i}

Regards,

Park Heesob
 
V

Vincent Fourmond

Hello again !
I have a string of the form

2h 3m

or

3m 2h
[....]
Is there a smart regexp one liner that could produce

[2, 3]

If you want to get [2,3] in both cases, that will be really difficult.
As far as I know, you can only do that in C#, which has named capturing
groups. In all the other languages I know, the capturing groups are
numbered when they are found... That rules it out.

Well, just to contradict myself, although this is no one-liner:

def scan(str)
re = Regexp.new(/(\d+)h.*(\d+)m|(\d+)m.*(\d+)h/)
if m = re.match(str)
return [m[1], m[2]] if m[1]
return [m[4], m[3]]
end
end

p scan("2h 3m")
p scan("3m 2h")

Cheers !

Vince
 
B

Bruno Michel

Vincent Fourmond a écrit :
Hello again !
I have a string of the form

2h 3m

or

3m 2h
[....]
Is there a smart regexp one liner that could produce

[2, 3]
If you want to get [2,3] in both cases, that will be really difficult.
As far as I know, you can only do that in C#, which has named capturing
groups. In all the other languages I know, the capturing groups are
numbered when they are found... That rules it out.

Well, just to contradict myself, although this is no one-liner:

def scan(str)
re = Regexp.new(/(\d+)h.*(\d+)m|(\d+)m.*(\d+)h/)
if m = re.match(str)
return [m[1], m[2]] if m[1]
return [m[4], m[3]]
end
end

p scan("2h 3m")
p scan("3m 2h")

Cheers !

Vince

And the one-liner :

$ irb2h".scan(/(\d+)h.*(\d+)m|(\d+)m.*(\d+)h/).flatten.values_at(0,1,3,2).compact
=> ["2", "3"]3m".scan(/(\d+)h.*(\d+)m|(\d+)m.*(\d+)h/).flatten.values_at(0,1,3,2).compact
=> ["2", "3"]


It's possible to add .map { |i| i.to_i } at the end of this one-liner if
the result array must contain integers instead of strings.
 
P

Pit Capitain

Park said:
str = "2h 3m" # or somthing
str.scan(/(\d+)(\w*)/).sort_by{|x|x[1]}.collect{|x|x[0].to_i}

Very nice idea, Park! I wouldn't have thought of that. Slightly shorter:

str.scan(/(\d+)(\w)/).sort_by{|n,u|u}.map{|n,u|n.to_i}

Regards,
Pit
 
M

Matthias Luedtke

I have a string of the form [...]

Is there a smart regexp one liner that could produce

Hello Marcus,

here's my take on it:

times = %w{ 2hour3min 2h3minutes 3m2h 2h3m }
=> ["2hour3min", "2h3minutes", "3m2h", "2h3m"]

times.map{ |t| [t[/\d+h(a-z)*/].to_i, t[/\d+m(a-z)*/].to_i] }
=> [[2, 3], [2, 3], [2, 3], [2, 3]]

Probably a little slower than the other solutions but perhaps easier to grasp.

Regards
Matthias
 
R

Relm

Hello
I have a string of the form

2h 3m

or

3m 2h
[....]
Is there a smart regexp one liner that could produce

[2, 3]

If you want to get [2,3] in both cases, that will be really difficult.
As far as I know, you can only do that in C#, which has named capturing
groups. In all the other languages I know, the capturing groups are
numbered when they are found... That rules it out.

irb> a
=> ["2h 3m", "3m 2h", "2h 3minutes", "2hour 3min", "2"]
irb> re
=> /(?=.*\b(\d+)(?=h|\b))(?=.*\b(\d+)m|)/
irb> a.map {|x| x.match(re).captures}
=> [["2", "3"], ["2", "3"], ["2", "3"], ["2", "3"], ["2", nil]]
 
T

Tom Pollard

Python regexps have named capturing groups. It's extremely helpful
if you need to construct complicated patterns; because the index of
each capturing group can eaasily change when you add and remove
things in the regexp.

Tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Help please 8
Regexp 4
small regexp help 1
Can't solve problems! please Help 0
I dont get this. Please help me!! 2
Code help please 4
regexp problem 4
Help with code 0

Members online

Forum statistics

Threads
474,214
Messages
2,571,112
Members
47,704
Latest member
DavidSuita

Latest Threads

Top