Tom said:
Does Ruby support regexps that assign names to specific matched groups?
In Ruby 1.9 it works. I wrote some artikels with many examples on
http://www.ruby-mine.de (the site may be down for maintenance the next two days,
especially the following one:
http://www.ruby-mine.de?p=130 - unfortunately it
is only available in german in the moment, but the examples are Ruby code and
irb usage, so it should be understandable without understanding the german texts.
But - Ruby 1.9 is still under development. May be that there will be changes in
details in future.
Some examples:
irb(main):001:0> md="abba".match(/(?<a1>.)(?<a2>.)\k<a2>\k<a1>/)
=> #<MatchData:0x2bf0488>
irb(main):002:0> md[0]
=> "abba"
irb(main):003:0> md[1]
=> "a"
irb(main):004:0> md[2]
=> "b"
irb(main):005:0> md[:a1]
=> "a"
irb(main):006:0> md[:a2]
=> "b"
irb(main):007:0> md['a1']
=> "a"
irb(main):008:0> md['a2']
=> "b"
Here it is visible, that the contents of a matched groups are accessible by
number, name as symbol, and name as string, but it is not allowed to mix named
groups and normal capturing groups in the same regular expression:
irb(main):001:0> "abba".match(/(?<a1>.)(.)\2\k<a1>/)
SyntaxError: compile error
(irb):1: numbered backref/call is not allowed. (use name): /(?<a1>.)(.)\2\k<a1>/
from (irb):1:in `Kernel#binding'
-----
When using "sub", "gsub", "sub!", or "gsub!" witout a block, it is only possible
to access the groups by name, the positional access return the empty string
irb(main):001:0> puts 'axbx'.sub(/(?<r>.)x(?<s>.)x/, '\k<s>\k<r>')
ba
=> nil
irb(main):002:0> puts 'axbx'.sub(/(?<r>.)x(?<s>.)x/, '\2\1')
=> nil
-----
Inside block a direct access to the group names is not possible - I must say, I
don't find a way to do it directly. The use of positional variables "$1" etc. is
possible. There is another possibility by using the MatchDate object "$~" inside
the block. In doing this, the same possibilities are available as described for
"match":
irb(main):001:0> 'axbx'.sub(/(?<i>.)x(?<j>.)x/){|k|p k;p $1;p $2;'u'+$2}
"axbx"
"a"
"b"
=> "ub"
irb(main):002:0> 'axbxcxdx'.gsub(/(?<i>.)x(?<j>.)x/){|k|p k;p $1;p $2;'u'+$2}
"axbx"
"a"
"b"
"cxdx"
"c"
"d"
=> "ubud"
and using MatchData object:
irb(main):001:0> 'axbx'.sub(/(?<i>.)x(?<j>.)x/){|k|p k;p $1;p $2;'u'+$2}
"axbx"
"a"
"b"
=> "ub"
irb(main):002:0> 'axbxcxdx'.gsub(/(?<i>.)x(?<j>.)x/){|k|p k;p $1;p $2;'u'+$2}
"axbx"
"a"
"b"
"cxdx"
"c"
"d"
=> "ubud"
------
There are special situations, where the possibilities of Oniguruma in Ruby 1.9
allow solutions, which are not as simple to describe in Ruby 1.9.
Ruby 1.8:
irb(main):001:0> "rasbuavb".scan(/(.)a|(.)b/){|i|p i}
["r", nil]
[nil, "s"]
["u", nil]
[nil, "v"]
=> "rasbuavb"
Ruby 1.9:
irb(main):002:0> "rasbuavb".scan(/(.){0}\g<1>a|\g<1>b/){|i|p i}
["r"]
["s"]
["u"]
["v"]
=> "rasbuavb"
Here isn't a named group the player, it is the possibility to call a
subexpression. It is a very powerfull feature, which allows recursive
constructs. I made in the article a pocket calculator as example, but it may
useful for checking complex input fields in a GUI, or even later on in Rails:
pattern = / (?<e>\g<t>\+\g<e>|\g<t>-\g<e>|\g<t>){0}
(?<t>|\g<f>\*\g<t>|\g<f>\/\g<t>|\g<f>){0}
(?<f>[-+]?\g<id>|\(\g<e>\)){0}
(?<id>\g<n>|\g<v>){0}
(?<n>[a-zA-Z_]\w*){0}
(?<v>\d+(\.\d+)?){0}
^((?<var>\g<n>)=)?(?<expr>\g<e>)$
/x
vars = Hash.new(0)
basbind = binding
# print ‘input> ‘ # for interactive usage
while (!(inp = DATA.gets).chomp.match(/^quit$/i))
if (md = inp.chomp.gsub(/\s+/,‘‘).match(pattern))
expr = md[:expr].gsub(/([a-zA-Z_]\w*)/, ‘vars["\1"]‘)
erg = eval(expr, basbind)
vars[md[:var]] = erg if md[:var]
puts "#{inp.chomp}, result> #{(md[:var])?(md[:var]+‘=‘):‘‘}#{erg}"
else
puts "+++++ incorrect input: ‘#{inp.chomp}‘"
end
# print ‘input> ‘ # for interactive usage
end
puts ‘***** variables *****‘
vars.keys.sort.each{|v|puts "#{v}=#{vars[v]}"}
puts ‘******* End ********‘
__END__
30+12
a = 30 + 12
b = 2*a
c = -(a*a+5)
d = (6+5*a)*c
quit
results in:
30+12, result> 42
a = 30 + 12, result> a=42
b = 2*a, result> b=84
c = -(a*a+5), result> c=-1769
d = (6+5*a)*c, result> d=-382104
***** variables *****
a=42
b=84
c=-1769
d=-382104
******* End ********
-----
Summary - in the near future you will habe a lot of powerful new features in
Ruby's pattern matching facilities.
Wolfgang Nádasi-Donner