A
Andreas Launila
I'm trying to come up with a clean way to specify regexps in the integer
domain. I.e. instead of describing a pattern of characters (as in normal
regexps) one describes patterns of integers ("17 followed by 3 or 15"
rather than "'a' followed by 'b' or 'c'").
My main objective is to make the syntax intuitive to Ruby users. I have
been toying around with a few different approaches, but I'm not sure if
any meet the goal. How would you design the syntax for regexps in the
integer domain?
The syntax does not need to support especially many operators:
* Kleene star ('*' in character regexps)
* "At least once" ('+' in character regexps)
* Alternation ('|' in character regexps, i.e. "a|b" being "'a' or 'b'")
Below are some of the approaches one could take.
== String representation
Many Ruby users have used character regexps defined as strings, so it
would seem like a good idea to make integer regexps as similar as
possible. A delimiter would be needed though, since "17" could otherwise
mean "1 followed by 7" or just "17". One delimiter could for instance be
a blank space. The following is how the pattern "Any number of (17 or, 1
followed by 5) followed by 4711" could look.
IntRegexp.new('(17|1 5)*4711')
== Method combination
In this approach the regexp is gradually built up by invoking methods.
One major question is what the methods should be named.
The following is how the above example could look.
first_part = IntRegexp.new(17).or(IntRegexp.new(1).followed_by 5)
first_part.any_number_of_times.followed_by 4711
Or with some different method names
first_part = IntRegexp.new(17) | (IntRegexp.new(1) + 5)
first_part.any_times + 4711
I'm unsure what sort of method names would strike a good balance between
verbosity and readability.
== Blocks
TextualRegexp[1] provides a block-based interface for specifying normal
regexps. Perhaps something similar could be done for integer regexps?
The following is the syntax of TextualRegexp but with integers.
regexp do
at_least_zero do
any f do
integer 17
group{ integer 1; integer 5 }
end
end
integer 4711
end
There's a fair amount of redundancy since there will never be any thing
other than "integer" specified.
[1] http://rubyforge.org/projects/texrex/
domain. I.e. instead of describing a pattern of characters (as in normal
regexps) one describes patterns of integers ("17 followed by 3 or 15"
rather than "'a' followed by 'b' or 'c'").
My main objective is to make the syntax intuitive to Ruby users. I have
been toying around with a few different approaches, but I'm not sure if
any meet the goal. How would you design the syntax for regexps in the
integer domain?
The syntax does not need to support especially many operators:
* Kleene star ('*' in character regexps)
* "At least once" ('+' in character regexps)
* Alternation ('|' in character regexps, i.e. "a|b" being "'a' or 'b'")
Below are some of the approaches one could take.
== String representation
Many Ruby users have used character regexps defined as strings, so it
would seem like a good idea to make integer regexps as similar as
possible. A delimiter would be needed though, since "17" could otherwise
mean "1 followed by 7" or just "17". One delimiter could for instance be
a blank space. The following is how the pattern "Any number of (17 or, 1
followed by 5) followed by 4711" could look.
IntRegexp.new('(17|1 5)*4711')
== Method combination
In this approach the regexp is gradually built up by invoking methods.
One major question is what the methods should be named.
The following is how the above example could look.
first_part = IntRegexp.new(17).or(IntRegexp.new(1).followed_by 5)
first_part.any_number_of_times.followed_by 4711
Or with some different method names
first_part = IntRegexp.new(17) | (IntRegexp.new(1) + 5)
first_part.any_times + 4711
I'm unsure what sort of method names would strike a good balance between
verbosity and readability.
== Blocks
TextualRegexp[1] provides a block-based interface for specifying normal
regexps. Perhaps something similar could be done for integer regexps?
The following is the syntax of TextualRegexp but with integers.
regexp do
at_least_zero do
any f do
integer 17
group{ integer 1; integer 5 }
end
end
integer 4711
end
There's a fair amount of redundancy since there will never be any thing
other than "integer" specified.
[1] http://rubyforge.org/projects/texrex/