[ann] regexp engine 0.9

S

Simon Strandgaard

download:
http://rubyforge.org/frs/?group_id=18&release_id=422

play with regexp online:
http://neoneye.dk/regexp.rbx

RAA entry:
http://raa.ruby-lang.org/list.rhtml?name=regexp


About
=====

Here is an Regexp engine written entirely in Ruby.
It allows you to search in text with advanced search patterns.
It supports Perl5 syntax... plus some perl6 syntax (more to
come in the future). Its fairly compatible with Ruby's native
regexp engine (GNU), and when running against the Rubicon
testsuite, it passes 96.666% out of 1560 total tests.

The implementation is simple, yet without any optimizations.
Therefore speed is slow.. At some point when optimizations
are in place, I plan to do a re-implementation in C++.
Because of the simplicity, the code should be easy to grasp
and extend with your own custom code.



Changes since 0.8
=================

major refactoring of allocation/deallocation scheme for mementoes,
so that Context now acts as a caretaker. This has paved the way
for fixing bug in lookahead inside repeat.

The MatchData extracted the subcaptures wrong, because it used
string[range], where the range was 0..-1, so that the whole
string got extracted. Now it uses #slice which solved the problem.

Gained 0.66% more, so it now passes 96.66%. Version 0.8 only
passed 96.025 % of the rubicon tests (1560).



Question
========

Who on this list has interest in the Ruby-in-Ruby project ?
 
T

ts

S> Gained 0.66% more, so it now passes 96.66%. Version 0.8 only
S> passed 96.025 % of the rubicon tests (1560).

Be carefull with this rubicon test.

I can say you how it was build : I've retrieved tests from P languages,
RX, ... adapted for ruby and run it.

Each time that I've found a difference, I've tried to see if I can call it
a bug or if it was just a different choice made by ruby.

Because I was never able to say that it was a bug, I've adapted the test
to make ruby pass all its tests even if sometimes I was surprised by the
result :))


Guy Decoux
 
S

Simon Strandgaard

ts said:
S> Gained 0.66% more, so it now passes 96.66%. Version 0.8 only
S> passed 96.025 % of the rubicon tests (1560).

Be carefull with this rubicon test.

I can say you how it was build : I've retrieved tests from P languages,
RX, ... adapted for ruby and run it.

Each time that I've found a difference, I've tried to see if I can call it
a bug or if it was just a different choice made by ruby.

Because I was never able to say that it was a bug, I've adapted the test
to make ruby pass all its tests even if sometimes I was surprised by the
result :))

Agree there is a bunch of oddities in the rubicon testdata.
At the moment I have ~50 tests which is kind of odd.

I don't intend to be 100% bug-per-bug compatible. Just want to be compatible
with the good things in GNU.

There is ~20 tests where GNU has some empty sub-captures, where I have
chosen that these sub-captures actually should contain some text.
At least these 20 will never pass ok, so the theoretical limit is
(1560 - 20) / 1560 => 98.7%

Besides Rubicon, I have my own blackbox tests.. 341 to be precise,
these are somewhat more verbose than the rubicon tests.
Furhter more I have some whitebox tests too.

I use the original rubicon 'regexp.test' file, unmodified. In order to use
it I must do heavy substitution, because I cannot overload $&, $1, $2...

def make_result(registers, repl)
s = repl.clone
reg = registers
reg.map!{|i| i || ""}
reg.fill("", reg.size..12)
s.gsub!(/([^\\])\\\#\{\$&\}/, '\1REG0') # hack to find slash prefixes
s.gsub!(/([^\\])\\\#\{\$1\}/, '\1REG1') # hack to find slash prefixes
s.gsub!(/\#\{\$&\}/, reg[0])
1.upto(10){|i| s.gsub!(/#\{\$#{i}\}/, reg) }
s.gsub!(/\\\\/, '\\')
s.gsub!(/REG0/, '#{$&}') # hack to replace slash prefixes
s.gsub!(/REG1/, '#{$1}') # hack to replace slash prefixes
s
end

Regexp as we all love it.. It works :)


BTW: thanks for the 1560 tests ;-)
 
M

Mark Sparshatt

Simon said:
Question
========

Who on this list has interest in the Ruby-in-Ruby project ?
Am I right in thinking that this is a Ruby interpreter written in Ruby,
sort of a Ruby version of PyPy?

I'd be interested in seeing something like that, though I probably won't
have time to be able to help with it.
 
J

Jean-Hugues ROBERT

Question
========
Who on this list has interest in the Ruby-in-Ruby project ?

Such a beast would ease the bootstrapping of Ruby on whatever
VM it is first supported. As such I do have interest in it, but
I am not an active participant if that is what you are looking for.

Yours,

Jean-Hugues
 
S

Simon Strandgaard

Eric Hodel said:
Ryan Davis and myself have been toying with this on and off at our
weekly hacking sessions in Seattle.


Cool.. whats the status of your project ?
Can we do a checkout and play with it ?


I know only little about T-diagrams.. I think I have some weak
ideas about why it matters to make Ruby-in-Ruby.
Easier spawn a new ruby-interpreter, which can self-translate.
But doesn't this only apply to compilers ? not interpreters..

I am curious to if there are other useful things of Ruby-in-Ruby ?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,825
Latest member
VernonQuy6

Latest Threads

Top