[ann] regexp-engine-0.8, perl5 + some perl6

S

Simon Strandgaard

download:
http://rubyforge.org/frs/?group_id=18&release_id=422

play with regexp online:
http://neoneye.dk/regexp.rbx

RAA entry:
http://raa.ruby-lang.org/list.rhtml?name=regexp


About
=====

Here is an Regexp engine written entirely in Ruby.
It allows you to search in text with advanced search patterns.
It supports Perl5 syntax... plus some perl6 syntax (more to
come in the future). Its fairly compatible with Ruby's native
regexp engine (GNU), and when running against the Rubicon
testsuite, it passes 96.025% out of 1560 total tests.

The implementation is simple, yet without any optimizations.
Therefore speed is slow.. At some point when optimizations
are in place, I plan to do a re-implementation in C++.
Because of the simplicity, the code should be easy to grasp
and extend with your own custom code.


Goals
=====

Be compatible with Ruby's GNU regexp engine (perl5 syntax).
DONE, This is goal fullfilled.

Support Perl6 regexp-syntax (even before perl6 gets finished).
This new syntax is less obfuscated than old perl5 syntax.
Perhaps also make a converter between perl5 <-> perl6 syntax.
Not fullfilled yet, but I am working on it.

The AEditor project needs a flexible regexp-engine for doing
lexing, so that text can get syntax-colored.
Future.

The Ruby-in-Ruby project needs a regexp-engine.. this engine
will hopefully become suitable.
Optional.

Explain-regexp.. output a verbose overview of what each
opcode in the regexp does.
Optional.


Status
======

The project has completed the 'make it work' phase, and has
entered the 'make it right' phase, where I will focus on
optimization, so that decent speed can be achieved.

Running the engine against the Rubicon testsuite, yields
pass=1498, fail=62, pass/total=96.025%
The failing tests are mostly obscurities in GNU.

Besides that there are 402 tests, which both does whitebox
and blackbox testing. However in order to run the tests
its necessary to fetch Michael Granger's Test::Unit::Mock
package.


License
=======

Ruby's license.


Acknowledgements
================

Mark Sparshatt
* Got the inital idea of extending with perl6.
* NewMatchData class, NewRegexp class.

Guy Decoux/Dave Thomas
* stolen part of rubicon testsuite which exercises regex.


Contact
=======

In case you find a bug og have suggestion for improvements,
then feel free to mail me.

Simon Strandgaard <[email protected]>


Thanks for your patience.
 
C

Charles Comstock

Simon said:
download:
http://rubyforge.org/frs/?group_id=18&release_id=422

play with regexp online:
http://neoneye.dk/regexp.rbx

RAA entry:
http://raa.ruby-lang.org/list.rhtml?name=regexp


About
=====

Here is an Regexp engine written entirely in Ruby.
It allows you to search in text with advanced search patterns.
It supports Perl5 syntax... plus some perl6 syntax (more to
come in the future). Its fairly compatible with Ruby's native
regexp engine (GNU), and when running against the Rubicon
testsuite, it passes 96.025% out of 1560 total tests.

The implementation is simple, yet without any optimizations.
Therefore speed is slow.. At some point when optimizations
are in place, I plan to do a re-implementation in C++.
Because of the simplicity, the code should be easy to grasp
and extend with your own custom code.


Goals
=====

Be compatible with Ruby's GNU regexp engine (perl5 syntax).
DONE, This is goal fullfilled.

Support Perl6 regexp-syntax (even before perl6 gets finished).
This new syntax is less obfuscated than old perl5 syntax.
Perhaps also make a converter between perl5 <-> perl6 syntax.
Not fullfilled yet, but I am working on it.

The AEditor project needs a flexible regexp-engine for doing
lexing, so that text can get syntax-colored.
Future.

The Ruby-in-Ruby project needs a regexp-engine.. this engine
will hopefully become suitable.
Optional.

Explain-regexp.. output a verbose overview of what each
opcode in the regexp does.
Optional.


Status
======

The project has completed the 'make it work' phase, and has
entered the 'make it right' phase, where I will focus on
optimization, so that decent speed can be achieved.

Running the engine against the Rubicon testsuite, yields
pass=1498, fail=62, pass/total=96.025%
The failing tests are mostly obscurities in GNU.

Besides that there are 402 tests, which both does whitebox
and blackbox testing. However in order to run the tests
its necessary to fetch Michael Granger's Test::Unit::Mock
package.


License
=======

Ruby's license.


Acknowledgements
================

Mark Sparshatt
* Got the inital idea of extending with perl6.
* NewMatchData class, NewRegexp class.

Guy Decoux/Dave Thomas
* stolen part of rubicon testsuite which exercises regex.


Contact
=======

In case you find a bug og have suggestion for improvements,
then feel free to mail me.

Simon Strandgaard <[email protected]>


Thanks for your patience.

Does it include the perl6 embedded grammars in regex stuff?
Charlie
 
S

Simon Strandgaard

Charles Comstock said:
Does it include the perl6 embedded grammars in regex stuff?
Charlie

Not yet. There is only little perl6 support.
I plan to support perl6 fully.

However there is many tasks for me to do. You are very welcome to
contribute to the project ;-)
 
S

Simon Strandgaard


Questions for those which has tried the package out.. or played with the homepage,
or just following the discussion ;-)


What do you think about this perl6 regexp thing? Is it something Ruby needs?
What do you want to use perl6 syntax fore?

The engine are going to support inline code inside regexp.
Is this a feature you would use?

Why did you played with regexp on the demo site?

Who has tried out this package? why did you chose to do that?
Did it install itself correct?

Did you browed the source code?

BTW: What happened to the Ruby-in-Ruby project ?
 
T

ts

S> The engine are going to support inline code inside regexp.
S> Is this a feature you would use?

Don't forget this

"(?{ code })"

[...]

For reasons of security, this construct is forbidden if the
regular expression involves run-time interpolation of vari-
ables, unless the perilous "use re 'eval'" pragma has been
used (see re), or the variables contain results of "qr//"
operator (see "qr/STRING/imosx" in perlop).
[...]

and there are not really parano, on p5p


Guy Decoux
 
S

Simon Strandgaard

ts said:
S> The engine are going to support inline code inside regexp.
S> Is this a feature you would use?

Don't forget this

"(?{ code })"

[...]

For reasons of security, this construct is forbidden if the
regular expression involves run-time interpolation of vari-
ables, unless the perilous "use re 'eval'" pragma has been
used (see re), or the variables contain results of "qr//"
operator (see "qr/STRING/imosx" in perlop).
[...]

and there are not really parano, on p5p

Yes security is an issue here. I think that one must supply a option when
they wish inline code to be executed. Or perhaps rely on $SAFE-level ?

code = "remember position; puts 'hello world'"
re = NewRegexp.new("xy(?{#{code}}).{42}z", INLINE_CODE)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top