Writing a "String#jindex" method to do the same for "index" as "String#jlength" does for "length"

G

Greg Hurrell

So I've been using the 'jcode' library to get UTF-8 support and wanted
to add a convenience method called "jindex" that would work with multi-
byte characters in much the same way that "jlength" does.

The basic idea is to call the real "index" method to do the work, and
then convert the returned byte-count index into a character-count
index before returning it to the user. This works fine, but the
problem is that I can't find a way to propagate $~ back to the user.

That is, after calling this:

'foo'.index /foo/

$~ is set and $~[0] is "foo".

But when I do this.

'foo'.jindex /foo/

$~ is nil. The following short irb session demonstrates what's
happening in the two cases:

irb(main):001:0> class StringTest < String
irb(main):002:1> def index foo
irb(main):003:2> r = super
irb(main):004:2> end
irb(main):005:1> end
=> nil
irb(main):006:0> StringTest.new('foo').index /o/
=> 1
irb(main):007:0> $~[0]
NoMethodError: undefined method `[]' for nil:NilClass
from (irb):7
from :0
irb(main):008:0> String.new('foo').index /o/
=> 1
irb(main):009:0> $~[0]
=> "o"

So I guess this is a more general question. If something inside a
method sets $~, how do you propagate this back to the caller? There
was a thread called "How to preserve $~ when extending Regexp#match"
last year that touched on a similar issue (http://groups.google.com/
group/comp.lang.ruby/browse_thread/thread/73ccf213a9c0802d/
a93554cd466acb44). This leads me to believe that the
"special_local_set" function is being used under the hood by the
"index" method and there will be no way for my "jindex" method to
mimick its behaviour without also getting down into the C extension
level, something I am not too keen to do as it is probably a bit over
my head for now...

Are my worst fears right on this one?

Cheers,
Greg
 
G

Greg Hurrell

So I guess this is a more general question. If something inside a
method sets $~, how do you propagate this back to the caller? There
was a thread called "How to preserve $~ when extending Regexp#match"
last year that touched on a similar issue (http://groups.google.com/
group/comp.lang.ruby/browse_thread/thread/73ccf213a9c0802d/
a93554cd466acb44). This leads me to believe that the
"special_local_set" function is being used under the hood by the
"index" method and there will be no way for my "jindex" method to
mimick its behaviour without also getting down into the C extension
level, something I am not too keen to do as it is probably a bit over
my head for now...

For the record, I decided to bite the bullet and try writing my first
C extension for Ruby. It works and uses rb_backref_get and
rb_backref_set to ensure that $~ gets correctly propagated back to the
caller. If anyone knows of a clean, pure-Ruby way to do this (without
a C extension) please let me know as it would be better from a
portability/deployment perspective.

Cheers,
Greg
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,969
Messages
2,570,161
Members
46,705
Latest member
Stefkari24

Latest Threads

Top