[ANN] rocaml: Ruby extensions in Objective Caml

M

Mauricio Fernandez

rocaml allows you to write Ruby extensions in Objective Caml.

I never seem to manage to release things when I should, so here's a
pre-release announcement to let you know about this so you can play with it
before the actual release, which could take longer than necessary.

http://eigenclass.org/repos/rocaml/head/

Young as it is, rocaml is very usable and the generated extensions are
reliable, since they enforce type safety and handle exceptions both
in Ruby and OCaml (OCaml exceptions are passed to Ruby).

Developing Ruby extensions with rocaml is easier and more convenient than
writing a plain old C extension since rocaml performs Ruby<->OCaml conversions
for a wide range of types, including abstract types and arrays, tuples,
variants and records of values of any supported type (e.g. arrays of arrays of
variants of tuples of ...).

Making an extension with rocaml involves two steps:
* implementing the desired functionality in Objective Caml, and registering
the functions to be exported
(using Callback.register : string -> 'a -> unit)
* creating the extconf.rb file (just modify the sample extconf.rb distributed
with rocaml) defining the interface of your Objective Caml code.

** At no point is there any need to write a single line of C code when **
** using rocaml. **

The mandatory trivial example
=============================

Let's create an extension with a 'fib' function.
Here's the OCaml code:

let rec fib n = if n < 2 then 1 else fib (n-1) + fib (n-2)

let _ = Callback.register "Fib.fib" fib

Here's the interface declaration in your extconf.rb:

Interface.generate("fib") do
def_module("Fib") do
fun "fib", INT => INT
end
end

That's it. Running extconf.rb will generate all the required wrappers and
make will link them against your ml code, creating a normal Ruby extension
that can be used simply with
require 'fib'
p Fib.fib 10

Set of strings using a RB tree
==============================
Here's a simple set based on an RB tree, specialized for strings (see
examples/tree for how to create several classes from a single polymorphic
structure). The (unoptimized) RB tree takes only ~30LoCs, but lookup is 3X
faster than with RBTree, which takes >3000 lines and over ~250 lines for the
equivalent functionality, without counting the *manually written* wrappers for
the underlying C data structure.

This shows how rocaml handles complex types, including variant and recursive
types.

Given this interface definition:

Interface.generate("tree") do
string_tree_t = sym_variant("string_tree_t") do |t|
constant :Empty
non_constant :Node, TUPLE(t, type, t)
end

def_class("StringRBSet") do |c|
t = c.abstract_type
fun "empty", UNIT => t, :aliased_as => "new"
fun "make", string_tree_t => t

method "add", [t, STRING] => t
method "mem", [t, STRING] => BOOL, :aliased_as => "include?"
method "dump", t => string_tree_t
method "iter", t => t, :aliased_as => "each", :yield => [STRING, UNIT]
end
end

You can use the generated extension as follows (you can find the OCaml code
below):

require 'tree'
set = StringRBSet.new
set2 = s.add "foo" # the RB set is a functional, i.e. persistant
# data structure
# see how rocaml handles conversions for recursive variant types
p s.add("foo").dump
p s.add("foo").add("bar").dump

The above will print
[:Node, [:B, :Empty, "foo", :Empty]]
[:Node, [:B, [:Node, [:R, :Empty, "bar", :Empty]], "foo", :Empty]]

showing you the structure of the RB tree.


That's it for now, enjoy.
Further updates on eigenclass.org.

PS:
For the sake of completeness, here's the OCaml code. You can find the full
example in examples/tree.

exception Found

module RBSet =
struct
type color = R | B
type 'a t = Empty | Node of color * 'a t * 'a * 'a t

let empty = Empty

let rec mem x = function
Empty -> false
| Node(_, l, y, r) ->
if y < x then mem x l else if y > x then mem x r else true

let balance = function
B, Node(R, Node(R, a, x, b), y, c), z, d
| B, Node(R, a, x, Node(R, b, y, c)), z, d
| B, a, x, Node(R, Node(R, b, y, c), z, d)
| B, a, x, Node(R, b, y, Node(R, c, z, d)) -> Node(R, Node(B, a, x, b), y, Node(B, c, z, d))
| (c, a, x, b) -> Node (c, a, x, b)

let add x t =
let rec ins = function
Empty -> Node(R, Empty, x, Empty)
| Node(color, a, y, b) ->
if x < y then balance (color, ins a, y, b)
else if x > y then balance (color, a, y, ins b)
else raise Found
in try match ins t with
Node (_, a, y, b) -> Node(B, a, y, b)
| Empty -> assert false (* ins always returns Node _ *)
with Found -> t

let rec iter f = function
Empty -> ()
| Node(_, l, x, r) -> iter f l; f x; iter f r
end


external intset_yield : int -> unit = "IntRBSet_iter_yield"
external stringset_yield : int -> unit = "StringRBSet_iter_yield"

let identity x = x

open Callback
let _ =
let def_set t =
let r name f = register (t ^ "RBSet" ^ "." ^ name) f in
r "empty" (fun () -> RBSet.empty);
r "add" (fun t x -> RBSet.add x t);
r "mem" (fun t x -> RBSet.mem x t);
r "dump" identity;
r "make" identity;
in
List.iter def_set ["Int"; "String"];
register "IntRBSet.iter" (RBSet.iter intset_yield);
register "StringRBSet.iter" (RBSet.iter stringset_yield);
 
B

benjohn

Hi Mauricio,

a quick thought that I'm pretty sure you've already had: I notice that
you define the interface in Ruby, and also define what should be
exported in OCaml. Couldn't you, say, just define it in OCaml (where the
type sigantures will be know completely, I guess?), and have the Ruby
interface generated from this?

I've been meaning to learn oCaml; if i do, I'll definitely give this a
look.

Thanks,
Benjohn
 
M

Mauricio Fernandez

Hi Mauricio,

a quick thought that I'm pretty sure you've already had: I notice that
you define the interface in Ruby, and also define what should be
exported in OCaml. Couldn't you, say, just define it in OCaml (where the
type sigantures will be know completely, I guess?), and have the Ruby
interface generated from this?

Even though the types are known by the compiler, human intervention is needed
at some point because:
* we have to define what needs to be exported
* the naming and parameter passing conventions might differ (e.g. the
data structure often being given after the element to operate with in
functional data structures)
* polymorphic functions have too broad a type, and the concrete
type(s) you want must be specified. For instance, in the RB tree example,
two classes are generated for sets of strings and ints. Their instance
methods correspond to the same polymorphic OCaml functions, but providing
the desired concrete types allows the wrapper generated by OCaml to perform
the necessary type checking and Ruby->OCaml conversions. The alternative
would be wrapping Ruby values in OCaml with an abstract universal type, but
introducing dynamic typing defeats the purpose of writing an extension in
OCaml to some extent (it'll be slower and the type system will not help you
that much).

That said, it would be possible to encode all that information in a .ml file,
instead of splitting it into an OCaml part (which functions are to be
exported) and another in Ruby, in extconf.rb (how that functionality is
accessible from Ruby, and the method signatures). It's a bit harder to
implement though, as building the extension would involve an extra stage to
compile the file holding that information and extract it in order to generate
the wrapper. Going the other way around, specifying it all in extconf.rb and
generating the .ml code that registers the functions from it would be very
easy to implement, but would force one to use named functions(1).

In the meantime, I don't find the need to register the functions in OCaml too
onerous, as it's at most one line per method, and the extra degree of freedom
in the OCaml -> Ruby mapping is quite convenient (I can do e.g. parameter
reordering with an anonymous function).

(1) another benefit from that would be the possibility to check that the
specified types are included in those from the .cmi (a file generated by OCaml
with interface information)
 
B

benjohn

Mauricio:
*snip*
In the meantime, I don't find the need to register the functions in
OCaml too
onerous, as it's at most one line per method, and the extra degree of
freedom
in the OCaml -> Ruby mapping is quite convenient (I can do e.g.
parameter
reordering with an anonymous function).

*snip*

:) Thanks for the reasoned reply. I've got ocaml installed, and I've
started reading the manual. It feels somewhat like it is to functional
programming as c is to imperative programming at the moment, which is
interesting - but could be a useless thought ;-) I don't like the syntax
a lot at the moment, but that's just lack of experience. I'm very
intregued by the open GL bindings.
 
P

Phil Tomson

rocaml allows you to write Ruby extensions in Objective Caml.

I never seem to manage to release things when I should, so here's a
pre-release announcement to let you know about this so you can play with it
before the actual release, which could take longer than necessary.

http://eigenclass.org/repos/rocaml/head/

Young as it is, rocaml is very usable and the generated extensions are
reliable, since they enforce type safety and handle exceptions both
in Ruby and OCaml (OCaml exceptions are passed to Ruby).

Nice. I've been learning OCaml for the last few months. I'll
definitely give this a try.

Phil
 
M

Mauricio Fernandez

Mauricio:
*snip*

*snip*

:) Thanks for the reasoned reply. I've got ocaml installed, and I've
started reading the manual. It feels somewhat like it is to functional
programming as c is to imperative programming at the moment, which is
interesting - but could be a useless thought ;-)

If you mean by this that performance is easy to predict, you're very right :)

Allow me expand a bit on this. One of the distinct advantages of OCaml is that
the compiler doesn't perform any deep magic (for instance, it doesn't do
loop-invariant code motion, IIRC). How can this be a good thing? It means that
it's very easy to get from e.g. guessing why so much time is spent in the GC
to giving the finishing touches to your code, because you can predict the
effect of your modifications quite easily (and OCaml also has nice profiling
tools for both native code and bytecode). OCaml's excellent performance is a
testament to the effectiveness of a good basic compilation strategy. So you
can reword the initial statement in a less enigmatic way: Objective Caml
*doesn't need* deep magic in the compiler to yield good performance.

This is especially true when you compare OCaml to languages with a lazy
evaluation discipline like Haskell. I don't have any sizeable experience with
the latter, but I've often heard about the difficulty of predicting the
performance of code involving lazy evaluation, even for seasoned programmers,
like a single line change *mysteriously* turning some O(n) code into O(exp(n))
or the other way around. This caught my attention when I read it in Okasaki's
book[1], which I can't recommend enough:

"Historically, the most common technique for analyzing lazy programs has been
pretending that they are actually strict".

Okasaki introduces a basic framework to perform such analyzes, but the fact
remains that eager evaluation is easier to understand in that regard.
I don't like the syntax a lot at the moment, but that's just lack of
experience. I'm very intregued by the open GL bindings.

There are a few problems with the original syntax, but nothing that cannot be
solved with a couple parentheses here and there (for instance, with nested
pattern matching expressions). There is an alternative syntax ("revised
syntax") which addresses these "ambiguities" by adding some punctuation and
differentiating constructions that are arguably too hard to tell apart in the
original syntax, but it doesn't seem to be widely used (I prefer the original
one myself, but it's probably because it's the first one I was exposed to).
camlp4 can take code in one syntax and rewrite it using the other, so you
don't have to decide upfront what you will be using in the end.

[1]
Purely Functional Data Structures by Chris Okasaki, Cambridge University
Press, 1998.

You can find the PhD. thesis on which that book was based at
http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,733
Latest member
LonaMonzon

Latest Threads

Top