Jon Harrop said:
Let's do a single-threaded benchmark first.
Symbolic rewriting would make a good benchmark. Here is one:
http://www.lambdassociates.org/studies/study10.htm
Try implementing this using reference counting and we can see how it
compares (probably on a trickier rewrite). Replace the arbitrary-precision
ints with doubles.
The (old) machine I read news on is not nearly so powerful as Mark's
so for reference here's the time for a slightly modified version[1] of
the O'Caml code :-
$ ocamlopt -v
The Objective Caml native-code compiler, version 3.08.2
Standard library directory: /usr/lib/ocaml/3.08
$ ocamlopt simplify.ml -o simplify
$ time ./simplify
Took 5.020000s
real 0m5.137s
user 0m5.021s
sys 0m0.005s
Now for C code that uses naive reference counting :-
$ cc -v
Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.5/specs
Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux
Thread model: posix
gcc version 3.3.5 (Debian 1:3.3.5-8ubuntu2.1)
$ cc -O3 -Wall simplify.c -o simplify
$ time ./simplify
real 0m16.757s
user 0m16.553s
sys 0m0.003s
$
That's over 3x slower than O'Caml so O'Caml/GC is the winner, right?
We'll if your only choice is O'Caml or (naive) reference counting for
this problem then yes. However, there are other approaches. For
example, the following is the time for C code that uses a hand-coded
version of what a sufficiently smart compiler that implemented
linear/unique types could produce :-
$ time ./simplify
real 0m2.812s
user 0m2.751s
sys 0m0.002s
So on my machine naive reference counting is 3x slower than O'Caml GC
but the linear version is 2x faster than O'Caml. Should we conclude
that (O'Caml) GC and naive reference counting both suck?
My conclusion is that while the benchmark clearly measures something
it isn't clear to me that what it measures is interesting/useful. It
may be possible to make it useful in which case the benchmark should
also track peak/average heap size. At present that's pointless
because the the test expression is so small that the working set for
each iteration should only be a few hundred bytes. This will fit in
the nursery of any generational collector or only require a copying
collector to traverse a few hundred bytes. That's very GC friendly.
------------------
[1] Compiling the code from the site gave the following :-
$ ocamlopt simplify.ml -o simplify
No implementations provided for the following modules:
Num referenced from simplify.cmx
Rather than work out if I'm suffering from using an out of date
O'Caml or it just isn't installed right, I just changed the code to
use a good old data type and that works fine :-
$ cat simplify.ml
type expr = Int of int | Var of string | Add of expr * expr | Mul of expr * expr;;
let int n = (Int n);;
let ( +: ) f g = Add(f, g) and ( *: ) f g = Mul(f, g);;
let test_expr =
Var "x" *: ((int 12 *: int 0 +: (int 23 +: int 8)) +: Var "y");;
let time f x =
let t = Sys.time() in
let f_x = f x in
Printf.printf "Took %fs\n" (Sys.time() -. t);
f_x;;
let rec loop n f x = if n=1 then f x else (ignore(f x); loop (n-1) f x);;
let rec ( +: ) f g = match f, g with
| Int n, Int m -> Int (n + m)
| (Int 0), e | e, (Int 0) -> e
| f, Add(g, h) -> f +: g +: h
| f, g -> Add(f, g);;
let rec ( *: ) f g = match f, g with
| Int n, Int m -> Int (n * m)
| (Int 0), e | e, (Int 0) -> (Int 0)
| (Int 1), e | e, (Int 1) -> e
| f, Mul(g, h) -> f *: g *: h
| f, g -> Mul(f, g);;
let rec simplify = function
| Int _ | Var _ as f -> f
| Add (f, g) -> simplify f +: simplify g
| Mul (f, g) -> simplify f *: simplify g;;
time (loop 10000000 simplify) test_expr;;