Isaac said:
I'm a bit confused about what you might mean, help me understand.
Absolutely!
Do you mean small benchmark programs will be "much slower" when run
once rather than run 100 times? How much slower - 0.1x 10x 1000x ?
A description of JRuby internals will help here.
JRuby starts running almost all code in interpreted mode right now. This
is partially because the compiler is incomplete, and can't handle all
syntax, but also partially because parsing + compilation + classloading
costs more than just parsing, frequently so much more that performance
gains during execution are outweighed.
So JRuby currently has the bytecode compiler in JIT mode. As methods are
called, the number of invocations are recorded. After a certain
threshold, they are compiled. We do not do any adaptive optimization in
the compiler at present, though we do a few ahead-of-time optimizations
by inspecting the AST. Compiled code does not (with very few exceptions)
ever deoptimize.
Because of the JRuby JIT, we must balance our compilation triggers with
the JVMs. Ideally, we get things compiled quickly enough for HotSpot to
take over and make a big difference without compiling *too many* methods
or compiling them too frequently and having a negative impact on
performance.
Do you mean we should not assume small program performance is a
reasonable estimate of large program performance?
It depends how small. For example, if the top-level of a script includes
a while loop of a million iterations, it will not be indicative of an
app that has such loops in methods, as part of an object model, and so
on, because that top-level may not ever compile to bytecode (since it's
only called once) or may only execute once and never be JITed by the
JVM. Soon, when the compiler is complete, we could theoretically compile
scripts on load, but it remains to be seen if that will incur more
overhead than it is worth. And it still wouldn't solve the problem of
long-running methods or scripts that are only invoked once.
As an example, try running the two following scripts in JRuby and
comparing the results:
SCRIPT1:
t = Time.now
i = 0
while i < 10_000_000
i += 1
end
puts Time.now - t
t = Time.now
i = 0
while i < 10_000_000
i += 1
end
puts Time.now - t
t = Time.now
i = 0
while i < 10_000_000
i += 1
end
puts Time.now - t
t = Time.now
i = 0
while i < 10_000_000
i += 1
end
puts Time.now - t
t = Time.now
i = 0
while i < 10_000_000
i += 1
end
puts Time.now - t
SCRIPT2:
def looper
i = 0
while i < 10_000_000
i += 1
end
end
5.times {
t = Time.now
looper
puts Time.now - t
}
My results:
SCRIPT1:
9.389
9.194
9.207
9.198
9.191
SCRIPT2:
9.128
9.012
2.001
1.822
1.823
This is fairly typical. And this also should be of interest to you for
the alioth shooutout benchmarks; simply re-running the same script in a
loop will not allow JRuby or HotSpot to really get moving, since each
run through the script will define *new* classes and *new* methods that
must "warm up" again. You must leave the methods defined and re-run only
the work portion of the benchmark.
Incidentally, I don't think "warm-up time" works as a description of
adaptive optimization - it makes it sound like a one-time-thing,
rather than continual profiling decompilation recompilation adapting
to the current hotspot.
In our case, it's a bit of both. There's some warm-up time for JRuby to
compile to bytecode, and then there's the adaptive optimization of
HotSpot which is a bit of a black box to us. We are working to reduce
JRuby's warm up time to get HotSpot in the picture sooner.
- Charlie