Hash Surprises with Fixnum, #hash, and #eql?

C

Charles Oliver Nutter

Top-replying with a general observation: you can't please everyone all the =
time.

The special-cased logic for Fixnums and Symbols in hashes is obviously
done for performance purposes. No matter what you do, checking for
method redefinitions every single time will have a performance impact.
Even checking an inline cache has an impact. When you look at how
frequently hashes are used with Fixnum or Symbol keys, you'd basically
be asking everyone to take a perf hit to do it the "right way" for a
tiny minority of use cases.

There are also plenty of other cases in all the implementations where
modifying critical core classes does not get reflected during
execution. For example, some impls treat operator calls against Fixnum
as always being the Fixnum version, regardless of modifications. This
allows using a fast type-identity check rather than a cache check or
class-modification check, and it can make a *huge* difference for raw
numeric performance.

In this case I think there's a fine line between consistency and
zealotry. The *vast* majority of Ruby users will never reopen and
modify Fixnum or Symbol, so it's a 99%-safe assumption that "fast"
logic for those types is just fine, especially if it's a noticeable
perf boost for the 99% of users. We're talking about the lowest-level
values in the system...if they can't be made fast, everything else
suffers.

JRuby follows MRI largely because of the perf improvement, but also
partially because MRI does it this way. If MRI always dispatched, we'd
do what we need to do to always dispatch (and we do have other ways
internally to reduce -- but not eliminate -- the modification check).

A side note on JRuby's optimization strategy over the years:

1. We find a largely-invariant piece of logic that could be optimized,
like fixnum operators or hashes of symbols
2. We come up with an optimization that may diverge slightly from
"pure" behavior and add an opt-in flag for that optimization
3. Based on user reports, test runs, and so on, we may eventually turn
the optimization on all the time and make the flag be opt-out

We've been more conservative than other impls, even.

- Charlie

 
C

Clifford Heath

The special-cased logic for Fixnums and Symbols in hashes is obviously
done for performance purposes. No matter what you do, checking for
method redefinitions every single time will have a performance impact.
Yes.

Even checking an inline cache has an impact.

You are mixing up the situations where there is no sane
case for allowing modifications, from the many fewer
ones where there is. No sane person would want to change
the implementation of 1+1; but someone can and has
implemented Fixnum+Complex. That works because Fixnum
will always call coerce where needed, so there's no need
to guard the optimisations.

In the few remaining cases where there is a good case
for supporting modifications, the minuscule cost of
a check would be justified. A single variable (saying
"this class has been modified from its standard form")
would take up a cache line, but the test would play
into branch prediction, so the actual effect would be
tiny.

I know you guys have done amazing thing to achieve the
performance that we now have, but please don't forget
why people choose Ruby; it's clean and consistent.

Another thing that could be done to assist; make sure
that a Hash only ever calls eql? on objects in the hash,
not on lookup keys. At least that way, if a non-standard
object is in the hash, it can still be found using values
that *it* considers equivalent. This change would cost
*nothing*, it would just make Ruby more consistent.
In this case I think there's a fine line between consistency and
zealotry. The *vast* majority of Ruby users will never reopen and
modify Fixnum or Symbol,

People don't do it because it doesn't work, not because
it wouldn't be useful. Inability to make classes that act
like numbers is perhaps the biggest wart on an otherwise
clean language, on a par with Javascript using float for
all numbers.
JRuby follows MRI largely because of the perf improvement, but also
partially because MRI does it this way.

But JRuby does it differently from MRI. Try the code in
the gist I previously sent, you'll see that's true.
We've been more conservative than other impls, even.

and yet JRuby's Hash optimises both eql? and hash for Fixnums,
where MRI only optimises hash.

Clifford Heath.
 
R

Robert Klemme

You are mixing up the situations where there is no sane
case for allowing modifications, from the many fewer
ones where there is. No sane person would want to change
the implementation of 1+1; but someone can and has
implemented Fixnum+Complex. That works because Fixnum
will always call coerce where needed, so there's no need
to guard the optimisations.

I think Charly got it exactly right.
In the few remaining cases where there is a good case
for supporting modifications, the minuscule cost of
a check would be justified. A single variable (saying
"this class has been modified from its standard form")
would take up a cache line, but the test would play
into branch prediction, so the actual effect would be
tiny.

Frankly, since what you are attempting seems a rather rare case I'd
say it should be the way it is. After all, you can easily monkeypatch
Hash[] etc. to get the behavior you desire. That way not 99% of
usages of Hash have to suffer for 1% needing to code less. I think
that is a fair balance.

I don't understand why you insist on changing a core class for your
rare case of making different classes equivalent and causing potential
harm for many, many users of Ruby instead of just going ahead and also
monkey patch Hash since you did already so for Fixnum. On one hand
you use Ruby's openness to change core classes to achieve what you
want but on the other you seem to refuse to change another to make
your change complete. The only reason for this that I can detect is
that you were surprised and your expectations were not met. But now
since you have learned otherwise what stops you from dealing with the
situation in the pragmatic way that is so typical for Ruby?
I know you guys have done amazing thing to achieve the
performance that we now have, but please don't forget
why people choose Ruby; it's clean and consistent.

... most of the time. But it also tries to balance reasonable
usability with above than awful performance.
Another thing that could be done to assist; make sure
that a Hash only ever calls eql? on objects in the hash,
not on lookup keys. At least that way, if a non-standard
object is in the hash, it can still be found using values
that *it* considers equivalent. This change would cost
*nothing*, it would just make Ruby more consistent.

But it also would not have any positive impact for all others plus
that a change always brings a certain amount of risk of introducing
errors. Note though, that there might be situations where you want
the exact opposite: you have a Fixnum key and pass a
SomethingFixnumLinke lookup key and want the match to succeed. You
can only have it one way. You happen to need the way that is not
possible right now.

Apart from that it feels more natural to me to let the key passed in
compare internal key for equivalence because this key is what should
determine whether I have a match or not.
People don't do it because it doesn't work, not because
it wouldn't be useful.

How do you know?
Inability to make classes that act
like numbers is perhaps the biggest wart on an otherwise
clean language, on a par with Javascript using float for
all numbers.

We *can* make classes that act like numbers (as has been demonstrated
often enough). And this works remarkably well.
and yet JRuby's Hash optimises both eql? and hash for Fixnums,
where MRI only optimises hash.

It is in the nature of optimizations that they are done differently on
different platforms. Actually they have to because characteristics of
all platforms are different.

Cheers

robert
 
C

Clifford Heath

monkey patch Hash since you did already so for Fixnum. On one hand
you use Ruby's openness to change core classes to achieve what you
want but on the other you seem to refuse to change another to make
your change complete. The only reason for this that I can detect is
that you were surprised and your expectations were not met. But now
since you have learned otherwise what stops you from dealing with the
situation in the pragmatic way that is so typical for Ruby?

I'm trying to express a logical programming paradigm, for users
of my API - for myself I'm happy to work around whatever I need
to.
But it also would not have any positive impact for all others plus
that a change always brings a certain amount of risk of introducing
errors. Note though, that there might be situations where you want
the exact opposite:

No. The reason a Hash uses eql? is to know whether it has found the
element which it's looking for *in the Hash*. It should only do that
comparison in one direction.
You happen to need the way that is not possible right now.

On the contrary, I could live with it being either way.
The different interpreters do it both ways, or not at all.
That is the fact. That you've missed it makes me think you
*still* haven't actually run my code on the three interpreters
I mentioned. if that's true, it just underscores my suspicion
that you're engaging in defensive sophistry rather than rational
argument.
How do you know?

Because, as you say they deal with things in "the pragmatic way that
is so typical for Ruby". in other words, they're prepared to hack
around Ruby's hacks. Forgive me for the crime of suggesting that we
should improve things instead.

Clifford Heath.
 
C

Charles Oliver Nutter

On the contrary, I could live with it being either way.
The different interpreters do it both ways, or not at all.
That is the fact. That you've missed it makes me think you
*still* haven't actually run my code on the three interpreters
I mentioned. if that's true, it just underscores my suspicion
that you're engaging in defensive sophistry rather than rational
argument.

Perhaps I'm missing something. Should I be seeing different results than this?

https://gist.github.com/917031

- Charlie
 
C

Clifford Heath

Perhaps I'm missing something. Should I be seeing different results than this?

https://gist.github.com/917031

That result is reasonable, using MRI. Interesting, and nice, that it
figures out it doesn't need to call Fixnum#hash to be able to choose
a bucket. (BTW, that *uncommon* case means there's a test and branch
which is superfluous and excessively costly for the more common case,
according to arguments you've made against detecting monkey patching)

If you do the same thing with JRuby however, the first test contains this:

Looking up using Integer:
nil

I.e. using Integer, JRuby doesn't call Fixnum#eql? the way MRI does.

Do it again using Rubinius, and you get this:

Looking up using Integer:
<Fixnum 23>.hash => 47
nil

I.e. Rubinius calls Fixnum#hash, but not Fixnum#eql?

So both non-MRI interpreters fail, and in different ways.

Clifford Heath.
 
R

Robert Klemme

On the contrary, I could live with it being either way.
The different interpreters do it both ways, or not at all.
That is the fact. That you've missed it makes me think you
*still* haven't actually run my code on the three interpreters
I mentioned. if that's true, it just underscores my suspicion
that you're engaging in defensive sophistry rather than rational
argument.

I have run the code and I haven't missed the fact. Right, that wasn't
obvious from my statement above. I qualify the behavior you are
seeing as "undefined" (across all Ruby implementations) so it doesn't
matter to me in which ways the test fails on different platforms.
Because, as you say they deal with things in "the pragmatic way that
is so typical for Ruby". in other words, they're prepared to hack
around Ruby's hacks.

I asked how you know that there is a significant number of cases where
people stumble into this and what you basically say is "I know it
because people pragmatically work around this and do not mention it".
This could not be further away from anything like evidence or even a
hint.
Forgive me for the crime of suggesting that we
should improve things instead.

I'm all for improvements. But that's the exact point where we differ:
in my opinion there is no overall improvement - as Charlie laid out
nicely: there may be an improvement for a rare case but many other,
more regular cases suffer.

Regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
C

Charles Oliver Nutter

That result is reasonable, using MRI. Interesting, and nice, that it
figures out it doesn't need to call Fixnum#hash to be able to choose
a bucket. (BTW, that *uncommon* case means there's a test and branch
which is superfluous and excessively costly for the more common case,
according to arguments you've made against detecting monkey patching)

If you do the same thing with JRuby however, the first test contains this:

This is JRuby 1.6.1.

- Charlie
 
C

Clifford Heath

I'm all for improvements. But that's the exact point where we differ:
in my opinion there is no overall improvement - as Charlie laid out
nicely: there may be an improvement for a rare case but many other,
more regular cases suffer.

Yeah, I get that argument, and I agree with it. But it's a general rule,
and Charlie applied it to the generalised case regarding whether, in
a fit of pure zealotry, we should require Rubies to support people who
want to redefine 1+1. I never suggested that we do, and in the case I
am requesting investigation, I frankly don't believe that the performance
penalty would even be measurable. Given a suitably designed fix, of
course. You ask for evidence; so do I.
 
C

Charles Oliver Nutter

Yep... same as 1.6.0 and 1.5.6. What's you point?

I'm confused...you made it sound as though JRuby results would be
different than the ones I posted. Am I missing something? Those
results *were* run with JRuby, and I was hoping you could tell me what
was missing...

- Charlie
 
C

Clifford Heath

I'm confused...you made it sound as though JRuby results would be
different than the ones I posted. Am I missing something? Those
results *were* run with JRuby, and I was hoping you could tell me what
was missing...

I presented results from MRI 1.8.7, JRuby 1.6.0, and Rubinius,
and showed that they all had different shortcuts, and none
reliably kept the Hash contract (of using eql? and hash).
I.e. you can't rely on sensible code working the same in MRI
and JRuby.

I think that's a problem. If you don't, then I'm done...

Unless you care to point me to the place in the JRuby code
where this shortcut occurs (where I could make a change to
make it invisible), and a performance benchmark that would
show the effect of doing so. Then I'll happily make the
experiment to see whether I'm right (and the shortcut can
be made invisible without affecting performance measurably,
i.e. above the noise level of the benchmark). If I'm not,
I'll openly admit I was wrong... but I've done some pretty
hardcore optimizing in machine code before, and I think I
can win this one.
 
C

Charles Oliver Nutter

I presented results from MRI 1.8.7, JRuby 1.6.0, and Rubinius,
and showed that they all had different shortcuts, and none
reliably kept the Hash contract (of using eql? and hash).
I.e. you can't rely on sensible code working the same in MRI
and JRuby.

I posted results on JRuby master (1.6.1). You responded to that email with:

"That result is reasonable, using MRI. Interesting, and nice, that it
figures out it doesn't need to call Fixnum#hash to be able to choose
a bucket. (BTW, that *uncommon* case means there's a test and branch
which is superfluous and excessively costly for the more common case,
according to arguments you've made against detecting monkey patching)

If you do the same thing with JRuby however, the first test contains this:

Looking up using Integer:
nil"

I'm not sure you actually looked at my results.
I think that's a problem. If you don't, then I'm done...

It may be a problem, or it may not. When running your example,
however, it seems to call your monkeypatched code in many places where
you claim it doesn't. So I'm still confused. I know we don't call
hash/eql? in all cases, but I'm trying to quantify what the correct
behavior should be and what that behavior would cost.

I'd be happy to continue discussing this as a JRuby issue. Would you
file something at http://bugs.jruby.org with expected and actual JRuby
1.6.1 results?
Unless you care to point me to the place in the JRuby code
where this shortcut occurs (where I could make a change to
make it invisible), and a performance benchmark that would
show the effect of doing so. Then I'll happily make the
experiment to see whether I'm right (and the shortcut can
be made invisible without affecting performance measurably,
i.e. above the noise level of the benchmark). If I'm not,
I'll openly admit I was wrong... but I've done some pretty
hardcore optimizing in machine code before, and I think I
can win this one.

I think you are underestimating the cost of performing a dynamic call.
Even in an optimizing VM (like JRuby/JVM) there's a much higher cost
for a dynamic call to "hash" than to just check that it's a Fixnum and
branch to custom logic. *Way* higher cost.

Here's stock JRuby 1.6.1, which isn't dispatching to "hash" for Fixnums:

~/projects/jruby =E2=9E=94 jruby --server -rbenchmark -e "5.times { h =3D {=
};
h[1000] =3D 1000; puts Benchmark.measure { 10_000_000.times { h[1000] }
} }"
0.908000 0.000000 0.908000 ( 0.859000)
0.623000 0.000000 0.623000 ( 0.622000)
0.699000 0.000000 0.699000 ( 0.699000)
0.747000 0.000000 0.747000 ( 0.747000)
0.753000 0.000000 0.753000 ( 0.753000)

Here's the same benchmark, dispatching to "hash" through a per-class
cache (faster than typical call-site caching, roughly on par with
inlined calls):

~/projects/jruby =E2=9E=94 jruby --server -rbenchmark -e "5.times { h =3D {=
};
h[1000] =3D 1000; puts Benchmark.measure { 10_000_000.times { h[1000] }
} }"
1.634000 0.000000 1.634000 ( 1.580000)
1.297000 0.000000 1.297000 ( 1.297000)
1.356000 0.000000 1.356000 ( 1.355000)
1.334000 0.000000 1.334000 ( 1.334000)
1.343000 0.000000 1.343000 ( 1.344000)

Now using an even faster check, that only dispatches to "hash" if the
object is a Fixnum and the Fixnum class has not been reopened. Notice
it's faster than full dyncall, but still a good bit slower than the
fast path:

~/projects/jruby =E2=9E=94 jruby --server -rbenchmark -e "5.times { h =3D {=
};
h[1000] =3D 1000; puts Benchmark.measure { 10_000_000.times { h[1000] }
} }"
1.057000 0.000000 1.057000 ( 1.014000)
0.977000 0.000000 0.977000 ( 0.976000)
0.885000 0.000000 0.885000 ( 0.885000)
0.903000 0.000000 0.903000 ( 0.903000)
0.871000 0.000000 0.871000 ( 0.871000)

And 1.9.2 to compare:

~/projects/jruby =E2=9E=94 ruby1.9 -rbenchmark -e "5.times { h =3D {}; h[10=
00] =3D
1000; puts Benchmark.measure { 10_000_000.times { h[1000] } } }"
1.270000 0.010000 1.280000 ( 1.313950)
1.270000 0.010000 1.280000 ( 1.307163)
1.270000 0.000000 1.270000 ( 1.295588)
1.260000 0.010000 1.270000 ( 1.285083)
1.260000 0.010000 1.270000 ( 1.307108)

Bottom line is that *any* additional branching logic will add
overhead, and full dynamic calling introduces even more overhead on
just about any implementation. Whether that's a fair trade-off is not
for me to decide ;)

- Charlie
 
C

Clifford Heath

Charles,

Thanks for persisting with me. I'm not deliberately being unreasonable :)
I think there are real issues around this case and others like them,
regarding what we expect from Ruby's standard behaviour.

It turns out that I've worked around the original issue, which occurred
in my forthcoming activefacts-api gem (forthcoming because although it's
old code, its about to get to 1.0 and get publicly announced). The
workaround is mostly successful in hiding the problem(s).

But anyhow...

I posted results on JRuby master (1.6.1). You responded to that email with:

"That result is reasonable, using MRI. Interesting, and nice, that it
figures out it doesn't need to call Fixnum#hash to be able to choose
a bucket. (BTW, that *uncommon* case means there's a test and branch
which is superfluous and excessively costly for the more common case,
according to arguments you've made against detecting monkey patching)

I'm sorry for this comment, it's wrong. I must have been confused.
MRI doesn't call Fixnum#hash in any case, but uses a shortcut.
If you do the same thing with JRuby however, the first test contains this:
Looking up using Integer:
nil"
I'm not sure you actually looked at my results.
It may be a problem, or it may not. When running your example,
however, it seems to call your monkeypatched code in many places where
you claim it doesn't. So I'm still confused.

I think that will have to go down to communication error. Let it ride.
I know we don't call
hash/eql? in all cases, but I'm trying to quantify what the correct
behavior should be and what that behavior would cost.

The correct behavior would result in that last "Looking up using Integer:"
to find the value, as MRI does. MRI doesn't shortcut Fixnum#eql? in the
hash lookup, and my Number hash function is defined in terms of Integer#hash,
so it works. It would be better if MRI knew it shouldn't shortcut Fixnum#hash,
but I can live without that as long it's clear.
I'd be happy to continue discussing this as a JRuby issue. Would you
file something at http://bugs.jruby.org with expected and actual JRuby
1.6.1 results?

Ok, perhaps. I think it's a Ruby issue though, not just a JRuby one.
I think you are underestimating the cost of performing a dynamic call.

I'm not. I'm expecting that JRuby would detect that a core Fixnum
method has been monkey-patched, and set a global variable. If the
variable is set (an inline check, susceptible to branch-prediction)
then it would default to conservative behaviour by calling dispatch,
otherwise continue with the shortcuts as normal.

I think this is what you mean when you say:
Now using an even faster check, that only dispatches to "hash" if the
object is a Fixnum and the Fixnum class has not been reopened. Notice
it's faster than full dyncall, but still a good bit slower than the
fast path:

If I read the result correctly, that's 0.871/0.753 or 15.6% slower,
on a test that does nothing but Fixnum hash lookups in a hash with
a single element.

What I'd ask is, what's the percentage of time in actual programs
typically spent doing Fixnum hash lookups? I'd wager it's less than
1%, meaning your 16% is now 0.16% - and it makes the behaviour
incorrect per the Ruby spec. Personally, I'd take the hit.

The cost of the inline check of this variable is what I was implying
would vanish into the dust in performance tests, as the branch
prediction figures out that "we always go this way". Is your
implementation of this check implemented to make best advantage of
branch prediction? Frankly I'm rather amazed that it's as big as 18%
merely to decide to use the shortcut. Does that %age drop when using
a hash with more than one entry? It seems like a benchmark that was
created to magnify the change, not to realistically demonstrate the
effect.
Bottom line is that *any* additional branching logic will add
overhead, and full dynamic calling introduces even more overhead on
just about any implementation. Whether that's a fair trade-off is not
for me to decide ;)

Well, yes and no. If the effect is 0.16% in typical apps, you'd be
well within your right to maintain it should work like MRI and/or
the documentation.

Do you still think this needs a JRuby bug report?

Clifford Heath.
 
R

Robert Klemme

On 04/20/11 08:41, Charles Oliver Nutter wrote:

I'm not. I'm expecting that JRuby would detect that a core Fixnum
method has been monkey-patched, and set a global variable. If =A0the
variable is set (an inline check, susceptible to branch-prediction)
then it would default to conservative behaviour by calling dispatch,
otherwise continue with the shortcuts as normal.
The cost of the inline check of this variable is what I was implying
would vanish into the dust in performance tests, as the branch
prediction figures out that "we always go this way".

Please keep in mind that in a multithreaded environment there is
synchronization overhead. A solution could use an AtomicBoolean
stored somewhere as static final. Now all threads that need to make
the decision need to go through this. Even if it is "only" volatile
semantics (and not synchronized) and allows for concurrent reads there
is a price to pay. Using a ThreadLocal which is initialized during
thread construction or lazily would reduce synchronization overhead at
the risk of the flag value becoming outdated - an issue which becomes
worse with thread lifetime. Applications which use a thread pool
could suffer.

But I agree, the effect vastly depends on the frequency of Hash
accesses with a Fixnum key. Unfortunately I guess nobody has figures
about this - and even if, those will probably largely vary with type
of application.

Kind regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
C

Clifford Heath

Please keep in mind that in a multithreaded environment there is
synchronization overhead. A solution could use an AtomicBoolean

Oh get real. This is a single variable which may, during the course of
a single execution, *once* change from false to true. In so doing, it
enables a slightly more conservative approach to compatibility for one
small side-effect of a shortcut, which probably doesn't even matter to
the application, and which is almost certainly set by the one thread
that cares about that side effect. It *so* doesn't need to be synchronised.
 
R

Robert Klemme

Oh get real. This is a single variable which may, during the course of
a single execution, *once* change from false to true.

In multithreaded applications the change frequency is not important.
What's important is the access frequency (read and write) because even
if memory does not change access needs to be guarded by proper
synchronization means for an application to work correctly in light of
multiple threads.
In so doing, it
enables a slightly more conservative approach to compatibility for one
small side-effect of a shortcut, which probably doesn't even matter to
the application, and which is almost certainly set by the one thread
that cares about that side effect. It *so* doesn't need to be synchronise=
d.

As I said, it does not need to be guarded by "synchronized". But it
needs to be at least volatile for multithreaded applications to work
properly - otherwise you might see odd effects. For each *access*
(that is write *and* read) there needs to be a memory barrier which
essentially means that modified thread local memory must be published
so other threads can see it. Since many pages might be modified and
there is NUMA this can actually have a measurable cost.

The JVM memory model is a fascinating topic but significantly more
complex than one might be inclined to believe at first.

Kind regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
J

John W Higgins

[Note: parts of this message were removed to make it a legal post.]

Good Morning,

Oh get real. This is a single variable which may, during the course of
a single execution, *once* change from false to true. In so doing, it
enables a slightly more conservative approach to compatibility for one
small side-effect of a shortcut, which probably doesn't even matter to
the application, and which is almost certainly set by the one thread
that cares about that side effect. It *so* doesn't need to be synchronised.
I'm sorry but the old saying "people in glass houses shouldn't throw stones"
really comes to mind here. This entire thread has been a rant against
implementation semantics being "inconsistent" in 0.001% of applications (if
that) and then you come back with - oh we can cheat here because your fix
would "almost certainly set by the one thread that care about that side
effect".

You don't get to cut a corner in providing a solution to a problem you
believe is cutting a corner. You are asking every one else to take a
performance hit (however small) to fix a problem that you won't even fix
properly? That doesn't seem appropriate at all.

John
 
C

Charles Oliver Nutter

I'm not. I'm expecting that JRuby would detect that a core Fixnum
method has been monkey-patched, and set a global variable. If =C2=A0the
variable is set (an inline check, susceptible to branch-prediction)
then it would default to conservative behaviour by calling dispatch,
otherwise continue with the shortcuts as normal.

I think this is what you mean when you say:


If I read the result correctly, that's 0.871/0.753 or 15.6% slower,
on a test that does nothing but Fixnum hash lookups in a hash with
a single element.

What I'd ask is, what's the percentage of time in actual programs
typically spent doing Fixnum hash lookups? I'd wager it's less than
1%, meaning your 16% is now 0.16% - and it makes the behaviour
incorrect per the Ruby spec. Personally, I'd take the hit.

The cost of the inline check of this variable is what I was implying
would vanish into the dust in performance tests, as the branch
prediction figures out that "we always go this way". Is your
implementation of this check implemented to make best advantage of
branch prediction? Frankly I'm rather amazed that it's as big as 18%
merely to decide to use the shortcut. Does that %age drop when using
a hash with more than one entry? It seems like a benchmark that was
created to magnify the change, not to realistically demonstrate the
effect.

Well, it has to get the value from somewhere. In JRuby, the
Fixnum/Float checks are each a field on org.jruby.Ruby (the JRuby
"runtime" object in play), which is accessible via a field on
org.jruby.runtime.ThreadContext (passed on the call stack to most
methods) or via the current object's metaclass, an org.jruby.RubyClass
object with a "runtime" field. So at a best, it's two field
dereferences and one (inlined) virtual method invocation, and at worst
it's three field references and two (probably inlined) virtual method
invocations. Perhaps that defeats branch prediction, since we're at
least two memory hops away from the value we're checking.

Branch prediction works great when the data is in a register. It
doesn't do as well when there's memory dereferences in the way.
Well, yes and no. If the effect is 0.16% in typical apps, you'd be
well within your right to maintain it should work like MRI and/or
the documentation.

Do you still think this needs a JRuby bug report?

In general we have followed the path of matching MRI in such cases,
except when there's a very strong argument not to do so. If you can
convince Ruby core, or if you think you have a strong enough case for
deviating from MRI's behavior, go ahead and file the bug and we can
continue exploring it there.

- Charlie
 
C

Charles Oliver Nutter

Please keep in mind that in a multithreaded environment there is
synchronization overhead. =C2=A0A solution could use an AtomicBoolean
stored somewhere as static final. =C2=A0Now all threads that need to make
the decision need to go through this. =C2=A0Even if it is "only" volatile
semantics (and not synchronized) and allows for concurrent reads there
is a price to pay. =C2=A0Using a ThreadLocal which is initialized during
thread construction or lazily would reduce synchronization overhead at
the risk of the flag value becoming outdated - an issue which becomes
worse with thread lifetime. =C2=A0Applications which use a thread pool
could suffer.

In this case, I'm not using a synchronized, atomic, *or* boolean
field. Because of the rarity of Fixnum and Float modification and the
potential for heavy perf impact, I'm considering redefinition of
methods in one thread while another thread is calling those methods as
somewhat undefined, at least for Fixnum and Float. That's not perfect
(JVM could optimize such that one thread's modifications never are
seen by another thread), but it's closer.

It's also worth pointing out that usually modifications to Fixnum or
Float are done for DSL purposes, where there's less likelihood of
heavy threading effects.

You're right, though...if I made that field volatile (it doesn't need
to be Atomic, since I only ever read *or* write, never both), the perf
impact would be higher.
But I agree, the effect vastly depends on the frequency of Hash
accesses with a Fixnum key. =C2=A0Unfortunately I guess nobody has figure= s
about this - and even if, those will probably largely vary with type
of application.

I operate at too low a level to see the 10000-foot view of application
performance. In other words, I spend my time optimizing individual
core methods, individual Ruby language features, and runtime-level
operations like calls and constant lookup...rather than really looking
at full app performance. Once you get to the scale of a real app, the
performance bottlenecks from badly-written code, slow IO, excessive
object creation, slow libraries and other userland issues almost
always trump runtime-level speed. As an example, I point at the fact
that Ruby 1.9 is almost always much faster than Ruby 1.8, but Rails
under Ruby 1.9 is only marginally faster than Rails on Ruby 1.8 (or so
I've seen when people try to measure it).

The benefit of a faster and faster runtime is often outweighed by
writing better Ruby code in the first place. But I don't live in the
application world...I work on JRuby low-level performance. You have to
do the rest :)

- Charlie
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,122
Messages
2,570,717
Members
47,283
Latest member
VonnieEwan

Latest Threads

Top