optimsed HashMap

D

Daniele Futtorovic

How would interning help? The input is read only once anyway

Depends on the input, of course. But natural text on the web (which
appears to be what this is about) is quite likely to contain the same
words more than once each.
and if you
mean to intern individual words of the input then how does the JVM do
the interning?

Like it does all interning? I must admit I couldn't lay out the details
off the top of my head, but the JLS should have them within reasonable
accuracy.

Of course, this would only be an option for a batch-like program. You
wouldn't want to clutter the string pool of a long-running app.

Interning would also perhaps allow one to use an IdentityHashMap, and
thus doing away with the (probably relatively costly) string comparisons.

For sure, this wouldn't be a replacement for more sophisticated
solutions, but could one of the things to try if it is to be kept "simple".
My guess would be that some form of hashing would be
used there as well - plus that internal data structure must be thread
safe...

True.
 
D

Daniele Futtorovic

Depends on the input, of course. But natural text on the web (which
appears to be what this is about) is quite likely to contain the same
words more than once each.


Like it does all interning? I must admit I couldn't lay out the details
off the top of my head, but the JLS should have them within reasonable
accuracy.

Of course, this would only be an option for a batch-like program. You
wouldn't want to clutter the string pool of a long-running app.

Interning would also perhaps allow one to use an IdentityHashMap, and
thus doing away with the (probably relatively costly) string comparisons.

For sure, this wouldn't be a replacement for more sophisticated
solutions, but could one of the things to try if it is to be kept "simple".


True.

Hm. According to Roedy himself
(<http://www.mindprod.com/jgloss/interned.html#UNDERTHEHOOD>), the JVM
uses a HashMap for intern()'d string lookup. So there may be no point in
doing it after all.
 
A

Arved Sandstrom

java.util.concurrent.ConcurrentMap has putIfAbsent().
It's almost certainly a reflection (no pun intended) of what I do with
Maps, but a common variant use case for me is

putEmptyListAndAddValueToListIfAbsent

I don't normally use ConcurrentHashMap so the usual idiom for me is a
contains() check that may put() the new empty list as a value,
afterwards the list (empty or otherwise) is always got, and the actual
value added to it.

AHS
 
E

Eric Sosman

It's almost certainly a reflection (no pun intended) of what I do with
Maps, but a common variant use case for me is

putEmptyListAndAddValueToListIfAbsent

I don't normally use ConcurrentHashMap so the usual idiom for me is a
contains() check that may put() the new empty list as a value,
afterwards the list (empty or otherwise) is always got, and the actual
value added to it.

I usually write it as

List<V> list = map.get(key);
if (list == null) {
list = new SomeKindOfList();
map.put(key, list);
}
list.add(newValue);

.... which involves one extra lookup per mapped key, not one
extra lookup per query.

The imaginary getOrAddEntry() would be nice, although it
might complicate the Map.Entry interface itself (you'd want a
way to distinguish "This key maps to null" from "This key is
newly entered").
 
D

Daniel Pitts

java.util.concurrent.ConcurrentMap has putIfAbsent().
putIfAbsent isn't exactly what I want. First, I might not want to go to
the expense of creating a new value *unless* there isn't already one.
Second, this is an operation that may not need the overhead of a
concurrency-safe map.

The first problem *can* be overcome by using a wrapper object, but that
adds additional overhead, and the point of this is to reduce overhead.
 
D

Daniel Pitts

Here is the implementation of intern(), quoted in its
entirety from the Java 1.7 source:

public native String intern();

Draw your own conclusions about its use of HashMap.

I suspect it uses a hash map, not necessarily a HashMap. Of course, the
underlying mechanism is probably implementation dependent.
 
D

Daniele Futtorovic

I suspect it uses a hash map, not necessarily a HashMap. Of course, the
underlying mechanism is probably implementation dependent.

Yeah, the camel casing was out of habit.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,816
Latest member
nipsseyhussle

Latest Threads

Top