D
Daniele Futtorovic
How would interning help? The input is read only once anyway
Depends on the input, of course. But natural text on the web (which
appears to be what this is about) is quite likely to contain the same
words more than once each.
and if you
mean to intern individual words of the input then how does the JVM do
the interning?
Like it does all interning? I must admit I couldn't lay out the details
off the top of my head, but the JLS should have them within reasonable
accuracy.
Of course, this would only be an option for a batch-like program. You
wouldn't want to clutter the string pool of a long-running app.
Interning would also perhaps allow one to use an IdentityHashMap, and
thus doing away with the (probably relatively costly) string comparisons.
For sure, this wouldn't be a replacement for more sophisticated
solutions, but could one of the things to try if it is to be kept "simple".
My guess would be that some form of hashing would be
used there as well - plus that internal data structure must be thread
safe...
True.