S
skvarenina.peter
Hello,
I would like to ask for your expert opinion regarding high-performance
cache implementation in Java.
My goal is to create a cache implementation that would handle large
amount of concurrent reads (preferably without any locking) and can
handle write concurrency in the best possible way. The updating of the
records in the cache is of no concern; all updates are versioned and
therefore represent separate records in the cache.
I noticed the properties of the JDK's ConcurrentHashMap which seem to
fit the objective very well - the class supports unrestricted
retrieval concurrency without locking and adjustable probabilistic non-
blocking writing concurrency, which would be a good basis of a cache.
This might be also done by an own structure utilizing the read-copy-
update approach.
The cache removal algorithm could be based on a concurrent priority
queue implementation, either skip-list or heap based, or even some
simple threshold algorithm. The priorities would be assigned according
to a chosen algorithm (FIFO/LRU/LFU etc.). There is of course a
problem with the cache memory management - it is not easy to estimate
the Java class instance memory size and the exact time when a record
should be removed if the cache is memory bound. Of course, the use of
soft references is possible, however there is no possibility to
guarantee the removal of records according to the cache retention
algorithm. One possibility is to split the cache content (e.g. into
two halves/according to the usage threshold) where the records with
higher priorities (i.e. the ones that should be retained) would be
referenced by hard references whereas the ones with lower priorities
would be referenced by soft references, while allowing promotion/
demotion to/from each part according to the access statistics. The
hard referenced part will still need to be restricted in size by some
estimate of the record size (e.g. max record size). Unfortunately, I
don't have the luxury to precompute the size of a Java class instance
in memory in advance from marshaled state.
My question is how such a cache should be constructed? Do you think
the proposed ideas are worth exploring? What is the current state of
art of cache implementation?
I would love to hear your opinion and will be very thankful for any
references to papers pertaining to the problems outlined here!
Thank you and kind regards,
Peter Skvarenina
I would like to ask for your expert opinion regarding high-performance
cache implementation in Java.
My goal is to create a cache implementation that would handle large
amount of concurrent reads (preferably without any locking) and can
handle write concurrency in the best possible way. The updating of the
records in the cache is of no concern; all updates are versioned and
therefore represent separate records in the cache.
I noticed the properties of the JDK's ConcurrentHashMap which seem to
fit the objective very well - the class supports unrestricted
retrieval concurrency without locking and adjustable probabilistic non-
blocking writing concurrency, which would be a good basis of a cache.
This might be also done by an own structure utilizing the read-copy-
update approach.
The cache removal algorithm could be based on a concurrent priority
queue implementation, either skip-list or heap based, or even some
simple threshold algorithm. The priorities would be assigned according
to a chosen algorithm (FIFO/LRU/LFU etc.). There is of course a
problem with the cache memory management - it is not easy to estimate
the Java class instance memory size and the exact time when a record
should be removed if the cache is memory bound. Of course, the use of
soft references is possible, however there is no possibility to
guarantee the removal of records according to the cache retention
algorithm. One possibility is to split the cache content (e.g. into
two halves/according to the usage threshold) where the records with
higher priorities (i.e. the ones that should be retained) would be
referenced by hard references whereas the ones with lower priorities
would be referenced by soft references, while allowing promotion/
demotion to/from each part according to the access statistics. The
hard referenced part will still need to be restricted in size by some
estimate of the record size (e.g. max record size). Unfortunately, I
don't have the luxury to precompute the size of a Java class instance
in memory in advance from marshaled state.
My question is how such a cache should be constructed? Do you think
the proposed ideas are worth exploring? What is the current state of
art of cache implementation?
I would love to hear your opinion and will be very thankful for any
references to papers pertaining to the problems outlined here!
Thank you and kind regards,
Peter Skvarenina