Implementing high-performance cache in Java

skvarenina.peter · Feb 4, 2009

Hello,

I would like to ask for your expert opinion regarding high-performance
cache implementation in Java.

My goal is to create a cache implementation that would handle large
amount of concurrent reads (preferably without any locking) and can
handle write concurrency in the best possible way. The updating of the
records in the cache is of no concern; all updates are versioned and
therefore represent separate records in the cache.

I noticed the properties of the JDK's ConcurrentHashMap which seem to
fit the objective very well - the class supports unrestricted
retrieval concurrency without locking and adjustable probabilistic non-
blocking writing concurrency, which would be a good basis of a cache.
This might be also done by an own structure utilizing the read-copy-
update approach.

The cache removal algorithm could be based on a concurrent priority
queue implementation, either skip-list or heap based, or even some
simple threshold algorithm. The priorities would be assigned according
to a chosen algorithm (FIFO/LRU/LFU etc.). There is of course a
problem with the cache memory management - it is not easy to estimate
the Java class instance memory size and the exact time when a record
should be removed if the cache is memory bound. Of course, the use of
soft references is possible, however there is no possibility to
guarantee the removal of records according to the cache retention
algorithm. One possibility is to split the cache content (e.g. into
two halves/according to the usage threshold) where the records with
higher priorities (i.e. the ones that should be retained) would be
referenced by hard references whereas the ones with lower priorities
would be referenced by soft references, while allowing promotion/
demotion to/from each part according to the access statistics. The
hard referenced part will still need to be restricted in size by some
estimate of the record size (e.g. max record size). Unfortunately, I
don't have the luxury to precompute the size of a Java class instance
in memory in advance from marshaled state.

My question is how such a cache should be constructed? Do you think
the proposed ideas are worth exploring? What is the current state of
art of cache implementation?

I would love to hear your opinion and will be very thankful for any
references to papers pertaining to the problems outlined here!

Thank you and kind regards,
Peter Skvarenina

Arne Vajhøj · Feb 5, 2009

I would like to ask for your expert opinion regarding high-performance
cache implementation in Java.

My goal is to create a cache implementation that would handle large
amount of concurrent reads (preferably without any locking) and can
handle write concurrency in the best possible way. The updating of the
records in the cache is of no concern; all updates are versioned and
therefore represent separate records in the cache.

I noticed the properties of the JDK's ConcurrentHashMap which seem to
fit the objective very well - the class supports unrestricted
retrieval concurrency without locking and adjustable probabilistic non-
blocking writing concurrency, which would be a good basis of a cache.
This might be also done by an own structure utilizing the read-copy-
update approach.

The cache removal algorithm could be based on a concurrent priority
queue implementation, either skip-list or heap based, or even some
simple threshold algorithm. The priorities would be assigned according
to a chosen algorithm (FIFO/LRU/LFU etc.). There is of course a
problem with the cache memory management - it is not easy to estimate
the Java class instance memory size and the exact time when a record
should be removed if the cache is memory bound. Of course, the use of
soft references is possible, however there is no possibility to
guarantee the removal of records according to the cache retention
algorithm. One possibility is to split the cache content (e.g. into
two halves/according to the usage threshold) where the records with
higher priorities (i.e. the ones that should be retained) would be
referenced by hard references whereas the ones with lower priorities
would be referenced by soft references, while allowing promotion/
demotion to/from each part according to the access statistics. The
hard referenced part will still need to be restricted in size by some
estimate of the record size (e.g. max record size). Unfortunately, I
don't have the luxury to precompute the size of a Java class instance
in memory in advance from marshaled state.

My question is how such a cache should be constructed? Do you think
the proposed ideas are worth exploring? What is the current state of
art of cache implementation?

I would love to hear your opinion and will be very thankful for any
references to papers pertaining to the problems outlined here!

I think you should start by looking at existing Java cache products:
oscache
ehcache
JBOSS Cache
etc.

And then look at JSR 107, which is supposed to end up with a standard
for cache in Java (note that it is not a fast moving standard !!).

Arne

Mark Space · Feb 5, 2009

Arne said:
I think you should start by looking at existing Java cache products:
oscache
ehcache
JBOSS Cache
etc.

A few more at:

<http://java-source.net/open-source/cache-solutions>

skvarenina.peter · Feb 9, 2009

Many thanks Arne & Mark!

I guess the easiest way is to explore the existing open source caches
and see if what inspiring ideas are there for my particular case. I
was looking at cache4j and ehcache prior to writing my question; the
ConcurrentHashMap alone with soft references was about the same
performance as cache4j in my testing scenario.

Best regards,
Peter Skvarenina

How to test the high-score in my Python-game	2	Jan 5, 2023
Implementing Many Stacks in the Same Program	1	Aug 10, 2021
Problem while implementing ehCache Framework	6	Apr 18, 2010
May I post a link to our graduate research project concerning Java Concurrency?	0	Sep 6, 2022
High performance distributed programming on Java	0	Feb 15, 2008
The distinction between a java applet and an application	1	Jan 4, 2023
Implementing a cache	4	Jul 10, 2009
Cache performance in ASP.NET 2.0	1	Sep 20, 2007

Implementing high-performance cache in Java

skvarenina.peter

Arne Vajhøj

Mark Space

skvarenina.peter

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads