Implementing a cache

N

Nikolaus Rath

Hello,

I want to implement a caching data structure in Python that allows me
to:

1. Quickly look up objects using a key
2. Keep track of the order in which the objects are accessed (most
recently and least recently accessed one, not a complete history)
3. Quickly retrieve and remove the least recently accessed object.


Here's my idea for the implementation:

The objects in the cache are encapsulated in wrapper objects:

class OrderedDictElement(object):
__slots__ = [ "next", "prev", "key", "value" ]

These wrapper objects are then kept in a linked lists and in an ordinary
dict (self.data) in parallel. Object access then works as follows:

def __setitem__(self, key, value):
if key in self.data:
# Key already exists, just overwrite value
self.data[key].value = value
else:
# New key, attach at head of list
with self.lock:
el = OrderedDictElement(key, value, next=self.head.next, prev=self.head)
self.head.next.prev = el
self.head.next = el
self.data[key] = el

def __getitem__(self, key):
return self.data[key].value

To 'update the access time' of an object, I use

def to_head(self, key):
with self.lock:
el = self.data[key]
# Splice out
el.prev.next = el.next
el.next.prev = el.prev

# Insert back at front
el.next = self.head.next
el.prev = self.head

self.head.next.prev = el
self.head.next = el


self.head and self.tail are special sentinel objects that only have a
..next and .prev attribute respectively.


While this is probably going to work, I'm not sure if its the best
solution, so I'd appreciate any comments. Can it be done more elegantly?
Or is there an entirely different way to construct the data structure
that also fulfills my requirements?


I already looked at the new OrderedDict class in Python 3.1, but
apparently it does not allow me to change the ordering and is therefore
not suitable for my purpose. (I can move something to one end by
deleting and reinserting it, but I'd like to keep at least the option of also
moving objects to the opposite end).


Best,


-Nikolaus
 
S

Steven D'Aprano

Hello,

I want to implement a caching data structure in Python that allows me
to:

1. Quickly look up objects using a key 2. Keep track of the order in
which the objects are accessed (most
recently and least recently accessed one, not a complete history)
3. Quickly retrieve and remove the least recently accessed object.

Google for "python LRU cache".

Here are the first three hits:

http://code.activestate.com/recipes/498245/
http://code.activestate.com/recipes/252524/
http://www.algorithm.co.il/blogs/index.php/programming/python/small-python-challenge-no-2-lru-cache/
 
S

skip

Nikolaus> I want to implement a caching data structure in Python that
Nikolaus> allows me to:

Nikolaus> 1. Quickly look up objects using a key
Nikolaus> 2. Keep track of the order in which the objects are accessed
Nikolaus> (most recently and least recently accessed one, not a
Nikolaus> complete history)
Nikolaus> 3. Quickly retrieve and remove the least recently accessed
Nikolaus> object.

My Cache module does #1 and #3. I'm not sure if you want #2 for internal
cache maintenance or for as part of the API.

http://www.smontanaro.net/python/Cache.py

Skip
 
P

pdpi

    Nikolaus> I want to implement a caching data structure in Python that
    Nikolaus> allows me to:

    Nikolaus>  1. Quickly look up objects using a key
    Nikolaus>  2. Keep track of the order in which the objects are accessed
    Nikolaus>     (most recently and least recently accessed one, not a
    Nikolaus>     complete history)
    Nikolaus>  3. Quickly retrieve and remove the least recently accessed
    Nikolaus>     object.

My Cache module does #1 and #3.  I'm not sure if you want #2 for internal
cache maintenance or for as part of the API.

   http://www.smontanaro.net/python/Cache.py

Skip

I'm not sure whether #2 is doable at all, as written. You _need_ a
complete history (at least the full ordering of the items in the
cache) to be able to tell what the least recently used item is.
 
S

skip

pdpi> I'm not sure whether #2 is doable at all, as written. You _need_ a
pdpi> complete history (at least the full ordering of the items in the
pdpi> cache) to be able to tell what the least recently used item is.

I should have added that my Cache module does maintain a history. (Clearly
it needs that to determine the LRU item.) It is exposed as part of the API
via the ftimes method.

S
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,962
Messages
2,570,134
Members
46,690
Latest member
MacGyver

Latest Threads

Top