Lazy container

C

Cristiano Paris

Hi everyone,

I'm trying to write a Container which should mimic a list. Basically,
the container pulls items on the fly from an unspecified source through
a function and returns an instance of a given class over the pulled item.

That is:

class lazy(object):
def __getitem__(self,index):
return Foo(f(index))

lc = lazy()

Here I see a problem: two consecutive accesses to lc for the same index
would retrieve two different instances of Foo.

In some scenarios this is not desirable since one wants to have all the
accessors to share the same instance for the same index so as to reduce
memory consumption.

So, I thought to use an internal dictionary of all the Foo instances
given away so far. Something like:

class lazy(object):
def __init__(self):
self.cache = {}

def __getitem__(self,index):
if not self.cache.has_key(index):
value = Foo(f(index)) # MISS
self.cache[index] = value
else: # HIT
value = self.cache[index]

return value

This is acceptable as it reduces the number of instances living in the
system at any given time.

The problem with this implementation is that the cache never decreases
in length as Foo instances are no longer referenced by the lc accessor
since they're all referenced by the internal cache.

There are scenarios in which f() retrieves items from a very huge source
(i.e. a database) and threads sharing the lazing container and consuming
Foo instances to use them for a very short time and then discardi them:
after a while you'll eventually end up having a whole copy of the source
in memory, which is unacceptable.

One solution may be to use gc.get_referrers() to search the cache for
unused instances every time we have a cache miss so as to see if we
could discard some instance before inserting the new one in the cache.

But this comes at a cost.

May be someone has faced this problem before and came up with some
obscure language feature to solve this elegantly :D

Thanks for your replies.

Cristiano
 
D

Duncan Booth

Cristiano Paris said:
May be someone has faced this problem before and came up with some
obscure language feature to solve this elegantly :D

Yes. Try using weak references. Specifically a weakref.WeakValueDictionary
should meet your needs.
 
P

Peter Otten

Cristiano said:
I'm trying to write a Container which should mimic a list. Basically,
the container pulls items on the fly from an unspecified source through
a function and returns an instance of a given class over the pulled item.

That is:

class lazy(object):
def __getitem__(self,index):
return Foo(f(index))

lc = lazy()

Here I see a problem: two consecutive accesses to lc for the same index
would retrieve two different instances of Foo.

In some scenarios this is not desirable since one wants to have all the
accessors to share the same instance for the same index so as to reduce
memory consumption.

So, I thought to use an internal dictionary of all the Foo instances
given away so far. Something like:
The problem with this implementation is that the cache never decreases
in length as Foo instances are no longer referenced by the lc accessor
since they're all referenced by the internal cache.


You want a WeakValueDictionary:
http://docs.python.org/lib/module-weakref.html

Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,962
Messages
2,570,134
Members
46,692
Latest member
JenniferTi

Latest Threads

Top