F
Freek Dijkstra
Is there a best practice on how to override __new__?
I have a base class, RDFObject, which is instantiated using a unique
identifier (a URI in this case). If an object with a given identifier
already exists, I want to return the existing object, otherwise, I
want to create a new object and add this new object to a cache. I'm
not sure if there is a name for such a creature, but I've seen the
name MultiSingleton in the archive.
This is not so hard; this can be done by overriding __new__(), as long
as I use a lock in case I want my code to be multi-threading
compatible.
import threading
threadlock = threading.Lock()
class RDFObject(object):
_cache = {} # class variable is shared among all RDFObject
instances
def __new__(cls, *args, **kargs):
assert len(args) >= 1
uri = args[0]
if uri not in cls._cache:
threadlock.acquire() # thread lock
obj = object.__new__(cls)
cls._cache[uri] = obj
threadlock.release() # thread unlock.
return cls._cache[uri]
def __init__(self, uri):
pass
# ...
However, I have the following problem:
The __init__-method is called every time you call RDFObject().
The benefit of this multi-singleton is that I can put this class in a
module, call RDFObject(someuri), and simply keep adding states to it
(which is what we want). If it had some state, good, that is retained.
If it did not have so: fine, we get a new object.
For example:
x = RDFObject(someuri)
x.myvar = 123
....later in the code...
y = RDFObject(someuri)
assert(y.myvar == 123)
I and fellow programmers tend to forget about the __init__() catch.
For example, when we subclass RDFObject:
class MySubclass(RDFObject):
def __init__(self, uri):
RDFObject.__init__(self, uri)
self.somevar = []
Now, this does not work. The array is unwantedly initialized twice:
x = RDFObject(someotheruri)
x.somevar.append(123)
....later in the code...
y = RDFObject(someotheruri)
assert(y.somevar[0] == 123)
So I'm wondering: is there a best practice that allows the behaviour
we're looking for? (I can think of a few things, but I consider them
all rather ugly). Is there a good way to suppress the second call
__init__() from the base class? Perhaps even without overriding
__new__?
I have a base class, RDFObject, which is instantiated using a unique
identifier (a URI in this case). If an object with a given identifier
already exists, I want to return the existing object, otherwise, I
want to create a new object and add this new object to a cache. I'm
not sure if there is a name for such a creature, but I've seen the
name MultiSingleton in the archive.
This is not so hard; this can be done by overriding __new__(), as long
as I use a lock in case I want my code to be multi-threading
compatible.
import threading
threadlock = threading.Lock()
class RDFObject(object):
_cache = {} # class variable is shared among all RDFObject
instances
def __new__(cls, *args, **kargs):
assert len(args) >= 1
uri = args[0]
if uri not in cls._cache:
threadlock.acquire() # thread lock
obj = object.__new__(cls)
cls._cache[uri] = obj
threadlock.release() # thread unlock.
return cls._cache[uri]
def __init__(self, uri):
pass
# ...
However, I have the following problem:
The __init__-method is called every time you call RDFObject().
The benefit of this multi-singleton is that I can put this class in a
module, call RDFObject(someuri), and simply keep adding states to it
(which is what we want). If it had some state, good, that is retained.
If it did not have so: fine, we get a new object.
For example:
x = RDFObject(someuri)
x.myvar = 123
....later in the code...
y = RDFObject(someuri)
assert(y.myvar == 123)
I and fellow programmers tend to forget about the __init__() catch.
For example, when we subclass RDFObject:
class MySubclass(RDFObject):
def __init__(self, uri):
RDFObject.__init__(self, uri)
self.somevar = []
Now, this does not work. The array is unwantedly initialized twice:
x = RDFObject(someotheruri)
x.somevar.append(123)
....later in the code...
y = RDFObject(someotheruri)
assert(y.somevar[0] == 123)
So I'm wondering: is there a best practice that allows the behaviour
we're looking for? (I can think of a few things, but I consider them
all rather ugly). Is there a good way to suppress the second call
__init__() from the base class? Perhaps even without overriding
__new__?