Overriding a method at the instance level on a subclass of a builtintype

Z

Zac Burns

Sorry for the long subject.

I'm trying to create a subclass dictionary that runs extra init code
on the first __getitem__ call. However, the performance of __getitem__
is quite important - so I'm trying in the subclassed __getitem__
method to first run some code and then patch in the original dict
method for the instance to avoid even the check to see if the init
code has been run. Various recipes using instancemethod and the like
have failed me.

Curiously if __slots__ is not specified no error occurs when setting
self.__getitem__ but the function is not overriden. If __slots__ is
['__getitem__'] however it complains that __getitem__ is read only. I
do not understand that behavior.

--
Zachary Burns
(407)590-4814
Aim - Zac256FL
Production Engineer (Digital Overlord)
Zindagi Games
 
G

George Sakkis

Sorry for the long subject.

I'm trying to create a subclass dictionary that runs extra init code
on the first __getitem__ call. However, the performance of __getitem__
is quite important - so I'm trying in the subclassed __getitem__
method to first run some code and then patch in the original dict
method for the instance to avoid even the check to see if the init
code has been run. Various recipes using instancemethod and the like
have failed me.

For new-style classes, special methods are always looked up in the
class, not the instance, so you're out of luck there. What are you
trying to do? Perhaps there is a less magic solution to the general
problem.

George
 
A

Aaron Brady

Sorry for the long subject.

I'm trying to create a subclass dictionary that runs extra init code
on the first __getitem__ call. However, the performance of __getitem__
is quite important - so I'm trying in the subclassed __getitem__
method to first run some code and then patch in the original dict
method for the instance to avoid even the check to see if the init
code has been run. Various recipes using instancemethod and the like
have failed me.

Curiously if __slots__ is not specified no error occurs when setting
self.__getitem__ but the function is not overriden. If __slots__ is
['__getitem__'] however it complains that __getitem__ is read only. I
do not understand that behavior.

--
Zachary Burns
(407)590-4814
Aim - Zac256FL
Production Engineer (Digital Overlord)
Zindagi Games

That sounds like the State Pattern, from GoF. http://en.wikipedia.org/wiki/State_pattern

I like the idea of 'renaming', not redefining, but reassigning methods
at different points during an object's lifetime. I often wish I had
more experience with it, and more docs talked about it.

It's hard on memory usage, since each instance has its own function
attribute, even if there's still only one instance of the function.
Without it, the function attribute is just looked up on the class.

Not thoroughly tested:
.... def methA( self ):
.... print 'methA'
.... self.meth= self.methB
.... meth= methA
.... def methB( self ):
.... print 'methB'
....methB
 
B

Bryan Olson

Zac said:
Sorry for the long subject.

I'm trying to create a subclass dictionary that runs extra init code
on the first __getitem__ call. However, the performance of __getitem__
is quite important - so I'm trying in the subclassed __getitem__
method to first run some code and then patch in the original dict
method for the instance to avoid even the check to see if the init
code has been run. Various recipes using instancemethod and the like
have failed me.

One option is to re-assign the object's __class__, as in:

class XDict (dict):
pass

class ZDict (XDict):
def __getitem__(self, k):
whatever_you_want_to_do_once(self)
result = dict.__getitem__(self, k)
self.__class__ = XDict
return result


The first dict subtype is needed because __class__ assignment requires
that both the current and newly-assigned class be 'heap types', which
the native dict is not.
 
A

Arnaud Delobelle

Zac Burns said:
Sorry for the long subject.

I'm trying to create a subclass dictionary that runs extra init code
on the first __getitem__ call. However, the performance of __getitem__
is quite important - so I'm trying in the subclassed __getitem__
method to first run some code and then patch in the original dict
method for the instance to avoid even the check to see if the init
code has been run. Various recipes using instancemethod and the like
have failed me.

Curiously if __slots__ is not specified no error occurs when setting
self.__getitem__ but the function is not overriden. If __slots__ is
['__getitem__'] however it complains that __getitem__ is read only. I
do not understand that behavior.

You can change the class on the fly to achieve what you want:
.... def __getitem__(self, key):
.... print 'first call'
.... self.__class__ = D2
.... return dict.__getitem__(self, key)
.... .... pass
....
first call
42

HTH
 
J

Jason Scheirer

Sorry for the long subject.
I'm trying to create a subclass dictionary that runs extra init code
on the first __getitem__ call. However, the performance of __getitem__
is quite important - so I'm trying in the subclassed __getitem__
method to first run some code and then patch in the original dict
method for the instance to avoid even the check to see if the init
code has been run. Various recipes using instancemethod and the like
have failed me.
Curiously if __slots__ is not specified no error occurs when setting
self.__getitem__ but the function is not overriden. If __slots__ is
['__getitem__'] however it complains that __getitem__ is read only. I
do not understand that behavior.
--
Zachary Burns
(407)590-4814
Aim - Zac256FL
Production Engineer (Digital Overlord)
Zindagi Games

That sounds like the State Pattern, from GoF.  http://en.wikipedia.org/wiki/State_pattern

I like the idea of 'renaming', not redefining, but reassigning methods
at different points during an object's lifetime.  I often wish I had
more experience with it, and more docs talked about it.

It's hard on memory usage, since each instance has its own function
attribute, even if there's still only one instance of the function.
Without it, the function attribute is just looked up on the class.

Not thoroughly tested:

...     def methA( self ):
...             print 'methA'
...             self.meth= self.methB
...     meth= methA
...     def methB( self ):
...             print 'methB'
...>>> a= A()
methB

The problem with using this this pattern in the way that you've
specified is that you have a potential memory leak/object lifetime
issue. Assigning a bound method of an instance (which itself holds a
reference to self) to another attribute in that same instance creates
a kind of circular dependency that I have discovered can trip up the
GC more often than not.

You can subclass it as easily:

class dictsubclass(dict):
def __getitem__(self, keyname):
if not hasattr(self, '_run_once'):
self.special_code_to_run_once()
self._run_once = True
return super(self, dict).__getitem__(keyname)

If that extra ~16 bytes associated with the subclass is really a
problem:

class dictsubclass(dict):
def __getitem__(self, keyname):
self.special_code_to_run_once()
self.__class__ = dict
return super(self, dict).__getitem__(keyname)

But I don't think that's a good idea at all.
 
A

Aaron Brady

The problem with using this this pattern in the way that you've
specified is that you have a potential memory leak/object lifetime
issue. Assigning a bound method of an instance (which itself holds a
reference to self) to another attribute in that same instance creates
a kind of circular dependency that I have discovered can trip up the
GC more often than not.

You can subclass it as easily:

class dictsubclass(dict):
    def __getitem__(self, keyname):
        if not hasattr(self, '_run_once'):
            self.special_code_to_run_once()
            self._run_once = True
        return super(self, dict).__getitem__(keyname)

If that extra ~16 bytes associated with the subclass is really a
problem:

class dictsubclass(dict):
    def __getitem__(self, keyname):
        self.special_code_to_run_once()
        self.__class__ = dict
        return super(self, dict).__getitem__(keyname)

But I don't think that's a good idea at all.

Interesting. The following code ran, and process memory usage rose to
150MB. It failed to return to normal afterward.
.... a= []
.... a.append(a)
....

However, the following code succeeded in returning usage to normal.

It was in version 2.6. So, the GC succeeded in collecting circularly
linked garbage when invoked manually. That might have implications in
the OP's use case.

In another language, it would work differently, if it lacked unbound
method descriptors. C++ for example, untested:

class C {
public:
func_t meth;
C( ) { meth= methA; }
void methA( ) { meth= methB; }
void methB( ) { }
};

It has no problems with memory consumption (an extra pointer per
object), or circular references; functions are not first-class
objects. However they are in Python, which creates an entire bound
method object per instance.

The OP stated:
run some code and then patch in the original dict
method for the instance to avoid even the check to see if the init
code has been run.

So your, Arnaud's, and Bryan's '.__class__' solution is probably best,
and possibly even truer to the intent of the State Pattern.

It is too bad that you can't assign an unbound method to the member,
and derive the bound method on the fly. That might provide a middle-
ground solution.
 
Z

Zac Burns

The class method seems to be the most promising, however I have more
'state' methods to worry about so I might end up building new classes
on the fly rather than have a class per permutation of states! Now the
code isn't quite as clear as I thought it was going to be.

It seems unfortunate to me that methods are always looked up on the
class for new style objects. Was this done for speed reasons?
--
Zachary Burns
(407)590-4814
Aim - Zac256FL
Production Engineer (Digital Overlord)
Zindagi Games



The problem with using this this pattern in the way that you've
specified is that you have a potential memory leak/object lifetime
issue. Assigning a bound method of an instance (which itself holds a
reference to self) to another attribute in that same instance creates
a kind of circular dependency that I have discovered can trip up the
GC more often than not.

You can subclass it as easily:

class dictsubclass(dict):
def __getitem__(self, keyname):
if not hasattr(self, '_run_once'):
self.special_code_to_run_once()
self._run_once = True
return super(self, dict).__getitem__(keyname)

If that extra ~16 bytes associated with the subclass is really a
problem:

class dictsubclass(dict):
def __getitem__(self, keyname):
self.special_code_to_run_once()
self.__class__ = dict
return super(self, dict).__getitem__(keyname)

But I don't think that's a good idea at all.

Interesting. The following code ran, and process memory usage rose to
150MB. It failed to return to normal afterward.
... a= []
... a.append(a)
...

However, the following code succeeded in returning usage to normal.

It was in version 2.6. So, the GC succeeded in collecting circularly
linked garbage when invoked manually. That might have implications in
the OP's use case.

In another language, it would work differently, if it lacked unbound
method descriptors. C++ for example, untested:

class C {
public:
func_t meth;
C( ) { meth= methA; }
void methA( ) { meth= methB; }
void methB( ) { }
};

It has no problems with memory consumption (an extra pointer per
object), or circular references; functions are not first-class
objects. However they are in Python, which creates an entire bound
method object per instance.

The OP stated:
run some code and then patch in the original dict
method for the instance to avoid even the check to see if the init
code has been run.

So your, Arnaud's, and Bryan's '.__class__' solution is probably best,
and possibly even truer to the intent of the State Pattern.

It is too bad that you can't assign an unbound method to the member,
and derive the bound method on the fly. That might provide a middle-
ground solution.
 
A

Arnaud Delobelle

Zac Burns said:
The class method seems to be the most promising, however I have more
'state' methods to worry about so I might end up building new classes
on the fly rather than have a class per permutation of states! Now the
code isn't quite as clear as I thought it was going to be.

It seems unfortunate to me that methods are always looked up on the
class for new style objects. Was this done for speed reasons?

It's only special methods such as __getitem__, ...

You can override normal method on a per-object basis just by adding a
callable attribute with its name to the object:
.... def foo(self): print 'A.foo'
.... a.foo
 
G

George Sakkis

It's only special methods such as __getitem__, ...

You can override normal method on a per-object basis just by adding a
callable attribute with its name to the object:


...     def foo(self): print 'A.foo'
...>>> a = A()

Note that the overriden "method" here is a plain function; it doesn't
take self as the first argument. If you want to bind it to a callable
that expects the first argument to be self, you have to bind
explicitly self to the object:
a_foo

George
 
Z

Zac Burns

Ok... but why are the special methods handled differently?
--
Zachary Burns
(407)590-4814
Aim - Zac256FL
Production Engineer (Digital Overlord)
Zindagi Games
 
G

George Sakkis

Ok... but why are the special methods handled differently?

Because otherwise they wouldn't be special ;-) And also for
performance and implementation reasons I believe.

George
 
Z

Zac Burns

Ok. Feature request then - assignment of a special method name to an
instance raises an error.

--
Zachary Burns
(407)590-4814
Aim - Zac256FL
Production Engineer (Digital Overlord)
Zindagi Games
 
A

Aaron Brady

The class method seems to be the most promising, however I have more
'state' methods to worry about so I might end up building new classes
on the fly rather than have a class per permutation of states! Now the
code isn't quite as clear as I thought it was going to be.

It seems unfortunate to me that methods are always looked up on the
class for new style objects. Was this done for speed reasons?

I thought of two more solutions. One, derive a new class for each
instance as you create it, and assign methods to that, that can be
shared. Even more memory consumption though.
.... def methA( self ): print 'methA'
.... def methB( self ): print 'methB'
....methB

Two, use getter properties to return the right function, based on
index and array.
.... @property
.... def meth( self ):
.... return self.meths[ self.methA_i ].__get__( self, A )
.... def methA( self ):
.... print 'methA'
.... self.methA_i+= 1
.... def methB( self ):
.... print 'methB'
.... self.methA_i-= 1
.... meths= [ methA, methB ]
.... def __init__( self ):
.... self.methA_i= 0
....methA

Or (two B), look up the method by name on name-index pair.
.... @property
.... def meth( self ):
.... return getattr( self, self.meths[ self.methA_i ] )
.... def methA( self ):
.... print 'methA'
.... self.methA_i+= 1
.... def methB( self ):
.... print 'methB'
.... self.methA_i-= 1
.... meths= [ 'methA', 'methB' ]
.... def __init__( self ):
.... self.methA_i= 0
....methA

The 'meths' list will need separate lists for each 'group' of state-
dependent functions you wish to use.
 
A

Arnaud Delobelle

Zac Burns said:
Ok. Feature request then - assignment of a special method name to an
instance raises an error.

I haven't got the time to implement it, but I'm sure you can obtain the
behaviour you want.
 
A

Arnaud Delobelle

Arnaud Delobelle said:
I haven't got the time to implement it, but I'm sure you can obtain the
behaviour you want.

OK I've had half an hour to fill this afternoon so I tried to implement
it. I've restriced the ability to override special methods to
__getitem__ but this could be extended to any special method AFAICS. It
combines a metaclass and two descriptors (one for the metaclass and one
for the class), there may be a simpler way! It is proof-of-concept
code, I have not tried to make it behave sensibly when no __getitem__
method is defined (although that would be straighforward) and I have not
thought about how it would work with (multiple) inheritance (this may
require lots more thinking). Here it is, tested very succintly on
Python 2.5:

class ClassGetItem(object):
def __get__(self, obj, objtype=None):
return obj._getitem_
def __set__(self, obj, val):
obj._getitem_ = val

class GetItem(object):
def __get__(self, obj, objtype=None):
return obj._getitem_
def __set__(self, obj, val):
obj._getitem_ = val

class MetaOverrideSpecial(type):
def __new__(meta, name, bases, attrs):
if '__getitem__' in attrs:
attrs['_getitem_'] = attrs['__getitem__']
attrs['__getitem__'] = GetItem()
return type.__new__(meta, name, bases, attrs)
__getitem__ = ClassGetItem()

class OverrideSpecial(object):
__metaclass__ = MetaOverrideSpecial


Here is an example that shows it in action:
.... def __getitem__(self, key): return 'Class getitem(%s)' % key
....
'Class getitem(3)'

Override the class's __getitem__ special method:
Foo.__getitem__ = lambda self, key: 'Overriden class getitem(%s)' % key
foo['bar']
'Overriden class getitem(bar)'

Override the instance's __getitem__ special method:
foo.__getitem__ = lambda key: 'Instance getitem(%s)' % key
foo['baz']
'Instance getitem(baz)'

What-a-way-to-waste-time'ly yours
 
A

Arnaud Delobelle

[...]
class ClassGetItem(object):
def __get__(self, obj, objtype=None):
return obj._getitem_
def __set__(self, obj, val):
obj._getitem_ = val

class GetItem(object):
def __get__(self, obj, objtype=None):
return obj._getitem_
def __set__(self, obj, val):
obj._getitem_ = val

It's funny how the brain works. I didn't realise both classes were the
same until I read my own post!

[...]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,962
Messages
2,570,134
Members
46,692
Latest member
JenniferTi

Latest Threads

Top