Dynamically determine base classes on instantiation

T

Thomas Bach

Hi list,

I'm confronted with a strang problem I cannot find a clean solution
for. To me it seems like I need meta-classes. Anyway, I stucked a bit
deeper in that topic and couldn't find a proper solution neither. But,
judge for yourselve.

I want a class that determines on instantiating its base classes
dynamically. Consider the following two use cases

a = Foo(['a', 'list']) # returns an instance that behaves like a list
assert len(a) == 2
assert a[0] == 'a'
assert a == ['a', 'list']
assert isinstance(a, list) # This would be nice, but no must-have

b = Foo({'blah': 8}) # returns an instance that behaves like a dict
assert b['blah'] == 'blah'
assert b == {'blah': 8}
assert isinstance(b, dict) # again, no must-have

a.do_something() # common function to both instances as defined
b.do_something() # in the Foo class


What I'm currently doing something like the following:

class Foo(object):

def __init__(self, obj):
self._obj = obj

def __len__(self):
return len(self._obj)

def __getitem__(self, name):
return self._obj[name]

# …

def do_something(self):
# do something on self._obj
pass

Which seems ugly. Is there a way to provide the functions of `list'
and `dict' in Foo's look-up path without having to write all the
stubs myself?

Regards,
Thomas Bach.
 
S

Steven D'Aprano

Hi list,

I'm confronted with a strang problem I cannot find a clean solution for.
I want a class that determines on instantiating its base classes
dynamically. Consider the following two use cases

Some comments:

1) What you show are not "use cases", but "examples". A use-case is a
description of an actual real-world problem that needs to be solved. A
couple of asserts is not a use-case.

2) You stated that you have a "strange problem", but you haven't told us
what that problem is, you went directly to what you think is the
solution: "a class that determines on instantiating its base classes
dynamically".

How about you tell us the problem, and we'll suggest a solution?

I'm pretty sure it isn't going to be what you asked for, because that
goes completely against the fundamentals of object-oriented design.

Consider your two examples:

a = Foo(['a', 'list'])
b = Foo({'blah': 8})

According to your design:

a is a Foo
b is a Foo
therefore a and b are the same type

So far so good: this is perfectly normal object-oriented design.

But you also have

a is a list, but not a dict
b is a dict, but not a list
therefore a and b are different types

So you contradict yourself: at the same time, a and b are both the same
and different types.

So now you see why you shouldn't do what you ask for. Now let me tell you
why you *can't* do what you ask for: Python's classes don't work like
that. You can't set the base classes of an instance individually. All
instances of a class share the same base classes.

I think that the right solution here is not inheritance, but composition
and delegation. You're already on the right track when you give your Foo
instances an attribute _obj and then operate on that, but you are wrong
to focus on inheritance.

Instead, Foo should implement only the shared operations, and everything
else should be delegated to _obj.

Automatic delegation is trivially easy (except see below):

http://code.activestate.com/recipes/52295

This is a *really old* recipe, from ancient days before you could inherit
from built-in types like list, dict etc., so the description of the
problem is no longer accurate. But the technique is still good, with
unfortunately one complication:

If you inherit from builtins, you cannot use automatic delegation on the
magic "double-underscore" (dunder) methods like __eq__, __len__, etc.

See this thread here for one possible solution:

http://www.velocityreviews.com/forums/t732798-automatic-delegation-in-python-3-a.html
 
T

Thomas Bach

Some comments:

1) What you show are not "use cases", but "examples". A use-case is a
description of an actual real-world problem that needs to be solved. A
couple of asserts is not a use-case.

Thanks for the clarification on that one. So, here's the use-case: I'm
querying the crunchbase API which returns JSON data and is rather
poorly documented. I want to create a data model for the companies
listed on Crunchbase in order to be able to put the queried data in a
data-base. As I am too lazy to examine all the data by hand I thought
I automatize this. I thought that it would be nice to be able to pass
a function a parsed JSON object (AFAIK these are lists, dicts,
strings, ints, floats, strs in Python) and it returns me the type of
these objects. For the simple classes (str, int, float) this is quite
trivial: F('foo') should return `str' and F(8) should return `int'.

For a compound object like dict I would like it to return the data
fields with their type. Hence, F({'foo': 8}) should return
{'foo': int}, and given that f = F({'foo': {'bar': 80}}) I would like
f to equal to {'foo': dict}, with the option to query the type of
'foo' via f.foo, where the latter should equal to {'bar': int}. So
far, this is not a complicated case. But, sometimes a data field on
returned data set is simply None. Thus, I want to extract the types from
another data set and merge the two.

So, my question (as far as I can see it, please correct me if I am
wrong) is less of the "How do I achieve this?"-kind, but more of the
"What is a clean design for this?"-kind. My intuitive thought was that
the `merge' function should be a part of the object returned from `F'.
How about you tell us the problem, and we'll suggest a solution?

I can see your point. On the other hand, by expressing my thoughts you
can at least tell me that these are completely wrong and correct my
way of thinking this way.
Consider your two examples:

a = Foo(['a', 'list'])
b = Foo({'blah': 8})

According to your design:

a is a Foo
b is a Foo

I actually never said that. I simply wanted `a' and `b' to share the
same function (the `merge' function), I thought that the easiest way
to achieve this is by letting them share the same name-space. But, as
you show: …
therefore a and b are the same type

So far so good: this is perfectly normal object-oriented design.

But you also have

a is a list, but not a dict
b is a dict, but not a listn
therefore a and b are different types

So you contradict yourself: at the same time, a and b are both the same
and different types.

… I already made a mistake on the logical level.
Instead, Foo should implement only the shared operations, and everything
else should be delegated to _obj.

If you inherit from builtins, you cannot use automatic delegation on the
magic "double-underscore" (dunder) methods like __eq__, __len__, etc.

See this thread here for one possible solution:

http://www.velocityreviews.com/forums/t732798-automatic-delegation-in-python-3-a.html

OK, thanks for the hint. I will see how I'm going to put all this
stuff together.

Regards,
Thomas.
 
H

Hans Mulder

Thanks for the clarification on that one. So, here's the use-case: I'm
querying the crunchbase API which returns JSON data and is rather
poorly documented. I want to create a data model for the companies
listed on Crunchbase in order to be able to put the queried data in a
data-base. As I am too lazy to examine all the data by hand I thought
I automatize this. I thought that it would be nice to be able to pass
a function a parsed JSON object (AFAIK these are lists, dicts,
strings, ints, floats, strs in Python) and it returns me the type of
these objects. For the simple classes (str, int, float) this is quite
trivial: F('foo') should return `str' and F(8) should return `int'.

For a compound object like dict I would like it to return the data
fields with their type. Hence, F({'foo': 8}) should return
{'foo': int}, and given that f = F({'foo': {'bar': 80}}) I would like
f to equal to {'foo': dict}, with the option to query the type of
'foo' via f.foo, where the latter should equal to {'bar': int}. So
far, this is not a complicated case. But, sometimes a data field on
returned data set is simply None. Thus, I want to extract the types from
another data set and merge the two.

So, my question (as far as I can see it, please correct me if I am
wrong) is less of the "How do I achieve this?"-kind, but more of the
"What is a clean design for this?"-kind. My intuitive thought was that
the `merge' function should be a part of the object returned from `F'.

The misunderstanding is that you feel F should return an object with
a 'merge' method and a varying abse type, while Steven and others
think that F should be a function.

Maybe something like:

def F(obj):
if obj is None:
return None
tp = type(obj)
if tp in (bool, int, float, str):
return tp
elif tp is list:
return merge([F(elem) for elem in obj])
elif tp is dict:
return dict((k, F(v)) for k,v in obj.iteritems())
else:
raise ValueError("Unexpected type %s for value %s" %(tp, obj))

def merge(lst):
if None in lst:
not_nones = [elem for elem in lst if elem is not None]
if not_nones:
not_none = not_nones[0]
lst = [not_none if elem is None else elem for elem in lst]
else:
return lst # all elements are None; nothing can be done
types = {}
for elem in lst:
if type(elem) is dict:
for k,v in elem.iteritems():
if v is None:
if k in types:
elem[k] = types[k]
else:
for other in lst:
if (other is not elem
and type(other) is dict
and k in other
and other[k] is not None
):
elem[k] = types[k] = other[k]
break
return lst


The merge logic you have in mind may be different from what I just
made up, but the idea remains: F and merge can be functions.


Hope this helps,

-- HansM
 
D

Dennis Lee Bieber

Thanks for the clarification on that one. So, here's the use-case: I'm
querying the crunchbase API which returns JSON data and is rather
poorly documented. I want to create a data model for the companies
listed on Crunchbase in order to be able to put the queried data in a
data-base. As I am too lazy to examine all the data by hand I thought
I automatize this. I thought that it would be nice to be able to pass
a function a parsed JSON object (AFAIK these are lists, dicts,
strings, ints, floats, strs in Python) and it returns me the type of
these objects. For the simple classes (str, int, float) this is quite
trivial: F('foo') should return `str' and F(8) should return `int'.
I'm not familiar with JSON structure, but off-hand I'd say the point
to determine the nature of the data is during the so-called parsing of
the JSON data itself, not after... Based upon http://www.json.org/ the
type of an item can basically be determined from the first character of
the "value":
{ dictionary (JSON "object")
[ list (JSON "array")
" string
t/f/n true/false/null
- or 0..9 number

I'd be looking for some way to have the parser itself return a
structure of tuples of (type, parsedJSONitem)

Of course, since the parse result (at least from my recent
experiment) is a Python structure, it isn't difficult to walk that
structure...
import simplejson as j
SAMPLE = '["foo", {"bar":["baz", null, 1.0, 2]}]'
parse = j.loads(SAMPLE)
parse [u'foo', {u'bar': [u'baz', None, 1.0, 2]}]
def typer(obj, level=0):
.... otype = type(obj)
.... print "\t" * level, otype
.... if otype == type([]):
.... for o in obj:
.... typer(o, level+1)
.... elif otype == type({}):
.... for k,o in obj.items():
.... print "\t" * level, " ", k
.... typer(o, level+1)
.... else:
.... print "\t" * level, " ", obj
....
print parse [u'foo', {u'bar': [u'baz', None, 1.0, 2]}]
typer(parse)
<type 'list'>
<type 'unicode'>
foo
<type 'dict'>
bar
<type 'list'>
<type 'unicode'>
baz
<type 'NoneType'>
None
<type 'float'>
1.0

{Hmmm, forgot to print the type of "k" for dictionaries}.
For a compound object like dict I would like it to return the data
fields with their type. Hence, F({'foo': 8}) should return
{'foo': int}, and given that f = F({'foo': {'bar': 80}}) I would like
f to equal to {'foo': dict}, with the option to query the type of
'foo' via f.foo, where the latter should equal to {'bar': int}. So
far, this is not a complicated case. But, sometimes a data field on
returned data set is simply None. Thus, I want to extract the types from
another data set and merge the two.

"But, sometimes a data field on returned data set is simply None.
Thus, I want to extract the types from another data set and merge the
two." ??? A "data field" /value/ of None has the /type/ "<type
'NoneType'>", so I don't quite understand what you intend to merge? You
can't arbitrarily change the "type" without changing the "value".
 
T

Thomas Bach

The misunderstanding is that you feel F should return an object with
a 'merge' method and a varying abse type, while Steven and others
think that F should be a function.

OK, then my design wasn't so bad in the first place. :)

I made a class `Model' which wraps the actual type and realized
`merge' and `F' (with a better name, though) as classmethods of
`Model' in order to tie together the stuff that belongs together. By
the way, another need I saw for this design was that

setattr(Model(), 'foo', {'bar': int})

works, whereas

setattr(dict(), 'foo', {'bar': int})

raises an AttributeError (on Python 3.2). Could someone give me the
buzz word (or even an explanation) on why that is so?

Thomas Bach
 
R

Richard Thomas

class Foo(object):
def __new__(cls, arg):
if isinstance(arg, list):
cls = FooList
elif isinstance(arg, dict):
cls = FooDict
return object.__new__(cls, arg)

class FooList(Foo, list):
pass

class FooDict(Foo, dict):
pass

You could even have __new__ make these Foo* classes dynamically when it encounters a new type of argument.

Chard.
 
R

Richard Thomas

class Foo(object):
def __new__(cls, arg):
if isinstance(arg, list):
cls = FooList
elif isinstance(arg, dict):
cls = FooDict
return object.__new__(cls, arg)

class FooList(Foo, list):
pass

class FooDict(Foo, dict):
pass

You could even have __new__ make these Foo* classes dynamically when it encounters a new type of argument.

Chard.
 
T

Thomas Bach

On Thu, 16 Aug 2012 14:52:30 +0200, Thomas Bach
<[email protected]> declaimed the following in
gmane.comp.python.general:

Of course, since the parse result (at least from my recent
experiment) is a Python structure, it isn't difficult to walk that
structure...

I prefer that one, as I have the parsed data already lying around in
memory. But, as I think about it, I could also pass it to json.dumps
and parse it again. But, that wouldn't make much sense, right?
"But, sometimes a data field on returned data set is simply None.
Thus, I want to extract the types from another data set and merge the
two." ??? A "data field" /value/ of None has the /type/ "<type
'NoneType'>", so I don't quite understand what you intend to merge? You
can't arbitrarily change the "type" without changing the "value".

OK, I am probably using the wrong vocabulary here again. :(

Imagine you have two data sets:

d1 = {'foo': None}
d2 = {'foo': 8}

Where I would assume that d1 has "foo" not set. That's why I want this
whole "merge"-thing in the first place: to be able to extract the type
{'foo': None} from d1 and {'foo': int} from d2 and merge the two
together which should result in {'foo': int}.

Regards,
Thomas Bach.
 
T

Thomas Bach

class Foo(object):
def __new__(cls, arg):
if isinstance(arg, list):
cls = FooList
elif isinstance(arg, dict):
cls = FooDict
return object.__new__(cls, arg)

class FooList(Foo, list):
pass

class FooDict(Foo, dict):
pass

You could even have __new__ make these Foo* classes dynamically when
it encounters a new type of argument.

Chard.

Thanks for that one. Your solution just hit me like a punch in the
face. I had something similar in my mind. But I could not work out how
the mechanics behind it are working.

Regards,

Thomas
 
S

Steven D'Aprano

What you mean here is "a and b share a common base class".

No. I mean what I said: since a and b are both direct instances of Foo,
not subclasses, they are both the same type, namely Foo.

"Share a common base class" is a much weaker statement:

class Foo: pass
class Bar(Foo): pass

a = Foo()
b = Bar()

Now we can see that a and b are NOT the same type, but they share a
common base class, Foo.
 
S

Steven D'Aprano

class Foo(object):
def __new__(cls, arg):
if isinstance(arg, list):
cls = FooList
elif isinstance(arg, dict):
cls = FooDict
return object.__new__(cls, arg)

class FooList(Foo, list):
pass

class FooDict(Foo, dict):
pass


Did you actually try your code?


py> x = Foo([])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in __new__
TypeError: object.__new__(FooList) is not safe, use list.__new__()
 
S

Steven D'Aprano

I'm
querying the crunchbase API which returns JSON data and is rather poorly
documented. I want to create a data model for the companies listed on
Crunchbase in order to be able to put the queried data in a data-base.
As I am too lazy to examine all the data by hand I thought I automatize
this. I thought that it would be nice to be able to pass a function a
parsed JSON object (AFAIK these are lists, dicts, strings, ints, floats,
strs in Python) and it returns me the type of these objects. For the
simple classes (str, int, float) this is quite trivial: F('foo') should
return `str' and F(8) should return `int'.

Um, this is utterly trivial for *any* object. Just call type(object), and
it will return the type of the object.

For a compound object like dict I would like it to return the data
fields with their type. Hence, F({'foo': 8}) should return {'foo': int},

Your first problem is defining what you consider a compound object. Once
you've done that, it just becomes a matter of recursion:

def recursive_type(obj):
if isinstance(obj, dict):
return dict((k, recursive_type(v)) for (k, v) in obj.items())
elif isinstance(obj, list):
pass # whatever...
else:
return type(obj)


and given that f = F({'foo': {'bar': 80}}) I would like f to equal to
{'foo': dict}, with the option to query the type of 'foo' via f.foo,
where the latter should equal to {'bar': int}. So far, this is not a
complicated case. But, sometimes a data field on returned data set is
simply None. Thus, I want to extract the types from another data set and
merge the two.

So, my question (as far as I can see it, please correct me if I am
wrong) is less of the "How do I achieve this?"-kind, but more of the
"What is a clean design for this?"-kind. My intuitive thought was that
the `merge' function should be a part of the object returned from `F'.

This isn't Java you know. Just write a function to merge the two data
sets and be done with it.

Consider your two examples:

a = Foo(['a', 'list'])
b = Foo({'blah': 8})

According to your design:

a is a Foo
b is a Foo

I actually never said that.

You might not have said that, but that's what instantiation implies. If
you instantiate Foo, you get a Foo instance.

I simply wanted `a' and `b' to share the
same function (the `merge' function), I thought that the easiest way to
achieve this is by letting them share the same name-space.

Or you could you use composition, or a mixin, or straits, or prototypes.
Well, prototypes are hard in Python -- I'm not sure how you would go
about doing that.
 
S

Steven D'Aprano

Imagine you have two data sets:

d1 = {'foo': None}
d2 = {'foo': 8}

Where I would assume that d1 has "foo" not set. That's why I want this
whole "merge"-thing in the first place: to be able to extract the type
{'foo': None} from d1 and {'foo': int} from d2 and merge the two
together which should result in {'foo': int}.

That becomes trivial if you do the merge before converting to types:

d3 = d1.copy() # the merged dict
for key, value in d2.items():
if key in d1 and d1[key] is None:
d3[key] = value # merge

Now pass d3 to your recursive_type function.
 
R

Richard Thomas

Did you actually try your code?

I rarely test code. I'm confident in, however undeserved the confidence. :)In this case that's not an error I've ever seen before. Obvious easy fix:

return cls.__new__(cls, arg)

Incidentally when I reply to your posts through the groups.google.com interface it inserts a blank quoted line between each pair of lines. My first thought was that it was a line endings bug with Google's app but in retrospect that seems very likely to have been fixed years ago. Any ideas?
 
D

Dennis Lee Bieber

Incidentally when I reply to your posts through the groups.google.com interface it inserts a blank quoted line between each pair of lines. My first thought was that it was a line endings bug with Google's app but in retrospect that seems very likely to have been fixed years ago. Any ideas?

Actually, it appear to be a more recent bug -- I've only seen the
double-spaced results coming in over the last few months.

Also, note that your comment shows up as one long single line when
quoted, rather than the common NNTP/SMTP "80 character" line length.

I suspect Google is treating each <cr><lf> as an HTML <p> tag, so
your comments become long extended lines until explicitly split, and
regular posts following NNTP conventions are turning each "hard line"
into a paragraph all by itself.
 
S

Steven D'Aprano

I rarely test code. I'm confident in, however undeserved the confidence.
:) In this case that's not an error I've ever seen before. Obvious easy
fix:

return cls.__new__(cls, arg)

It might be easy, but it's obviously wrong.

py> sys.setrecursionlimit(10)
py> x = Foo([])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in __new__
File "<stdin>", line 7, in __new__
File "<stdin>", line 7, in __new__
File "<stdin>", line 7, in __new__
File "<stdin>", line 7, in __new__
File "<stdin>", line 7, in __new__
File "<stdin>", line 7, in __new__
RuntimeError: maximum recursion depth exceeded

Incidentally when I reply to your posts through the groups.google.com
interface it inserts a blank quoted line between each pair of lines. My
first thought was that it was a line endings bug with Google's app but
in retrospect that seems very likely to have been fixed years ago. Any
ideas?

Makes you think that Google is interested in fixing the bugs in their
crappy web apps? They have become as arrogant and as obnoxious as
Microsoft used to be.
 
M

Mark Lawrence

Makes you think that Google is interested in fixing the bugs in their
crappy web apps? They have become as arrogant and as obnoxious as
Microsoft used to be.

Charging off topic again, but I borrowed a book from the local library a
couple of months back about Google Apps as it looked interesting. I
returned it in disgust rather rapidly as it was basically a "let's bash
Microsoft" tome.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,152
Members
46,698
Latest member
LydiaHalle

Latest Threads

Top