Implementing polymorphism with vtables?

J

Jef Driesen

Hi,

I'm implementing polymorphism with vtables in C. Everything works fine
(see code at the end of this post), but I have some questions.

A minimal sample application would look similar to this:

int
main ()
{
base_t *b = base_new ();
derived_t *d = derived_new ();

foo (b, 1);
bar ((derived_t *) b, 1, 2);
destroy (b);

bar (d, 1, 2);
foo ((base_t *) d, 1);
destroy ((base_t *) d);

return 0;
}

As you can see, calling a base class function requires a manual cast,
because C obviously does no automatic downcasting. That's not unexpected
because unlike a real OOP language like C++, C has no built-in knowledge
about the inheritance tree. (Note that I don't want to use C++ for
portability reasons.)

A possible solution would be to hide the existence of the derived_t
class from the user. Thus all constructors would return a base_t
pointer, and all functions of the derived classes would also accept a
base_t pointer:

base_t *d = derived_new ();

bar (d, 1, 2);
foo (d, 1);
destroy (d);

Thus no more manual downcasts, but now there is the risk of passing a
real base_t object to a function from one of the derived_t classes,
without any warnings or errors from the compiler:

base_t *b = base_new ();

bar (b, 1, 2); /* Ouch! */

Is there some solution for this? Can I catch this somehow at runtime?

For functions that are not called through the vtable (e.g. non-virtual
functions) with classes that are not further subclassed (e.g. a leaf
class in the inheritance tree), I can easily detect this type of errors
by inspecting the vtable pointer. If the vtable pointer is equal to the
vtable pointer for that class, I know the object is of the correct
class. However, for functions that are called through the vtable, that
doesn't work because the vtable has already been dereferenced at that
point and the damage has already been done. And for classes that are
further subclassed, the vtable pointers won't necessary match, while
that shouldn't be considered as an error.

Any suggestions to solve this problem?

Thanks for your time,

Jef

------------------------------------------------------------

#include <stdlib.h>

/*
* The base class
*/

typedef struct base_t base_t;
typedef struct base_vtable_t base_vtable_t;

typedef void (*destroy_func_t) (base_t *object);
typedef int (*foo_func_t) (base_t *object, int a);

struct base_vtable_t
{
destroy_func_t destroy;
foo_func_t foo;
};

struct base_t
{
const base_vtable_t *vtable;
/* ... */
};


static void
base_destroy (base_t *object)
{
free (object);
}

static int
base_foo (base_t *object, int a)
{
return 0;
}

static const base_vtable_t *
base_get_vtable (base_t *object)
{
return object->vtable;
}

void
base_init (base_t *object)
{
static const base_vtable_t vtable =
{
base_destroy,
base_foo
};

object->vtable = &vtable;
}

base_t *
base_new ()
{
base_t *object = malloc (sizeof (base_t));

base_init (object);

return object;
}

int
foo (base_t *object, int a)
{
const base_vtable_t *vtable = object->vtable;

return vtable->foo (object, a);
}

void
destroy (base_t *object)
{
const base_vtable_t *vtable = object->vtable;

return vtable->destroy (object);
}

/*
* The derived class
*/

typedef struct derived_t derived_t;
typedef struct derived_vtable_t derived_vtable_t;

typedef int (*bar_func_t) (derived_t *object, int a, int b);

struct derived_vtable_t
{
base_vtable_t base;
bar_func_t bar;
};

struct derived_t
{
base_t base;
/* ... */
};

static void
derived_destroy (base_t *object)
{
free (object);
}

static int
derived_foo (base_t *object, int a)
{
return a;
}

static int
derived_bar (derived_t *object, int a, int b)
{
return a + b;
}

static const derived_vtable_t *
derived_get_vtable (derived_t *object)
{
return (const derived_vtable_t *) ((base_t *) object)->vtable;
}

void
derived_init (derived_t *object)
{
static const derived_vtable_t vtable =
{
{
derived_destroy,
derived_foo
},
derived_bar
};

base_init (&object->base);

((base_t *) object)->vtable = &vtable.base;
}

derived_t *
derived_new ()
{
derived_t *object = malloc (sizeof (derived_t));

derived_init (object);

return object;
}

int
bar (derived_t *object, int a, int b)
{
const derived_vtable_t *vtable = (const derived_vtable_t *)
((base_t *) object)->vtable;

return vtable->bar (object, a, b);
}
 
N

Nobody

I'm implementing polymorphism with vtables in C. Everything works fine
(see code at the end of this post), but I have some questions.
As you can see, calling a base class function requires a manual cast,
because C obviously does no automatic downcasting. That's not unexpected
because unlike a real OOP language like C++, C has no built-in knowledge
about the inheritance tree. (Note that I don't want to use C++ for
portability reasons.)

A possible solution would be to hide the existence of the derived_t
class from the user. Thus all constructors would return a base_t
pointer, and all functions of the derived classes would also accept a
base_t pointer:
Thus no more manual downcasts, but now there is the risk of passing a
real base_t object to a function from one of the derived_t classes,
without any warnings or errors from the compiler:
Is there some solution for this? Can I catch this somehow at runtime?

For functions that are not called through the vtable (e.g. non-virtual
functions) with classes that are not further subclassed (e.g. a leaf
class in the inheritance tree), I can easily detect this type of errors
by inspecting the vtable pointer. If the vtable pointer is equal to the
vtable pointer for that class, I know the object is of the correct
class. However, for functions that are called through the vtable, that
doesn't work because the vtable has already been dereferenced at that
point and the damage has already been done. And for classes that are
further subclassed, the vtable pointers won't necessary match, while
that shouldn't be considered as an error.

Any suggestions to solve this problem?

Are you expecting the user to call virtual functions using the vtable
explicitly, e.g. foo->vtable->method(foo, arg) ?

If you are, then the objects must have their correct types, not the base
type.

If not, then the wrapper can perform the type check before accessing the
vtable.

If you want run-time type checks, the vtable should include a pointer to
the superclass' vtable. Then, you can define isinstance() as e.g.:

int isinstance(base *obj, base_vtable *cls)
{
base_vtable *ocls;
for (ocls = obj->vtable; ocls; ocls = ocls->super)
if (ocls == cls)
return 1;
return 0;
}

This returns 1 if obj is an instance of cls or any of its subclasses,
0 otherwise.

Personally, I would expose the different subclass types, and require the
user to cast as needed, either statically (C cast) or dynamically (provide
a macro for each type to cast to that type, after optionally checking that
the object is actually an instance of that type).

This seems to work well enough for GTK, which is widely used, even by
relatively inexperienced programmers.
 
J

Jef Driesen

Nobody said:
Are you expecting the user to call virtual functions using the vtable
explicitly, e.g. foo->vtable->method(foo, arg) ?

No. All my objects are implemented as opaque objects, thus the internals
are entirely hidden from to the user, only the public api can be used.
If not, then the wrapper can perform the type check before accessing the
vtable.

That's indeed the idea.
If you want run-time type checks, the vtable should include a pointer to
the superclass' vtable. Then, you can define isinstance() as e.g.:

int isinstance(base *obj, base_vtable *cls)
{
base_vtable *ocls;
for (ocls = obj->vtable; ocls; ocls = ocls->super)
if (ocls == cls)
return 1;
return 0;
}

This returns 1 if obj is an instance of cls or any of its subclasses,
0 otherwise.

Nice.

Downside is that adding the superclass' vtable to each vtable requires
that this superclass' vtable is exposed. In my current code, the vtable
pointer is declared as static, making it local to the function (or the
file).

Unless there is some other way to achieve this?
Personally, I would expose the different subclass types, and require the
user to cast as needed, either statically (C cast) or dynamically (provide
a macro for each type to cast to that type, after optionally checking that
the object is actually an instance of that type).

One of the reasons I'm hiding the subclasses from the user is that my
code has one base class with a number of different backends implemented
as subclasses. Once constructed, all backends are accessed through an
identical interface (that of the base class), so it made sense to return
a base class pointer immediately. The only exception here is that some
backends support functionality outside of the base class interface. They
are implemented as non-virtual functions and since there was only one
level in the inheritance tree, I could simply check the vtable.

This approach is also used in other libraries, like for instance the
cairo graphics library:

http://cairographics.org/manual/cairo-surfaces.html

Now, I'm doing some reorganization of the code and I'm adding additional
levels in the inheritance tree as a way to share some common code. But
this intermediate classes are mostly implementation details, that are
not meant to be "known" to the backend users, only the backend writers.

The runtime checking would be useful to prevent my code from crashing
when a user attempts to pass a wrong type of object.
This seems to work well enough for GTK, which is widely used, even by
relatively inexperienced programmers.

I'm one of its users :)
 
N

Nobody

Nice.

Downside is that adding the superclass' vtable to each vtable requires
that this superclass' vtable is exposed. In my current code, the vtable
pointer is declared as static, making it local to the function (or the
file).

Unless there is some other way to achieve this?

If you have a "rooted" hierarchy (with everything derived from an "object"
class), you could make isinstance a virtual class method of the object
class, with the class represented by a string, then override it in each
subclass. E.g.

int object_isinstance(const char *cls) {
if strcmp(cls, "object")
return 1;
return 0;
}

object_vtable = {
.isinstance = object_isinstance,
...
};

int is_object(object *obj) {
return obj->vtable->isinstance("object");
}

int base_isinstance(const char *cls) {
if strcmp(cls, "base")
return 1;
return object_isinstance(cls);
}

base_vtable = {
.isinstance = base_isinstance,
...
};

int is_base(object *obj) {
return obj->vtable->isinstance("base");
}

int derived_isinstance(const char *cls) {
if strcmp(cls, "derived")
return 1;
return base_isinstance(cls);
}

derived_vtable = {
.isinstance = derived_isinstance,
...
};

int is_derived(object *obj) {
return obj->vtable->isinstance("derived");
}

So, calling e.g. is_object() on a derived instance will call
derived_isinstance("object"), which will call base_isinstance("object"),
which will call object_isinstance("object"), which will return 1. The only
vtable access is for the isinstance method, which will always be available.

If you don't have an object class (i.e. multiple disjoint hierarchies
rather than a single hierarchy), this mechanism will only work if you know
in advance the hierarchy to which the instance belongs (otherwise, the
obj->vtable->isinstance lookup is unsafe).
One of the reasons I'm hiding the subclasses from the user is that my
code has one base class with a number of different backends implemented
as subclasses. Once constructed, all backends are accessed through an
identical interface (that of the base class), so it made sense to return
a base class pointer immediately. The only exception here is that some
backends support functionality outside of the base class interface. They
are implemented as non-virtual functions and since there was only one
level in the inheritance tree, I could simply check the vtable.

You don't necessarily need to expose the details to the user (i.e.
"public"), but it may help to expose some of them to subclasses
(i.e. "protected"). Obviously, you can only implement this via convention
(public vs "internal" headers), not by making functions and variables
"static".
Now, I'm doing some reorganization of the code and I'm adding additional
levels in the inheritance tree as a way to share some common code. But
this intermediate classes are mostly implementation details, that are
not meant to be "known" to the backend users, only the backend writers.

The vtable pointer only needs to be exposed to the subclasses for the
isinstance check, not to the user.
The runtime checking would be useful to prevent my code from crashing
when a user attempts to pass a wrong type of object.

This should only be possible if the user has access to backend-specific
methods. If the only public methods apply to all subclasses, the only way
that backend-specific methods can be invoked is via a specific backend.
If you're executing a virtual method which belongs to the implemention of
a specific backend, you already know that the instance is of the correct
type.
 
J

Jef Driesen

Nobody said:
Downside is that adding the superclass' vtable to each vtable requires
that this superclass' vtable is exposed. In my current code, the vtable
pointer is declared as static, making it local to the function (or the
file).

Unless there is some other way to achieve this?

If you have a "rooted" hierarchy (with everything derived from an "object"
class), you could make isinstance a virtual class method of the object
class, with the class represented by a string, then override it in each
subclass. E.g.

[code removed]

So, calling e.g. is_object() on a derived instance will call
derived_isinstance("object"), which will call base_isinstance("object"),
which will call object_isinstance("object"), which will return 1. The only
vtable access is for the isinstance method, which will always be available.

If you don't have an object class (i.e. multiple disjoint hierarchies
rather than a single hierarchy), this mechanism will only work if you know
in advance the hierarchy to which the instance belongs (otherwise, the
obj->vtable->isinstance lookup is unsafe).

I do have a rooted hierarchy, but I think this approach only makes it
more complicated. And on top of that, you would still need access to
"something" of the parent class to be able to construct the tree. In
this case it's not the vtable pointer directly, but the
parent_isinstance function.

I was more thinking in the direction of the constructor functions. In a
derived class, you call the constructor of the parent class, which fills
in the vtable pointer. Once the base class constructor is finished, you
override the vtable pointer again. Thus at that point, you can easily
grab the vtable pointer of the parent class, without doing anything
special. But I can't come up with a way to actually use it. It can't be
stored in the vtable anymore, because that one is constant...

BTW, is there a reason why you are using strings as the class
representation, and not the vtable pointer itself? I assume that would
be a little more efficient than all those strcmp's.
You don't necessarily need to expose the details to the user (i.e.
"public"), but it may help to expose some of them to subclasses
(i.e. "protected"). Obviously, you can only implement this via convention
(public vs "internal" headers), not by making functions and variables
"static".

Making a function or global variable static does limit its scope to only
that file. Since I use a separate file for each class, that does make
such function or variable private. But I understand what you mean with
the difference between public and protected is "by convention only" :)

Right now, only my base class has a separate public and internal header.
The subclasses only have a public header (for the constructor and the
backend specific functions), because they were never intended to be
subclassed further. When adding an intermediate class, that changes of
course. Having to add an extra internal header only to make the vtable
"protected" feels somewhat silly to me (resulting in twice the number of
header files). But now that I think of that, my intermediate classes are
already for internal use only.
The vtable pointer only needs to be exposed to the subclasses for the
isinstance check, not to the user.

I'm aware of that. I'm just trying to expose only the absolute minimum,
simply because I consider that a good coding style. If functions or
variables are not accessible, there is also no risk to (accidentally)
access or modify them where you shouldn't. But if it's necessary to make
the vtable pointer "protected", that's fine too.
This should only be possible if the user has access to backend-specific
methods. If the only public methods apply to all subclasses, the only way
that backend-specific methods can be invoked is via a specific backend.
If you're executing a virtual method which belongs to the implemention of
a specific backend, you already know that the instance is of the correct
type.

True, and that's why the runtime check needs to done in the wrapper
function, before the corresponding entry in the vtable pointer is called
(as you already mentioned in your previous reply). For backend specific
functions (which don't go through the vtable) the check needs to be done
in the function itself of course.
 
N

Nobody

I do have a rooted hierarchy, but I think this approach only makes it
more complicated. And on top of that, you would still need access to
"something" of the parent class to be able to construct the tree. In
this case it's not the vtable pointer directly, but the
parent_isinstance function.

Right; but I was assuming that you didn't have a problem with classes
exposing functions. Ultimately, you have to expose *something* to
the subclasses.
BTW, is there a reason why you are using strings as the class
representation, and not the vtable pointer itself? I assume that would
be a little more efficient than all those strcmp's.

Simply to avoid exposing internals.

Personally, I'd just make the vtable pointers visible to subclasses.
 
J

Jef Driesen

Nobody said:
Right; but I was assuming that you didn't have a problem with classes
exposing functions. Ultimately, you have to expose *something* to
the subclasses.

Usually I prefer functions over direct access to variables, because that
allows me to change the internals while keeping the interface unchanged
(cf. opaque objects). But in this case it only adds unnecessary complexity.
Simply to avoid exposing internals.

Well, the is_base/derived() function could easily pass the vtable
pointer to the isinstance() function, without having to expose it
outside of its class.

int derived_isinstance(const base_vtable *cls) {
if (cls == derived_vtable)
return 1;
return base_isinstance(cls);
}

derived_vtable = {
.isinstance = derived_isinstance,
...
};

int is_derived(object *obj) {
return obj->vtable->isinstance(derived_vtable);
}

Anyway, it doesn't matter much to me because I do prefer the other
method of using the vtable pointer directly.
Personally, I'd just make the vtable pointers visible to subclasses.

It's more simple and elegant, so that's what I prefer too.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,955
Messages
2,570,117
Members
46,705
Latest member
v_darius

Latest Threads

Top