I'm trying to dynamically generate functions [as is done in languages
like Scheme and ML and all sorts of other "functional" languages] ...
... it seems that what I really want is beyond C itself, but I'd
like to be confirmed here.
It is (beyond C, that is), except insofar as you can write a
Lisp interpreter in C, and then write your code in Lisp.
If C's ordinary data objects ("int", "double", and so on) are "first
class" items, and structures are also "first class"[1] citizens,
and arrays are then "second class" citizens[2], that puts C's
functions at a distant third-class. You cannot take the size of
a function, functions may be (and somewhat often, really are) in
"special" memory (ROM or write-protected RAM and/or a separate
instruction-space on the processor), and functions can never be
generated "on the fly", or even loaded dynamically[3].
-----
[1] Really true in C99, only half-true in C89. C99 has a new
feature called a "compound literal" so that you can write things
like (const struct foo){1,2}. These allow the creation of
anonymous aggregates. Thus, given void f1(int), you can do
both f1(i) and f1(3) in C89; but given void f2(struct foo),
you can only do f2((struct foo){3}) in C99.
[2] The entire value of an array can never be accessed all at once.
In particular, f(array) passes &array[0] to f(), instead of a copy
of every element of the array. Similarly:
struct S a, b;
...
a = b;
is OK, but:
int a[5], b[5];
...
a = b;
is not.
[3] Lots of systems have dynamically-loaded functions (whether
called "DLLs" or "dynamic shared libraries" or "DSOs" or some other
name(s)). This is done by those implementations going beyond the
minimum requirements for Standard C. Such systems can, in general,
implement dynamic code generation as well -- but in some cases it
takes quite a lot of fancy footwork, and in all cases it is not
portable.
-----
There is a sort of compromise position in C, halfway beween "writing
a full-blown interpreter" and "doing full-blown runtime code
generation". To build a Lisp-like "closure", so that you can
perform partial application of some function f() to generate new
functions f1() through fN(), write function f() so that it takes
an extra parameter, e.g.:
struct adder_context { int addend; };
int adder(int param, struct adder_context *p) {
return param + p->addend;
}
Now you can generate a new adder simply by allocating an
"adder_context" and filling in the addend:
/* remember to #include <stdlib.h> */
struct adder_context *new_adder(int addend) {
struct adder_context *p = malloc(sizeof *p);
if (p == NULL)
panic("out of memory");
p->addend = addend;
return p;
}
Of course, this is not very general. You probably want to be
able to construct not only an adder but also a multiplier, and/or
various other functions. So now we fancy up the context, and
perhaps also use "void *":
struct generic_context {
int (*func)(void *ctx, int arg);
};
struct adder_context {
struct generic_context common;
int addend;
};
struct multiplier_context {
struct generic_context common;
int mult;
};
struct mult_and_add_context {
struct generic_context common;
int mult;
int addend;
};
We now have three specific kinds of contexts, and can write three
functions (which can be "static" so that their names are invisible
outside the implementation routine) and their three exported
function-builders:
static int do_add(void *ctx, int arg) {
struct adder_context *p = ctx;
return arg + p->addend;
}
/* NB: emalloc is malloc + panic-if-out-of-memory */
struct generic_context *new_adder(int addend) {
struct adder_context *p = emalloc(sizeof *p);
p->func = do_add;
p->addend = addend;
return &p->common;
}
static int do_mult(void *ctx, int arg) {
struct multiplier_context *p = ctx;
return arg + p->addend;
}
struct generic_context *new_mult(int mult) {
struct multiplier_context *p = emalloc(sizeof *p);
p->func = do_mult;
p->mult = mult;
return &p->common;
}
static int do_mult_and_add(void *ctx, int arg) {
struct mult_and_add_context *p = ctx;
return (arg * p->mult) + p->addend;
}
struct generic_context *new_mult_and_add(int mult, int addend) {
struct mult_and_add_context *p = emalloc(sizeof *p);
p->func = do_mult_and_add;
p->mult = mult;
p->addend = addend;
return &p->common;
}
Whatever code calls these need only use the "generic" context that
is common to all these functions:
struct generic_context *p;
if (...)
p = new_adder(3); /* so that p->func computes x + 3 */
else if (...)
p = new_mult(5); /* here p->func computes 5x */
else
p = new_mult_and_add(7, 2); /* p->func computes 7x + 2 */
...
printf("func(4) = %d\n", p->func(p, 4));
/* prints 7, 20, or 30, depending on p->func */
The use of "void *" provides a kind of "type system sleight-of-hand"
that allows us to write cast-free (i.e., "far less ugly") C code.
The only constraint is that the generic context must be the first
member of the specific contexts.
For those familiar with C++, note that this is really just a "hand
expansion" of a C++ "base class" with a single "virtual function".
If we had multiple virtual functions, it would often be good to
use a second level of indirection, where the generic context has
a pointer to a table of function pointers, so that instead of:
result = p->func(p, other_args);
we would write:
result1 = p->ops->func1(p, other_args);
result2 = p->ops->func2(p, other_args);
result3 = p->ops->func3(p, other_args);
and so on.
Note that, if this were to be used in a serious program, the common
"generic" context would go in some header file, along with the
declarations of the various builder functions. The actual
implementations (and the specific contexts that contain the generic
context as their first element) can then be in a separate translation
unit. The actual contents of a specific context are thus
well-contained, with the interface being determined entirely by
the generic context. That generic context may be as simple or as
complicated as you like -- the only real constraint is that it is
fixed at compile-time, and all the functions have the same "type
signature" (return value, and number-and-types of parameters).