Anonymous functions in C.

Richard Heathfield · Apr 22, 2007

Steve Thompson said:

If it is usually considered necessary and good to
prevent side-effects from occuring when variables are expanded in
macros, then why is it that the pre-processor is not defined to ensure
this doesn't occur?

Side effects aren't necessarily the problem. Indeed, adding a side
effect may on occasion be the whole point of defining a macro.

The problem comes when a macro argument is evaluated twice or more and
is written in such a way that this re-evaluation is significant in a
bad way. This happens surprisingly rarely - it seems to be one of the
few C pitfalls that students do actually learn about during their
formal studies, and it is easy enough to remember not to do it - so it
isn't really all that big a deal after all.

Chris Torek · Apr 22, 2007

Eh?

#define fluf( array, size, n ) \
{ size_t index = 0; n = 0; \
while( index < size ) { n += array[index++]; } }

Here, "fluf" *is* a macro, but is not a *function-like* macro:

x = fluf(a, b, c);

produces a syntax error.

[but] gcc-like compound statement expressions let you use, for
example, if, while, and for statements.

Click to expand...

I still prefer inline functions, which have "more obvious" semantics
than gcc's ({ ... }) statement-expressions (and are even in C99

).
The function versions avoid name clashes in the usual way that
functions avoid name clashes: we know that at "function call" time
the actual arguments are bound to the formal parameter variables,
and use of the inline function's local variables works in the usual
manner.

Although it is off topic here, I will, however, note that we found
that gcc's inline assembly behaved differently (in at least some
versions of gcc) if put into an inline function. In particular,
while:

static inline int magic_instruction(args) {
...
__asm__ __volatile__("some instruction" : "outputs" : "inputs");
...
return whatever;
}

and:

#define magic_instruction(args) ({ \
... \
__asm__ __volatile__("some instruction" : "outputs" : "inputs"); \
... \
whatever; \
})

both "worked" (modulo compiler bugs and getting the constraints
right on the input and output strings given to the __asm__), the
expression version ({...}) generated better code on the x86 than
the inline function version. (Exactly why was not clear -- both
compilations had the same optimization flags -- and I never
investigated further. I simply gave up on trying to convince people
to use the inline function version. Both were irrevocably intertwined
with using gcc anyway; if we were to use another compiler, either
method could be usually replaced with a call to assembly code in
a separate file, without changing the rest of the C code.)

Ian Collins · Apr 22, 2007

Richard said:
The only form of control flow available in a function-like macro is
usually the ternary operator (and && and ||). gcc-like compound
statement expressions let you use, for example, if, while, and for
statements.

Click to expand...

Eh?

#define fluf( array, size, n ) \
{ size_t index = 0; n = 0; \
while( index < size ) { n += array[index++]; } }

This is not function-like in that it cannot return a value. The GCC
extension lets you return a value from a block.
[/QUOTE]
OK, but it is void function-like!

Keith Thompson · Apr 22, 2007

Chris Torek said:
Eh?

#define fluf( array, size, n ) \
{ size_t index = 0; n = 0; \
while( index < size ) { n += array[index++]; } }

Click to expand...

Here, "fluf" *is* a macro, but is not a *function-like* macro:

x = fluf(a, b, c);

produces a syntax error.

[...]

Strictly speaking, it is a function-like macro (and I should have been
clearer on that point upthread). An object-like macro is one that
takes no arguments; a function-like macro is one that takes arguments
(the standard defines both terms). But not all function-like macros
can actually be used as if they were functions.

My point was that a macro that can actually be used like a function
can't use if, while, and for statements; gcc's compound statement
expressions make this possible. (I'm not necessarily arguing that
they should be added to the standard, though suppose I I wouldn't
mind.)
[...]

Chris Dollin · Apr 22, 2007

jacob said:
Gnu C features some interesting extensions, among others
compound statements that return a value. For instance:

({ int y = foo(); int z;
if (y>0) z = y; else z=-y;
z;
})
A block enclosed by braces can appear within parentheses
to form a block that "returns" a value. This is handy
in some macros, or in other applications.

Actually this construct is nothing more (and nothing less)
than anonymous functions.

It's no more an anonymous function than a five-pound note
is a Brandenburg concerto.

The point about an anonymous function is that it's a /function/,
something you can apply to arguments, and -- in most modern
and decent languages -- that you can treat as a first-class
value, ie, pass it as an argument, return it as a result,
store it in a variable, that sort of thing [1]. The GCC
feature above doesn't do that (it /is/ anonymous, though).

It's a poor man's VALOF-RESULTIS.

[1] That C calls those things "function /pointers/" doesn't
materially affect the argument.

Eric Sosman · Apr 22, 2007

Steve said:
Please excuse my butting in here given my relative lack of experience and
expertise, but it seems that there is a point that everyone here is missing
about this problem. If it is usually considered necessary and good to
prevent side-effects from occuring when variables are expanded in macros,
then why is it that the pre-processor is not defined to ensure this doesn't
occur?

Regards,

Steve

Eric Sosman · Apr 22, 2007

(Sorry for earlier blank reply; hit Send too soon.)

Steve said:
Please excuse my butting in here given my relative lack of experience and
expertise, but it seems that there is a point that everyone here is missing
about this problem. If it is usually considered necessary and good to
prevent side-effects from occuring when variables are expanded in macros,
then why is it that the pre-processor is not defined to ensure this doesn't
occur?

Because "the preprocessor doesn't know C." That is, the
preprocessor just deals with a sequence of tokens (formally,
"preprocessing tokens") but doesn't attach any meaning to
them unless they happen to be macro identifiers or preprocessor
directives. The preprocessor doesn't know that `int' is a
numeric type, that `extern' is a storage class specifier, or
that `++' is an operator. And in particular, the preprocessor
cannot tell which sequences of tokens have side-effects and
which do not.

Eric Sosman · Apr 22, 2007

jacob.navia said:
As you may know, macros and function calls are not
easy to distinguish in C.

This is just a special case of "macros and any construct
whatsoever are not easy to distinguish in C." If you come
across

x = MACRO(y, z);

you may suspect that MACRO is a macro identifier, and you may
even guess that it is a function-like macro (although that's
by no means certain), but you still don't know its expansion.

#define MACRO(u,v) sqrt((u) * (v))
#define MACRO(p,q) p ## q
#define MACRO ++n ,

You HAVE TO KNOW that you are calling a macro and not a function.
You may know that, or you may not.

In C as it stands this has to be a matter of documentation.
You have to know that MACRO is a macro (and what it's good for)
in the same way that you have to know that fflush(stdin) is no
good.

"C as it stands" means, more or less, "C with a preprocessor."
As long as the preprocessor is powerful enough to generate any
arbitrary sequence of source tokens, and as long as it operates
"before" those parts of C that attach meanings to the tokens, the
job of understanding what a macro does must remain a matter for
documentation. One can imagine a C-ish language in which macro
invocations were set off by a special syntax, e.g.

if ((ch = getchar()) == #EOF) ...
#assert(x <= #INT_MAX / 2);

.... but I don't think this would be an improvement. Yes, it
would make the reader of source code aware that a macro was in
use, but it still wouldn't solve the rest of the documentation
issue -- and if you need to look up the documentation for #MACRO
anyhow, you might as well look up the documentation for MACRO.
Meanwhile, it would give up the ability to do things like

#undef malloc
#define malloc debug_malloc

What is important for language coherence and transparency is that

foo(i++);

evaluates i only once if it is a macro or not.

Horse non-proximal to barn, I'm afraid. A "pre"processor
more fully integrated with the rest of the language would be a
Good Thing in some ways (e.g., ability to #if a typedef), but
the language would not be a lot like C. If you want PL/I (and
its more closely integrated source-transformer), you know where
to find it.

Steve Thompson · Apr 22, 2007

Steve Thompson said:

Side effects aren't necessarily the problem. Indeed, adding a side
effect may on occasion be the whole point of defining a macro.

Well, yes. C macros are often useful precisely because they have that
property, but those sorts of macros are rare. I think most people do not
really want to expose their programs to the risks associated with the use
of byzantine macro constructs.

The problem comes when a macro argument is evaluated twice or more and
is written in such a way that this re-evaluation is significant in a
bad way. This happens surprisingly rarely - it seems to be one of the
few C pitfalls that students do actually learn about during their
formal studies, and it is easy enough to remember not to do it - so it
isn't really all that big a deal after all.

I was just wondering why it is that the C pre-processor was not defined to
fix this problem. It is not a stretch to imagine that macro definitions
such as

#define foo(x, y) ({ if (y >0) (x / y); else x; })

could be automatically treated by the preprocessor as if it were something
like

#define foo(x, y) ({ typeof(x) __x = x; typeof(y) __y; \
if(__y > 0) (__x /__y); else (__x); })

I suppose you could say that if you need that behaviour, then you should
write an inline function, but that fails to address the usability risk
involved in macros that evaluate their parameters more than once. IMHO it
is a needless risk.

Regards,

Steve

Steve Thompson · Apr 22, 2007

(Sorry for earlier blank reply; hit Send too soon.)

Because "the preprocessor doesn't know C." That is, the
preprocessor just deals with a sequence of tokens (formally,
"preprocessing tokens") but doesn't attach any meaning to
them unless they happen to be macro identifiers or preprocessor
directives. The preprocessor doesn't know that `int' is a
numeric type, that `extern' is a storage class specifier, or
that `++' is an operator. And in particular, the preprocessor
cannot tell which sequences of tokens have side-effects and
which do not.

Aha; that was what I was trying to get at. But as you imply the
preprocessor knows a little bit about C, specifically it almost knows what
a function argument is. In my reply to Mr. Heathfield I ask whether the
pre-processor could not automatically protect macro arguments at very
little cost to the language.

Regards,

Steve

Army1987 · Apr 22, 2007

#define my_abs(x) ({ int y = x; y > 0 ? y : -y; })

would fail if called as my_abs(y).

Use some identifier you know you'll never use, such as
#define my_abs(x) ({ int y__ = x; y__ > 0 ? y__ : -y__; })

Eric Sosman · Apr 22, 2007

Steve said:
Aha; that was what I was trying to get at. But as you imply the
preprocessor knows a little bit about C, specifically it almost knows what
a function argument is. In my reply to Mr. Heathfield I ask whether the
pre-processor could not automatically protect macro arguments at very
little cost to the language.

No, the preprocessor really doesn't know what a function
argument is, not at all. It knows what macro arguments are,
and macro arguments are specified with a syntax that is similar
to that of function arguments, but that's the end of it. The
preprocessor operates at an early stage of translation, before
types and operators and keywords and side-effects exist. This
is why the preprocessor can't evaluate the sizeof an expression
or tell whether a typedef has been declared, and why __func__
cannot be a macro. What doesn't yet exist can't be discerned.

One could imagine a linguistic construct that would test
whether an expression was "pure," that is, without side-effects.
Such a construct would need to operate after the preprocessor
was all through and thus couldn't affect what it did, but
might be used in macros. For example, imagine a purify(x)
operator whose value is x if x is an expression with no side-
effects, but which causes a compilation error if x might have
side-effects. You could use it like this:

#define SQUARE(x) ( purify(x) * (x) )

Looks good ... but is it? SQUARE(42) works fine, and SQUARE(++i)
generates a diagnostic, but what about SQUARE(f(x)), where f()
is a function that hasn't even been written yet? Or how about
SQUARE(asin(x)), which may or may not set errno? A static
guard of this kind might not be as useful as could be desired.

Army1987 · Apr 22, 2007

Richard Heathfield said:
Michal Nazarewicz said:

At this point (if you don't want #define max(x,y) ((x)>(y) ? (x) :
(y)) to avoid either operand being evaluated twice) the stuff above
already needs to know the types of a and b; so why is it better
than
int max(int a, int b) { return a > b ? a : b; }
?

Servé Laurijssen · Apr 22, 2007

Dave Vandervies said:
Like C99's compound literals? A block statement constituting the body
of the function, cast to a function pointer type.
The cast would have to name the function arguments, to identify them in
the body.

So they'd be used something like:
--------
qsort(arr,num,sizeof *arr,
(int (*)(const void *va,const void *vb)){
int *a=va;int *b=vb; return (*a<*b)?(-1)*a>*b);});
--------

Not sure if that would be an improvement, although compound literals can
make messy looking code too

Richard Bos · Apr 23, 2007

I imagine that one reason why uses of this are not obvious is just
that in C (unlike Lisp) it's not traditional to write things as nested
multi-line expressions, and that in turn is because at present you
can't generally do it.

Well, no, it's traditional not to write things as nested multi-line
expressions because such things rapidly grow beyond every possibility of
control and legibility. I'll admit that Lisp _has_ solved the spaghetti
code problem, but it's done so by replacing it with

)))))))))))
)))))
))
))))
)

maccheroni code, and I'm not entirely sure that that is _that_ much of
an improvement.

Richard

Richard Tobin · Apr 23, 2007

I imagine that one reason why uses of this are not obvious is just
that in C (unlike Lisp) it's not traditional to write things as nested
multi-line expressions, and that in turn is because at present you
can't generally do it.

[/QUOTE]

Well, no, it's traditional not to write things as nested multi-line
expressions because such things rapidly grow beyond every possibility of
control and legibility. I'll admit that Lisp _has_ solved the spaghetti
code problem, but it's done so by replacing it with
[...]

"Spaghetti code" doesn't usually refer to heavy nesting. On the
contrary, it usually refers to the use of goto to flatten out control
structure.

But I'm not arguing for or against that style of coding, just pointing
out that it's not useless as Richard Heathfield was suggesting. One
might reasonably argue that it's not a C-like way to do things.

Going back to macros, it's very common to use the conditional operator
in them (consider the traditional implementation of getc()), and one
might say that it provides all the control structure that most macros
need (if they need any). Or you might say it's a kludge, and we
should have the full power of C at our disposal even if it would be
ugly to overuse it.

-- Richard

Richard Bos · Apr 23, 2007

Well, no, it's traditional not to write things as nested multi-line
expressions because such things rapidly grow beyond every possibility of
control and legibility. I'll admit that Lisp _has_ solved the spaghetti
code problem, but it's done so by replacing it with
[...]

"Spaghetti code" doesn't usually refer to heavy nesting. [/QUOTE]

Yes, that is precisely what I said. Lisp doesn't have spaghetti code,
because it replaces it with abyssal nesting, a.k.a. maccheroni code.

Richard

Richard Tobin · Apr 23, 2007

Yes, that is precisely what I said. Lisp doesn't have spaghetti code,
because it replaces it with abyssal nesting, a.k.a. maccheroni code.

Perhaps "onion code" would be a better term.

-- Richard

Richard Bos · Apr 27, 2007

Perhaps "onion code" would be a better term.

True, but I am rather fond of pasta.

Richard

Mark McIntyre · Apr 28, 2007

Perhaps "onion code" would be a better term.

True, but I am rather fond of pasta.[/QUOTE]

In which case, a flame war about the correct spelling of macaroni is
probably in order....

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

difference in stdint.h and inttypes.h	6	Apr 1, 2012
C99 library part of C++ TR1	3	Mar 31, 2007
size_t or int for malloc-type functions?	318	Dec 31, 2006
qsort descending order	5	Mar 19, 2007
Levels of exception safety in C++	2	Mar 6, 2011
Garbage Collection in C	142	Oct 11, 2006
Aliases in C	15	Sep 2, 2006
C++0x memory model and atomics, some questions	5	Sep 1, 2010

Anonymous functions in C.

Richard Heathfield

Chris Torek

Ian Collins

Keith Thompson

Chris Dollin

Eric Sosman

Eric Sosman

Eric Sosman

Steve Thompson

Steve Thompson

Army1987

Eric Sosman

Army1987

Servé Laurijssen

Richard Bos

Richard Tobin

Richard Bos

Richard Tobin

Richard Bos

Mark McIntyre

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads