And to answer the next question (I can feel it coming;
me rheumatism allus' gits a-cranky when one o' them li'l
questions is in the wind): No, this doesn't cast doubt on
the status of the main() function. (K.B.'s original post
mentioned that it was prompted by the recent main() thread.)
The Standard prescribes two forms for main(), and allows
an implementation to support additional forms. So it's
sometimes said that main() is unique among C functions in
having "multiple signatures." But this isn't quite right:
In any particular (hosted) program, main() has exactly one
signature, namely, the signature it is defined with. ...
And, perhaps not incidentally, here is how it works on one
system on which "return" from a non-variadic function pops
the arguments pushed by the caller: ... um, hang on a moment.
First, let me describe how arguments work, on this machine.
The machine has a conventional stack, and arguments are pushed
onto the stack in the usual way:
extern int foo(char *, int);
...
result = foo("this", 7);
compiles to:
push $7 # push last argument first, and then
push $.LC3 # first argument last
call foo_ # call the function
mov r1, -4(sp) # save result
...
and "foo" itself ends with:
ret $8
where the "$8" means "pop 8 bytes off the stack". This makes
function calls slightly smaller and faster, since all the callers
get to omit their "pop bytes off the stack" instructions.
Naturally, this method does not work with variadic functions, so
those end with "ret $0" and the caller, who knows how many bytes
were pushed, pops the pushed bytes.
Now, the problem lies in handling main(). The compiler sees
that main() is *not* a variadic function -- it does not have
", ..." as its last argument:
int main(void) {
return 42;
}
/* or
int main(int argc, char **argv) {
return 0;
}
*/
The first one compiles to:
mov $42, r1
ret $0
which pops nothing at all off the stack; but the second compiles
to:
mov $0, r1
ret $8
which pops 8 bytes off the stack. How can the startup code know
what to do?
There are three "obvious" solutions. The first one is that
the compiler recognizes main() and always uses "ret $0" (or always
uses "ret $8"), so that the startup code can be sure what happened.
Another is to have the startup code examine the value in the
"sp" register upon return, to see whether main() popped 8 bytes
or none (or save and restore it so that it works no matter what).
A third method is to have the compiler recognize "main", and compile
it to one of *two* "link names". Here, the two examples above
actually produce the following:
__0_main: .global __0_main
mov $42, r1
ret $0
and:
__2_main: .global __2_main
mov $0, r1
ret $8
Now, at link time, the compiler simply looks to see whether it
can find the symbol "__0_main" or the symbol "__2_main". Whichever
one it finds, it chooses the appropriate startup code:
void __0_startup() {
extern int main(void);
extern void __init_c_library();
__init_c_library();
exit(main());
}
or:
void __2_startup() {
extern int main(int, char **);
extern void __init_c_library();
__init_c_library();
... find argc and argv ...
exit(main(argc, argv));
}
Which method does this compiler actually use, on this machine?
The answer is: who cares? It works, just as the Standard requires.
That is all you need to know.