printf() with too many args -- legal?

Keith Thompson · Jun 25, 2005

Simon Biber said:
They aren't special, from a language point of view. A standard C
compiler is not required to do type-checking of the variable arguments
of printf functions. The fact that some compilers do this is an extra,
non-standardised feature.

The *printf() functions are somewhat special in that they're part of
the standard library, so the compiler is allowed to make assumptions
about what they do. For example, a call like

printf("%d", 1.5);

invokes undefined behavior. The compiler is allowed to issue a
diagnostic (which is really beside the point, since a compiler may
issue a diagnostic for any reason at any time), but it's also allowed
to generate code that, for example, aborts the program rather than
calling printf(). And for a correct printf() call like:

printf("%s\n", "Hello, world");

the compiler may replace the printf() call with

puts("Hello, world");

It wouldn't be allowed to do these things for a user-defined
my_printf() function with the same prototype, *unless* it's able to
determine at compilation time (or at link time) how the user-defined
function actually behaves. For functions in the standard library, the
compiler is allowed to make these assumptions without analyzing the
implementation of the function, because the behavior is guaranteed by
the standard.

Stan R. · Jun 26, 2005

Richard said:
Some graphical interfaces have used vararg/stdarg argument lists with
alternating name-value arguments, and these can quite reasonably be
fairly long, though more than 30 arguments would be unusual even for
this case.

Good point, but the arg lists in a case like you described doesn't need
to be so long.

a) Encapsolate the name-value pair to a struct or class, and pass
instances of the struct instead (cutting the arg list in half. Or...

b) Cut it down to one arg by using something of the sort of a linked
list, thus just passing the first pointing, and the function keeps going
until it hits the NULL (end.)

c) not having used c for some time (been using c++ for year now after
using c for a while), I honestly can't remebmer if there are classes,
but iff you, one could write a Map or Hash type class, or just use
<map.h>

Keith Thompson · Jun 26, 2005

Stan R. said:
Good point, but the arg lists in a case like you described doesn't need
to be so long.

a) Encapsolate the name-value pair to a struct or class, and pass
instances of the struct instead (cutting the arg list in half. Or...

C has no classes.

Even with C99's compound literals, the call is going to be
considerably more verbose than with alternating name-value arguments.
For example, in C99

func(WIDTH, 40, HEIGHT, 50, LAST_ARG);

might become

func((arg_type){WIDTH, 40},
(arg_type){HEIGHT, 50},
(arg_type){LAST_ARG, 0});

And that's ignoring the possibility of different types of arguments,
which would probably require unions and named designators. It gets
very ugly very quickly.

C90 has no compound literals, so you'd need separate code to
initialize the argument objects before passing them to the function.

b) Cut it down to one arg by using something of the sort of a linked
list, thus just passing the first pointing, and the function keeps going
until it hits the NULL (end.)

Again, you's have to construct the linked list separatey before
calling the function, and destroy it afterwards.

c) not having used c for some time (been using c++ for year now after
using c for a while), I honestly can't remebmer if there are classes,
but iff you, one could write a Map or Hash type class, or just use
<map.h>

See above.

Richard Tobin · Jun 26, 2005

Stan R. said:
a) Encapsolate the name-value pair to a struct or class, and pass
instances of the struct instead (cutting the arg list in half. Or...

b) Cut it down to one arg by using something of the sort of a linked
list, thus just passing the first pointing, and the function keeps going
until it hits the NULL (end.)

In one of the systems of this kind that I used, the original interface
was an array of name-value structs. This was verbose and tedious to
type, and the varargs version was added to make it more usable.

-- Richard

Christian Kandeler · Jun 27, 2005

Kenny said:
int foo(int a) { return a; }

void bar(void) { return foo(1,2,3); }

Works fine on 99.99% of all hardware

Although you'd first have to write a compiler which accepts this.

Christian

Kenny McCormack · Jun 27, 2005

Although you'd first have to write a compiler which accepts this.

Christian

I'm sure compilers exist (from the early dawn era of prototypes)
that would, but today, you would just rearrange it to:

void bar(void) { return foo(1,2,3); }
int foo(int a) { return a; }

And ignore the warning about implicit declaration of foo().

pete · Jun 27, 2005

Works fine? Meaning that you don't know what it does
and you don't know what it's supposed to do?

What is bar supposed to do?
A function call to bar has no value and no side effects.

It's just simply undefined.

N869
6.5.2.2 Function calls
Constraints
[#2] If the expression that denotes the called function has
a type that includes a prototype, the number of arguments
shall agree with the number of parameters.

6.8.6.4 The return statement
Constraints
[#1] A return statement with an expression shall not appear
in a function whose return type is void.

Kenny McCormack · Jun 27, 2005

Works fine? Meaning that you don't know what it does
and you don't know what it's supposed to do?

OK, little boy. Change it to "int bar" if it makes you happy.

Or change it from "return foo" to "printf foo" (with suitable syntax to
make it compilable).

pete · Jun 28, 2005

Kenny said:
OK, little boy. Change it to "int bar" if it makes you happy.

Or change it from "return foo" to "printf foo"
(with suitable syntax to make it compilable).

The point,
is that you're not paying attention to what you're saying,
because you know better than to do that.

Kenny McCormack · Jun 28, 2005

The point,
is that you're not paying attention to what you're saying,
because you know better than to do that.

I see what you mean, but I think there is a sense in which you ought to be
able to read text posted here for content and in context - rather than just
looking for any little thing to nit-pick.

Obviously, the point of the thread and the thing being discussed was the
calling of foo() with more parameters than it was declared with. I think it
was clear enough to anyone what the point was, unless, of course, the
reader is, as is all too common in this ng, just reading to look for
something to nitpick. The declaration and use of bar was clearly outside
of the main channel of the discussion.

CBFalconer · Jun 29, 2005

Kenny said:
.... snip ...

Obviously, the point of the thread and the thing being discussed
was the calling of foo() with more parameters than it was declared
with. I think it was clear enough to anyone what the point was,
unless, of course, the reader is, as is all too common in this ng,
just reading to look for something to nitpick. The declaration
and use of bar was clearly outside of the main channel of the
discussion.

I believe the excess parameters to foo lead to non-defined
behavior, inasmuch as foo is not a variadic function. I use non-
to include un- and implementation-. I can easily think of a
calling convention in which that call would blow up.

Kenny McCormack · Jun 29, 2005

CBFalconer said:
I believe the excess parameters to foo lead to non-defined
behavior, inasmuch as foo is not a variadic function. I use non-
to include un- and implementation-. I can easily think of a
calling convention in which that call would blow up.

Of course. That's pretty much the reason for existence (or however the
French would say that...) of the "cdecl" calling convention. I would
imagine that the original authors of C invented cdecl (though they may not
have called it that) precisely so that printf() could be implemented.

Notes for the prissy:
1) I say "cdecl", using that as a term in and of itself, to avoid
any implication that the cdecl calling convention is part of or
gets any funding from or is in any way associated with the
C language (the holy grail of this ng).
2) It is probably OT to even mention calling conventions, but
I personally think it is OK to discuss cdecl as long as we make
it clear, as we must, that it is technically OT.

Chris Torek · Jun 30, 2005

(It is indeed undefined behavior, and it does in fact break on
x86 architecture systems that use the "RET N" instruction form
for fixed-argument functions. What happens, of course, is that
the "RET 12" or whatever in the fixed-argument callee removes the
wrong number of bytes from the caller's stack frame, so that
after calling the function, all your local variables go kaboom.
Depending on whether the caller uses ENTER/EXIT and/or a frame
pointer, the caller itself may also not be able to return.)

Of course. That's pretty much the reason for existence (or however the
French would say that...) of the "cdecl" calling convention. I would
imagine that the original authors of C invented cdecl (though they may not
have called it that) precisely so that printf() could be implemented.

Notes for the prissy:
1) I say "cdecl", using that as a term in and of itself, to avoid
any implication that the cdecl calling convention is part of or
gets any funding from or is in any way associated with the
C language (the holy grail of this ng).
2) It is probably OT to even mention calling conventions, but
I personally think it is OK to discuss cdecl as long as we make
it clear, as we must, that it is technically OT.

To me, the word "cdecl" refers to the program for turning C
declarations into pseudo-English and vice versa

... as in:

% cdecl
cdecl> explain int (*foo)(arg)
declare foo as pointer to function (arg) returning int
cdecl> declare p as pointer to array 3 of pointer to function (args) returning pointer to pointer to char
char **((*p)[3])(args)

(Note: my copy of cdecl disappeared long ago and the above is made
up from memory.)

That said, there are some fundamental errors above.

C grew out of the B -> NB (New B) -> C progression, as Dennis
Ritchie has noted. The first C compilers were done on the PDP-11.
(See http://cm.bell-labs.com/cm/cs/who/dmr/chist.html for details.)
The PDP-11 has a hardware-denoted "stack pointer" register ("sp")
with "push" and "pop" semantics implied by its subroutine call
instruction. (As with many such machines, there are separate "user"
and "kernel"/"system" stack pointers as well, controlled by the
"PSW" or Processor Status Word. As such, one can talk about the
"usp" vs the "ssp": user and system stack pointers. But this is
invisible to ordinary user programs.)

The "most natural" design of assembly code on the PDP-11 has the
stack pointer adjusted by the caller. The reason is that if you
are going to put arguments on the stack, then call a function, you
end up with a stack on which the return address is at the bottom,
and a "return from subroutine" instruction pops it automatically,
but leaves the arguments on the stack. Unlike the x86, there is
no "return and pop more arguments" instruction -- you must use a
separate pop (or add), and of course, this has to go in the caller,
because the "ret" has returned.

Given these "machine facts", as it were, the obvious, simple, easy
way to handle subroutine calls in a language that supports recursion
and uses the machine's stack is to evaluate the arguments right to
left, pushing each value as soon as it is computed, and then emit
the "call subroutine" instruction followed by the "pop as many bytes
as we pushed" instruction. Variadic routines are easy to write as
well. If you take the address of the stack element just above the
return-address, you have a pointer to the first argument. Each
subsequent argument is the next two bytes (because all arguments
are "int"s -- early C did not have "long" and nobody actually used
floating-point

). You could even write an "nargs" function:
just trace up the call stack, look at the instruction following the
call, and figure out how many bytes it pops (and divide by two since
two PDP-11 bytes gives you one PDP-11 int).

Of course, "nargs" stops working once you support "long" and
floating-point ("double") arguments, which use 4 and 8 bytes (2
and 4 ints) respectively. (It also stops working when you get a
big separate-I&D PDP-11 and are unable to read instructions
because they are in the wrong address space.)

The problems came in when C started growing up and trying to move
out of the restricted "everything is a PDP-11" world. Suddenly
there were Honeywell machines with 9-bit bytes, the Interdata with
32-bit "int"s, machines on which "char" should properly be unsigned,
and so on. And of course, there were machines like the Intel x86,
with its "RET 12" instructions that both returned *and* popped
arguments off the caller's stack -- but required knowing for sure
exactly how many bytes to pop.

The ANSI C89 folks decided to allow Microsoft and other vendors
to use the "RET 12" style instructions, by the simple means of
requiring that a prototype be in scope before calling variadic
functions like printf(). The C compiler could assume that a call
like:

f(1, 2L);

that passed 6 bytes called a non-variadic function f() that ended
with a "RET 6". On the other hand, given:

int g(int, ...);
g(1, 2L);

that same C compiler would know that g() used a plain "RET"
instruction, and pop the 6 bytes itself.

Fortunately, the Intel x86 CPU remained obscure and was never used
in any popular hardware, so nobody actually has to declare their
variadic functions in advance.

(OK, I *am* kidding: fortunately,
C compiler vendors used the slower ret-and-separate-pop instruction
sequence, so that C programmers could get away with sloppy code.
And as it turns out, ret-and-separate-pop is now often faster
anyway.)

Stan R. · Jul 1, 2005

Chris Torek wrote:
[...]

To me, the word "cdecl" refers to the program for turning C
declarations into pseudo-English and vice versa ... as in:

% cdecl
cdecl> explain int (*foo)(arg)
declare foo as pointer to function (arg) returning int
cdecl> declare p as pointer to array 3 of pointer to function
(args) returning pointer to pointer to char char **((*p)[3])(args)

(Note: my copy of cdecl disappeared long ago and the above is made
up from memory.)

Well your memory seems to be exellent:

$ cdecl
Type `help' or `?' for help
cdecl> explain int (*foo)(arg)
declare foo as pointer to function (arg) returning int
cdecl> declare p as pointer to array 3 of pointer to function (args)
returning pointer to pointer to char

char **(*(*p)[3])(args)

Thats copied right out of my linux shell window.

Just thought you might be curious

Calling functions with the wrong parameters	7	Jul 18, 2006
C FAQs section 20.27	9	Apr 21, 2008
"(unsigned)" with long/int	10	Aug 7, 2007
YKYBRclcTLW	13	Jun 19, 2007
Sorta-OT: John Backus obit	5	Mar 21, 2007
Is "?" a sequence point?	7	Jun 9, 2006
Pointer arithmetic question	23	Jan 20, 2006
printf with run-time format strings	11	Feb 10, 2004

printf() with too many args -- legal?

Keith Thompson

Stan R.

Keith Thompson

Richard Tobin

Christian Kandeler

Kenny McCormack

pete

Kenny McCormack

pete

Kenny McCormack

CBFalconer

Kenny McCormack

Chris Torek

Stan R.

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads