Does GCC optimize variadic functions to death?

E

Elmar

Hi everyone,

I just switched to GCC 4.4 (gcc44 (GCC) 4.4.0 20090514 (Red Hat
4.4.0-6)) and noted that quite a lot of my functions with variable
arguments are not compiled correctly anymore at -O2 (things are OK at -
O).

Since the problem could well be my knowledge of the C standard, I'd
like to ask about it here before I file a bug report.

Here is the minimum example:

/* PRINT A LIST OF INTEGER VALUES
==============================
list[0] is the number of values in list */
void lst_print(int *list)
{ int i;

for (i=0;i<*list;i++) printf("Value %d: %d\n",i,list[1+i]); }

/* PRINT INTEGER VALUES
====================
The values are specified directly as function parameters */
void val_print(int values,...)
{ /* The list has been put on the stack by the caller... */
lst_print(&values); }

/* TEST PROGRAM
============ */
int main(int argc,char **argv)
{ /* Print 5 integer values from 10 to 50 */
val_print(5,10,20,30,40,50);
return(0); }


Now, at optimization level -O, the result is as expected:
Value 0: 10
Value 1: 20
Value 2: 30
Value 3: 40
Value 4: 50

But at optimization level -O2, the result is garbage:
Value 0: 241472600
Value 1: 0
Value 2: 1
Value 3: 12652532
Value 4: 11273664

Analysis of the assembler code shows that at -O2, GCC entirely
bypasses the function val_print and calls lst_print directly from
main(), but forgets to also put the variable arguments on the stack:

0x08167820 <main+0>: push %ebp
0x08167821 <main+1>: mov %esp,%ebp
0x08167823 <main+3>: sub $0x28,%esp
0x08167826 <main+6>: lea -0xc(%ebp),%eax
0x08167829 <main+9>: mov %eax,(%esp)
0x0816782c <main+12>: movl $0x5,-0xc(%ebp)
0x08167833 <main+19>: call 0x81677c0 <lst_print>
0x08167838 <main+24>: xor %eax,%eax
0x0816783a <main+26>: leave
0x0816783b <main+27>: ret


What do you think about this? I would say that since variable
arguments can only be accessed via a pointer to the last non-variable
argument, GCC is wrong to ignore the variable arguments here: Function
val_print takes the address of the last non-variable argument and
passes it on, so GCC must expect that the variable arguments will
still be accessed later, no..?

In case GCC is right and my functions are not allowed by the C
standard: does anyone have a hint how to fix my program?
An ugly hack that works is to move function val_print or lst_print to
another source code file.
Maybe there is an __attribute__ I can add to the function?

Many thanks for your time and help,
Elmar
 
T

Tom St Denis

Hi everyone,

I just switched to GCC 4.4 (gcc44 (GCC) 4.4.0 20090514 (Red Hat
4.4.0-6)) and noted that quite a lot of my functions with variable
arguments are not compiled correctly anymore at -O2 (things are OK at -
O).

Since the problem could well be my knowledge of the C standard, I'd
like to ask about it here before I file a bug report.

Here is the minimum example:

/* PRINT A LIST OF INTEGER VALUES
   ==============================
   list[0] is the number of values in list */
void lst_print(int *list)
{ int i;

  for (i=0;i<*list;i++) printf("Value %d: %d\n",i,list[1+i]); }

/* PRINT INTEGER VALUES
   ====================
   The values are specified directly as function parameters */
void val_print(int values,...)
{ /* The list has been put on the stack by the caller... */
  lst_print(&values); }

If this is what you think "portable code" is ...

From what I recall nowhere in the spec does it say function parameters
needs to be adjacent in memory at all. That your hack happen to have
worked before is just coincidence, not standard.

Look up va_start (and associated functions) on how to properly access
the parameters.

Tom
 
N

Noob

Elmar said:
I just switched to GCC 4.4 and noted that quite a lot of my functions
with variable arguments are not compiled correctly anymore at -O2
(things are OK at -O).

For reference, GCC's -O is equivalent to -O1
Since the problem could well be my knowledge of the C standard, I'd
like to ask about it here before I file a bug report.

The problem comes from your code ;-)

You need to use a va_list.

<quote>

4.8 VARIABLE ARGUMENTS <stdarg.h>

The header <stdarg.h> declares a type and defines three macros, for
advancing through a list of arguments whose number and types are not
known to the called function when it is translated.

A function may be called with a variable number of arguments of
varying types. As described in $3.7.1, its parameter list contains
one or more parameters. The rightmost parameter plays a special role
in the access mechanism, and will be designated parmN in this
description.

The type declared is

va_list

which is a type suitable for holding information needed by the macros
va_start , va_arg , and va_end . If access to the varying arguments
is desired, the called function shall declare an object (referred to
as ap in this section) having type va_list . The object ap may be
passed as an argument to another function; if that function invokes
the va_arg macro with parameter ap , the value of ap in the calling
function is indeterminate and shall be passed to the va_end macro
prior to any further reference to ap .

</quote>

cf. also
http://www.opengroup.org/onlinepubs/009695399/basedefs/stdarg.h.html

Regards.
 
S

Seebs

I just switched to GCC 4.4 (gcc44 (GCC) 4.4.0 20090514 (Red Hat
4.4.0-6)) and noted that quite a lot of my functions with variable
arguments are not compiled correctly anymore at -O2 (things are OK at -
O).

Yes, they are.
/* PRINT A LIST OF INTEGER VALUES
==============================
list[0] is the number of values in list */
void lst_print(int *list)
{ int i;

for (i=0;i<*list;i++) printf("Value %d: %d\n",i,list[1+i]); }

This is 100% wrong.

You misspelled:
void lst_print(int *list, ...)
{
va_list ap;
int i;
va_start(ap, list);
for (i = 0; i < *list; ++i) {
int val = va_arg(ap, int);
printf("Value %d: %d\n", i, val);
}
}

.... But that wouldn't work in your case, because...
void val_print(int values,...)
{ /* The list has been put on the stack by the caller... */

No, it hasn't.

The arguments are accessible only through a "va_list".
"man va_arg".
What do you think about this? I would say that since variable
arguments can only be accessed via a pointer to the last non-variable
argument,

Uh, no.

<stdarg.h> is your friend, and has been available for roughly twenty
years. The thing you were doing before was never portable or guaranteed,
and there have been targets where it would never, ever, have worked -- that
it worked at all was pure coincidence, and given how much it's confused
you, I'd call it a pretty unfortunate coincidence.

-s
 
S

Seebs

If the compiler can prove that a parameter is never referenced, can it
eliminate it entirely, to the point where it's not even passed?

Of course it can. Since you can't, in conforming code, tell whether it
did or not, it makes no difference.

-s
 
E

Elmar

Many thanks for all your comments!

I forgot to mention that I'm not a complete newbie (remember, I could
trace the problem on the assembler level ;-), so I did know about
va_list and friends.
This is 100% wrong.


You misspelled:
void lst_print(int *list, ...)
{
va_list ap;
int i;
va_start(ap, list);
for (i = 0; i < *list; ++i) {
int val = va_arg(ap, int);
printf("Value %d: %d\n", i, val);
}
}

The problem is that in real life, lst_print from my minimum example is
a much larger function, which I don't want to write twice, once as
shown originally (other functions need to call it with a list, not
with variable arguments), and once as typed out by you (would be a
nasty code duplication).

So the only correct alternative would be to turn val_print into a
wrapper for lst_print, and use va_start and va_arg to create a list
that is then passed to lst_print. This would require malloc/free etc.
to create an exact copy of what is already beautifully laid out on the
stack. My application is very performance critical, so creating
intermediate exact copies is something I usually try to avoid.

Portability is not my concern, since my application has >100000 lines
of assembler code and is tied to x86.

That's why I hoped there would be a simple trick to tweak the GCC
optimizer...

A few months from now, I'll be in trouble anyway, when it's time to
port to x86_64. There, va_start and va_arg are a true nightmare (the
assembler code is blown up and slowed down by a factor 2-3 since
variable arguments can end up in registers). So in case anyone has a
brilliant idea how to make my original example work on x86_64, you'll
be my hero!

Anyway, thanks!
Elmar
 
S

Seebs

I forgot to mention that I'm not a complete newbie (remember, I could
trace the problem on the assembler level ;-), so I did know about
va_list and friends.

In this case, knowing assembly is screwing you up -- you're assuming
that the function uses the same calling sequence all the time, but that's
not the case.

As an example, there has existed at least one system on which integer
and floating point arguments were collated -- the first floating point
argument would be in a particular register no matter how many integer
arguments there were.
The problem is that in real life, lst_print from my minimum example is
a much larger function, which I don't want to write twice, once as
shown originally (other functions need to call it with a list, not
with variable arguments), and once as typed out by you (would be a
nasty code duplication).

Ahh, I see. So either it's passed as an array of objects, or a series
of arguments.

void
va_or_list_print(int use_va, int *list, va_list ap) {
int i;
for (i = 0; i < *list; ++i) {
printf("%d\n", use_va ? va_arg(ap, int) : list);
}
}

void
va_print(int *count, ...) {
va_list ap;
va_start(ap, before_args);
va_or_list_print(1, count, ap);
va_end(ap);
}

void
list_print(int *items, ...) {
va_list ap;
va_start(items, ap);
va_or_list_print(0, items, ap);
va_end(ap);
}

The reason you need a '...' and a va_list for list_print is that you have
to be able to pass a valid va_list object to va_or_list_print.

Basically, you can pass another argument to the innermost function telling
it how to get its arguments. Then it'll work either way.

It ain't pretty, but I think it should work. As long as you can
make sure you always have a pointer to the number of arguments... But
you could do that in other ways, too:

void
va_print(int dummy, ...) {
int count = 0;
va_list_ap;
va_start(ap, dummy);
while (va_arg(ap, int) != -1)
++count;
va_end(ap, dummy);
va_start(ap, dummy);
va_or_list_print(1, &count, ap);
va_end(ap);
}

You can't get rid of that first argument, though. (But I suppose
you could make it a pointer, so you'd store the count of arguments
found before the sentinel in it.)

-s
 
A

Alan Curry

|
|The problem is that in real life, lst_print from my minimum example is
|a much larger function, which I don't want to write twice, once as
|shown originally (other functions need to call it with a list, not
|with variable arguments), and once as typed out by you (would be a
|nasty code duplication).
|
|So the only correct alternative would be to turn val_print into a
|wrapper for lst_print, and use va_start and va_arg to create a list
|that is then passed to lst_print. This would require malloc/free etc.

It doesn't require all that... it just requires a couple of C99 features:
a variadic macro and an array literal.

#define val_print(...) lst_print((int[]){__VA_ARGS__})

There you have something that can be used like a variadic function, but
compiles as a call to a function that takes a pointer to the first element
of an array. And only one copy of the list is generated.

There will probably be an array-copy operation in the generated code, but
your original version had a bunch of stack pushes, which did an equivalent
amount of copying.
 
B

Ben Bacarisse

|
|The problem is that in real life, lst_print from my minimum example is
|a much larger function, which I don't want to write twice, once as
|shown originally (other functions need to call it with a list, not
|with variable arguments), and once as typed out by you (would be a
|nasty code duplication).
|
|So the only correct alternative would be to turn val_print into a
|wrapper for lst_print, and use va_start and va_arg to create a list
|that is then passed to lst_print. This would require malloc/free etc.

It doesn't require all that... it just requires a couple of C99 features:
a variadic macro and an array literal.

#define val_print(...) lst_print((int[]){__VA_ARGS__})

An ingenious idea! However, you don't get exactly the same effect as
a function. For example, on my machine

val_print(L"abc");

is silently accepted. Of course this might be considered a feature.

Less of a feature is the fact that you can't take the address of
val_print but, honestly, I like it (in a sort of dark and twisted way).
 
N

Nick Keighley

<stdarg.h> is your friend, and has been available for roughly twenty
years.

and before that there was varargs which i believe gave "semi-"
portability. I like the quote (from memory unfortunatly) from
Plauger's library book

"...in the early days everyone knew how Richie's compiler laid out the
arguments in memory so it was a simple excercise in pointer arithmatic
to walk the argument list"
 
S

Stephen Sprunk

I just switched to GCC 4.4 (gcc44 (GCC) 4.4.0 20090514 (Red Hat
4.4.0-6)) and noted that quite a lot of my functions with variable
arguments are not compiled correctly anymore at -O2 (things are OK at
-O).

Incidentally, your code worked fine for me at -O2 but broke at -O3. Not
that the particular level matters, but apparently whatever optimization
is breaking your code moved from -O3 in GCC 4.2 to -O2 in GCC 4.4, which
might help you track down which one it is if you're curious...
The problem is that in real life, lst_print from my minimum example is
a much larger function, which I don't want to write twice, once as
shown originally (other functions need to call it with a list, not
with variable arguments), and once as typed out by you (would be a
nasty code duplication).

Obviously, one should never duplicate massive amounts of code without a
very good reason, and this doesn't appear to be one of those cases.
So the only correct alternative would be to turn val_print into a
wrapper for lst_print, and use va_start and va_arg to create a list
that is then passed to lst_print. This would require malloc/free etc.
to create an exact copy of what is already beautifully laid out on the
stack. My application is very performance critical, so creating
intermediate exact copies is something I usually try to avoid.

You shouldn't need malloc/free for this at all; if so, you're probably
doing something wrong.

Here is a minimally modified version of your program, which compiles and
works just fine for me at all optimization levels and is, AFAICT,
completely portable:

#include <stdio.h>
#include <stdarg.h>

void lst_print(int count, va_list list) {
int i;

for (i=0; i<count; i++)
printf("Value %d: %d\n", i, va_arg(list, int));
}

void val_print(int count, ...) {
va_list list;

va_start(list, count);
lst_print(count, list);
va_end(list);
}

int main(int argc, char *argv[]) {
val_print(5, 10, 20, 30, 40, 50);
return 0;
}

This, as you noted, turns val_print() into a "wrapper" of sorts but
doesn't require additional memory allocation, copying data, etc. and
should perform at least as well as your code. The magic is in the
(va_list) parameter that is passed from val_print() to lst_print().

If you need other functions to be able to call lst_print() with a
variable argument list, just like val_print(), the trick is to create a
third function (by convention, with the same name as the simplest
wrapper with a "v" prefix or suffix) with the meat of the logic:

void lst_printv(int count, va_list list) {
int i;

for (i=0; i<count; i++)
printf("Value %d: %d\n", i, va_arg(list, int));
}

void lst_print(int count, ...) {
va_list list;

va_start(list, count);
lst_printv(count, list);
va_end(list);
}

void val_print(int count, ...) {
va_list list;

va_start(list, count);
/* do something complicated */
lst_printv(count, list);
va_end(list);
}


If neither of these solutions is workable in your specific scenario,
you'll need to provide more detail so that we can understand why not and
make better suggestions.
Portability is not my concern, since my application has >100000 lines
of assembler code and is tied to x86.

That's why I hoped there would be a simple trick to tweak the GCC
optimizer...

Unfortunately for you, when you change the optimization levels, you are
effectively getting a completely new C implementation which naturally
behaves differently--and all the "portability" problems that entails.
A few months from now, I'll be in trouble anyway, when it's time to
port to x86_64. There, va_start and va_arg are a true nightmare (the
assembler code is blown up and slowed down by a factor 2-3 since
variable arguments can end up in registers). So in case anyone has a
brilliant idea how to make my original example work on x86_64, you'll
be my hero!

If you make your code portable today, it'll work on x64 systems as well
without change. However, IIRC in x64 functions with variable argument
lists go on the stack just like in x86, and only functions with fixed
argument lists get the (faster) register-based calling convention.

S
 
E

Elmar

Hi Alan,
|The problem is that in real life, lst_print from my minimum example is
|a much larger function, which I don't want to write twice, once as
|shown originally (other functions need to call it with a list, not
|with variable arguments), and once as typed out by you (would be a
|nasty code duplication).
|
|So the only correct alternative would be to turn val_print into a
|wrapper for lst_print, and use va_start and va_arg to create a list
|that is then passed to lst_print. This would require malloc/free etc.


It doesn't require all that... it just requires a couple of C99 features:
a variadic macro and an array literal.


#define val_print(...) lst_print((int[]){__VA_ARGS__})


There you have something that can be used like a variadic function, but
compiles as a call to a function that takes a pointer to the first element
of an array. And only one copy of the list is generated.


There will probably be an array-copy operation in the generated code, but
your original version had a bunch of stack pushes, which did an equivalent
amount of copying.

This is amaaaazing! Exactly the brilliant idea I was hoping for!

Truly ideal:

On x86_32, it reduces source and object code size a bit and thus
offers more "simplistic beauty" than my original approach, and now
it's even standard compliant and adds type safety!

And on x86_64, it's light-years ahead of the the frightening va_list/
va_arg monsters. Just for fun, I attached the x86_64 assembler code
generated for a variadic function, a creepily slow nightmare caused by
variable arguments passed in registers.

And it doesn't even need a -std=gnu99 switch or so. Even my ancient
GCCs which I use for cross-compiling ate it without complaints! This
should really go into the FAQ as the ideal solution when all variable
arguments have the same type..

So I spent this morning on an extreme makeover for my application,
putting in your clever approach wherever possible. You really saved
the day, so if you got a PayPal account, please mail it to
elmar(a)cmbi.ru.nl, I'll return a 100 USD thank you. Or pick something
at Amazon..

Cheers,
Elmar

P.S.: Thanks also to Stephen Sprunk for his long comment:
However, IIRC in x64 functions with variable argument
lists go on the stack just like in x86, and only functions with fixed
argument lists get the (faster) register-based calling convention.

This would certainly help, but I fear it can't be done: C allows
implicit declaration, so the caller doesn't know that he's calling a
variadic function, and thus cannot choose a stack-based calling
convention.

P.P.S.:
Here is the minimal lst_print which can be used thanks to Alan's
suggestion:

void lst_print(int *list)
{ int i;

for (i=0;i<*list;i++) printf("Value %d: %d\n",i,list[1+i]); }

0x0000000000400498 <lst_print+0>: push %rbp
0x0000000000400499 <lst_print+1>: mov %rsp,%rbp
0x000000000040049c <lst_print+4>: sub $0x20,%rsp
0x00000000004004a0 <lst_print+8>: mov %rdi,-0x18(%rbp)
0x00000000004004a4 <lst_print+12>: movl $0x0,-0x4(%rbp)
0x00000000004004ab <lst_print+19>: jmp 0x4004da <lst_print+66>
0x00000000004004ad <lst_print+21>: mov -0x18(%rbp),%rdx
0x00000000004004b1 <lst_print+25>: add $0x4,%rdx
0x00000000004004b5 <lst_print+29>: mov -0x4(%rbp),%eax
0x00000000004004b8 <lst_print+32>: cltq
0x00000000004004ba <lst_print+34>: shl $0x2,%rax
0x00000000004004be <lst_print+38>: lea (%rdx,%rax,1),%rax
0x00000000004004c2 <lst_print+42>: mov (%rax),%edx
0x00000000004004c4 <lst_print+44>: mov -0x4(%rbp),%esi
0x00000000004004c7 <lst_print+47>: mov $0x400ad8,%edi
0x00000000004004cc <lst_print+52>: mov $0x0,%eax
0x00000000004004d1 <lst_print+57>: callq 0x400398 <printf@plt>
0x00000000004004d6 <lst_print+62>: addl $0x1,-0x4(%rbp)
0x00000000004004da <lst_print+66>: mov -0x18(%rbp),%rax
0x00000000004004de <lst_print+70>: mov (%rax),%eax
0x00000000004004e0 <lst_print+72>: cmp -0x4(%rbp),%eax
0x00000000004004e3 <lst_print+75>: jg 0x4004ad <lst_print+21>
0x00000000004004e5 <lst_print+77>: leaveq


And that's what it would look like using va_list/va_arg: ;-(((

void lst_print2(int count,...)
{ int i;
va_list list;

va_start(list,count);
for (i=0;i<count;i++) printf("Value %d: %d\n",i,va_arg(list,int)); }

0x00000000004004e7 <lst_print2+0>: push %rbp
0x00000000004004e8 <lst_print2+1>: mov %rsp,%rbp
0x00000000004004eb <lst_print2+4>: sub $0xf0,%rsp
0x00000000004004f2 <lst_print2+11>: mov %rsi,-0xa8(%rbp)
0x00000000004004f9 <lst_print2+18>: mov %rdx,-0xa0(%rbp)
0x0000000000400500 <lst_print2+25>: mov %rcx,-0x98(%rbp)
0x0000000000400507 <lst_print2+32>: mov %r8,-0x90(%rbp)
0x000000000040050e <lst_print2+39>: mov %r9,-0x88(%rbp)
0x0000000000400515 <lst_print2+46>: movzbl %al,%eax
0x0000000000400518 <lst_print2+49>: mov %rax,-0xe8(%rbp)
0x000000000040051f <lst_print2+56>: mov -0xe8(%rbp),%rdx
0x0000000000400526 <lst_print2+63>: lea 0x0(,%rdx,4),%rax
0x000000000040052e <lst_print2+71>: movq $0x40056d,-0xe8(%rbp)
0x0000000000400539 <lst_print2+82>: sub %rax,-0xe8(%rbp)
0x0000000000400540 <lst_print2+89>: lea -0x1(%rbp),%rax
0x0000000000400544 <lst_print2+93>: mov -0xe8(%rbp),%rdx
0x000000000040054b <lst_print2+100>: jmpq *%rdx
0x000000000040054d <lst_print2+102>: movaps %xmm7,-0xf(%rax)
0x0000000000400551 <lst_print2+106>: movaps %xmm6,-0x1f(%rax)
0x0000000000400555 <lst_print2+110>: movaps %xmm5,-0x2f(%rax)
0x0000000000400559 <lst_print2+114>: movaps %xmm4,-0x3f(%rax)
0x000000000040055d <lst_print2+118>: movaps %xmm3,-0x4f(%rax)
0x0000000000400561 <lst_print2+122>: movaps %xmm2,-0x5f(%rax)
0x0000000000400565 <lst_print2+126>: movaps %xmm1,-0x6f(%rax)
0x0000000000400569 <lst_print2+130>: movaps %xmm0,-0x7f(%rax)
0x000000000040056d <lst_print2+134>: mov %edi,-0xd4(%rbp)
0x0000000000400573 <lst_print2+140>: lea -0xd0(%rbp),%rax
0x000000000040057a <lst_print2+147>: movl $0x8,(%rax)
0x0000000000400580 <lst_print2+153>: lea -0xd0(%rbp),%rax
0x0000000000400587 <lst_print2+160>: movl $0x30,0x4(%rax)
0x000000000040058e <lst_print2+167>: lea -0xd0(%rbp),%rax
0x0000000000400595 <lst_print2+174>: lea 0x10(%rbp),%rdx
0x0000000000400599 <lst_print2+178>: mov %rdx,0x8(%rax)
0x000000000040059d <lst_print2+182>: lea -0xd0(%rbp),%rax
0x00000000004005a4 <lst_print2+189>: lea -0xb0(%rbp),%rdx
0x00000000004005ab <lst_print2+196>: mov %rdx,0x10(%rax)
0x00000000004005af <lst_print2+200>: movl $0x0,-0xb4(%rbp)
0x00000000004005b9 <lst_print2+210>: jmp 0x40062e
<lst_print2+327>
0x00000000004005bb <lst_print2+212>: mov -0xd0(%rbp),%eax
0x00000000004005c1 <lst_print2+218>: cmp $0x30,%eax
0x00000000004005c4 <lst_print2+221>: jae 0x4005f0
<lst_print2+265>
0x00000000004005c6 <lst_print2+223>: mov -0xc0(%rbp),%rdx
0x00000000004005cd <lst_print2+230>: mov -0xd0(%rbp),%eax
0x00000000004005d3 <lst_print2+236>: mov %eax,%eax
0x00000000004005d5 <lst_print2+238>: add %rax,%rdx
0x00000000004005d8 <lst_print2+241>: mov %rdx,-0xe0(%rbp)
0x00000000004005df <lst_print2+248>: mov -0xd0(%rbp),%eax
0x00000000004005e5 <lst_print2+254>: add $0x8,%eax
0x00000000004005e8 <lst_print2+257>: mov %eax,-0xd0(%rbp)
0x00000000004005ee <lst_print2+263>: jmp 0x400609
<lst_print2+290>
0x00000000004005f0 <lst_print2+265>: mov -0xc8(%rbp),%rax
0x00000000004005f7 <lst_print2+272>: mov %rax,-0xe0(%rbp)
0x00000000004005fe <lst_print2+279>: add $0x8,%rax
0x0000000000400602 <lst_print2+283>: mov %rax,-0xc8(%rbp)
0x0000000000400609 <lst_print2+290>: mov -0xe0(%rbp),%rax
0x0000000000400610 <lst_print2+297>: mov (%rax),%edx
0x0000000000400612 <lst_print2+299>: mov -0xb4(%rbp),%esi
0x0000000000400618 <lst_print2+305>: mov $0x400ad8,%edi
0x000000000040061d <lst_print2+310>: mov $0x0,%eax
0x0000000000400622 <lst_print2+315>: callq 0x400398 <printf@plt>
0x0000000000400627 <lst_print2+320>: addl $0x1,-0xb4(%rbp)
0x000000000040062e <lst_print2+327>: mov -0xb4(%rbp),%eax
0x0000000000400634 <lst_print2+333>: cmp -0xd4(%rbp),%eax
0x000000000040063a <lst_print2+339>: jl 0x4005bb
<lst_print2+212>
0x0000000000400640 <lst_print2+345>: leaveq
0x0000000000400641 <lst_print2+346>: retq
End of assembler dump.
 
K

Keith Thompson

Elmar said:
This would certainly help, but I fear it can't be done: C allows
implicit declaration, so the caller doesn't know that he's calling a
variadic function, and thus cannot choose a stack-based calling
convention.
[...]

In C90, if you call a function with no visible declaration, it's
assumed to take a fixed number of arguments of the (promoted) type(s)
you passed it and return a result of type int. Attempting to call a
variadic function with no visible prototype (not just a declaration;
the compiler has to see the "...") invokes undefined behavior.

So yes, a conforming C90 compiler can use different calling
conventions for variadic and non-variadic functions.

On the other hand, this would break (incorrect) programs that call
printf() without the requires "#include <stdio.h>" -- including the
first program in K&R1. I think many implementers have chosen to keep
consistent calling conventions for backward compatibility.
 
K

Keith Thompson

Kenneth Brody said:
On 3/18/2010 6:46 PM, Keith Thompson wrote:
[...]
So yes, a conforming C90 compiler can use different calling
conventions for variadic and non-variadic functions.

On the other hand, this would break (incorrect) programs that call
printf() without the requires "#include<stdio.h>" -- including the
first program in K&R1. I think many implementers have chosen to keep
consistent calling conventions for backward compatibility.

I have seen systems which pass the first several parameters to
non-variadic functions in registers, and only use the stack for
parameters beyond the first N. To do this on a variadic function
would be "messy" to say the least, and therefore variadic functions
are done purely on-stack.

Now, it may be that the compiler has "special knowledge" of things
like printf() even w/o the header files included, "just in case". (I
believe it is allowed to do so for standard library functions, as
replacing them with user-defined versions introduces UB.)

Ah, that's a possibility I hadn't thought of. Yes, that would work.

Of course, it means that the compiler has to know which standard
functions are variadic (snprintf, for example, exists in C99 but
not in C90). It also means that this feature is unavailable for
user-written functions. But it's a reasonable workaround if your
goal is to do as good a job as possible with obsolete or incorrect
code.

Though I'd certainly hope that, as long as the compiler is working
around the lack of a prototype by assuming a function of a certain
name is variadic, it will at least issue a warning.
 
T

Tim Rentsch

|
|The problem is that in real life, lst_print from my minimum example is
|a much larger function, which I don't want to write twice, once as
|shown originally (other functions need to call it with a list, not
|with variable arguments), and once as typed out by you (would be a
|nasty code duplication).
|
|So the only correct alternative would be to turn val_print into a
|wrapper for lst_print, and use va_start and va_arg to create a list
|that is then passed to lst_print. This would require malloc/free etc.

It doesn't require all that... it just requires a couple of C99 features:
a variadic macro and an array literal.

#define val_print(...) lst_print((int[]){__VA_ARGS__})

There you have something that can be used like a variadic function, but
compiles as a call to a function that takes a pointer to the first element
of an array. And only one copy of the list is generated. [snip incidental]

Good idea, to which I would like to suggest an addition, namely,
an additional argument giving the number of elements in the array,
(computed using the usual 'sizeof x / sizeof *x' technique).
Details left as an exercise for the reader.

There will probably be an array-copy operation in the generated code, but
your original version had a bunch of stack pushes, which did an equivalent
amount of copying.

The array-copy operation might be avoided if the function
parameter and compound literal used 'const int' rather than 'int'
(and the array values were constants not needing run-time
computation).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,736
Latest member
zacharyharris

Latest Threads

Top