some doubts in a .s file of a c program

  • Thread starter ramasubramanian.rahul
  • Start date
R

ramasubramanian.rahul

hi
i was trying to see how the compiler hides the static golbals from the
linker and allows golbal varibale to be visable to the linker.i managed
to figure out how it did that ( the .lcomm and .comm sections) but the
assembly code for the c program raised a few more doubts . i am
enclosing the .c and the .s files .. if someonce could expain what the
starred statements in the .s file mean.. i would be greatful


the .c file is as follows


#include<stdio.h>
int i ;
static int j ;
int main()
{
printf ("\t%d\t%d\n" ,i , j );
}





the .s file it generated on a P4 RHEL machine using the cc command is
as follows

..file "some.c"
.section .rodata
..LC0:
.string "\t%d\t%d\n"
.text
..globl main
.type main, @function
main:
pushl %ebp
movl %esp, %ebp
subl $8, %esp * why more space in stack when no local vars
are used
andl $-16, %esp * this has something to do with making the
stack aligned to a 16 bit filed.. can u expalin it in
detail
movl $0, %eax * the next six statements make no sense to me
at all... why are they here
addl $15, %eax *
addl $15, %eax *
shrl $4, %eax *
sall $4, %eax *
subl %eax, %esp *
subl $4, %esp *
pushl j
pushl i
pushl $.LC0
call printf
addl $16, %esp
leave
ret
.size main, .-main
.comm i,4,4
.local j
.comm j,4,4
..section .note.GNU-stack,"",@progbits
.ident "GCC: (GNU) 3.4.3 20041212 (Red Hat 3.4.3-9.EL4)"






any help will be appriciated
kind regards
rahul
 
C

Cong Wang

hi
i was trying to see how the compiler hides the static golbals from the
linker and allows golbal varibale to be visable to the linker.i managed
to figure out how it did that ( the .lcomm and .comm sections) but the
assembly code for the c program raised a few more doubts . i am
enclosing the .c and the .s files .. if someonce could expain what the
starred statements in the .s file mean.. i would be greatful


the .c file is as follows


#include<stdio.h>
int i ;
static int j ;
int main()
{
printf ("\t%d\t%d\n" ,i , j );
}

****Well, this is off topic here. Posting it to gcc.* groups is
better.****
the .s file it generated on a P4 RHEL machine using the cc command is
as follows

.file "some.c"
.section .rodata
.LC0:
.string "\t%d\t%d\n"
.text
.globl main
.type main, @function
main:
pushl %ebp
movl %esp, %ebp
subl $8, %esp * why more space in stack when no local vars
are used
andl $-16, %esp * this has something to do with making the
stack aligned to a 16 bit filed.. can u expalin it in
detail
movl $0, %eax * the next six statements make no sense to me
at all... why are they here
addl $15, %eax *
addl $15, %eax *
shrl $4, %eax *
sall $4, %eax *
subl %eax, %esp *
subl $4, %esp *

Maybe gcc is clever than us all. If you really want to know how gcc
works, why don't you ask a gcc hacker?
 
J

jacob navia

The function call to printf has 3 arguments.
Gcc attempts to maintain the stack aligned by rounding
up the stack value before pusshing the arguments.

This feature of gcc is an incredible costly feature. As you
can see a great percentage of gcc's emitetd code is just
stack manipulation stuff.
 
R

Rod Pemberton

hi
i was trying to see how the compiler hides the static golbals from the
linker and allows golbal varibale to be visable to the linker.i managed
to figure out how it did that ( the .lcomm and .comm sections) but the
assembly code for the c program raised a few more doubts . i am
enclosing the .c and the .s files .. if someonce could expain what the
starred statements in the .s file mean.. i would be greatful

the .c file is as follows

#include<stdio.h>
int i ;
static int j ;
int main()
{
printf ("\t%d\t%d\n" ,i , j );
}

the .s file it generated on a P4 RHEL machine using the cc command is
as follows

Unfortunately, most here either don't know any assembly or feign ignorance
to keep the topics to C only.

First, the assembly is using some GAS directives I'm not familiar with which
I suspect are unique to Linux.
.file "some.c"
.section .rodata
.LC0:
.string "\t%d\t%d\n"
.text
.globl main
.type main, @function
main:
pushl %ebp
movl %esp, %ebp

These two instructions save and replace the current stack pointer. They are
equivalent to the 'enter' instruction. These two when combined with the
'subl $8, %esp' are the C function's prolog. The prolog and epilog (below)
create and destroy the stackframe, respectively.
subl $8, %esp * why more space in stack when no local vars
are used

This is used to allocate stack space for variables, but it doesn't appear
that anything accesses the allocated space... (It may be in the C startup,
or printf() code, or main's return value, etc...) main() does return an int
(probably 4 bytes and not 8) but it isn't set or cleared in the posted code
(it should be done in the C startup). This may go away with optimization.
It seems the extra space is removed further below.
andl $-16, %esp * this has something to do with making the
stack aligned to a 16 bit filed.. can u expalin it in
detail

$-16 is the same as 0xfffffff0. An 'and' the stack pointer (esp) clears the
lower eight bits, thereby "aligning" it to a multiple of 16.
movl $0, %eax * the next six statements make no sense to me
addl $15, %eax *
addl $15, %eax *
shrl $4, %eax *
sall $4, %eax *
subl %eax, %esp *
subl $4, %esp *

In C syntax:

eax=0;
eax+=15;
eax+=15;
eax>>4;
eax<<4; /* eax is 16 */
esp-=eax; /* esp=esp-16 */
esp-=4; /* esp=esp-20 */

The end result is some more (20 bytes) of stack allocation. The first six
_seem_ to be an aligned stack allocation (of two items) and the last line
appears to be unaligned allocation (although it's value is 4) of one item.
Since there isn't a nice correspondence with the C code and it appears that
variables i and j have space allocated at the bottom, I'm not sure what
specifically causes the instructions to be generated. But, I've seen it
before and it usually goes away or is reduced with optimization.
pushl j
pushl i
pushl $.LC0
call printf

Although this GAS syntax is slightly different from the way I've seen it,
this pushes the arguments to printf() and calls it.
addl $16, %esp
leave
ret

'addl $16,%esp' removes 16 bytes from the stack,i.e., stack cleanup prior to
the function exiting. 'leave' restores the saved stack pointer. It is
equivalent to 'movl %ebp, %esp; popl %ebp'. This is also the C function's
epilog. As I stated earlier, it seems that there is extra space allocated
and destroyed for some reason, probably lack of optimization.
.size main, .-main
.comm i,4,4
.local j
.comm j,4,4
.section .note.GNU-stack,"",@progbits
.ident "GCC: (GNU) 3.4.3 20041212 (Red Hat 3.4.3-9.EL4)"

About a month ago, there was a similar thread on comp.lang.c++ and
comp.lang.asm.x86:

original
http://groups.google.com/group/comp.lang.c/msg/3c6d3ce5bb9616fd?hl=en
explanation
http://groups.google.com/group/comp.lang.c/msg/c9cae37b8b61eeac?hl=en
explanation
http://groups.google.com/group/comp.lang.asm.x86/msg/1828a343930da546?hl=en

Jacob Navia also listed this link:
http://en.wikibooks.org/wiki/X86_Assembly/GAS_Syntax


Someone around here claimed to have in depth knowledge of GCC "about a
decade ago." They might be able to help further. (Was that Chris Torek?)


Rod Pemberton
 
A

Ancient_Hacker

jacob navia wrote:

This feature of gcc is an incredible costly feature.

Note this is in function main(), which is usually called just once.

Aligning the stack pointer can be a BIG win on some architectures, like
2x faster access to parameters and local variables.

But of course anything having to do with the real world is somewhat
off-topic here :)
 
R

ramasubramanian.rahul

hi all
many thanks to rod for giving such a detailed explanation... i
appriciate his help....
just contuing further though.. any idea where i can find some
literature on this stack aligning business... some doc where it has
been discussed in detail...
kind regards
rahul
 
R

Rod Pemberton

hi all
many thanks to rod for giving such a detailed explanation... i
appriciate his help....
just contuing further though.. any idea where i can find some
literature on this stack aligning business... some doc where it has
been discussed in detail...
kind regards
rahul

I would start with the Intel microprocessor manuals. For example, the
Volume 1 of the Pentium4 manuals has a section (6.2.2) on stack alignment:

http://www.intel.com/design/pentium4/manuals/253665.htm

Then, I would check the gcc documentation,
specifically -mpreferred-stack-boundary.

http://gcc.gnu.org/onlinedocs/gcc-4...2d64-Options.html#i386-and-x86_002d64-Options

Then, I would use Yahoo (search for "stack alignment" "how to") to find
stuff like this, note the 'sub esp,xx' comment:

"With newer versions of GCC, programs whose inner loops include many
function calls, or which are deeply recursive, could benefit from using
the -mpreferred-stack-boundary=2 compiler option. This causes the compiler
to relax its stack-alignment requirements that need a lot of sub esp,xx
instructions. The default stack alignment is 16 bytes, unless overridden
by -mpreferred-stack-boundary. The argument to this option is the power of 2
used for alignment, so 2 means 4-byte alignment; if your code uses double
and long double variables, an argument of 3 might be a better choice. "

http://www.delorie.com/djgpp/v2faq/faq14_2.html

Then, I would use Google's Groups advanced search to search for say: gcc
"stack alignment." This will pull up hundreds of posts similar to the
detailed response I provided, especially some much older ones on
comp.lang.asm.x86.

http://groups.google.com/advanced_search?hl=en


Rod Pemberton
 
S

Stephen Sprunk

the .c file is as follows

#include<stdio.h>
int i ;
static int j ;
int main()
{
printf ("\t%d\t%d\n" ,i , j );
}

the .s file it generated on a P4 RHEL machine using the cc command is
as follows

.file "some.c"
.section .rodata
.LC0:
.string "\t%d\t%d\n"
.text
.globl main
.type main, @function
main:
pushl %ebp
movl %esp, %ebp
subl $8, %esp * why more space in stack when no local vars
are used
andl $-16, %esp * this has something to do with making the
stack aligned to a 16 bit filed.. can u expalin it in
detail
movl $0, %eax * the next six statements make no sense to me
at all... why are they here
addl $15, %eax *
addl $15, %eax *
shrl $4, %eax *
sall $4, %eax *
subl %eax, %esp *
subl $4, %esp *

None of this stuff makes much sense; my guess is that you compiled
without optimization. GCC is infamous for putting out incredibly stupid
code when you do that. OTOH, it's very difficult to match GCC's
optimized code up to the C source to figure out why it does what it
does, so there's a purpose in leaving that mode in.

Still, here's what I get with my GCC on Linux without optimization:

main:
pushl %ebp
movl %esp,%ebp
movl j,%eax
pushl %eax
movl i,%eax
pushl %eax
pushl $.LC0
call printf
addl $12,%esp
..L1:
leave
ret

and with -O3:

main:
pushl %ebp
movl %esp,%ebp
pushl j
pushl i
pushl $.LC0
call printf
leave
ret

The unoptimized version isn't really that much worse, certainly not as
bad as the version you posted. Turn on optimizations and recompile; if
the odd stack stuff is still there, go report it to the GCC folks as a
bug.

S
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,983
Messages
2,570,187
Members
46,747
Latest member
jojoBizaroo

Latest Threads

Top