Long long int

K

Keoncheol Shin

Hi to all!

I wonder how gcc treat 64bit integer type(long long int) internally in 32bit
machine.

If possible, please tell me the related webpage.

Thank you.
 
L

lallous

Keoncheol Shin said:
Hi to all!

I wonder how gcc treat 64bit integer type(long long int) internally in 32bit
machine.

If possible, please tell me the related webpage.

Thank you.
I would suspect it is treated as 32bits long were being treated in a 16bit
application: two words = 1 dword

For your case it is two longs to form one 64bits long
 
M

Malcolm

Keoncheol Shin said:
I wonder how gcc treat 64bit integer type(long long int) internally in
32bit machine.
Why not do some experiments? Try copying a long long int to an array of
unsigned chars, and examine the layout by printing them in hex format?
 
M

Mark Shelor

Keoncheol said:
I wonder how gcc treat 64bit integer type(long long int) internally in 32bit
machine.


The following comments apply to gcc 3.3.1, but the same or similar
structures likely exist in the latest version (3.3.3) as well.

You can see for yourself how gcc handles the "long long" type by
examining two files from the source distribution: gcc/libgcc2.h and
gcc/longlong.h.

The first file defines the following struct and union:

#if LIBGCC2_WORDS_BIG_ENDIAN
struct DWstruct {Wtype high, low;};
#else
struct DWstruct {Wtype low, high;};
#endif

typedef union
{
struct DWstruct s;
DWtype ll;
} DWunion;


The second file contains a flurry of assembler macros that define the
primitive "long long" operations for different processor types. The
"DWunion" type is employed to hold "long long" values for both 32-bit
and 64-bit processors.

As you can see (and as you might have well guessed), "long long" values
are manipulated as structs on 32-bit machines, with the "high" and "low"
members storing the upper and lower 32-bits, respectively, of the 64-bit
value.

NB: this description presents an over-simplified picture of relevant
header files for the purpose of providing a brief explanation. To
obtain of fuller and more precise understanding of what's going on with
gcc's handling of the "long long" type, a much more extensive review of
the source code would be required. In other words, things are not
always quite as simple as they look <g>

Mark
 
A

August Derleth

Keoncheol said:
Hi to all!

I wonder how gcc treat 64bit integer type(long long int) internally in 32bit
machine.

Well, one of the best ways to see how gcc does /anything/ related to the
details of code generation is to give it the -S option to make it
generate assembly instead of object code. (Assuming you're on an Intel
machine: Note that the default assembly syntax is AT&T, which gas likes
but most humans don't. ;) Give it the -masm=intel option to cure it of
this behavior.)

Also, gcc is a compiler, not a human. It will generate odd and obviously
non-optimal code unless you hand it options like -O3 and
-march=[/something/], where /something/ is what gcc calls your hardware.
(Even then, it's not quite right.) Modern Intel chips are called i686,
but I think the --version option will cause gcc to print its idea of
your hardware.
If possible, please tell me the related webpage.

I like using showasm, a Perl program that takes a C source file and
generates an output file that contains the assembly annotated with the C
source lines that the compiler used to generate it. It's colored with
ANSI escapes, but less will handle it fine if you give less the -R option.

showasm seems to live here:

http://www.ibiblio.org/pub/Linux/devel/showasm-1.0.tar.gz
 
?

=?ISO-8859-1?Q?RagnarDanneskj=F6ld?=

Keoncheol said:
I wonder how gcc treat 64bit integer type(long long int) internally in 32bit
machine.

When dealing with 64 bit numbers on a 32 bit CPU you need to sort of
emulate the number processing using TWO 32-bit registers or memory
addresses. all the calculations are done in terms of 32-bits of course.

EXAMPLE:

long.c (Compiled to assembly with gcc -S long.c)

main() {
long long n;
n = 33;
n += 15;
}

long.S

.file "long.c"
.def ___main; .scl 2; .type 32; .endef
.text
..globl _main
.def _main; .scl 2; .type 32; .endef
_main:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
andl $-16, %esp
movl $0, %eax
movl %eax, -12(%ebp)
movl -12(%ebp), %eax
call __alloca
call ___main
movl $0, -8(%ebp)
movl $0, -4(%ebp)
leal -8(%ebp), %eax
addl $15, (%eax)
adcl $0, 4(%eax)
leave
ret

Well I have absolutely no idea what this odd-ball assembly code is doing
so let's look at how a "real" compiler does it.

// unoptimized assembly
main() {
__int64 n;
n = 33;
n += 15;
}

assembly (important parts)

; 2 : __int64 n;
; 3 : n = 33;

mov DWORD PTR _n$[ebp], 33 ; 00000021H
mov DWORD PTR _n$[ebp+4], 0

; 4 : n += 15;

mov eax, DWORD PTR _n$[ebp]
add eax, 15 ; 0000000fH
mov ecx, DWORD PTR _n$[ebp+4]
adc ecx, 0
mov DWORD PTR _n$[ebp], eax
mov DWORD PTR _n$[ebp+4], ecx
 
T

Tom St Denis

RagnarDanneskjöld said:
Well I have absolutely no idea what this odd-ball assembly code is doing
so let's look at how a "real" compiler does it.

You put that code in main where gcc imposes some limitations try

long long test(long long n) {
return n + 15;
}

Which with -O3 on my cygwin box produces

_test:
movl 4(%esp), %eax
movl 8(%esp), %edx
addl $15, %eax
adcl $0, %edx
ret

Which is not only straightforward but also the most optimal code you can
really write for this function on the x86

Tom
 
?

=?ISO-8859-1?Q?RagnarDanneskj=F6ld?=

Tom said:
You put that code in main where gcc imposes some limitations try

Damn why didn't I think to generate with a non-main function.

When I said I had no idea what the heck it was doing: i was referring
more to the syntax of GAS or AT&T or whatever assembly. It's zany.
 
A

August Derleth

RagnarDanneskjöld said:
Damn why didn't I think to generate with a non-main function.

When I said I had no idea what the heck it was doing: i was referring
more to the syntax of GAS or AT&T or whatever assembly. It's zany.

It's ugly as hell, and it's called AT&T. Try using the -masm=intel
argument. That won't make it as clean as, say, nasm's assembly, but it
will make it readable to someone coming from an Intel background.

AT&T's assembly is supposedly ugly to be usable as-is on multiple
architectures. Supposedly, you can use AT&T syntax as a template for a
very generic (non-processor-specific) assembly language, filling in the
opcodes and their exact arguments as the compiler generates them.

Of course, that doesn't wash with me. First off, I doubt the benefits of
trying to shoehorn Intel assembly into something that worked for VAX
assembly, or PowerPC assembly into something that worked for System/360
assembly. Code reuse is good, but that is simple absurdity.

Secondly, trained monkeys could have come up with a better template.
Memory references are particularly FUBARed: What does -3(%eax,%edx,4)
mean? Looking it up, I find it means [eax+edx*4-3]. I chose a rather
absurd example, but it isn't that odd to use rather complex addressing
schemes in a chip as register-starved as an x86 clone. And Intel's
algebraic notation is clearly a win over AT&T's What-The-**** notation.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,139
Messages
2,570,807
Members
47,356
Latest member
Tommyhotly

Latest Threads

Top