J
jacob navia
In the last installement we looked into the object files and what they
contain.
Some people insisted that I was generalizing too much and there could be
C implementations without object files (like C interpreters) and C
implementations that do not link files in separate compilation but just
parse and digest each module, making the whole code generation step in
the linker, from an unknown representation.
Granted, werid implementation and special options may exists. Here I am
speaking about the very common (or most common case) where the compiler
produces traditional object files, stored in the disk somewhere.
Those object files in an abstract way contain:
(1) A symbol table that specifies whiwh symbols are exported and which
symbols are imported
(2) Several "Sections" containing the data of the program. (Code
instructions, initialized tables, and just reserved space)
(3) A series of relocation records that specify which parts of the data
(code or tables) must be patched by the linker to insert the external
symbols required by the module
The linking process
-------------------
The linker opens all object files that it receives, and builds a symbol
table. In this table we have several sets of symbols
(a) The set of defined symbols, not in the common section. All this
symbols have a fixed address already.
(b) The set of symbols in the common section
(c) The set of undefined symbols that have been seen as externals but
where the definition is not yet processed.
Symbols can be moved from the undefined set, into the common or into the
defined symbols.
This needs some explanation. Suppose you have in the file file1.c the
following declaration:
int iconst;
The symbol ‘iconst’ will be assigned to the common section that is
initialized to zero at program startup. But consider what happens if you
include ‘file2.c’ in the link, that contains the declaration:
int iconst = 53433;
The linker will move the symbol ‘iconst’, from the common section to the
data section. The definition in file1.c will be lost. If you relied in
"iconst" being zero at startup now you are wrong.
And there are worst things that can be done:
file1.c:
int buf[256];
file2.c:
int buff[512];
The linker will leave ‘buf’ in the common section, but will set its size
to the bigger value, i.e. 512. This is harmless, but beware that you
make a definition in a file3.c
int buff[4] = {0,0,0,0};
Your table will have a size of just four positions instead of 512!!
This can be tested, for instance, with the following two files:
file t1.c
int tab[12];
File t2.c
int tab[256];
int main(void){return 0;}
Linking t1.c and t2.c with MSVC 8 we obtain an executable *without any
warnings* not even at the highest warning level.
In the linker of lcc-win I added a warning:
in t1.obj warning: '_tab' (defined with size 48)
is redefined in t2.obj with size 1024
The linker of gnu doesn't emit any warning:
root@ubuntu:/tmp# gcc -Wall t1.c t2.c
root@ubuntu:/tmp#
The explanation that will be commonly given for this behavior is that
any definition in the "common" section (non initialized data) is a
"tentative definition" and only valid until another definition is seen
by the linker.
Dave Hanson, one of the authors of the original lcc compiler told me
this, when we discussed about this problem:
jacob:
Dave Hanson:
<<quote>>
For the record, the declaration for p is indeed a tentative definition,
but that status persists only until the end of the compilation unit,
i.e., the end of f1.c. Since there's no subsequent external definition
of p, the tentative declaration acts exactly as if there was a
file-scope declaration for p with an initializer equal to 0. (See Sec.
3.7.2 of the ANSI Standard, which I think is Sec. 6.7.2 of the ISO
Standard). As a result, p is null at program startup--assuming there are
no other similar declarations for p.
This example illustrates nicely a problem with the common storage model:
You can't determine whether or not a declaration is also a definition by
examining just the program text, and it's easy to get strange behaviors.
In this example, there was only one definition, which passes without
complaint from linkers. In the stricter definition/reference model,
linkers would complain about multiple definitions when combining the
object code for f1.c and f2.c. This example also shows why it's best to
initialize globals, because linkers will usually catch these kinds of
multiple definitions.
The common model also permits C's (admittedly weak) type system to be
violated. I've seen programmers declare "int x[2]" in one file and
"double x" in another one, for example, just so they can access x as a
double and as a pair of ints.
For a good summary of the four models of external
definitions/declarations, see Sec. 4.8 in Harbison & Steele, C: A
Reference Manual, 4th ed., Prentice-Hall, 1995.
<<end quote>>
------------------------------------------------------------------------------
Relocating all symbols
----------------------
Let's come back to our linker however. I will outline with lcclnk and
windows as exmaples, but in Unix and many other systems, the operations
done by the linker are very similar.
The next thing to do is to go through all symbols, and decide whether
they will go into the final symbol table or not. Many of them are
discarded, since they are local symbols of each compilation unit.
Global symbols need to be relocated, i.e. the ‘value’ of the symbol has
to be set to its final address. This is easy now that the position of
the section that contains the symbol is exactly known: we just go
through them setting the value field to the right number.
The algorithm outline is simple:
1. Read the relocation information from the object file.
2. According to the type of relocation, adjust the value of the symbol.
The relocations supported by lcclnk are just a few: the pc-relative
relocation (code 7, and code 20), the normal 32-bit relocation (code 6),
and two types of relocations for the debug information, code 10 and 11.
3. Save the position within the executable file where the relocation is
being done in the case of relocation type 6 (normal 32 bits relocation),
to later build the .reloc section if this is needed.
Normally this is needed only when generating a dll, since executables
aren’t relocated under windows.
The .reloc section of the executable is data for the program loader, to
tell it where are the addresses that it should patch when loading the
file into memory.
Other linkers more complicated than lcc's support more fancy stuff. A
symbol can be included only once even if it appears several times, and
many other things
Performing the relocations
--------------------------
More specifically, what the linker does, is fixing the data/code
references that each module contains from all others, patching the code
with the offsets that the external symbols have, now that the positions
of all sections are known. For a C source line like:
foo(5);
the linker reads the corresponding relocation record emitted by the
compiler, and looks up the symbol ‘foo’ in the symbol table. It patches
the zeroes that are stored by the assembler at the position following
the call opcodes with the relative offset from the point of the call to
the address of foo. This will allow the processor to make a PC relative
call instruction: the 4 bytes after the call instruction contain a
32-bit offset to the address of foo.
Using the utility pedump, you can see this process. Consider the
following well-known program:
#include <stdio.h>
int main(int argc,char *argv[])
{
printf("Hello\n");
}
Compile this with:
lcc -g2 hello.c
Now, disassemble hello.obj with pedump like this:
pedump /A hello.obj
You will see near the end of the long listing that follows, the
disassembled text section:
section 00 (.text) size: 00020 file offs: 00220
--------------------------------------------------------------
_main: Size 18
--------------------------------------------------------------
[0000000] 55 pushl %ebp
[0000001] 89e5 movl %esp,%ebp
Line 5
[0000003] 6800000000 pushl $0 (_$2) (relocation)
[0000008] e800000000 call _printf (relocation)
[0000013] 83c404 addl $4,%esp
Line 6
[0000016] 5d popl %ebp
[0000017] c3 ret
[0000018] 0000 addb %al,(%eax)
Let’s follow the relocation to the function printf. You will see that
pedump has a listing of the relocations that looks like this:
Section 01 (.text) relocations
Address Type Symbol Index Symbol Name
------- ---- ------------ ----- ----
4 DIR32 4 _$2
9 REL32 16 _printf
The linker will then take the bytes starting at the address 4, and put
the address of the symbol 4 in the symbol table of main.obj. It will
search the address of printf, and put the relative address, i.e. the
difference between the address of printf and the address of main+9 in
those bytes starting at byte 9.
As you can see there are several types of relocations, each specifying a
different way of doing these additions. The compiler emits only three
types of relocations:
• Type 6 : Direct 32-bit reference to the symbols virtual address
• Type 7: Direct 32-bit references to the symbols virtual address, base
not included.
• Type 20: PC-relative 32-bit reference to the symbols virtual address.
This last one is the one used in the relocation to printf. We have to
know too that the relative call is relative to the next instruction,
i.e. to the byte 13 and not to the byte 9. Happily for us the linker now
knows this stuff...
contain.
Some people insisted that I was generalizing too much and there could be
C implementations without object files (like C interpreters) and C
implementations that do not link files in separate compilation but just
parse and digest each module, making the whole code generation step in
the linker, from an unknown representation.
Granted, werid implementation and special options may exists. Here I am
speaking about the very common (or most common case) where the compiler
produces traditional object files, stored in the disk somewhere.
Those object files in an abstract way contain:
(1) A symbol table that specifies whiwh symbols are exported and which
symbols are imported
(2) Several "Sections" containing the data of the program. (Code
instructions, initialized tables, and just reserved space)
(3) A series of relocation records that specify which parts of the data
(code or tables) must be patched by the linker to insert the external
symbols required by the module
The linking process
-------------------
The linker opens all object files that it receives, and builds a symbol
table. In this table we have several sets of symbols
(a) The set of defined symbols, not in the common section. All this
symbols have a fixed address already.
(b) The set of symbols in the common section
(c) The set of undefined symbols that have been seen as externals but
where the definition is not yet processed.
Symbols can be moved from the undefined set, into the common or into the
defined symbols.
This needs some explanation. Suppose you have in the file file1.c the
following declaration:
int iconst;
The symbol ‘iconst’ will be assigned to the common section that is
initialized to zero at program startup. But consider what happens if you
include ‘file2.c’ in the link, that contains the declaration:
int iconst = 53433;
The linker will move the symbol ‘iconst’, from the common section to the
data section. The definition in file1.c will be lost. If you relied in
"iconst" being zero at startup now you are wrong.
And there are worst things that can be done:
file1.c:
int buf[256];
file2.c:
int buff[512];
The linker will leave ‘buf’ in the common section, but will set its size
to the bigger value, i.e. 512. This is harmless, but beware that you
make a definition in a file3.c
int buff[4] = {0,0,0,0};
Your table will have a size of just four positions instead of 512!!
This can be tested, for instance, with the following two files:
file t1.c
int tab[12];
File t2.c
int tab[256];
int main(void){return 0;}
Linking t1.c and t2.c with MSVC 8 we obtain an executable *without any
warnings* not even at the highest warning level.
In the linker of lcc-win I added a warning:
in t1.obj warning: '_tab' (defined with size 48)
is redefined in t2.obj with size 1024
The linker of gnu doesn't emit any warning:
root@ubuntu:/tmp# gcc -Wall t1.c t2.c
root@ubuntu:/tmp#
The explanation that will be commonly given for this behavior is that
any definition in the "common" section (non initialized data) is a
"tentative definition" and only valid until another definition is seen
by the linker.
Dave Hanson, one of the authors of the original lcc compiler told me
this, when we discussed about this problem:
jacob:
Dave Hanson:
<<quote>>
For the record, the declaration for p is indeed a tentative definition,
but that status persists only until the end of the compilation unit,
i.e., the end of f1.c. Since there's no subsequent external definition
of p, the tentative declaration acts exactly as if there was a
file-scope declaration for p with an initializer equal to 0. (See Sec.
3.7.2 of the ANSI Standard, which I think is Sec. 6.7.2 of the ISO
Standard). As a result, p is null at program startup--assuming there are
no other similar declarations for p.
This example illustrates nicely a problem with the common storage model:
You can't determine whether or not a declaration is also a definition by
examining just the program text, and it's easy to get strange behaviors.
In this example, there was only one definition, which passes without
complaint from linkers. In the stricter definition/reference model,
linkers would complain about multiple definitions when combining the
object code for f1.c and f2.c. This example also shows why it's best to
initialize globals, because linkers will usually catch these kinds of
multiple definitions.
The common model also permits C's (admittedly weak) type system to be
violated. I've seen programmers declare "int x[2]" in one file and
"double x" in another one, for example, just so they can access x as a
double and as a pair of ints.
For a good summary of the four models of external
definitions/declarations, see Sec. 4.8 in Harbison & Steele, C: A
Reference Manual, 4th ed., Prentice-Hall, 1995.
<<end quote>>
------------------------------------------------------------------------------
Relocating all symbols
----------------------
Let's come back to our linker however. I will outline with lcclnk and
windows as exmaples, but in Unix and many other systems, the operations
done by the linker are very similar.
The next thing to do is to go through all symbols, and decide whether
they will go into the final symbol table or not. Many of them are
discarded, since they are local symbols of each compilation unit.
Global symbols need to be relocated, i.e. the ‘value’ of the symbol has
to be set to its final address. This is easy now that the position of
the section that contains the symbol is exactly known: we just go
through them setting the value field to the right number.
The algorithm outline is simple:
1. Read the relocation information from the object file.
2. According to the type of relocation, adjust the value of the symbol.
The relocations supported by lcclnk are just a few: the pc-relative
relocation (code 7, and code 20), the normal 32-bit relocation (code 6),
and two types of relocations for the debug information, code 10 and 11.
3. Save the position within the executable file where the relocation is
being done in the case of relocation type 6 (normal 32 bits relocation),
to later build the .reloc section if this is needed.
Normally this is needed only when generating a dll, since executables
aren’t relocated under windows.
The .reloc section of the executable is data for the program loader, to
tell it where are the addresses that it should patch when loading the
file into memory.
Other linkers more complicated than lcc's support more fancy stuff. A
symbol can be included only once even if it appears several times, and
many other things
Performing the relocations
--------------------------
More specifically, what the linker does, is fixing the data/code
references that each module contains from all others, patching the code
with the offsets that the external symbols have, now that the positions
of all sections are known. For a C source line like:
foo(5);
the linker reads the corresponding relocation record emitted by the
compiler, and looks up the symbol ‘foo’ in the symbol table. It patches
the zeroes that are stored by the assembler at the position following
the call opcodes with the relative offset from the point of the call to
the address of foo. This will allow the processor to make a PC relative
call instruction: the 4 bytes after the call instruction contain a
32-bit offset to the address of foo.
Using the utility pedump, you can see this process. Consider the
following well-known program:
#include <stdio.h>
int main(int argc,char *argv[])
{
printf("Hello\n");
}
Compile this with:
lcc -g2 hello.c
Now, disassemble hello.obj with pedump like this:
pedump /A hello.obj
You will see near the end of the long listing that follows, the
disassembled text section:
section 00 (.text) size: 00020 file offs: 00220
--------------------------------------------------------------
_main: Size 18
--------------------------------------------------------------
[0000000] 55 pushl %ebp
[0000001] 89e5 movl %esp,%ebp
Line 5
[0000003] 6800000000 pushl $0 (_$2) (relocation)
[0000008] e800000000 call _printf (relocation)
[0000013] 83c404 addl $4,%esp
Line 6
[0000016] 5d popl %ebp
[0000017] c3 ret
[0000018] 0000 addb %al,(%eax)
Let’s follow the relocation to the function printf. You will see that
pedump has a listing of the relocations that looks like this:
Section 01 (.text) relocations
Address Type Symbol Index Symbol Name
------- ---- ------------ ----- ----
4 DIR32 4 _$2
9 REL32 16 _printf
The linker will then take the bytes starting at the address 4, and put
the address of the symbol 4 in the symbol table of main.obj. It will
search the address of printf, and put the relative address, i.e. the
difference between the address of printf and the address of main+9 in
those bytes starting at byte 9.
As you can see there are several types of relocations, each specifying a
different way of doing these additions. The compiler emits only three
types of relocations:
• Type 6 : Direct 32-bit reference to the symbols virtual address
• Type 7: Direct 32-bit references to the symbols virtual address, base
not included.
• Type 20: PC-relative 32-bit reference to the symbols virtual address.
This last one is the one used in the relocation to printf. We have to
know too that the relative call is relative to the next instruction,
i.e. to the byte 13 and not to the byte 9. Happily for us the linker now
knows this stuff...