Why does execution start at main()?

G

glen herrmannsfeldt

Probably as a lame compensation for not providing a decent hierarchical
file system. Such a file could be considered as a poor surogate of a
directory.

Well, yes, there is that. Though the 44 character names were
long enough for fake hierarchy in many cases.

Still, the idea of keeping related programs together in one
file seems useful. Unix has ar files for libraries.

Though #include files tend to be in a directory structure
as separate files.

-- glen
 
C

CBFalconer

Dan said:
Probably as a lame compensation for not providing a decent
hierarchical file system. Such a file could be considered as a
poor surogate of a directory.

Back in CP/M days, shortly after .LBR files appeared, people began
packing executables into them. Then the LRUN program appeared, to
extract and run a particular component from COMMAND.LBR. Even
later my CCPLUS incorporated that function, together with a search
path, into the 'shell'. Similar things were done with interpreted
executables, using PCDS.LBR. I had systems using the same format
as conventional searchable libraries for object code modules.
There is very little new under the sun.

All these packages provided a convenient way to supply a set of
utilities to the user.

Those interested can explore the download/cpm/ area of my site,
below.
 
M

Mark McIntyre

Though #include files tend to be in a directory structure
as separate files.

if anyone remembers VMS, they'll recall that all the system headers were in
a text library, a file with a .tlb extension. The C compiler knew
automagically how to read out the relevant block of data at runtime.
 
G

glen herrmannsfeldt

Mark said:
On Tue, 18 May 2004 18:43:05 GMT, in comp.lang.c , glen herrmannsfeldt
if anyone remembers VMS, they'll recall that all the system headers were in
a text library, a file with a .tlb extension. The C compiler knew
automagically how to read out the relevant block of data at runtime.

Someday I might have a running MicroVAX running VMS,
and some compilers under the hobbyist license.

I did some C under VMS many years ago. The most interesting
one is the messages you get out from exit() for arguments
other than 0. I had a program with exit(8) and it took
me a while to understand the message that came out.

-- glen
 
D

Dan Pop

In said:
Still, the idea of keeping related programs together in one
file seems useful. Unix has ar files for libraries.

Though #include files tend to be in a directory structure
as separate files.

ar files are for linker consumption, while header files are for compiler
and human consumption. Furthermore, there is no one-to-one correspondence
between them. So, the different handling makes sense.

The Unix model is quite popular on non-Unix systems, too, except that
ar files are called, usually, libraries (ar stands for "archiver" and the
..a suffix for "archive").

Another Unix oddity is the usage of the "lib" *prefix* in the name of
archives built for linker consumption: this prefix is omitted, along
with the .a suffix, when the linker is asked to search a library using
the -l option: to include libm.a in the linking process (it contains
the <math.h> stuff and is not searched by default) you have to use
the -lm option.

Dan
 
W

-wombat-

Beni said:
I have been programming in C for about a year now. It sounds silly,
but I never took the time to question why a C(or C++ or Java) program
execution begins only at the main(). Is it a convention or is there
some deeper underlying reason?

I love reading all of the wanking on this thread regarding "hosted
implementations" and the CRT.

An executable image gets loaded by the OS and somewhere in the image, there
has to be a starting point, usually denoted in the header. The OS jumps to
the starting points address after the image is loaded into memory (today,
this also means dynamically linking libraries too and a lot of other
garbage.) But in C terms, the starting point is an address (pointer to fcn
returning int). return (*entry_point)(). BFD.

This starting point gets set by the linker (loader or linking loader in
years gone by). The C compiler frontend adds the symbol name of that
starting point to the linker command line when the objects are linked into
an executable. Now, pick a name for that starting point -- "main" sounded
like a good one and it stuck. Worse yet, the CRT is generally now the entry
point and the CRT expects to call main as a function.

Why is the user's entry point called "main"? It could have been something
else entirely, but at the time of PDP-11s and early Unix/C, the "main entry
point of the program" was a phrase in common usage.
 
M

Mark McIntyre

Someday I might have a running MicroVAX running VMS,
and some compilers under the hobbyist license.

can you do that? point me at a url, my office has a stack of VMS
workststions its junking, I could have hours of fun...
I did some C under VMS many years ago. The most interesting
one is the messages you get out from exit() for arguments
other than 0. I had a program with exit(8) and it took
me a while to understand the message that came out.

Same with return (somedigit) from main.
 
C

CBFalconer

Mark said:
.... snip ...

can you do that? point me at a url, my office has a stack of VMS
workststions its junking, I could have hours of fun...

You may get a lot of interest in those from a.f.c, to which I have
cross-posted this.

(Mark is in the UK)
 
A

August Derleth

Another Unix oddity is the usage of the "lib" *prefix* in the name of
archives built for linker consumption: this prefix is omitted, along
with the .a suffix, when the linker is asked to search a library using
the -l option: to include libm.a in the linking process (it contains
the <math.h> stuff and is not searched by default) you have to use
the -lm option.

As long as we're [OT], I might as well defend this behavior. ;-)

It relies on a naming convention that should be universally followed, and
the desire avoid typing boilerplate. Assuming all library files are named
following the libfoo.a scheme, only the foo part is at all meaningful to
the human linking the program: It is, or should be, predictable from the
headers you included or the naming scheme of the functions you called, and
it certainly should be trivially derived from the full name of the
collection of code you just called into.

Therefore, stripping the first and last bits off ought to leave the poor
typist compiling programs all day (and recall, Unix was designed for poor
typists ;-)) with something short and memorable to put behind that -l flag.

(Not scanning libm.a by default might arguably be a bug in this age of
cheap clock cycles and the possibility of intelligent linkers, but that's
a different Holy War.)
 
D

Dan Pop

In said:
can you do that? point me at a url, my office has a stack of VMS
workststions its junking, I could have hours of fun...

Lots of hours, indeed, given the speed of these things...

Dan
 
D

Dan Pop

In said:
Another Unix oddity is the usage of the "lib" *prefix* in the name of
archives built for linker consumption: this prefix is omitted, along
with the .a suffix, when the linker is asked to search a library using
the -l option: to include libm.a in the linking process (it contains
the <math.h> stuff and is not searched by default) you have to use
the -lm option.

As long as we're [OT], I might as well defend this behavior. ;-)

Well, I was not attacking it in the first place... But since you want
to defend it: what's the point of having the lib prefix in the first
place, as long as the purpose of the file can be derived from the suffix
alone?
(Not scanning libm.a by default might arguably be a bug in this age of
cheap clock cycles and the possibility of intelligent linkers, but that's
a different Holy War.)

It's not as much an issue of cheap clock cycles as it is of fast disks.
Back when it was not uncommon to have a floppy disk as your fastest
I/O device, any such optimisations made perfect sense. Nowadays, they
mere reflect the stubborness of a few people who think that whatever
was the right thing 30 years ago *must* be the right thing today.

Back then, the "standard" C library was contained in two library files,
libc.a and libm.a that were covered by two headers, <stdio.h> and
<math.h>. For obvious reasons, libm.a wasn't searched by default
(I can't remember writing a single *production* Unix program including
<math.h> and I guess I'm not alone). By the time the one-to-one
correspondence between headers and library files was broken, treating
the <math.h> stuff in a special manner at link time started to make less
sense than before. In time, it made less and less sense...

Dan
 
C

CBFalconer

Dan said:
August Derleth said:
Another Unix oddity is the usage of the "lib" *prefix* in the
name of archives built for linker consumption: this prefix is
omitted, along with the .a suffix, when the linker is asked to
search a library using the -l option: to include libm.a in the
linking process (it contains the <math.h> stuff and is not
searched by default) you have to use the -lm option.

As long as we're [OT], I might as well defend this behavior. ;-)

Well, I was not attacking it in the first place... But since
you want to defend it: what's the point of having the lib prefix
in the first place, as long as the purpose of the file can be
derived from the suffix alone?

I think historically .a (archive) files were used for other
purposes, much as tar and zip are today. Only some of these were
object libraries.
 
A

August Derleth

Another Unix oddity is the usage of the "lib" *prefix* in the name of
archives built for linker consumption: this prefix is omitted, along
with the .a suffix, when the linker is asked to search a library using
the -l option: to include libm.a in the linking process (it contains
the <math.h> stuff and is not searched by default) you have to use the
-lm option.

As long as we're [OT], I might as well defend this behavior. ;-)

Well, I was not attacking it in the first place... But since you want
to defend it: what's the point of having the lib prefix in the first
place, as long as the purpose of the file can be derived from the suffix
alone?

As CBFalconer said, the ar archiver was used for other things before tar
became widespread enough to be the archiver of choice. (Although seeing
shell archives is more common among old Unix source archives, a shell
archive is really just a shell program. You can see the potential for
abuse.)

Secondly, it's more common for extensions to describe the format of a file
and not its use. (Except for executable files, which ideally shouldn't
have extensions at all, no matter what they look like internally.)

So, really, it's the same reason the compilers don't scan libm.a by
default: Tradition.
 
D

Dan Pop

In said:
Dan said:
August Derleth said:
On Wed, 19 May 2004 14:32:31 +0000, Dan Pop wrote:

Another Unix oddity is the usage of the "lib" *prefix* in the
name of archives built for linker consumption: this prefix is
omitted, along with the .a suffix, when the linker is asked to
search a library using the -l option: to include libm.a in the
linking process (it contains the <math.h> stuff and is not
searched by default) you have to use the -lm option.

As long as we're [OT], I might as well defend this behavior. ;-)

Well, I was not attacking it in the first place... But since
you want to defend it: what's the point of having the lib prefix
in the first place, as long as the purpose of the file can be
derived from the suffix alone?

I think historically .a (archive) files were used for other
purposes, much as tar and zip are today.

AFAIK, ar and tar must be roughly of the same age and tar is the one
supposed to be used for other purposes than archiving on tapes.

But there may have been a time window when ar was and tar wasn't...

Dan
 
D

Dan Pop

In said:
On Wed, 19 May 2004 14:32:31 +0000, Dan Pop wrote:

Another Unix oddity is the usage of the "lib" *prefix* in the name of
archives built for linker consumption: this prefix is omitted, along
with the .a suffix, when the linker is asked to search a library using
the -l option: to include libm.a in the linking process (it contains
the <math.h> stuff and is not searched by default) you have to use the
-lm option.

As long as we're [OT], I might as well defend this behavior. ;-)

Well, I was not attacking it in the first place... But since you want
to defend it: what's the point of having the lib prefix in the first
place, as long as the purpose of the file can be derived from the suffix
alone?

As CBFalconer said, the ar archiver was used for other things before tar
became widespread enough to be the archiver of choice. (Although seeing
shell archives is more common among old Unix source archives, a shell
archive is really just a shell program. You can see the potential for
abuse.)

Secondly, it's more common for extensions to describe the format of a file
and not its use. (Except for executable files, which ideally shouldn't
have extensions at all, no matter what they look like internally.)

Unix doesn't use file extensions at all...

Then, pray tell, what are the prefixes used in conjunction with the .tar
suffix? How about the .c or .o suffix?

The common convention is that the name of the directory containing a file
together with the file name suffix provide an indication about the file's
use. So, if the directory name is "lib" and the suffix is .a, the "lib"
prefix is as redundant as you can get.
So, really, it's the same reason the compilers don't scan libm.a by
default: Tradition.

Nope: you can't *explain* tradition by invoking it ;-)

Dan
 
D

Dave Thompson

Dik T. Winter said:
Sprunk" <[email protected]> writes:

Linux will load ELF objects and begin execution wherever the ELF header
specifies. There may be common values for those addresses, but any value in
the first 2GB of virtual memory works on the i386 platform. I don't know
about other platforms Linux runs on, nor do I have any clue how COFF objects
work on i386.
M$ PE~COFF (at least) does have a start-address field.

However, *original* Unix did start at virtual 0 -- and "a.out" format
began with an 8-word header whose first word was (always) 0407, the
PDP-11 instruction to branch over the 7 remaining words to the first
actual code (thus) always at virtual byte 020. AFAICT, this plus ar
originated the now-hoary convention of identifying file formats by a
"magic number" in the first 2 or nowadays often 4-8 bytes.

- David.Thompson1 at worldnet.att.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,142
Messages
2,570,820
Members
47,367
Latest member
mahdiharooniir

Latest Threads

Top