Compiling source code out of the blue

  • Thread starter Tomás Ó hÉilidhe
  • Start date
T

Tomás Ó hÉilidhe

I'd post this on a gcc newsgroup but I'd be more productive talking
to the wall.

Anyway, let's say someone throws some source code at you for a
particular program and says, "Just compile it, it works fine". Now
admittedly, I tend to have a phobia of this situation because I recall
from my Windows days the numerous times I was given code that was
supposedly "good to go", but which failed to compile for some stupid
reason. Of course I like to program, but I couldn't be bothered going
through header files looking for the dodgy definition of DWORD which is
hidden within #if blocks pertaining to the Windows version.

Anyway, pretending I have faith for a moment in receiving a program's
source code to compile to yield an executable binary, I'd just like to
ask how best to compile it "in release mode" using gcc. I don't need
errors or warnings, I just want the executable.

At the moment, I'm unzipping the zip file, opening a command prompt
in the relevant directory and doing the following:

gcc *.c -D NDEBUG -o prog.exe

I'm looking for a gcc command line sequence that does the following:

* Compiles and links all the source files (*.c) present in the current
directory.
* Applies any and all optimisations it wants.
* Doesn't give me warnings (or any output for that matter)
* Strips all the garbage out of the executable (HelloWorld shouldn't be
400 KB)

The method I'm using at the moment does the trick, but still the
executable file is a bit big (roughly 25 KB for a simple-enough program).
Also, I'd like to know that I'm getting all the optimisations that are on
offer.

I'm getting into cross-platform programming lately, compiling
something for Linux one minute and Windows the other, which is why I've
been wondering what's the best "Give Me An Executable" method of using
gcc.

And just out of curiosity, is gcc restricted mainly to normal 8-Bit
byte system, or does it have binaries for all sorts of different systems,
9-Bit ones with padding inside a sign-magnitude int perhaps?
 
R

Randy Howard

Anyway, let's say someone throws some source code at you for a
particular program and says, "Just compile it, it works fine". Now
admittedly, I tend to have a phobia of this situation because I recall
from my Windows days the numerous times I was given code that was
supposedly "good to go", but which failed to compile for some stupid
reason. Of course I like to program, but I couldn't be bothered going
through header files looking for the dodgy definition of DWORD which is
hidden within #if blocks pertaining to the Windows version.

That can happen with code coming from <any platform>. It isn't fun,
because a blanket statement like "it just works fine", when taken from
code that hasn't been ported already, implies a lot about the person
giving it to you, and almost nothing about the code itself.
Anyway, pretending I have faith for a moment in receiving a program's
source code to compile to yield an executable binary, I'd just like to
ask how best to compile it "in release mode" using gcc. I don't need
errors or warnings, I just want the executable.

You don't need errors or warnings? How can you possibly support that,
/especially/ the former?
At the moment, I'm unzipping the zip file, opening a command prompt
in the relevant directory and doing the following:

gcc *.c -D NDEBUG -o prog.exe

I'm looking for a gcc command line sequence that does the following:

* Compiles and links all the source files (*.c) present in the current
directory.

Makefiles are nice for this for anything but trivially small projects.
If you are getting projects that were built with something like Visual
from MS, the project files are likely to not help you much on other
platforms, so you may have to build some of these yourself, unless you
really like invoking things manually file by file.
* Applies any and all optimisations it wants.

You want the compiler to decide the optimization settings? I'd think a
good starting point might be -O2.
* Doesn't give me warnings (or any output for that matter)

I think this is a horribly bad idea. Usually you want as many as you
can get, especially in an initial port. Later on you may decide that
some aren't going to be addressed, but a lot of broken code will
compile without error, but with meaningful warnings and still generate
a binary.
* Strips all the garbage out of the executable (HelloWorld shouldn't be
400 KB)

Many platforms have a means to do this after linking, try man strip on
UNIX systems, for example.
The method I'm using at the moment does the trick, but still the
executable file is a bit big (roughly 25 KB for a simple-enough program).
Also, I'd like to know that I'm getting all the optimisations that are on
offer.

The gcc compiler has a wide variety of optimization settings, and which
to turn on and off usually require more consideration than just "turn
them all on".
I'm getting into cross-platform programming lately, compiling
something for Linux one minute and Windows the other, which is why I've
been wondering what's the best "Give Me An Executable" method of using
gcc.

I can understand you wanting to apply the KISS principle to minimize
some of it, but I suspect you're going to cause more problems than you
solve by pursuing this path.
And just out of curiosity, is gcc restricted mainly to normal 8-Bit
byte system, or does it have binaries for all sorts of different systems,
9-Bit ones with padding inside a sign-magnitude int perhaps?

http://gcc.gnu.org/install/specific.html

The above addresses some of them, but not all. It should be a good
starting point though. Google should cough up answers for a specific
processor you may have in mind.
 
T

Tomás Ó hÉilidhe

Randy Howard said:
That can happen with code coming from <any platform>. It isn't fun,
because a blanket statement like "it just works fine", when taken from
code that hasn't been ported already, implies a lot about the person
giving it to you, and almost nothing about the code itself.


I tend to be dealing with command-line programs which should be fully-
portable. For example, a program to calcuate a network card's serial
number from its MAC address.

You don't need errors or warnings? How can you possibly support that,
/especially/ the former?


Because the program is know to work perfectly. When developing my own
code, I of course make use of high warning settings... but when I'm given
source code where I'd prefer to receive a binary, I just want to compile
to an executable and be done with it.

Working with Linux, people distribute source code a lot. So much so, that
gcc comes built-in to the operating system. You think you're downloading
a binary for a program, and then when you open readme.txt, it tells you
to do:

make
make install

Makefiles are nice for this for anything but trivially small projects.
If you are getting projects that were built with something like Visual
from MS, the project files are likely to not help you much on other
platforms, so you may have to build some of these yourself, unless you
really like invoking things manually file by file.


What's a makefile? Is it a list of parameters to pass to the compiler? Is
there any standard kind of makefile, or do all compilers have a different
format?

You want the compiler to decide the optimization settings? I'd think
a good starting point might be -O2.


Interestingly enough, I compiled by program with -O3 and now the binary
doesn't work properly... I'll look into why.

I think this is a horribly bad idea. Usually you want as many as you
can get, especially in an initial port. Later on you may decide that
some aren't going to be addressed, but a lot of broken code will
compile without error, but with meaningful warnings and still generate
a binary.


As I said, I would have preferred a binary but I'm left with source code,
so I just want to compile it and pretend I was given a binary to begin
with.

Many platforms have a means to do this after linking, try man strip on
UNIX systems, for example.


I've had a look at "strip" that comes with gcc. What I'm curious about
though, is why this needs to be done at all? Why fill an executable with
crap that it doesn't need? (assuming we're compiling in Release Mode of
course)
 
K

Kenny McCormack

Tomás Ó hÉilidhe said:
I've had a look at "strip" that comes with gcc. What I'm curious about
though, is why this needs to be done at all? Why fill an executable with
crap that it doesn't need? (assuming we're compiling in Release Mode of
course)

Compile (or, more precisely - and in this ng, you always gotta be
precise! - link) with "-s". This does the strip as part of the linking.

Note also that the default operation in most Linux-y situations is to
compile with "-g", which puts "all that crap" in there in the first
place. But linking with "-s" will remove it.

Note: Any minute now, someone is going to post an "off topic, can't
discuss it here, blah, blah, blah" message, telling you that makefiles
(and everything else you're interested in) is verboten.
 
R

Randy Howard

I tend to be dealing with command-line programs which should be fully-
portable. For example, a program to calcuate a network card's serial
number from its MAC address.




Because the program is know to work perfectly.

On some other platform. "known to work" programs frequently suffer
from bugs that don't appear on Platform A when you move it to PLatform
B. Platform-specific interfaces, UB, byte-order issues, lots of
things.
Working with Linux, people distribute source code a lot. So much so, that
gcc comes built-in to the operating system. You think you're downloading
a binary for a program, and then when you open readme.txt, it tells you
to do:

make
make install

Usually come configuration is required for anything but trivial
programs, such as the ./configure stuff.
What's a makefile? Is it a list of parameters to pass to the compiler? Is
there any standard kind of makefile, or do all compilers have a different
format?

Are you serious?
Interestingly enough, I compiled by program with -O3 and now the binary
doesn't work properly... I'll look into why.

I guess the code isn't perfect after all.
 
T

Tomás Ó hÉilidhe

Randy Howard said:
On some other platform. "known to work" programs frequently suffer
from bugs that don't appear on Platform A when you move it to PLatform
B. Platform-specific interfaces, UB, byte-order issues, lots of
things.


Most of the source code I come across is for programs targeted at
"Windows / Mac / Linux". There's no major differences between the
platforms.

Still though, If *I* had written the code, it'd work on any compliant
implementation of C89.

Are you serious?


I get the feeling that makefiles are something I should know about. . . ?
I only ever give the compiler a list of source files and viola I get my
executable -- I've never needed a makefile. I make as well look it up on
Wikipedia now while we're on the topic.

I guess the code isn't perfect after all.


Then again it could be the compiler. I'll check over the code in the next
few hours, it's only 200 lines or so and I know pretty much where things
are acting strange in the code.
 
T

Tomás Ó hÉilidhe

Tomás Ó hÉilidhe said:
Then again it could be the compiler. I'll check over the code in the
next few hours, it's only 200 lines or so and I know pretty much where
things are acting strange in the code.


Found the problem. I'm gonna start a new thread about it entitled "Sequence
point violation?".
 
C

CBFalconer

Tomás Ó hÉilidhe said:
.... snip ...


I get the feeling that makefiles are something I should know
about? I only ever give the compiler a list of source files and
viola I get my executable -- I've never needed a makefile. I
make as well look it up on Wikipedia now while we're on the topic.

Try "info make" or "man make" on your Linux or Cygwin or DJGPP
system. Maybe also Ming.

BTW, please do not remove attributions for material you quote.
Attributions are the initial lines of the form "Joe wrote:". We
have no idea who wrote "What's a makefile?".
 
J

John Bode

I tend to be dealing with command-line programs which should be fully-
portable. For example, a program to calcuate a network card's serial
number from its MAC address.

"Should be" doesn't often translate to "is". Anything that relies on
byte order, alignment rules, etc., won't be portable.
Because the program is know to work perfectly.

On a specific platform. The code may be making some non-portable
assumptions about the underlying architecture, some of which may show
up in the warnings.
When developing my own
code, I of course make use of high warning settings... but when I'm given
source code where I'd prefer to receive a binary, I just want to compile
to an executable and be done with it.

So would I. I also want to win the lottery.
Working with Linux, people distribute source code a lot. So much so, that
gcc comes built-in to the operating system. You think you're downloading
a binary for a program, and then when you open readme.txt, it tells you
to do:

make
make install

That's an artifact of multiple distros running on widely different
hardware -- it's highly impractical to build binaries for all possible
targets (i.e., Red Hat on x68, Yellow Dog on PPC, etc.).
What's a makefile? Is it a list of parameters to pass to the compiler? Is
there any standard kind of makefile, or do all compilers have a different
format?

A makefile is sort of equivalent to a project file, in that it
identifies the files in a project and the rules to build the project.
The make utility builds a dependency graph from the information in the
makefile, and automatically rebuilds any files that are out of date
with respect to the target.

The make utility and makefile format is common across most Unix and
Linux systems, and is independent of any specific compiler.
Interestingly enough, I compiled by program with -O3 and now the binary
doesn't work properly... I'll look into why.

Yeah, that smells like the code in question is making some non-
portable assumptions.
As I said, I would have preferred a binary but I'm left with source code,
so I just want to compile it and pretend I was given a binary to begin
with.

Doesn't help much if the code doesn't work.
 
P

pete

Tomás Ó hÉilidhe said:
I'd post this on a gcc newsgroup
but I'd be more productive talking to the wall.

Anyway, let's say someone throws some source code at you for a
particular program and says, "Just compile it, it works fine".

Crank up the compiler warning level.
Just compile it, don't link it.
See what happens and take it from there.
 
S

Stephen Sprunk

Toms hilidhe said:
Because the program is know to work perfectly. When developing my own
code, I of course make use of high warning settings... but when I'm given
source code where I'd prefer to receive a binary, I just want to compile
to an executable and be done with it.

"known to work perfectly" is rarely conclusive. There are a great many
programs which work fine _on a particular platform, when compiled with a
particular compiler_ but break horribly when you step outside those bounds.

IOW, a program can appear to be correct while relying on implementation- or
even undefined behavior, but that is no guarantee it will continue to work
if ported (even to a new version of the same compiler or OS).
Working with Linux, people distribute source code a lot. So much so, that
gcc comes built-in to the operating system. You think you're downloading a
binary for a program, and then when you open readme.txt, it tells you to
do:

make
make install

Add a "./configure" step at the beginning for most open source programs
these days.
What's a makefile? Is it a list of parameters to pass to the compiler? Is
there any standard kind of makefile, or do all compilers have a different
format?

A makefile is processed by the "make" command, and there is a standard
format, at least for UNIX-y systems. If a program comes with a file called
"Makefile", you can start with that and, if compiling fails, usually get it
working with minor changes. If it comes with a program called "configure",
that will build a Makefile for you -- and determine all the dependencies the
program has, adjust for your local environment, etc. If it comes with
neither, then either it's a completely trivial program or (if it comes with
"project" files) it was intended to only work on Windows. The latter is
rarely fun.

If it is a trivial program, here's a good start:

gcc -ansi -pedantic -W -Wall foo.c -o foo

or, if you're serious about not being warned of broken code:

gcc foo.c -o foo

I _always_ create Makefiles for anything larger than a single source file;
it's not tough, and the time spent is recovered within a few
modify-compile-test cycles.
Interestingly enough, I compiled by program with -O3 and now the binary
doesn't work properly... I'll look into why.

That means the code is broken, though it may not show up on the author's
system. Broken code often appears to work correctly if you only use it on a
particular OS with a particular compiler; porting such code can be a
nightmare.
I've had a look at "strip" that comes with gcc. What I'm curious about
though, is why this needs to be done at all? Why fill an executable with
crap that it doesn't need? (assuming we're compiling in Release Mode of
course)

Because you often want to debug "release" binaries when customers find
problems in the field, or when the program mysteriously crashes after you're
"done" working on it. Having that extra info there doesn't hurt
performance, so there's rarely reason to take it out.

S
 
A

andreyvul

* Compiles and links all the source files (*.c) present in the current Makefile.
directory.
* Applies any and all optimisations it wants.
General optimization: -O2
Size optimization: -Os (-O2 plus space-saving optimizations)
Speed optimization: -O3 (-O2 plus register renaming, inlining, and
loop unrolling - note: -O3 is a bitch to debug) and -mtune= (optimizes
the generated code to a certain processor i.e. cache timings,
instruction clock lengths), -march=, m<feature> (adds language
features to the code, like MMX, SSE/2/3/4 3DNow!/+, x86_64, etc. -
code will *not* be backwards-compatible)

I usually use -O2 -finline-functions -mtune=athlon64 -march=athlon64 -
msse3 for all my code.
* Doesn't give me warnings (or any output for that matter)
1) *VERY* bad idea. If the compiler cokes on your code, you won't know
what happened or where.
2) Compile with -ggdb. It will make debugging far easier.
* Strips all the garbage out of the executable (HelloWorld shouldn't be
400 KB)
Tweak the linker parameters. A HelloWorld object file is ~500-600
bytes. Unfortunately, both MSVC and MinGW link in a lot of libraries
because msvcrt.dll has a LOT of Win32 api calls. In Unix, glibc is
modular enough that you only need to add libc.a (and libm.a is you're
using anything in math.h). I'm far too lazy to turn on my linux laptop
in order to check how small an ELF HelloWorld can be.
 
A

andreyvul

I tend to be dealing with command-line programs which should be fully-
portable. For example, a program to calcuate a network card's serial
number from its MAC address.

Show me an ANSI/ISO C way of getting a network card's MAC address.
*NO* OS calls, syscalls, shell scripts, calls to system(), or OS API
functions. Posix functions are maybe.
Because the program is know to work perfectly. When developing my own
code, I of course make use of high warning settings... but when I'm given
source code where I'd prefer to receive a binary, I just want to compile
to an executable and be done with it.

Working with Linux, people distribute source code a lot. So much so, that
gcc comes built-in to the operating system. You think you're downloading
a binary for a program, and then when you open readme.txt, it tells you
to do:

make
make install


What's a makefile? Is it a list of parameters to pass to the compiler? Is
there any standard kind of makefile, or do all compilers have a different
format?

GNU Make is standard on GNU/Linux distributions.
If in doubt, write a GNU make compatible makefile.

An example:
# Makefile for Tower of Hanoi
# (C)2007 Andrey Vul
# GPL
CFLAGS = -Wall #Enable every single warning
OPTIMIZE_CFLAGS = -O3 -fomit-frame-pointer #speed optimization
CC = gcc #make will be using the gcc compiler

all: towerofhanoi #make (all)

#Remove object files, nano backups and ultraedit-32 backups
clean: #make clean
-rm *.o *~ *.bak

towerofhanoi: compile #linking phase, compile is a prerequisite target
$(CC) *.o -o $@

compile: #create object files
$(CC) $(CFLAGS) $(OPTIMIZE_CFLAGS) -c *.c
#I'm making this up to show an example
install: #make install
cp hanoi /usr/local/bin

Interestingly enough, I compiled by program with -O3 and now the binary
doesn't work properly... I'll look into why.
-O3 is risky. If the code crashes, use -O2.
As I said, I would have preferred a binary but I'm left with source code,
so I just want to compile it and pretend I was given a binary to begin
with.
The code crashes because you used void* and inputted a float instead
of a struct. The compiler would have warned you about this, but you
chose to disable warnings.
I've had a look at "strip" that comes with gcc. What I'm curious about
though, is why this needs to be done at all? Why fill an executable with
crap that it doesn't need? (assuming we're compiling in Release Mode of
course)
Because it will be far easier to debug.
 
R

Randy Howard

Show me an ANSI/ISO C way of getting a network card's MAC address.
*NO* OS calls, syscalls, shell scripts, calls to system(), or OS API
functions. Posix functions are maybe.

Yes, but that's not what he wrote. He may have meant that, but what he
wrote seems to imply that

int calc_serialnumber(const char *MAC_Address);

could be all that is required, employing some algorithm to extract
serial number from the MAC Address string passed to it.
 
G

Gordon Burditt

That can happen with code coming from said:
Show me an ANSI/ISO C way of getting a network card's MAC address.
*NO* OS calls, syscalls, shell scripts, calls to system(), or OS API
functions. Posix functions are maybe.

printf() followed by fgets(). Or look at argv[]. The post did not
say the network card in question is one connected to this computer,
and even if it is, you can prompt for the MAC address.

Presumably if you CAN calculate a network card's serial number from
its MAC address, there's either a database of them or there's some
formula for brand X (it is likely that the person using the program
works for the manufacturer or distributor of Brand X network cards)
network cards that relates the two.
 
G

Gordon Burditt

* Strips all the garbage out of the executable (HelloWorld shouldn't be
Tweak the linker parameters. A HelloWorld object file is ~500-600
bytes. Unfortunately, both MSVC and MinGW link in a lot of libraries
because msvcrt.dll has a LOT of Win32 api calls. In Unix, glibc is
modular enough that you only need to add libc.a (and libm.a is you're
using anything in math.h). I'm far too lazy to turn on my linux laptop
in order to check how small an ELF HelloWorld can be.

For what it's worth: a HelloWorld object file on FreeBSD 6.2 (gcc
3.4.6) is 800 bytes. The executable, linked with gcc -s and dynamic
linking of the C library (dynamic symbols can't be removed and leave
the executable runnable), is 3156 bytes. I suspect there's some
alignment to page boundaries here.

The size(1) command gives 65 bytes for the object file and 1600
bytes for the executable. About 150 bytes of the executable comes
from two RCS tags for the crt "startup code".

I suspect there's some alignment to pages going on.
 
T

Tomás Ó hÉilidhe

Presumably if you CAN calculate a network card's serial number from
its MAC address, there's either a database of them or there's some
formula for brand X (it is likely that the person using the program
works for the manufacturer or distributor of Brand X network cards)
network cards that relates the two.


Exactly, there's a particular brand of ADSL router where the MAC
address can be determined from the serial, and vice versa. I don't actually
work for the manufacturer though.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,989
Messages
2,570,207
Members
46,783
Latest member
RickeyDort

Latest Threads

Top