Let's say someone produces a tool that converts C code to compliant C++
code - e.g. alters C++ keywords used as identifiers, adds prototypes, adds
explicit casts of void * etc. Would you describe such a program as a C
compiler? If not, why not?
Generally, I *would* call it a compiler (provided it produced an
executable image in the process, perhaps by later invoking the
"assembler" that translates the C++ to machine code). But if this
particular translator depended on the C++-to-machine-code step to
find certain fundamental errors, that is a -- perhaps even the only
-- condition under which I would not call it a compiler.
I am not sure I can define it very well, so consider the following
as an example, before I go on to an attempt at a definition:
% cat bug.c
int main(void] { return *42; }
% ctocxx -C bug.c
(Here, please assume the -C option means "leave the C++ `assembly'
visible for inspection, and that no diagnostics occur.)
% cat bug.c++
int main(] { return *42; }
%
This fails the "compiler" criterion by missing the obvious syntax
error ("]" should be "}") and semantic error (unary "*" cannot be
applied to an integer constant). (And of course, if main() were
to call itself recursively in the C version, the C++ code would
have to use some other function, or depend on that particular C++
implementation to allow recursive calls to main() -- either would
be acceptable, provided the "C compiler" comes *with* the C++
compiler portion. If the C compiler is meant to work with *any*
C++ compiler, depending on implementation-defined characteristics
would be at best a bug.)
The difference is basically one of responsibility: to be called a
"compiler", the program must make a complete syntactic and semantic
analysis of the source code, determine its "intended meaning" (or
one of several meanings, in cases where the source language has
various freedoms), and generate as its output code that is intended
to pass cleanly through any (required and/or supplied) intermediate
stages before it produces the final "executable". If something
fails to "assemble" without the "compiler" stage first pointing out
an error, this indicates a bug in the compiler.
A preprocessor, macro-processor, or textual-substitution system, on
the other hand, does not need to make complete analyses -- if the
input is erroneous, its output can be arbitrarily malformed without
this necessarily being a bug. Diagnostics from later passes are
acceptable and expected.
Of course, escape hatches (as commonly found in C compilers with
__asm__ keywords and the like) can muddy things up a bit. If you
use __asm__ to insert invalid assembly code, while the compiler
assumes that you know what you are doing, this is probably "your
fault". Likewise, a C-via-C++-to-executable compiler might provide
an escape hatch to "raw C++", and if you muck that up, it would be
your fault, rather than a compiler bug or disqualifier.
(Note that a clever implementor might even use the C++ stage to
find [some of the] required-diagnostic bugs in incorrect C code.
I consider this "OK" and "not a disqualifier" *if* the C compiler
actually reads and digests the C++ stage's diagnostics, and re-forms
them back to refer to the original C code, so that the process is
invisible to the C programmer.)