Making Fatal Hidden Assumptions

toby · Mar 12, 2006

If the computation in one version can be reduced to a constant by the
compiler, that would be a reason for using that version.

I can imainge a number of situations in which "bad coding" is the
result of a programmer with a mental idea of how to accomplish
something efficiently, trying to render that approach in C as if it
were assembly language.

This reminds me of C--: http://www.cminusminus.org/

Hans-Bernhard Broeker · Mar 12, 2006

In comp.arch.embedded Jordan Abel said:
Jordan Abel wrote:

"tiny amounts of storage" may preclude a conforming hosted
implementation [which must support an object of 65535 bytes, and,
of course, a decent-sized library]

Click to expand...

Click to expand...

The "decent-sized" library for small embedded systems is easier to
meet than you may think. Be sure to look up "freestanding
implementation" in the library section of the C standard.

Following the only truly formal definition, "C compilers, just not
conforming ones" don't of course exist any more actually than there
are cars on the highway with "wheels, just not round ones."

[...]
Long long is, for many such compilers, still a non-issue, because they
never claimed to have implemented C99. C89==C is still a widely
accepted assumption.

Or even the range of int itself. [People have claimed that "c"
implementations exist with 8-bit int]

Some people would probably also not shy away from claiming there are
18-wheeler trucks built with only two wheels. Even all things
considered, such people are blatantly wrong (because they've been lied
to by, or are, marketroids). There's no excuse for violating a strict
requirement of the standard just to match users' likely interpretation
of one of the helpful suggestions, like "int should be the natural
integer type of the target CPU".

I'd question how much of the other stuff can be gone and still
considered "c", though.

The rule of thumb should be one of practicality. Try hard to fit as
much of the language as you can on the CPU, but stay reasonable.
I.e. all features that lie in the intersection between the standard's
requirements and the platform's feature set, should be implemented
strictly by the C standard. For the rest, stay as close to the
standard as you can bear. And above all, *document* all such
deviations prominently.

In case of doubt, do what every clever politician would do: refuse to
decide, so the users can unload the decision and the responsibility on
their own shoulders. Implement both a "standard as standard can"
mode, and a "as much standard as we think makes sense" mode. E.g. on
8-bit or smaller CPUs C's default integer promotion rules can turn
into a serious liability; offering a flag to turn them on or off makes
sense.

Allan Herriman · Mar 12, 2006

Jordan Abel said:
Jordan Abel said:

On Wed, 08 Mar 2006 18:07:45 -0700, Al Balmer wrote:
I spent 25 years writing assembler.

Yeah, me to. Still do, regularly, on processors that will never
have a C compiler.

It's a little OT in c.l.c, but would you mind telling us just what
processors those are, that you can make such a guarantee? What
characteristics do they have that means they'll never have a C
compiler?

(A few I can recall having been proposed are: tiny amounts of storage,
Harvard architecture, and lack of programmer-accessible stack.
Funnily enough, these are characteristics possessed by chips for which
I compile C code every day.)

Click to expand...

"tiny amounts of storage" may preclude a conforming hosted
implementation [which must support an object of 65535 bytes, and, of
course, a decent-sized library]

Click to expand...

Got me there. Freestanding only, with maybe 16 bytes of RAM and 256
words of ROM - no 'malloc()' here, and 'printf()' can be problematic.

A freestanding C implementation does, however, still have a C
compiler. I'm curious which processors Andrew Reilly claims "will
never have a C compiler", and why he makes that claim.

I assume Dr. Reilly's referring to various DSP devices. These often
have features such as saturating arithmetic and bit-reversed
addressing.

Regards,
Allan

Jordan Abel · Mar 12, 2006

Jordan Abel said:
Jordan Abel said:

On Wed, 08 Mar 2006 18:07:45 -0700, Al Balmer wrote:
I spent 25 years writing assembler.

Yeah, me to. Still do, regularly, on processors that will never
have a C compiler.

It's a little OT in c.l.c, but would you mind telling us just what
processors those are, that you can make such a guarantee? What
characteristics do they have that means they'll never have a C
compiler?

(A few I can recall having been proposed are: tiny amounts of storage,
Harvard architecture, and lack of programmer-accessible stack.
Funnily enough, these are characteristics possessed by chips for which
I compile C code every day.)

"tiny amounts of storage" may preclude a conforming hosted
implementation [which must support an object of 65535 bytes, and, of
course, a decent-sized library]

Click to expand...

Got me there. Freestanding only, with maybe 16 bytes of RAM and 256
words of ROM - no 'malloc()' here, and 'printf()' can be problematic.

A freestanding C implementation does, however, still have a C
compiler. I'm curious which processors Andrew Reilly claims "will
never have a C compiler", and why he makes that claim.

Click to expand...

I assume Dr. Reilly's referring to various DSP devices. These often
have features such as saturating arithmetic

Overflow's undefined, so what's wrong here?

and bit-reversed addressing.

I don't know what that is, so I have no idea how it would affect the
ability for there to be a conforming implementation

Guy Macon · Mar 12, 2006

Mark said:
A freestanding C implementation does, however, still have a C
compiler. I'm curious which processors Andrew Reilly claims "will
never have a C compiler", and why he makes that claim.

I don't think anyone will ever write a C Compiler for the
Atmel MARC4 [ http://www.atmel.com/products/MARC4/ ].

Andrew Reilly · Mar 12, 2006

I don't see much room for a "universal" assembler between C and a
traditional assembler, since the instruction sets can vary quite a
lot.

I think that a useful "universal assembler" would be something that had
the basic set of operators and types, all of which were well defined for a
particular machine model (flat data memory map, 2's compliment arithmetic,
etc.) It could have expressions, as long as the operator precedence was
rigorous enough so that you could absolutely know what the order of
evaluation would be, at coding time.

The two or three most painful things about assembly language programming
are register allocation and making up control-flow symbol names (in
assemblers that don't already have nice structured control flow
macros/pseudo-ops. Both of these can be included in a "universal
assembler", if you forgo some pure control for convenience: conventional
control structures, subroutine calls that follow common conventions. The
machine instruction sets of Java's JVM and C#'s CLR (?) avoid the register
name issue by being stack-based (and muck up the memory model by being
object-centric). Tao's VM is more nearly a plain 32-bit RISC model, but
with an infinite number of registers, which are managed by the "assembler".
(The third painful thing is instruction scheduling, in super-scalar or
VLIW machines of various sorts. That would probably want to be subsumed
by the language "compiler" too.)

A data model, a set of operators, control flow, a syntax for building
abstractions and domain-specific sub-languages. That could almost be C
right there, except that there are too many holes in the data model and
operator function, both to support old/strange hardware, and to
allow/support compiler optimization transformations. Java has tightened
up the model, but it's not a model of a "bare processor", it's a model of
an "object machine". I'd like the same kind of low-level language
definition, but with objects only built using the language's
meta-programming/macro features, rather than being the only way to do
things.

Just dreaming...

Cheers,

Andrew Reilly · Mar 12, 2006

It's a little OT in c.l.c, but would you mind telling us just what
processors those are, that you can make such a guarantee? What
characteristics do they have that means they'll never have a C
compiler?

I'm thinking mainly of deeply embedded DSP processors, like those of the
TI TAS3000 family, or Analog Devices Sigma DSPs, or any of several
similar-scale engines from several Japanese manufacturers.

Small memory, sure. Strange word lengths (not really that much of a
problem for C, admittedly). Some of these things don't have pointers in
the usual sense, let alone subroutine call stacks. Their arithmetic
usually doesn't match C's (integer only, usually with saturation on
overflow, freqently with different word lengths for data, coefficient and
result.

(A few I can recall having been proposed are: tiny amounts of storage,
Harvard architecture, and lack of programmer-accessible stack. Funnily
enough, these are characteristics possessed by chips for which I compile
C code every day.)

Apart from the reasons that I mentioned, the biggest one is simply utility
and man-power. No-one is building C compilers for these things because
no-one could or would use one if it existed: the hardware is tuned to do a
particular class of (fairly simple) thing, and that's easy enough to code
up in assembler. Easier than figuring out how to write a C compiler for
it, anyway.

Cheers,

Michael N. Moran · Mar 12, 2006

Jordan said:
and, of course, a decent-sized library]

Off topic? Yes. But, I it bothers me when we confuse the
language with the supporting libraries.

--
Michael N. Moran (h) 770 516 7918
5009 Old Field Ct. (c) 678 521 5460
Kennesaw, GA, USA 30144 http://mnmoran.org

"So often times it happens, that we live our lives in chains
and we never even know we have the key."
The Eagles, "Already Gone"

The Beatles were wrong: 1 & 1 & 1 is 1

James Dow Allen · Mar 12, 2006

Dik said:
This was however not done on any of the 1's complement machines I have
worked with. The +0 preferent machines (CDC) just did not generate it
in general. ..

The CDC 6400 and 6600 used "complement recomplement arithmetic."
The only way to get -0 as the result of integer arithmetic was to start
with
-0, i.e. (-0)+(-0) = -0 and (-0)-(+0) = -0.

The normal way to copy a B-register was to write, e.g. SB6, B5
(IIRC) which generated the same machine opcode as SB6, B5+B0.
(B0 was an always-zero register.) Hence SB6, B5 was not guaranteed
to copy B5 exactly! Instead SB6, B5-B0 should be coded.

Since there were fast tests for negative and zero, using both +0 and -0
as flags for testing was a micro-optimization sometimes useful for
speed.

James Dow Allen

David Holland · Mar 13, 2006

>> On 2006-03-07 said:
>> On 2006-03-07 said:

>>> [...] but I'm sincerely curious whether anyone knows of an *actual*
>>> environment where p == s will ever be false after (p = s-1; p++).

Click to expand...

>>
>> The problem is that evaluating s-1 might cause an underflow and a
>> trap, and then you won't even reach the comparison. You don't
>> necessarily have to dereference an invalid pointer to get a trap.
>>
>> You might hit this behavior on any segmented architecture (e.g.,
>> 80286, or 80386+ with segments on) ...

Click to expand...

>
> I'm certainly no x86 expert. Can you show or point to the output
> of any C compiler which causes an "underflow trap" in this case?

Have you tried bounds-checking gcc?

I don't think I've ever myself seen a compiler that targeted 286
protected mode. Maybe some of the early DOS-extender compilers did,
before everyone switched to 386+. If you can find one and set it to
generate code for some kind of "huge" memory model (supporting
individual objects more than 64K in size) I'd expect it to trap if you
picked a suitable location for `s' to point to.

That assumes you can find a 286 to run it on, too.

Otherwise, I don't know of any, but I'm hardly an expert on strange
platforms.

(Note: Coherent was a 286 protected mode platform, but it only
supported the "small" memory model... and it had a K&R-only compiler,
so it's not a viable example.)

Richard Bos · Mar 13, 2006

Michael N. Moran said:
Jordan said:

and, of course, a decent-sized library]

Click to expand...

Off topic? Yes. But, I it bothers me when we confuse the
language with the supporting libraries.

As long as we're talking about C, they are part of the same Standard.
You can get a freestanding implementation which is allowed not to
implement much of the Standard, but that doesn't make those parts any
less C.

Richard

Richard Bos · Mar 13, 2006

Jordan Abel said:
Overflow's undefined, so what's wrong here?

If the saturation also occurs for unsigned integers, you're going to
have a pain of a time implementing C's wraparound-on-unsigned-overflow
behaviour.

Richard

Allan Herriman · Mar 13, 2006

Overflow's undefined, so what's wrong here?

I don't know what that is, so I have no idea how it would affect the
ability for there to be a conforming implementation

http://www.google.com.au/search?q=bit-reversed+addressing
(First hit)

Bit reversed addressing is just another addressing mode that's simple
to access from assembly language. Do you think that this could be
generated by a C compiler?

Bit reversed addressing is used in the calculation of an FFT (and
almost nowhere else).

Regards,
Allan

Paul Burke · Mar 13, 2006

Mark said:
A freestanding C implementation does, however, still have a C
compiler. I'm curious which processors Andrew Reilly claims "will
never have a C compiler", and why he makes that claim.

Motorola 4500- though some bugger will probably write one just to prove
me wrong.

Paul Burke

Andrew Reilly · Mar 13, 2006

http://www.google.com.au/search?q=bit-reversed+addressing
(First hit)

Bit reversed addressing is just another addressing mode that's simple
to access from assembly language. Do you think that this could be
generated by a C compiler?

Bit reversed addressing is used in the calculation of an FFT (and
almost nowhere else).

To be fair, though, I suspect that bit reversed addressing is a bit
over-rated, within the DSP community. If I were designing a new DSP
processor, I'd be very tempted to leave it out, unless the instruction
space, die space and cycle-time impact it introduced were
completely negligible. In many ordinary processors you can perform a
bit-reverse re-ordering in about 10% of the cost of performing the FFT
itself with ordinary instructions (see below), recursive counting code, or
a lookup table (for smallish FFT sizes, probably less for larger). FFTW
manages to hold many performance benchmark crowns while producing in-order
results on conventional processors. Besides which, not all FFT algorithms
produce results in bit-reverse order anyway.

As an example of the sort of algorithmic weirdness that is sometimes put
into hardware, for which there isn't a good, let alone convenient way to
express in C, it's pretty good.

For the non-DSP-inclined, here's a simple expression of a bit-reverse
increment operation, in C:

unsigned int
bitrev_inc(unsigned int i, unsigned int N)
{
return (N & i) ? bitrev_inc(i ^ N, N >> 1) : i ^ N;
}

That one needs to be called with N = bins/2 where bins is the size of the
FFT, a power of 2. It's not especially efficient, but it is at least a
single pure function, and GCC does a good job on the tail recursion. An
iteration over an array could use i = bitrev_inc(i, bins/2) as an index
increment operation. It could be coded iteratively as a loop around
while(N & i), but that seems to be even more of a stretch for a compiler
to recognize as simply the invocation of an addressing mode.

Cheers,

Dik T. Winter · Mar 13, 2006

> I would agree that if an assembler must be a one-to-one mapping from
> source line to opcode, then C doesn't fit. I just don't agree with that
> definition of assembler.

On the other hand for every machine instruction there should be an
construct in the assembler to get that instruction. With that in
mind C doesn't fit either.

S.Tobias · Mar 13, 2006

[ F'ups set to c.l.c. - please reset if other groups are interested too. ]

In comp.lang.c Keith Thompson said:
Andrew Reilly said:

Question: If the C Standard guarantees that for any array a, &a [-1]
should be valid, should it also guarantee that &a [-1] != NULL

Click to expand...

Probably, since NULL has been given the guarantee that it's unique in some
sense. In an embedded environment, or assembly language, the construct

Click to expand...

....

How exactly do you get from NULL (more precisely, a null pointer
value) being "unique in some sense" to a guarantee that &a[-1], which
doesn't point to any object, is unequal to NULL?

The standard guarantees that a null pointer "is guaranteed to compare
unequal to a pointer to any object or function". &a[-1] is not a
pointer to any object or function, so the standard doesn't guarantee
that &a[-1] != NULL.

....

By the same logic, ptr to just past the end of an array (which, of course,
does not point to an object), _can_ compare equal to null ptr? ;-)
- Of course not, semantics for "==" guarantee that both ptrs must be
null (among others) to compare equal.

AFAIK the one-past-the-end ptrs were introduced to enable certain
idioms, and as such are not a necessity in the language:
while (*dest++ = *src++) ; /* strcpy-like */
IMHO the language could also use the concept of one-before-the-start
pointers for similar purposes, but it doesn't for practical (and good)
reasons. Here's what C99 Rationale says (6.5.6):

: In the case of p-1, on the other hand, an entire object would have
: to be allocated prior to the array of objects that p traverses,
: so decrement loops that run off the bottom of an array can fail.
: This restriction allows segmented architectures, for instance, to
: place objects at the start of a range of addressable memory.

James Beck · Mar 13, 2006

Jordan said:
Jordan said:

On Wed, 08 Mar 2006 18:07:45 -0700, Al Balmer wrote:

I spent 25 years writing assembler.

Yeah, me to. Still do, regularly, on processors that will never
have a C compiler.

It's a little OT in c.l.c, but would you mind telling us just what
processors those are, that you can make such a guarantee? What
characteristics do they have that means they'll never have a C
compiler?

(A few I can recall having been proposed are: tiny amounts of
storage, Harvard architecture, and lack of programmer-accessible
stack. Funnily enough, these are characteristics possessed by
chips for which I compile C code every day.)

"tiny amounts of storage" may preclude a conforming hosted
implementation [which must support an object of 65535 bytes, and,
of course, a decent-sized library]

Click to expand...

These machines may well have C compilers, just not conforming
ones. The areas of non-conformance are likely to be:

object size available
floating point arithmetic
recursion depth (1 meaning no recursion)

Click to expand...

availability of long and long-long.

Click to expand...

Or even the range of int itself. [People have claimed that "c"
implementations exist with 8-bit int]

I don't fully understand what you are saying here.
Please elaborate.
At first I thought maybe you meant that you can find no documented case
of an implementation that has an 8 bit int, then I thought, well MAYBE,
he means that it aint 'C' if it has an 8 bit int........

Jim

Walter Roberson · Mar 13, 2006

On the other hand for every machine instruction there should be an
construct in the assembler to get that instruction. With that in
mind C doesn't fit either.

Click to expand...

Over the years, there have been notable cases of "hidden" machine
instructions -- undocumented instructions, quite possibly with no
assembler construct (at least not in any publically available
assembler.)

Ed Prochak · Mar 13, 2006

Michael said:
Sure, if "most machines" excludes load/store architectures, and
machines which cannot operate directly on an object of the size of
whatever x happens to be, and all the cases where "x" is a pointer to
an object of a size other than the machine's addressing granularity...

I suppose you could argue that "can" in your claim is intended to be
weak - that, for "most machines" (with a conforming C implementation,
presumably), there exists at least one C program containing the
statement "x++;", and a conforming C implementation which will
translate that statement to a single machine instruction.

But that's a very small claim. All machines "can" map that statement
to multiple instructions as well; many "can" map it to zero
instructions in that sense (taking advantage of auto-increment modes
or the like). What can happen says very little about what will.

The presence in C of syntactic sugar for certain simple operations
like "x++" doesn't support the claim that C is somehow akin to
assembler in any case. One distinguishing feature of assembler is
a *lack* of syntactic sugar. (Macros aren't a counterexample
because they're purely lexical constructs; in principle they're
completely separate from code generation.)

C isn't assembler because:

- It doesn't impose a strict mapping between (preprocessed) source
and generated code. The "as if" clause allows the implementation
to have the generated code differ significantly from a strict
interpretation of the source acting on the virtual machine.

- It has generalized constructs (expressions) which can result in
the implementation generating arbitrarily complex code.

C is an assembler because

-- It doesn't impose strict data type checking, especially between
integers and pointers.
(While there has been some discussion about cases where conversions
back and forth between them can fail, for most machines it works. Good
thing too or some OS's would be written in some other language.)

-- datatype sizes are dependent on the underlying hardware. While a lot
of modern hardware has formed around the common 8bit char, and
multiples of 16 for int types (and recent C standards have started to
impose these standards), C still supports machines that used 9bit char
and 18bit and 36bit integers. This was the most frustrating thing for
me when I first learned C. It forces precisely some of the hidden
assumptions of this topic.

-- C allows for easy "compilation" in that you could do it in one pass
of the source code (well two counting the preprocessor run). The
original C compiler was written in C so that bootstrapping onto a new
machine required only a simple easily written initial compiler to
compile the real compiler.

-- original versions of the C compiler did not have passes like
data-flow optimizers. So optimization was left to the programmer. Hence
things like x++ and register storage became part of the language.
Perhaps they are not needed now, but dropping these features from the
language will nearly make it a differrent language. I do not know of
any other HLL that has register, but about every assembler allows
access to the registers under programmer control.

So IMHO, C is a nice generic assembler. It fits nicely in the narrow
world between hardware and applications. The fact that it is a decent
application development language is a bonus. I like C, I use it often.
Just realize it is a HLL with an assembler side too.

Ed

I Need Help with making a function that draws in a canvas using location data.	1	Dec 17, 2021
The Horror of pointers...	4	Jan 11, 2025
C pipe	1	Dec 9, 2021
Fatal error: Uncaught Error: Cannot use object of type WP_Error as array in	0	Dec 23, 2021
I need help making a zooming function	11	Dec 14, 2021
I need help making an html website	2	Aug 2, 2023
A process take input from /proc/<pid>/fd/0, but won't process it	0	Oct 29, 2023
Fibonacci	0	May 13, 2023

Making Fatal Hidden Assumptions

toby

Hans-Bernhard Broeker

Allan Herriman

Jordan Abel

Guy Macon

Andrew Reilly

Andrew Reilly

Michael N. Moran

James Dow Allen

David Holland

Richard Bos

Richard Bos

Allan Herriman

Paul Burke

Andrew Reilly

Dik T. Winter

S.Tobias

James Beck

Walter Roberson

Ed Prochak

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads