Zero overhead overflow checking

P

Phil Carmody

Chris Dollin said:
The C standard specifies that unsigned arithmetic wraps around and
does not "overflow"; there's nothing to check and no room to manoeuver.

At least n869.txt *does not directly define* what it means by
overflow, nor, it appears, does n1256.txt. However, I think it's
clear that many people, and at least one large microprocessor
manufacturer, view trying to fit a large thing into a smaller
thing such that it doesn't fit can be called "overflow".

If you accept it as valid usage of the word overflow, then clearly
C's unsigned types can overflow when being shifted to the left, as
the standard defines the resulting value of E1<<E2 in terms of the
*mathematical* value E1 * 2^E2 which it reduces modulo an appropriate
number. That mathematical value may not fit into a variable of the
desired type, and therefore I think it's fair to consider that the
operation has involved an overflow, albeit one which has precisely
defined numerical semantics.

It's clear that the C standardisation committee do not support this
use of the word overflow, given their blanket assertion about it not
applying to these operations on unsigned types.

In contrast, it's also clear that Intel does support this use of
the word when they describe the semantics of the overflow bit.

In the face of such huge opposition, it might be prudent to make
sure that no ambiguity is possible, by defining what is meant by
the term in advance of first use.

Given my cross-post location, I hope this will be given formal
consideration. Should there be anything else I need to do, I will
so do.

Phil
 
J

jacob navia

Eric Sosman a écrit :
The "overflow flag" that your implementation rests upon was
present in the original 8086 thirty-plus years ago, was even then
a conscious imitation of still earlier designs, and as far as I can
see has not changed materially since that time. New instructions
that can set, clear, and test the flag may have been added in the
meantime -- but the overflow flag itself is a relic of the far past,
a vermiform appendix that some more modern designs have chosen to
do without.

This is just not true. Take the Cell processor of IBM, a modern
RISC architecture for parallel processing. It has (just like
the x86 x64 of Intel) a flags register where the executed
instruction writes its report. True, in the Cell you can do
without by using the versions of the instruction that do NOT
set the flags, but you can use the flags if you want.

A processor that can't report overflow is unusable. If I
propose to check for overflow in C is because the language has
a hole here, overflow checking is necessary to avoid getting
GARBAGE out of your calculations.

If you think the DEC alpha is the "state of the art" in hardware
processing please think again.
 
D

Dag-Erling Smørgrav

jacob navia said:
Which ones? I mean short and chars promote to int as far as
I know.

short a = 20000;
short b = 20000;

a += b;

The addition won't overflow, but the assignment will (assuming 16-bit
short).

DES
 
N

Nick Keighley

"almost-zero" isn't "zero overhead"
:)

It must be great being a cpu designer, being able to create novel new
hardware incompatible with anything in the past, present or future,
and apparently not caring whether it's compatible with any software
either.

pretty rare I'd have thought these days. Intel's processors wouldn't
look the way they do if they had a clean sheet.
That all seems to be perfectly acceptable. But when it comes to
languages, we're only allowed to have this single, monolithic super-
language that must run on any conceivable hardware, from the lowliest
crummy microprocessor up to supercomputers, even though there are
several evidently different stratas of application areas.

that's what C does, yes. If you want Java...
Odd, isn't it. (And I'm talking about C, since I don't know of any
mainstream languages quite like it.)

I'd always thougt C was a mainstream language.
Personally I wouldn't have a problem with, say, a C-86 language, that
is targetted at x86-class processors.

yuk. In that case rename it C-86 or something. Or even better give
it completly different name. EightySixScript or something. Script-86.
Gosh the least portable HLL ever. 40 years of computer science
and computer engineering, vanished like a soap bubble.
It would make a lot of things a lot simpler.

and few things that some people think are important much harder.
And people who use other processors can have their own
slightly different version.
bleah

C is after all supposed to work at the
machine level; finally it will know exactly what that machine is!

Read some computer science books until you understand the term
"abstraction"


--
"Programs must be written for people to read, and only
incidentally for machines to execute."
- Abelson & Sussman, Structure and Interpretation of Computer Programs

In the development of the understanding of complex phenomena,
the most powerful tool available to the human intellect is
abstraction. Abstraction arises from the recognition of similarities
between certain objects, situations, or processes in the real world
and the decision to concentrate on these similarities and to ignore,
for the time being, their differences.
- C.A.R. Hoare
 
N

Nick Keighley

Yet, hardware created around a SPARC processor presumably won't work
with an ARM? Somebody made a decision to use a specific set of
hardware, requiring different circuitry, peripherals, power supply,
different manuals and expertise, the software however must work,
unchanged, across the lot?

that's kind of the point. That's what device drivers are all about.
I can run the same software on my new(ish) laptop that I ran on my
old computer. The manufacturers of computers, OSs and their drivers go
to
a fair bit of trouble to make the hardware invisible. There's a reason
for this.

People who develop embedded software (the hidden software in your
phone,
DVD player, washing machine, car etc etc.) can test much of it on bog
standard desk-tops even though its going to run on some wierd hardware
with chip named after an Acorn.

The software can be tested *before* the hardware even exists.

C portability makes this easier.

OK, fair enough. I'm sure the C code driving that 486 monitoring the
IBM servers that someone mentioned, will also work unchanged driving
the Dec Alpha monitoring Fuji servers instead (I've no idea what this
stuff does, and I suspect Fuji actually make film stock...).

Fuji make (or used to make) computers. They make on awful lot of
stuff.
My point is that C software can be considered an integral part of a
system and therefore can be allowed to be specific to that system in
the same way the bits of hardware can be. Ie., not just doing a
specific job but taking advantage of known characteristics of the
processor.

But it's easier if it doesn't. Oh, some parts will be hardware
specific.
But they should be restricted to small localised parts of the
software
(think drivers again). The bulk of the application can be tested sans
hardware. Handy if the real hardware isn't real portable or failure
isn't
an option. (I suspect avionics software is tested on the ground
first).
 
N

Nick Keighley

     Isn't that the "counted string" you were so vigorously
promoting just a week or two ago?  My, but how fashions change!
Ah, but that's a rant already ranted.

ooh! So Navia strings use a 24-bit or 32-bit (or, I suppose 64-bit)
count value. Lets assumei it's an int.

So on a 32-bit architecture.

Nstring s = NS_make ("a");

s takes 5 bytes
 
E

Eric Sosman

jacob said:
Eric Sosman a écrit :

This is just not true.

"Some more modern designs have chosen to do without [flags]"
is "just not true?" Every CPU designed in the last three decades
has arithmetic condition flags in its architecture? You're sure
of this, are you?
 
B

Bart

pretty rare I'd have thought these days. Intel's processors wouldn't
look the way they do if they had a clean sheet.

Someone mentioned hundreds of embedded processors for each advanced
processor. I guess these must be all a little different.
that's what C does, yes. If you want Java...

C does it with penalties, such as making some kinds of programming a
minefield because this won't work on processor X, and that is a wrong
assumption for processor Y, even though this application is designed
to run only on processor Z.
I'd always thougt C was a mainstream language.

I didn't say it wasn't. Just that there no others that I know of,
which are like C, that are mainstream (but presumably plenty of
private or in-house ones, like one or two of mine).
yuk. In that case rename it C-86 or something. Or even better give
it completly different name. EightySixScript or something. Script-86.
Gosh the least portable HLL ever. 40 years of computer science
and computer engineering, vanished like a soap bubble.

But this doesn't apply to hardware? Why can't that abstraction layer
that you mention a bit later be applied to C-86?
and few things that some people think are important much harder.

Have a look at C#'s basic types:

Byte: 8 bits; Short: 16 bits; Int: 32: bits; Long: 64 bits. Now try
and get the same hard and fast facts about C's types, you can't! It's
like asking basic questions of a politician.

That's one thing that would be simpler; what sort of things would be
harder (other than the obvious one of running on a system where these
type sizes are all different)?
 
B

Bart

Would your hypothetical C-86 language have enough advantages for
x86-specific code to make up for the fact that it wouldn't work *at
all* on anything else?

C-86 might be as simple as a bunch of assumptions (such as basic type
sizes, alignment needs, byte-order and so on). And I guess some people
may be programming in C-86 already!

As for portability, you'd have to ask them; it may not be a big deal
if they know their product will have to run on x86 for the next few
years.

Or perhaps C is used for (firmware for) a processor inside a phone
say, but that phone will be produced in the millions. Surely it must
help (in programmer effort, performance, code sise, any sort of
measure except actual portability) to have a tailored C version for
that system.

(When it comes do downloadable software, that is a different matter,
and perhaps a more adept language is needed, or just a non-specific
C!)
 
D

Dag-Erling Smørgrav

Bart said:
Or perhaps C is used for (firmware for) a processor inside a phone
say, but that phone will be produced in the millions. Surely it must
help (in programmer effort, performance, code sise, any sort of
measure except actual portability) to have a tailored C version for
that system.

You'd be surprised how many simply use gcc; many microcontroller vendors
port gcc to every new chip (or pay someone to do it for them).

DES
 
J

jacob navia

Dag-Erling Smørgrav a écrit :
You'd be surprised how many simply use gcc; many microcontroller vendors
port gcc to every new chip (or pay someone to do it for them).

DES

There is nbo way gcc can function correctly for small architectures.
Obviously there are many ports of it to many architectures, most of
them full of bugs.

The code of gcc is around 12-15MB source code. To rewrite a BIG
part of this for a small microprocessor is like trying to kill a fly
with an atomic bomb.

Yes, maybe the fly dies, but it could fly away until the missile arrives

:)

I am biased, of course. Like you are biased for gcc. I just want to
restore a sense of proportion. Porting gcc to a new architecture and
debug the resulting code is a work of several years until all is
debugged and fixed.

And no, it can't be done by a single person.
 
C

Chris Dollin

jacob said:
Dag-Erling Smørgrav a écrit :

There is nbo way gcc can function correctly for small architectures.

That's an interesting claim; what's your reasoning? [I read Dag as
saying that the gcc /back-end/ has been ported and that gcc is
being used as a cross-compiler.]
Obviously there are many ports of it to many architectures, most of
them full of bugs.

Isn't "full" so ridiculous a claim as to weaken what case you have?
The way I'd use "full of bugs" would make a product essentially
useless. Is that the claim you're making?
The code of gcc is around 12-15MB source code. To rewrite a BIG
part of this for a small microprocessor is like trying to kill a fly
with an atomic bomb.

/Is/ it a BIG part that needs to be rewritten?
I am biased, of course. Like you are biased for gcc. I just want to
restore a sense of proportion. Porting gcc to a new architecture and
debug the resulting code is a work of several years until all is
debugged and fixed.

That's going to depend heavily on how new a "new" architecture
is, yes?

--
"My name is Hannelore Ellicott-Chatham. I *end messes*." Hannelore,
/Questionable Content/

Hewlett-Packard Limited registered no:
registered office: Cain Road, Bracknell, Berks RG12 1HN 690597 England
 
B

Bart

jacob said:
Eric Sosman a écrit :
This is just not true.

     "Some more modern designs have chosen to do without [flags]"
is "just not true?"  Every CPU designed in the last three decades
has arithmetic condition flags in its architecture?  You're sure
of this, are you?

Knuth's MMIX architecture also makes use of integer overflow (in the
form of a trip flag). So he seems to think it's still worthwhile.
However he also acknowledges he had help from the "people at [Dec?]
Alpha", so perhaps it's fortunate overflow handling didn't go out the
window.
 
K

Keith Thompson

jacob navia said:
A processor that can't report overflow is unusable. If I
propose to check for overflow in C is because the language has
a hole here, overflow checking is necessary to avoid getting
GARBAGE out of your calculations.

If you think the DEC alpha is the "state of the art" in hardware
processing please think again.

The DEC Alpha certainly can detect overflow. It just uses a
different mechanism than the x86.

Don't confuse the ability to detect overflow (something that I
agree any CPU should have, and I don't know of any that can't)
with the particular mechanism of a dedicated flag.

I don't think your recent proposal actually depends on that
particular mechanism, so I don't know what the argument is about
anyway. If your proposed feature would impose some extra overhead
on systems that use different overflow detection mechanisms (and I
don't know that it would), I don't think that's a serious problem,
as long as the overhead is not too bad and can be avoided by
disabling checking.

Think of the DEC Alpha as just one example of a CPU that, while it's
perfectly capable of detecting overflow, doesn't happen to use the
same mechanism as the CPUs you usually work with. There are likely
other examples, and there are likely to be yet more in the future.
The fact that DEC has been acquired by another company is hardly
relevant.
 
K

Keith Thompson

Francis Glassborow said:
and assuming a 16-bit int, the addition will overflow.

However this raises the whole issue of narrowing conversions. They are
allowed in C but what should happen if the converted value looses
information?

Note that unlike the overflow as a result of computation, there is no
implicit problem with narrowing integer conversions, just drop the
high bits (on a S & M machine you will need to retain the sign bit)

The standard says overflow on a signed integer computation
invokes undefined behavior, but overflow on conversion either
yields an implementation-defined result or (in C99) raises an
implementation-defined signal.

I've never understood why the language treats these two kinds of
overflow differently.

But if the language were to add a mechanism for handling overflows,
I'd want it to apply to both arithmetic operations and conversions.
 
J

jacob navia

Keith Thompson a écrit :
The standard says overflow on a signed integer computation
invokes undefined behavior, but overflow on conversion either
yields an implementation-defined result or (in C99) raises an
implementation-defined signal.

I've never understood why the language treats these two kinds of
overflow differently.

But if the language were to add a mechanism for handling overflows,
I'd want it to apply to both arithmetic operations and conversions.


In principle you are right but in practice...

There are SO many places where overflow in integer conversions is
assumed that the whole thing would be unusable.

int i;
char p[12];

p[1] = i;

assuming that the compiler will do the equivalent of
p[1]=i&0xff;

The checking could be done of course, but I would make it a different
proposal and with a different name. Anyway for the language both
overflows are NOT the same anyway.
 
B

Beej Jorgensen

Bart said:
Byte: 8 bits; Short: 16 bits; Int: 32: bits; Long: 64 bits. Now try
and get the same hard and fast facts about C's types, you can't!

I'm going to go with:

Byte int_least8_t
Short int_least16_t
Int int_least32_t
Long int_least64_t

-Beej
 
K

Keith Thompson

jacob navia said:
Keith Thompson a écrit :
The standard says overflow on a signed integer computation
invokes undefined behavior, but overflow on conversion either
yields an implementation-defined result or (in C99) raises an
implementation-defined signal.

I've never understood why the language treats these two kinds of
overflow differently.

But if the language were to add a mechanism for handling overflows,
I'd want it to apply to both arithmetic operations and conversions.

In principle you are right but in practice...

There are SO many places where overflow in integer conversions is
assumed that the whole thing would be unusable.

int i;
char p[12];

p[1] = i;

assuming that the compiler will do the equivalent of
p[1]=i&0xff;

Sure, there's plenty of code that *assumes* they're equivalent.

They're not, and programmers have had 20 years warning that you
can't make that assumption.

I think there's even more code that assumes that the value of i is
in the range CHAR_MIN..CHAR_MAX; checking would detect cases where
that assumption is incorrect due to a programming error.
The checking could be done of course, but I would make it a different
proposal and with a different name. Anyway for the language both
overflows are NOT the same anyway.

I disagree.
 
J

jacob navia

Francis Glassborow a écrit :
My immediate reaction is 'rubbish'. The whole raison d'etre of C was to
enable quick porting of unix to new hardware. The only new stuff needed
is the code generator and I see no reason that should not be the work of
a single person in a relatively short time.

For a compiler like lcc it took me at least 8 months to get
some confidence into the modifications I was doing in the code
generator. It took me much longer to fully understand the
machine description and being able to write new rules
for it.

lcc's code was 250K or so. It has a VERY GOOD DOCUMENTATION.

gcc's code is 15MB or more, with confusing documentation.

With all respect, I disagree with you.

But we are going away from the main subject of this
discussion that was the overflow checking proposal.

Let's agree then, that we disagree in this point.

:)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,744
Latest member
CortneyMcK

Latest Threads

Top