Zero overhead overflow checking

H

Hallvard B Furuseth

I said:
char and short arithmetic can overflow, if they are as wide as int so
promotion to int does not protect from overflow. (In the case of char,
that means char must be at least 16 bits wide.)

Duh, sorry. They get promoted to int first, of course.
 
J

jacob navia

christian.bau a écrit :
Some comments:

1. In addition to "off" and "on", the pragma could have a third
setting "restore" which will restore to the state before the previous
"on" or "off", allowing this to be nested. So if someone doing
encryption wants no overflow checking, they put "off" and "restore"
around their code, and after that code we are back to the initial
setting.

Lcc-win offers that for ALL pragmas it supports. The syntax is:

#pragma something(push,on)

#pragma something(pop)

This was proposed by the microsoft compiler years ago, at least since
the 32 bit versions. I have generalized it to all pragmas. But that
is ANOTHER proposal. :)
2. Instead of specifying what will happen, there could be wording like
"this pragma is intended to have documented behavior that can be
noticed and can be used to find problems instead of producing
undefined behavior when an overflow happens". If the compiler can
guarantee that any overflow will trap into the debugger, that would be
something I would find useful. This would also be more efficient with
processors that have a "sticky" overflow flag.

That is too vague to be realloy useful. A defined API and a defined
type are more concrete and allow for more portable code.
3. This should handle as many overflow situations as possible. For
example signed left shifts, pointer overflow (if p is a pointer and i
is an int, then one would expect p+i > p if i > 0 and p+i < p if i <
i. In practice this happens only for a limited range of integers i.
Adding i outside that range would be overflow).

I am not sure if (p+i) --> p + (unsigned)i. The relevant part of the
standard (6.5.6) doesn't require this, so I think we could add your
suggestion.


The addition of a pointer with an integer should be checked.
5. I would have suggested an operator overflow () ("operator" in the
sense that "sizeof" is an operator): The "overflow" operator has a
single argument which is an expression; it evaluates the expression
including side effects as other operators would do and yields 1 if any
of the operators in the source code of the expression produced an
overflow, 0 otherwise. For example:

if (overflow (x = a+b)) printf ("Calculation produced overflow
\n");
else printf ("Sum of a and b is %d\n", x);

If the expression contains function calls, the function calls would
not be checked.

As in the "pragma" version, I think this is easiest to implement if
the compiler generates tokens like "checked-operator-plus" instead of
"operator-plus", depending on the situation. This handles things like
an inline function compiled with overflow checking when overflow
checking is disabled where it is called; this would perform checks for
the inline function even when it is inlined.

1. Your proposal is very good when you are interested in checking a
single expression. It is bad for checking a whole program.

2. I think your approach is complenetary to mine. But it is a much
bigger change in the language.

Since the opposition I have found for my proposals, I have greatly
reduced everything. In principle your approach is very good and
I would support it.
 
J

jacob navia

Eric Sosman a écrit :
Ugh. That means the source code needs a different version
of the overflow handler for every implementation it might run on.
Sounds like a return to the pre-ANSI days, with separate #ifdef
blocks for every compiler known to Man.

ONLY if you want to treat those extra arguments!

If you want maximum portability you stick to the 3 arguments
specified in the standard. If you treat the optional arguments
then it is no longer portable of course. The extra arguments are there
to provide flexibility to the implementors of this.
 
J

jacob navia

Eric Sosman a écrit :
Yes (except that char might promote to unsigned int). But
<stdint.h> can define types that are wider than int, and these
types do not promote at all. Arithmetic on int39_t values is
done (as if) in 39-bit arithmetic, not in promoted-to-64-bit
arithmetic. (This has implications for detection of overflow,
if the underlying hardware uses 64-bit arithmetic to simulate
39-bit operations.)

Actually, the overflow checking should be done for all integer
types with bit size >= int.
Is that enough? You say that if the overflow handler returns,
execution proceeds as if the overflowing operation had produced
some implementation-defined value. But if there are several
possible overflow sites in the same expression, and you have
only one "bail out" label per line, how can you know where in
the expression execution should resume?

x = a * f() >= 0 ? b * g() : c * h();

You do not have access to that information. If you want to
know which expression overflows you break that expression into
several lines. Some implementations could generate code that
would store the variables being checked and would pass them on to you
so you would get in your handler (in the extra arguments that are
provided) the whole expression and the position within that
expression.

You see why the extra arguments are useful?

The standard would prescribe the minimum, other implementations,
with implementation specific flags could do much more.
The choice of "What next?" -- and the observable side-effects
of that choice -- cannot be figured out based only on knowing
that the overflow occurred in line 42.

You can very easily break up the expression. Besides, with this
proposal you know at least that there was an overflow.

With the current situation you just get a wroing result with
no warning whatsoever!
Sure, but you're going to greatly increase the number of
jumps by inserting a new one after every arithmetic operation.
Many more jumps means many more delay slots, which means it's
less likely that useful work can be found for them, which means
that more of them will be filled with no-ops. Instead of one
ADD or whatever, you get ADD,JOV,NOP -- maybe not quite as bad
as it looks because there are other instructions to load operands
and store results and stuff, but it still has the effect of
shrinking the instruction cache.

Why insert nops? You do not insert anything. If there is an overflow,
it doesn't matter if one extra instruction was executed because you
are going to go to the exception handler anyway and the result of
the whole expression is undefined, so you can avoid those NOPs
without any danger.
It's my impression -- only an impression, mind you -- that
ISO is not interested in a "Programming Languages - C for x86"
standard. You can't simply live in the 8086 past and ignore
modern designs.

The Intel/Amd architecture "the past" ???

Well, I am sorry, it really doesn't look like it was like this mind you.

But anyway, even machines like the alpha can detect overflow with
not a lot of problems... They will have a small performance hit

If that is too much for you, avoid those machines or do not check for
overflow
What happened to "zero overhead?"

I said zero overhead for normal RISC machines or for the x86. Not
for the alpha, or for any brain dead machines out there!
 
J

jacob navia

Keith Thompson a écrit :
christian.bau said:
1. In addition to "off" and "on", the pragma could have a third
setting "restore" which will restore to the state before the previous
"on" or "off", allowing this to be nested. So if someone doing
encryption wants no overflow checking, they put "off" and "restore"
around their code, and after that code we are back to the initial
setting.

The existing STDC pragmas (see C99 6.10.6) take an "on-off-switch"
argument, which can be any of ON, OFF, or DEFAULT.

What you're suggesting would require an implicit stack of settings.
Having every occurence of the pragma push a new value onto that stack
seems wasteful, conceptually if not practically.

I think there's some precedent for having PUSH and POP arguments. So
you could have:

#pragma STDC OVERFLOW_CHECK PUSH
/* doesn't change the current state, but sets things up for a
following POP */

#pragma STDC OVERFLOW_CHECK ON

...

#pragma STDC OVERFLOW_CHECK POP
/* restores previous state */

Or perhaps:

#pragma STDC OVERFLOW_CHECK PUSH_ON
...
#pragma STDC OVERFLOW_CHECK POP

I'm undecided whether this is worth doing, and if so just how it
should be specified. But if this were to be done for the
OVERFLOW_CHECK pragma, it should be done for all the STDC pragmas.

[...]


Microsoft proposed

#pragma something(push, newvalue)
and
#pragma something(pop)

I have generalized that, so ALL lcc-win pragmas have a stack.

This would be very useful for many things but it is ANOTHER
discussion and ANOTHER proposal.
 
J

jacob navia

Eric Sosman a écrit :
A (weakly) related issue is to describe how faithfully the
generated code must follow the abstract machine. For example,
is it permissible to rewrite

#define OVERHEAD 1
#pragma STDC OVERFLOW_CHECK ON
x = a + OVERHEAD - 1;
as
x = a;

? The original expression can overflow if executed literally,
but the rewritten expression cannot; is the transformation
allowed? What optimizations (if any) must overflow detection
inhibit?

This is implementation defined.

This checking should NOT interfere with the language is such a manner
as to be very expensive and make optimizations impossible. If constant
propagation, a very simple and safe optimization, is rendered impossible
because the overflow checking nobody will want to use this.
 
K

Keith Thompson

jacob navia said:
Eric Sosman a écrit :

ONLY if you want to treat those extra arguments!

If you want maximum portability you stick to the 3 arguments
specified in the standard. If you treat the optional arguments
then it is no longer portable of course. The extra arguments are there
to provide flexibility to the implementors of this.

The prototype you propose is:

typedef void (*overflow_handler_t)(unsigned line_number,
char *filename,
char *function_name,...);

For any variadic function, the values of the parameters that precede
the ", ..." need to be able at least to tell the function whether to
start looking for more arguments.

How is an overflow handler going to be able to tell, from the values
of line_number, filename, and function_name, whether any more
arguments were passed?
 
K

Keith Thompson

jacob navia said:
Keith Thompson a écrit : [...]
Or perhaps:

#pragma STDC OVERFLOW_CHECK PUSH_ON
...
#pragma STDC OVERFLOW_CHECK POP

I'm undecided whether this is worth doing, and if so just how it
should be specified. But if this were to be done for the
OVERFLOW_CHECK pragma, it should be done for all the STDC pragmas.

[...]

Microsoft proposed

#pragma something(push, newvalue)
and
#pragma something(pop)

I have generalized that, so ALL lcc-win pragmas have a stack.

This would be very useful for many things but it is ANOTHER
discussion and ANOTHER proposal.

If you're going to use the "#pragma STDC" syntax, I think you need
to be consistent with the existing (and any future) STDC pragmas.
If you're going to propose a new mechanism to be used with #pragma
STDC OVERFLOW_CHECK, I think you should propose the same mechanism
for all of them, just to avoid creating a gratuitous inconsistency
in the language.
 
K

Keith Thompson

Hallvard B Furuseth said:
I'd call it #pragma STDC INT_OVERFLOW_CHECK or something, to make the
distinction from floating-point pragmas visible.

INTEGER_OVERFLOW_CHECK, not INT_OVERFLOW_CHECK, since it doesn't just
apply to type int.
 
K

Keith Thompson

jacob navia said:
christian.bau a écrit : [...]
3. This should handle as many overflow situations as possible. For
example signed left shifts, pointer overflow (if p is a pointer and i
is an int, then one would expect p+i > p if i > 0 and p+i < p if i <
i. In practice this happens only for a limited range of integers i.
Adding i outside that range would be overflow).

I am not sure if (p+i) --> p + (unsigned)i. The relevant part of the
standard (6.5.6) doesn't require this, so I think we could add your
suggestion.

No, p+i is certainly not equivalent to p+(unsigned)i.

int arr[10];
int *p = arr+5;
int i = -1;
p + i; /* points to arr[4] */

"Overflow" for pointer arithmetic is defined in terms of the bounds of
the object being pointed to. I suggest that checking such overflows
is beyond the scope of your proposal. I've discussed this in more
detail elsethread.

[...]
 
E

Eric Sosman

Keith said:
[...]
The prototype you propose is:

typedef void (*overflow_handler_t)(unsigned line_number,
char *filename,
char *function_name,...);

For any variadic function, the values of the parameters that precede
the ", ..." need to be able at least to tell the function whether to
start looking for more arguments.

It suffices that the function be able to tell "somehow," not
that the information be conveyed in the fixed arguments. If the
implementation-specific stuff changes from one implementation to
another but not within a single implementation, then a suitable
#ifdef will do it.

That said, I think

typedef void (*overflow_handler_t)
(struct *overflow_handler_data);

would be a better choice. The struct would have certain "always
present" elements and others that the implementation might choose
to add, in the manner of various other library functions.
How is an overflow handler going to be able to tell, from the values
of line_number, filename, and function_name, whether any more
arguments were passed?

By not asking questions it doesn't need to ask?
 
E

Eric Sosman

jacob said:
Eric Sosman a écrit :

Actually, the overflow checking should be done for all integer
types with bit size >= int.


You do not have access to that information. If you want to
know which expression overflows you break that expression into
several lines. Some implementations could generate code that
would store the variables being checked and would pass them on to you
so you would get in your handler (in the extra arguments that are
provided) the whole expression and the position within that
expression.

The original proposal says "If [the handler] returns,
execution continues with an implementation defined value as
the result of the operation that overflowed." How can you
accomplish that if all you know is that the overflow occurred
somewhere in line 42? If all three potential overflows are
lumped together as "line 42," where do you resume execution
without knowing which of the three multiplications overflowed?
You see why the extra arguments are useful?

I did not say they weren't. But they're not enough, absent
a way to get back to the point just after the overflow.
You can very easily break up the expression. Besides, with this
proposal you know at least that there was an overflow.

Is it permissible to have more than one potentially-overflowing
operator in an expression, *and* for the handler to return? If so,
the implementation needs to keep track of the restart point.
Why insert nops? You do not insert anything. If there is an overflow,
it doesn't matter if one extra instruction was executed because you
are going to go to the exception handler anyway and the result of
the whole expression is undefined, so you can avoid those NOPs
without any danger.

Not if the instruction in the branch delay slot changes the
state -- for example, by executing another arithmetic opcode that
also overflows ...
The Intel/Amd architecture "the past" ???

Yes. The 8086 reached the market in the middle of the Carter
administration, and its origins stretch back to the Nixon years,
if not further. It's a pretty old design.
But anyway, even machines like the alpha can detect overflow with
not a lot of problems... They will have a small performance hit

You've measured that performance hit, have you? Or at least
made a serious attempt to estimate it?
I said zero overhead for normal RISC machines or for the x86. Not
for the alpha, or for any brain dead machines out there!

Too bad -- You'd been doing so well, so very well, and then ...
(If you wear wooden shoes, does shooting yourself in the foot
count as sabotage?)
 
J

jacob navia

Eric Sosman a écrit :
I did not say they weren't. But they're not enough, absent
a way to get back to the point just after the overflow.

The point just after the overflow is just a label. Look at the generated
code:

subl %ebx,%ecx
jo _$L5 ; jump if overflow to L5
_$L6:
cdq
idivl %ecx
;; the rest of the normal control flow
;; goes here

After the end of the function we have label L5:
_$L5:
pusha
pushl $6
pushl $main__labelname
call __overflow
addl $8,%esp
popa
jmp _$L6

You see now?
Is it permissible to have more than one potentially-overflowing
operator in an expression, *and* for the handler to return? If so,
the implementation needs to keep track of the restart point.

Yes, and I do that above.
Not if the instruction in the branch delay slot changes the
state -- for example, by executing another arithmetic opcode that
also overflows ...

The result of the expression is undefined. It doen't matter.
Yes. The 8086 reached the market in the middle of the Carter
administration, and its origins stretch back to the Nixon years,
if not further. It's a pretty old design.

Mmmm I would say there are a few differences between the 8086 and
the intel i7 with 8 cores I am using now, excuse me. Obviously
just small differences from your point of view

:)

You've measured that performance hit, have you? Or at least
made a serious attempt to estimate it?

Yes. According to the discussion we had it has instructions that test
overflow. You have to make a pipeline flush, to keep the overflow
flag in synch with the execution unit, so there is pipeline turbulence.

This is an old problem. Chicken or egg?

Since C doesn't test overflow, hardware designers start getting rid of
the overflow flag. This #prgma could make them think again.

Too bad -- You'd been doing so well, so very well, and then ...
(If you wear wooden shoes, does shooting yourself in the foot
count as sabotage?)

I think that a machine where overflow can't be easily checked is
well brain dead. The DEC people were known for their VAX, that
decided arbitrarily to cut strings at 64K.

Well, each company has its bugs and stuff. Intel people have other bugs.
 
R

Richard Bos

=?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= said:
C is not restricted to "the most advanced CPUs". For every "advanced
CPU" in the world, there are tens or hundreds of embedded processors,
microcontrollers, DSPs etc. Even a COTS desktop, laptop or server with
an "advanced CPU" can contain multiple secondary processors: I've worked
with IBM servers that had an i486 (IIRC) on the backplane monitoring the
main CPUs.

Yes, but they don't count, because they don't support jacob's compiler.

Richard
 
B

Bart

Yes, but they don't count, because they don't support jacob's compiler.

It must be great being a cpu designer, being able to create novel new
hardware incompatible with anything in the past, present or future,
and apparently not caring whether it's compatible with any software
either.

That all seems to be perfectly acceptable. But when it comes to
languages, we're only allowed to have this single, monolithic super-
language that must run on any conceivable hardware, from the lowliest
crummy microprocessor up to supercomputers, even though there are
several evidently different stratas of application areas.

Odd, isn't it. (And I'm talking about C, since I don't know of any
mainstream languages quite like it.)

Personally I wouldn't have a problem with, say, a C-86 language, that
is targetted at x86-class processors. It would make a lot of things a
lot simpler. And people who use other processors can have their own
slightly different version. C is after all supposed to work at the
machine level; finally it will know exactly what that machine is!
 
E

Eric Sosman

jacob said:
Eric Sosman a écrit :

The point just after the overflow is just a label. Look at the generated
code:

subl %ebx,%ecx
jo _$L5 ; jump if overflow to L5
_$L6:
cdq
idivl %ecx
;; the rest of the normal control flow
;; goes here

After the end of the function we have label L5:
_$L5:
pusha
pushl $6
pushl $main__labelname
call __overflow
addl $8,%esp
popa
jmp _$L6

You see now?

I see that you cannot implement the specification you
promulgated, not with one "L5" per line of possibly-overflowing
code. You need one "L5" per potential overflow *site*, and
that's another matter.

The penalty can be reduced, if each "L5" is just a call-and-
return to The Great Caller of the Overflow Handler. But you still
need something like four instructions (three typically not executed)
per arithmetic operation: ADD, JOV, CALLequiv, RETequiv.
Yes, and I do that above.

No, you fail to do that above. If all you've got is an "L5"
that jumps unconditionally to "L6," you've got either an infinite
loop or a total abortion.
The result of the expression is undefined. It doen't matter.

Then how can you say that "execution continues?" What does
it mean to "continue" an execution, but either skipping its side-
effects or performing them infinitely often?

Perhaps the proposal would be improved (it would certainly
be simplified) by saying that the behavior is undefined if the
handler returns. (See precedents in signal handlers.) Take
careful note: This is a constructive suggestion, not an attack.
Mmmm I would say there are a few differences between the 8086 and
the intel i7 with 8 cores I am using now, excuse me. Obviously
just small differences from your point of view

The "overflow flag" that your implementation rests upon was
present in the original 8086 thirty-plus years ago, was even then
a conscious imitation of still earlier designs, and as far as I can
see has not changed materially since that time. New instructions
that can set, clear, and test the flag may have been added in the
meantime -- but the overflow flag itself is a relic of the far past,
a vermiform appendix that some more modern designs have chosen to
do without.
Yes. According to the discussion we had it has instructions that test
overflow.

Yes: You hang on to the operands, and do some comparisons
involving them and the computed result to see whether overflow
has occurred. Observe that these would be instructions *always*
executed, not branches seldom taken -- put that in your "zero
overhead" pipe and smoke it!
You have to make a pipeline flush, to keep the overflow
flag in synch with the execution unit, so there is pipeline turbulence.

You've lost me.
This is an old problem. Chicken or egg?

You've lost me again. Tonight, it was chicken, on the grill
with a Secret Sauce of my wife's invention.
Since C doesn't test overflow, hardware designers start getting rid of
the overflow flag. This #prgma could make them think again.

No; hardware designers got rid of overflow flags (and carry
flags and sign flags and zero flags and parity flags and The Holy
Flag Of The Cause) because they are contention points, reducing
the amount of parallelism one can achieve in the hardware. Hardware
designers pay attention to Amdahl's Law (even as software designers
blunder along in cack-handed ignorance).
I think that a machine where overflow can't be easily checked is
well brain dead. The DEC people were known for their VAX, that
decided arbitrarily to cut strings at 64K.

Isn't that the "counted string" you were so vigorously
promoting just a week or two ago? My, but how fashions change!
Ah, but that's a rant already ranted.
 
E

Eric Sosman

Eric said:
[...]
The penalty can be reduced, if each "L5" is just a call-and-
return to The Great Caller of the Overflow Handler. But you still
need something like four instructions (three typically not executed)
per arithmetic operation: ADD, JOV, CALLequiv, RETequiv.

Sorry; thinko; that should have been "ADD, JOV, CALLequiv, JMP."
 
K

Keith Thompson

Bart said:
Personally I wouldn't have a problem with, say, a C-86 language, that
is targetted at x86-class processors. It would make a lot of things a
lot simpler. And people who use other processors can have their own
slightly different version. C is after all supposed to work at the
machine level; finally it will know exactly what that machine is!

I'd have a *big* problem with that, if it meant that software
written for x86 systems won't run on SPARC, or ARM, or even x86-64.

I generally don't even think about what CPU I'm using at the moment,
because it doesn't matter. Making it matter would not be a step
forward.
 
B

Bart

[...]
Personally I wouldn't have a problem with, say, a C-86 language, that
is targetted at x86-class processors. It would make a lot of things a
lot simpler. And people who use other processors can have their own
slightly different version. C is after all supposed to work at the
machine level; finally it will know exactly what that machine is!

I'd have a *big* problem with that, if it meant that software
written for x86 systems won't run on SPARC, or ARM, or even x86-64.

Yet, hardware created around a SPARC processor presumably won't work
with an ARM? Somebody made a decision to use a specific set of
hardware, requiring different circuitry, peripherals, power supply,
different manuals and expertise, the software however must work,
unchanged, across the lot?

OK, fair enough. I'm sure the C code driving that 486 monitoring the
IBM servers that someone mentioned, will also work unchanged driving
the Dec Alpha monitoring Fuji servers instead (I've no idea what this
stuff does, and I suspect Fuji actually make film stock...).

My point is that C software can be considered an integral part of a
system and therefore can be allowed to be specific to that system in
the same way the bits of hardware can be. Ie., not just doing a
specific job but taking advantage of known characteristics of the
processor.
 
K

Keith Thompson

Bart said:
[...]
Personally I wouldn't have a problem with, say, a C-86 language, that
is targetted at x86-class processors. It would make a lot of things a
lot simpler. And people who use other processors can have their own
slightly different version. C is after all supposed to work at the
machine level; finally it will know exactly what that machine is!

I'd have a *big* problem with that, if it meant that software
written for x86 systems won't run on SPARC, or ARM, or even x86-64.

Yet, hardware created around a SPARC processor presumably won't work
with an ARM? Somebody made a decision to use a specific set of
hardware, requiring different circuitry, peripherals, power supply,
different manuals and expertise, the software however must work,
unchanged, across the lot?

Um, yes.
OK, fair enough. I'm sure the C code driving that 486 monitoring the
IBM servers that someone mentioned, will also work unchanged driving
the Dec Alpha monitoring Fuji servers instead (I've no idea what this
stuff does, and I suspect Fuji actually make film stock...).

My point is that C software can be considered an integral part of a
system and therefore can be allowed to be specific to that system in
the same way the bits of hardware can be. Ie., not just doing a
specific job but taking advantage of known characteristics of the
processor.

My point is that C can be used both for portable software (including
most of the software carrying these words between my keyboard and your
monitor) as well as for non-portable software (such as device
drivers).

Would your hypothetical C-86 language have enough advantages for
x86-specific code to make up for the fact that it wouldn't work *at
all* on anything else?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,742
Latest member
AshliMayer

Latest Threads

Top