Exception propagation in C

Joe keane · Jan 2, 2012

Can anyone come up with a faster or better way to propagate exceptions?
What other ways are there to handle exceptions in C? What do you do?

I worked at a place where the coding guidelines said
'every function returns an int, and that's the error code'.

Kleuske · Jan 2, 2012

I worked at a place where the coding guidelines said 'every function
returns an int, and that's the error code'.

So have I, and i must say I much preferred it to the place i worked
where it was deemed acceptable to silently fail and return nothing. At
least the former delivered stable code which was easy to trace and debug.

Hurray for returning error codes!

Jens Gustedt · Jan 2, 2012

Am 12/31/2011 05:40 PM, schrieb BGB:

who knows how long until it is really well supported though, as it has
been long enough just waiting for MSVC to support C99 stuff...

I don't think that this will take as long as for C99 until the majors
will implement most of C11. As for one thing most of the things that
go in C11 are already there in most compilers (or OS for threads). And
then, they learnt from the experience and made some features optional.

and alas they couldn't just reuse GCC's keywords?...
well, in any case, probably macro magic.

yes, for tls the replacement by macros is trivial

I had forgot about a trick I had actually used before:
one can fake TLS using a macro wrapping a function call.

yes, in particular with the POSIX like per thread keys, that also come
with C11

hmm:
in C one could use macros to be like:
BEGIN_TRY
...
END_TRY
BEGIN_CATCH(FooException, ex)
...
END_CATCH
BEGIN_FINALLY
...
END_FINALLY

I think it is easy to do that even nicer, where exceptions would throw
an integer value

TRY {
...
} CATCH(identifier) {
case 1: ...

default:
printf("uncaught exception %d\n", identifier);
RETHROW;
}

I'll probably implement something in that vein in the next weeks on
top of P99.

Jens

BGB · Jan 2, 2012

Am 12/31/2011 05:40 PM, schrieb BGB:

I don't think that this will take as long as for C99 until the majors
will implement most of C11. As for one thing most of the things that
go in C11 are already there in most compilers (or OS for threads). And
then, they learnt from the experience and made some features optional.

yes, but it also took until 2010 for MSVC to add "stdint.h", which
hardly seemed like a huge/complex issue.

I still can't just use "%lld" as a printf modifier, one has to type
"%I64d" with MSVC to get the expected results (but then have this not
work on Linux). (I created a function "gcrlltoa()" which does little
more than to print a long-long into a buffer and return a pointer to a
temporary string, so I can use "%s" instead.)

....

yes, for tls the replacement by macros is trivial

yeah, probably.

yes, in particular with the POSIX like per thread keys, that also come
with C11

ok.

I think it is easy to do that even nicer, where exceptions would throw
an integer value

TRY {
...
} CATCH(identifier) {
case 1: ...

default:
printf("uncaught exception %d\n", identifier);
RETHROW;
}

I'll probably implement something in that vein in the next weeks on
top of P99.

the main issue with directly using braces is that it doesn't give
anywhere to un-register the exception.

catch with an integer could also use a FOURCC for well-known / named
exceptions.

my current exception system uses dynamically-typed references
(generally, type-checking would be done as part of catching an
exception). generally, my type-system identifies types by type-name
strings (additionally, class/instance objects and certain C structs may
also be used, as these are built on top of the dynamic type system).

or such...

Dr Nick · Jan 2, 2012

BGB said:
yes, but it also took until 2010 for MSVC to add "stdint.h", which
hardly seemed like a huge/complex issue.

I still can't just use "%lld" as a printf modifier, one has to type
"%I64d" with MSVC to get the expected results (but then have this not
work on Linux). (I created a function "gcrlltoa()" which does little
more than to print a long-long into a buffer and return a pointer to a
temporary string, so I can use "%s" instead.)

Easier, probably, to take a hint from stdint and do something like this:

#if ..MSVC test..
#define PRINT_LLD "I64d"
#else
#define PRINT_LLD "lld"
#endif

....
long long var = 3872984822;

printf("this is a big number %" PRINT_LLD ", isn't it?",var);

ImpalerCore · Jan 2, 2012

yes, but it also took until 2010 for MSVC to add "stdint.h", which
hardly seemed like a huge/complex issue.

I still can't just use "%lld" as a printf modifier, one has to type
"%I64d" with MSVC to get the expected results (but then have this not
work on Linux). (I created a function "gcrlltoa()" which does little
more than to print a long-long into a buffer and return a pointer to a
temporary string, so I can use "%s" instead.)

To workaround that particular problem, I rely on the PRI[diouxX]64
modifier to print 64-bit integers. I have this in my stdint.h +
inttypes.h wrapper that you may find useful.

\code snippet
/*
* Even if the Microsoft Runtime library is linked against a C99
* compiler, the printf modifier "ll" will likely not work. Use the
"I64"
* modifier to declare the correct PRI[di]64 fprintf modifier macros.
*/
#if (defined(__STDC__) && defined(__STDC_VERSION__))
# if (__STDC__ && __STDC_VERSION__ >= 199901L)
# define inttypes_prid64_defined
# if defined(_WIN32) || defined(__WIN32__)
/* Microsoft Runtime does not support the "ll" modifier. */
# define PRId64 "I64d"
# define PRIi64 "I64i"
# else
# define PRId64 "lld"
# define PRIi64 "lli"
# endif
# endif
#endif

/*
* If one lacks a C99 compiler, but allows 64-bit integer extensions,
* one can still define the set of PRI[di]64 fprintf modifier macros.
*/
#if !defined(inttypes_prid64_defined) && !defined(ANSI_C_PSTDINT)
# if defined(_WIN32) || defined(__WIN32__)
# define PRId64 "I64d"
# define PRIi64 "I64i"
# else
# if (INT64_MAX == LLONG_MAX)
# define PRId64 "lld"
# define PRIi64 "lli"
# elif (INT64_MAX == LONG_MAX)
# define PRId64 "ld"
# define PRIi64 "li"
# elif (INT64_MAX == INT_MAX)
# define PRId64 "d"
# define PRIi64 "i"
# else
# error "Platform not supported"
# endif
# endif
#endif
\endcode

Best regards,
John D.

Rod Pemberton · Jan 2, 2012

Scott Wood said:
The relevant factor isn't "C" versus "non-C",

Sorry, "non-C" is inaccurate. I think of C as "a single-threaded execution
model." By that, I mean there is no other C code executing in parallel,
even though that's *not* the correct definition for single-threaded.
Obviously, there could be C code executing in parallel with each modifying
the same machine word. So, "non-C" should be "everything other than" the C
code deemed to be active.

The relevant factor is [...] synchronous versus asynchronous [...]

I don't see this as being relevant. His C code could be 100% synchronous
and he could still need a volatile. His code could be single-threaded,
single-core, single-processor, serial, non-parallel, not use setjmp/longjmp,
and be entirely interrupt free. If hardware modifies his machine word that
his C code is accessing, he still needs a volatile keyword for it.

AFAICT, James never fully stated what exactly is modifying his machine word.
It could be other software - in C or in assembly. Or, it could be hardware.
From his description, all I know is that it's "in the cloud" somewhere ...
I.e., it might not even be on his own computer. Maybe it's only accessable
from a another computer by a remote procedure call over a network. Without
knowing exactly what is modifying his machine word, I'm not sure how he is
going to guarantee non-volatility and/or atomicity.

(or perhaps, outside this instance of the single-threaded C
execution model).

Yes. Or, I like that better, although technically neither of us is being
precise enough. I'm not entirely sure of the correct terminology.
Single-threaded doesn't specifically exclude parallel processing for the
single-thread. Single-threaded just means a single command. E.g., one or
more parts of a single-threaded and in parallel C application could optimize
away use of the variable because it's unaware of the other parts' usage of
it.

If it's modified by code during what the C code sees as an ABI-conformant
function call (which seems to be the case here -- I don't think he was
talking about recovering from hardware exceptions), it doesn't matter what
language the function is implemented in, as long as it is indeed
ABI-conformant -- volatile is not needed. ....

OTOH, if the variable is modified from an interrupt/signal handler,
another thread, etc. you'll need (at a minimum) either volatile or some
implementation-specific optimization barrier, even if the code that
actually modifies the variable is also written in C.

Yes.

Rod Pemberton

Rod Pemberton · Jan 2, 2012

Kleuske said:
So have I, and i must say I much preferred it to the place i worked
where it was deemed acceptable to silently fail and return nothing. At
least the former delivered stable code which was easy to trace and debug.

Hurray for returning error codes!

I'm curious. In response to not being able to return other items, what did
you guys do? Did you guys also pass in an additional variable(s) to every
procedure or almost every procedure? E.g., an error structure, or a
function number?

Rod Pemberton

ImpalerCore · Jan 2, 2012

Easier, probably, to take a hint from stdint and do something like this:

#if ..MSVC test..
#define PRINT_LLD "I64d"
#else
#define PRINT_LLD "lld"
#endif

...
long long var = 3872984822;

printf("this is a big number %" PRINT_LLD ", isn't it?",var);

I'd probably go with PRIdLL or PRIuLL, just to keep in style with the
other fprintf modifiers, if the PRI prefix isn't reserved (although
I'd still be quite tempted to use them).

Best regards,
John D.

James Harris · Jan 3, 2012

On 12/30/2011 3:38 PM, James Harris wrote:

I've snipped the bits about how TLS is generally accessed. Thanks for
explaining that info. It was new to me.

....

I was thinking of thread_excep being at a fixed location so it could
literally be checked with one instruction:

Click to expand...

cmp [thread_excep], 0

Click to expand...

if it is a global though, it is not a TLS.

otherwise, I guess it would need to be changed around by the scheduler
or something (register variables with scheduler, and it saves/restores
their values on context switches or something?...).

Yes. The task switcher would include code along the lines of

push [thread_excep]

for the outgoing task and, for the incoming task,

pop [thread_excep]

Since the word would be checked many times between task switches this
is much faster than following a structure off FS or similar.

....

I guess, if one is making an OS, other possibilities also exist (special
magic TLS sections?).

say, if one has a section ".tls" section, where the scheduler
automatically saves/restores the contents of anything stored within
these sections. the advantage could be very fast access to TLS variables
at the cost of potentially more expensive context switches.

As thread_excep would be accessed very frequently it would be changed
explicitly on a context switch. That doesn't apply to the bulk of the
thread info, though, which might be wanted by the task. Most of that
info would be changed simply by updating a pointer. For example, as
well as there being

thread_excep: resd 1

in a page that the user-mode task can update there would also be

thread_data_p: resd 1

in a page which, to the task, was read-only. On a context switch both
words would be updated. This keeps thread switching minimal yet allows
the user-mode program easy and fast access to the kinds of things it
may want info on. For example, if the program wanted its thread id it
might find it at thread_data_p->id.

Incidentally, these could be wrapped in getter and setter functions
but, alternatively, for performance they can be linked to fixed
locations.

I don't think flags are the ideal strategy here.
flags don't as easily allow user-defined exceptions, and if one has only
a few bits for a user-defined exception class or similar, then it
creates a bit of a burden WRT avoiding clashes.

I suppose it depends on the normal first level of selecting
exceptions. The best way might be whatever the handler uses to first
distinguish them. Most code I've seen or written is interested, at the
top level, in which *type* of exceptions to catch and which to ignore
and it generally does that by looking at the exception type:
indexerror, valueerror, computationerror etc. That coupled with the
fact that multiple exceptions can be outstanding at the same time
suggested the use of a bit array but it's not the only option.

You mention user-defined exceptions. I planned only one bit for them
(as there is an arbitrary number of them). To distinguish one user
exception from another would probably require a call to a routine that
examines the detail.

one "could" potentially use interned strings and magic index numbers
though, say, 10 or 12 bits being plenty to allow a range of named
exception classes (potentially being keyed via a hash-table or similar).

this is actually how my dynamic type-system works (types are identified
by name, but encoded via hash indices).

I see your point. The problem is that this allows only one exception
at a time to be signalled-but-not-yet-handled. Which one? I suppose
the most critical one or the first one or the last one could be chosen
to be marked in the indicator word.

<<END
Just on this point, and bringing it back specifically to C, if TLS is
hard to obtain or slow to access there are two other possibilities
that I don't think have been mentioned yet that spring to mind for use
in C.

1. If only running a single thread simply use a global. Job done.

2. Use errno. Set it to zero at the start and check it where
appropriate. On a user-detected exception that does not already set
errno set it to a value which is outside the normal range (and create
or append to the detailed exception object, as before).

Using errno does only allow one exception to be indicated at a time
but that's the same as what I understand your suggestion to be. It
also doesn't work between languages (unless they obtain the address of
errno) but I think it could work well for C.
END

At the end of the day, any scheme could be used. For performance the
idea is that the exception-indicating word is either zero or non-zero.
If it's non-zero one or more exceptions has/have occurred and must be
dealt with.

James

Kleuske · Jan 3, 2012

I'm curious. In response to not being able to return other items, what
did you guys do?

Given the problem domain (controlling hugely expensive equipment very, very,
VERY accurately), there wasn't any big need to return all kinds of
stuff, but there's always the possibility to pass a pointer in the argument
list and return results through that. Stability and correctness was the prime
concern, everything else was rather less important.

Did you guys also pass in an additional variable(s) to
every procedure or almost every procedure? E.g., an error structure, or
a function number?

Error structures? Nope. Not really. Other variables passed by pointer in order
to return results? Occasionally, but there wasn't any big need for that. The
flow of information was a rather one-way affair.

I should have mentioned it wasn't your average piece of software and should
have qualified my remarks to reflect that.

James Harris · Jan 3, 2012

Maybe this is clear to you already, but the words you use leave
me in some doubt: In pondering the performance of an exception scheme
one should not worry about how fast exceptions are handled, but about
how fast non-exceptions are not-handled. If you make a million calls
to malloc() and get an exception on one of them, it's the performance
of the other 999,999 that you need to consider.

They're called "exceptions," not "usuals."

I'm principally thinking of overall performance. I did say that but I
can see my later comments could be taken to mean the opposite. There
is small caveat below but unless an exception is repeatedly generated
in a loop them the time it takes to generate or propagate tends to
zero and becomes insignificant.

Perhaps it's better expressed as: what's the fastest way in standard C
to ignore non-exceptions (still not a good phrase but you know what I
mean) while still reliably propagating and handling exceptions.

The caveat is that the design of C and its libraries may overuse
return values where exceptions, if available, would be a better
choice. So it may be hard for me to convince people on a C newsgroup
that invalid return values are not the best thing since sliced bread!

Exceptions should probably always be unusual but there are cases where
they could be beneficially used instead of invalid values.

....

The only portable ways to unwind the stack are to return or to
use longjmp(). With the latter there's no way to unwind the stack
frames one by one, nor to inspect what's being unwound: You just
longjmp() to the next-higher handler, do whatever local cleanup's
needed, longjmp() again to a handler higher still, and so on until
somebody's local cleanup says "This exception has propagated far enough;
I'm absorbing it."

For me, this paragraph sums it up. While a single longjmp back to main
would be very fast, in the general case a hierarchy of setjmps and
longjmps would be far too slow.

The only other portable option of returning can be made fast (very
fast) using option 3.

Again, I feel you're worrying about performance in a situation
where performance is not very important.

Isn't this just an `errno' equivalent, with a pointer instead
of an `int'? In "The Standard C Library," PJ Plauger writes

Nobody likes errno or the machinery that it implies. I can't
recall anybody defending this approach to error reporting,
not in two dozen or more meetings of X3J11, the committee
that developed the C Standard. Several alternatives were
proposed over the years. At least one faction favored simply
discarding errno.

I'd hesitate before imitating an interface nobody likes ...

This is not the same as errno and does not use the same 'machinery'.
errno can be ignored and provides no instance data (such as what
specifically went wrong?). The value of errno, if potentially wanted,
must be saved before another call is made as it may be overwriten. An
exception, on the other hand, is something that should record specific
data and cause an immediate change in the processing path.exceptions

I chose TLS since that's the 'right' place for an exception condition.
Once an exception occurs the thread itself needs to sit up and take
notice, not just one routine, not just part of the thread, and not the
whole process.

As mentioned in reply to BGB, though, if not running multiple threads
a global could be used instead of TLS. I'm not advocating general use
of globals. They carry state between routines and break interfaces
which is a very bad idea. This, however, is an exception indication
that applies to the locus of control wherever it happens to be in the
single-threaded process. The alternative of adding an explicit
exception-return parameter from each called routine is also good but I
doubt most C programmers would like it as it's so unfamiliar.

Also, the performance probably suffers. For all its drawbacks,
the function-returns-a-distinguishable-value-on-failure approach has
the advantage that the value is very likely in a CPU register when
you need to test it, not in a global variable somewhere off in RAM.

I disagree here. As with the case at the top, performance is only an
issue if code is executed repeatedly. If being run repeatedly the
variable would normally be in L1 cache. Consider the proposed x86-32
code following a call to a routine.

cmp [thread_excep], 0
jnz exception_handler

Given this being executed in a loop the only normal way for
thread_excep not to be cached would be if there was extreme contention
for cache capacity. If that were the case a lot of things would be
slow and, again, this sequence would tend to become insignificant.

As well as thread_excep being cached, the branch would nearly always
fall through (be not taken) so is an ideal case for branch prediction
to correctly predict as not taken. Finally, the two instructions may
be fusible into one operation.

In other words, there's almost zero cost to this sequence.

The main negative to me is that, in C, it uses a goto but this hasn't
elicited the chorus of disapproval that I thought it might.

[...] Anyone know of
something better?

Click to expand...

I've seen various exception-like frameworks overlaid on C, some
down-and-dirty, some fairly elaborate. All of them, however useful
for the particular task at hand, suffered from lack of support by
the language itself: There's just no way to tell the compiler that
a function might throw an exception, which means the compiler can't
tell the programmer when he's forgotten to catch one. Somebody opens
a file, obtains values from various accessor functions and writes them
to the file, closes the stream, and returns -- except that one of the
accessors throws an exception that gets caught somewhere way up the
stack, and the file-writing function never closes its stream ...

Errors of this kind can be prevented by programmer discipline,
but I've seen no scheme that will help keep the programmer on the
strait and narrow path. This is sad, because it means an exception
mechanism that was supposed to make the programmer's job easier can
wind up making it more burdensome -- and more perilous.

Exceptions aren't something you can just bolt on to a language..
Well, actually, you *can*: people have bolted them on to C, over and
over again. Yet none of the various schemes has caught on, which I
think should cause you to ponder. Maybe it's because bolting on is
a poor substitute for baking in.

Agreed. Inbuilt is better in so many ways. However, we have to work
with what there is available. C is C, after all, and I don't want to
mess with C-with-extensions whether they are plusses or something
else!

James

James Harris · Jan 3, 2012

So have I, and i must say I much preferred it to the place i worked
where it was deemed acceptable to silently fail and return nothing. At
least the former delivered stable code which was easy to trace and debug.

Hurray for returning error codes!

What did you guys return if an exception occurred in an exception
handler?

James

Kleuske · Jan 3, 2012

What did you guys return if an exception occurred in an exception
handler?

Lacking exception handlers, the question did not arise. In case of error
in that subsystem, the machine just freezes, in order to prevent damage
to the expensive (think jet-fighter price-range) equipment.

I remember celebrating the first time the software ran for ten seconds
without freezing the machine. I got home a bit wobbly on that occasion.

James Harris · Jan 3, 2012

On Jan 2, 8:59 pm, "Rod Pemberton" <[email protected]>
wrote:

....

AFAICT, James never fully stated what exactly is modifying his machine word.
It could be other software - in C or in assembly. Or, it could be hardware.
From his description, all I know is that it's "in the cloud" somewhere ....
I.e., it might not even be on his own computer. Maybe it's only accessable
from a another computer by a remote procedure call over a network. Without
knowing exactly what is modifying his machine word, I'm not sure how he is
going to guarantee non-volatility and/or atomicity.

Nothing special would modify the word. It is intended just as a way
for a callee to return an exception indication to its caller.

To make an example,

callee
if ((temp = malloc(SIZ)) == NULL) {
... record the details ...
thread_excep |= EXCEP_MEMSPACE;
return;
}
... further processing and return to caller ...

caller
p1 = callee(SIZ1);
if (thread_excep) goto handler;
p2 = callee(SIZ2);
if (thread_excep) goto handler;
... normal processing ...

As I understand it, even though the callee could modify thread_excep
it would not need to be declared as volatile.

James

James Harris · Jan 3, 2012

On Jan 2, 6:50 am, "Rod Pemberton" <[email protected]>
wrote:

....

My experience with c.l.c people
over the past decade is that many of them are insanely hostile, intolerant
of any criticism, and will argue to their death that they are correct when
in fact they are provably wrong ...

A lot of the antagonists have moved on. IME clc is a far better place
than it was a few years ago.

James

Keith Thompson · Jan 3, 2012

James Harris said:
On Jan 2, 6:50Â am, "Rod Pemberton" <[email protected]>
wrote:

...

A lot of the antagonists have moved on. IME clc is a far better place
than it was a few years ago.

And a well-constructed killfile can make it even more pleasant.

BGB · Jan 3, 2012

On 12/30/2011 3:38 PM, James Harris wrote:

Click to expand...

I've snipped the bits about how TLS is generally accessed. Thanks for
explaining that info. It was new to me.

...

I was thinking of thread_excep being at a fixed location so it could
literally be checked with one instruction:

Click to expand...

cmp [thread_excep], 0

Click to expand...

if it is a global though, it is not a TLS.

otherwise, I guess it would need to be changed around by the scheduler
or something (register variables with scheduler, and it saves/restores
their values on context switches or something?...).

Click to expand...

Yes. The task switcher would include code along the lines of

push [thread_excep]

for the outgoing task and, for the incoming task,

pop [thread_excep]

Since the word would be checked many times between task switches this
is much faster than following a structure off FS or similar.

possibly, however the issue then partly becomes:
how does the task switcher know where the variable is?...

possible options:
do it like the BIOS, with every task having a dedicated kernel area
which is also accessible to the userspace process;
the shared-object or DLL holding the variable is "owned" by the
OS/kernel (the kernel then depends on the presence of the library and on
the location of the variable within the library).

...

As thread_excep would be accessed very frequently it would be changed
explicitly on a context switch. That doesn't apply to the bulk of the
thread info, though, which might be wanted by the task. Most of that
info would be changed simply by updating a pointer. For example, as
well as there being

thread_excep: resd 1

in a page that the user-mode task can update there would also be

thread_data_p: resd 1

in a page which, to the task, was read-only. On a context switch both
words would be updated. This keeps thread switching minimal yet allows
the user-mode program easy and fast access to the kinds of things it
may want info on. For example, if the program wanted its thread id it
might find it at thread_data_p->id.

Incidentally, these could be wrapped in getter and setter functions
but, alternatively, for performance they can be linked to fixed
locations.

FWIW:
if one uses PE/COFF DLLs, typically all accesses to imported global
variables is indirect (and requires explicit declaration);
if one uses ELF, potentially nearly all access to global variables is
indirect (the typical strategy is to access them via the GOT except in
certain special cases).

also, segment overrides are fairly cheap.
the cheapest option then would probably be to include whatever magic
state directly into the TEB/TIB/whatever, then be like:
mov eax, [fs:address]

in some cases, accessing a variable this way could in-fact be
potentially cheaper than accessing a true global.

x86-64 is a little better, since the CPU defaults to relative
addressing, but is messed up some by ELF and the SysV/AMD64 ABI by
demanding that all access to globals still be via the GOT. the slight
advantage though is that one doesn't need dllimport/dllexport
annotations to import/export functions and variables, but overall I
prefer PE/COFF DLLs more.

back when I was developing an OS (so long ago), I also used PE/COFF.

a possible idea (for an alternative to both a GOT and the traditional
DLL imports, if developing a custom format for binaries, or just
tweaking PE/COFF) could be the use of a relocation table for any imports
(potentially in a semi-compressed form). the main issue is mostly on
x86-64, how to best deal with the 2GB window issue for global variables
if using this strategy (besides just requiring all libraries to be in
the low 2GB, or using indirect addressing if the compiler and/or linker
can't determine if the variable will be within the 2GB window).

this would probably be mostly applicable in the case where one is
reusing an existing compiler but writing a custom linker, and probably
while using a non-ELF object format (such as COFF).

possible nifty features one could add in such a case: fat-binaries,
partial late-binding (sadly, non-trivial cases would require compiler
support), ...

I suppose it depends on the normal first level of selecting
exceptions. The best way might be whatever the handler uses to first
distinguish them. Most code I've seen or written is interested, at the
top level, in which *type* of exceptions to catch and which to ignore
and it generally does that by looking at the exception type:
indexerror, valueerror, computationerror etc. That coupled with the
fact that multiple exceptions can be outstanding at the same time
suggested the use of a bit array but it's not the only option.

You mention user-defined exceptions. I planned only one bit for them
(as there is an arbitrary number of them). To distinguish one user
exception from another would probably require a call to a routine that
examines the detail.

probably ok for a single app, but is dubious for an OS or general mechanism.

I see your point. The problem is that this allows only one exception
at a time to be signalled-but-not-yet-handled. Which one? I suppose
the most critical one or the first one or the last one could be chosen
to be marked in the indicator word.

errm... typical exception mechanisms only allow a single exception at a
time. as soon as an exception is thrown, it is handled immediately. no
delays or queuing are allowed. if an exception occurs within a handler,
this may either throw the new exception (the prior one is forgotten), or
simply kill the app.

Windows does this: if something goes wrong and an exception can't be
thrown in-app, Windows will simply kill the process. if something goes
wrong in-kernel, it is a blue-screen / BSOD.

likewise for the CPU:
if an exception occurs within an interrupt handler, this is a
"doublefault" and generally leads to a reboot.

an exception is not a status code...

<<END
Just on this point, and bringing it back specifically to C, if TLS is
hard to obtain or slow to access there are two other possibilities
that I don't think have been mentioned yet that spring to mind for use
in C.

it is slow, yes, but typically not enough to care about.
for most things people don't worry about a few clock-cycles here and
there (generally, it is big algorithm-level stuff that kills
performance, and not a lack of sufficient micro-optimization).

1. If only running a single thread simply use a global. Job done.

except when globals are slower.

2. Use errno. Set it to zero at the start and check it where
appropriate. On a user-detected exception that does not already set
errno set it to a value which is outside the normal range (and create
or append to the detailed exception object, as before).

FWIW, depending on the C-library, errno may be in-fact a function call
wrapped in a macro.

Using errno does only allow one exception to be indicated at a time
but that's the same as what I understand your suggestion to be. It
also doesn't work between languages (unless they obtain the address of
errno) but I think it could work well for C.
END

At the end of the day, any scheme could be used. For performance the
idea is that the exception-indicating word is either zero or non-zero.
If it's non-zero one or more exceptions has/have occurred and must be
dealt with.

except that "exceptions as a status word" aren't really "exceptions" in
the traditional sense.

may as well just call them status-codes, and leave the whole mess of
exception-handling out of this.

Joe keane · Jan 3, 2012

* Option 3. The function detecting an exception creates an exception
object and sets an exception type in a thread-specific variable. Any
calling functions, after each call (or group of calls) test that
variable. If non-zero they jump to a local exception handler if there
is one or jump to a return from the current function if not. This does
involve a test and branch for each call but that is very fast (fusible
and single cycle) and it avoids the need for exception handling
context to be pushed and popped. It would also preserve the branch
prediction of calls and returns.

I'm baffled as to why one would do that.

It's like using a global variable to pass information between a function
and a function it calls or the reverse. Then you realize that it
doesn't work for multi-threaded and make it more complicated.

It still doesn't work when you get an exception, then the cleanup code
itself gets an exception. [Of course, in some cases you just throw up
your hands and call abort(), but this often can be handled.] Presumably
the callers want the -first- error code but you overwrote it.

If the stack works [or registers] why not use that?

James Harris · Jan 4, 2012

On Dec 31 2011, 2:24 am, BGB<[email protected]> wrote:

Click to expand...

....

Yes. The task switcher would include code along the lines of

Click to expand...

push [thread_excep]

Click to expand...

for the outgoing task and, for the incoming task,

Click to expand...

pop [thread_excep]

Click to expand...

Since the word would be checked many times between task switches this
is much faster than following a structure off FS or similar.

Click to expand...

possibly, however the issue then partly becomes:
how does the task switcher know where the variable is?...

Isn't this easy if you have a flat address space: just put the
variable at a fixed location?

possible options:
do it like the BIOS, with every task having a dedicated kernel area
which is also accessible to the userspace process;
the shared-object or DLL holding the variable is "owned" by the
OS/kernel (the kernel then depends on the presence of the library and on
the location of the variable within the library).
....

FWIW:
if one uses PE/COFF DLLs, typically all accesses to imported global
variables is indirect (and requires explicit declaration);
if one uses ELF, potentially nearly all access to global variables is
indirect (the typical strategy is to access them via the GOT except in
certain special cases).

Does this mean that the global, G, in the following would not be
accessed directly?

int G;
void sub() {
while (G > 0) {
G -= 1;
sub();
}
}
int main(int argc, char **argv) {
G = argc;
sub();
return 0;
}

also, segment overrides are fairly cheap.

True. It obviously depends on the CPU but in my tests, while segment
loading cost a bit, segment overrides were completely free.

the cheapest option then would probably be to include whatever magic
state directly into the TEB/TIB/whatever, then be like:
mov eax, [fs:address]

in some cases, accessing a variable this way could in-fact be
potentially cheaper than accessing a true global.

As mentioned, I would expect a file-scope global to be accessed
directly - as a machine word at a location that gets filled-in by the
linker. Am I wrong? Are you are thinking of library code?

....

probably ok for a single app, but is dubious for an OS or general mechanism.

Why is this no good? There can be any number of user exception types
so it's not possible to have one bit for each. The best I could think
of was to have one bit to indicate a user exception and functions to
examine as much other detail as needed.

errm... typical exception mechanisms only allow a single exception at a
time. as soon as an exception is thrown, it is handled immediately. no
delays or queuing are allowed. if an exception occurs within a handler,
this may either throw the new exception (the prior one is forgotten), or
simply kill the app.

If only one exception at a time is needed that makes things
significantly easier. But is only one needed? I was thinking of
exceptions caused in exception handlers. For example, the exception
handler divides by the number it first thought of and triggers an
overflow or it checks an array index and finds it out of bounds. What
does the system do?

It could leave the first exception in place - after all that was the
root cause. Or it could replace the exception information with its own
- after all it has an internal fault. Neither seems obviously best. It
might be best to keep both indications until both have been either
resolved or reported ... but it would be easier to handle if only one
was kept.

Windows does this: if something goes wrong and an exception can't be
thrown in-app, Windows will simply kill the process.

If an exception "can't be thrown"? What would prevent it being thrown?

if something goes
wrong in-kernel, it is a blue-screen / BSOD.

Not very good, though, is it! In reality I think that Windows or any
OS will handle almost all exceptions. A BSOD should only occur on a
hardware failure such as a machine-check exception. (It looks like
Windows does run third-party drivers as privileged this increasing its
vulnerability many-fold.)

likewise for the CPU:
if an exception occurs within an interrupt handler, this is a
"doublefault" and generally leads to a reboot.

an exception is not a status code...

Following your analogy of a CPU you could perhaps think of each
exception condition as an interrupt line that gets asserted until the
condition causing the interrupt has been resolved. I know, as an
analogy it's not perfect. :-(

it is slow, yes, but typically not enough to care about.
for most things people don't worry about a few clock-cycles here and
there (generally, it is big algorithm-level stuff that kills
performance, and not a lack of sufficient micro-optimization).

except when globals are slower.

I'd like to understand why they would be. I tried compiling the code
above with gcc and it seems to refer to G directly such as in sub()
the compiler generates

mov DWORD PTR G, %eax

FWIW, depending on the C-library, errno may be in-fact a function call
wrapped in a macro.

I know. I found that in some systems there is a call which returns the
address of errno. In that case the compiler should (hopefully)
remember that address for each use in a function.

except that "exceptions as a status word" aren't really "exceptions" in
the traditional sense.

may as well just call them status-codes, and leave the whole mess of
exception-handling out of this.

Don't forget the original intention was speed. The point of a status
word, as you call it, is to allow exception handling (or, non-
handling) quickly. It is not an end in itself but a means of adding
exceptions to C in a way that recognises exceptions in the right
places while having the lowest possible impact on overall performance.

James

In R Shiny, How do I ensure variable value propagation within same code block in R?	0	Sep 29, 2022
Exception propagation	8	Dec 10, 2011
propagation of exceptions over module/language boundaries	21	Jun 13, 2008
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
C pipe	1	Dec 9, 2021
Meme generator in c	1	Dec 23, 2022
Keyboard event detection in C#	1	Feb 8, 2023
color propagation in tkinter	4	Apr 20, 2009

Exception propagation in C

Joe keane

Kleuske

Jens Gustedt

BGB

Dr Nick

ImpalerCore

Rod Pemberton

Rod Pemberton

ImpalerCore

James Harris

Kleuske

James Harris

James Harris

Kleuske

James Harris

James Harris

Keith Thompson

BGB

Joe keane

James Harris

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads