On 12/30/2011 3:38 PM, James Harris wrote:
I've snipped the bits about how TLS is generally accessed. Thanks for
explaining that info. It was new to me.
...
I was thinking of thread_excep being at a fixed location so it could
literally be checked with one instruction:
if it is a global though, it is not a TLS.
otherwise, I guess it would need to be changed around by the scheduler
or something (register variables with scheduler, and it saves/restores
their values on context switches or something?...).
Yes. The task switcher would include code along the lines of
push [thread_excep]
for the outgoing task and, for the incoming task,
pop [thread_excep]
Since the word would be checked many times between task switches this
is much faster than following a structure off FS or similar.
possibly, however the issue then partly becomes:
how does the task switcher know where the variable is?...
possible options:
do it like the BIOS, with every task having a dedicated kernel area
which is also accessible to the userspace process;
the shared-object or DLL holding the variable is "owned" by the
OS/kernel (the kernel then depends on the presence of the library and on
the location of the variable within the library).
...
As thread_excep would be accessed very frequently it would be changed
explicitly on a context switch. That doesn't apply to the bulk of the
thread info, though, which might be wanted by the task. Most of that
info would be changed simply by updating a pointer. For example, as
well as there being
thread_excep: resd 1
in a page that the user-mode task can update there would also be
thread_data_p: resd 1
in a page which, to the task, was read-only. On a context switch both
words would be updated. This keeps thread switching minimal yet allows
the user-mode program easy and fast access to the kinds of things it
may want info on. For example, if the program wanted its thread id it
might find it at thread_data_p->id.
Incidentally, these could be wrapped in getter and setter functions
but, alternatively, for performance they can be linked to fixed
locations.
FWIW:
if one uses PE/COFF DLLs, typically all accesses to imported global
variables is indirect (and requires explicit declaration);
if one uses ELF, potentially nearly all access to global variables is
indirect (the typical strategy is to access them via the GOT except in
certain special cases).
also, segment overrides are fairly cheap.
the cheapest option then would probably be to include whatever magic
state directly into the TEB/TIB/whatever, then be like:
mov eax, [fs:address]
in some cases, accessing a variable this way could in-fact be
potentially cheaper than accessing a true global.
x86-64 is a little better, since the CPU defaults to relative
addressing, but is messed up some by ELF and the SysV/AMD64 ABI by
demanding that all access to globals still be via the GOT. the slight
advantage though is that one doesn't need dllimport/dllexport
annotations to import/export functions and variables, but overall I
prefer PE/COFF DLLs more.
back when I was developing an OS (so long ago), I also used PE/COFF.
a possible idea (for an alternative to both a GOT and the traditional
DLL imports, if developing a custom format for binaries, or just
tweaking PE/COFF) could be the use of a relocation table for any imports
(potentially in a semi-compressed form). the main issue is mostly on
x86-64, how to best deal with the 2GB window issue for global variables
if using this strategy (besides just requiring all libraries to be in
the low 2GB, or using indirect addressing if the compiler and/or linker
can't determine if the variable will be within the 2GB window).
this would probably be mostly applicable in the case where one is
reusing an existing compiler but writing a custom linker, and probably
while using a non-ELF object format (such as COFF).
possible nifty features one could add in such a case: fat-binaries,
partial late-binding (sadly, non-trivial cases would require compiler
support), ...
I suppose it depends on the normal first level of selecting
exceptions. The best way might be whatever the handler uses to first
distinguish them. Most code I've seen or written is interested, at the
top level, in which *type* of exceptions to catch and which to ignore
and it generally does that by looking at the exception type:
indexerror, valueerror, computationerror etc. That coupled with the
fact that multiple exceptions can be outstanding at the same time
suggested the use of a bit array but it's not the only option.
You mention user-defined exceptions. I planned only one bit for them
(as there is an arbitrary number of them). To distinguish one user
exception from another would probably require a call to a routine that
examines the detail.
probably ok for a single app, but is dubious for an OS or general mechanism.
I see your point. The problem is that this allows only one exception
at a time to be signalled-but-not-yet-handled. Which one? I suppose
the most critical one or the first one or the last one could be chosen
to be marked in the indicator word.
errm... typical exception mechanisms only allow a single exception at a
time. as soon as an exception is thrown, it is handled immediately. no
delays or queuing are allowed. if an exception occurs within a handler,
this may either throw the new exception (the prior one is forgotten), or
simply kill the app.
Windows does this: if something goes wrong and an exception can't be
thrown in-app, Windows will simply kill the process. if something goes
wrong in-kernel, it is a blue-screen / BSOD.
likewise for the CPU:
if an exception occurs within an interrupt handler, this is a
"doublefault" and generally leads to a reboot.
an exception is not a status code...
<<END
Just on this point, and bringing it back specifically to C, if TLS is
hard to obtain or slow to access there are two other possibilities
that I don't think have been mentioned yet that spring to mind for use
in C.
it is slow, yes, but typically not enough to care about.
for most things people don't worry about a few clock-cycles here and
there (generally, it is big algorithm-level stuff that kills
performance, and not a lack of sufficient micro-optimization).
1. If only running a single thread simply use a global. Job done.
except when globals are slower.
2. Use errno. Set it to zero at the start and check it where
appropriate. On a user-detected exception that does not already set
errno set it to a value which is outside the normal range (and create
or append to the detailed exception object, as before).
FWIW, depending on the C-library, errno may be in-fact a function call
wrapped in a macro.
Using errno does only allow one exception to be indicated at a time
but that's the same as what I understand your suggestion to be. It
also doesn't work between languages (unless they obtain the address of
errno) but I think it could work well for C.
END
At the end of the day, any scheme could be used. For performance the
idea is that the exception-indicating word is either zero or non-zero.
If it's non-zero one or more exceptions has/have occurred and must be
dealt with.
except that "exceptions as a status word" aren't really "exceptions" in
the traditional sense.
may as well just call them status-codes, and leave the whole mess of
exception-handling out of this.