43=3DA0am, (e-mail address removed) (Richard Harter) wrote:
On Wed, 30 Dec 2009 08:39:34 -0800 (PST),spinoza1111
[snip]
Um, the stack of the threads is where you typically put cheap per-
thread data. =3D3DA0Otherwise you allocate it off the heap. =3D3DA0= In the =3D
case of
the *_r() GNU libc functions they store any transient data in the
structure you pass it. =3D3DA0That's how they achieve thread safety= .
It's a clumsy and old-fashioned method, not universally used. It also
has bugola potential.
You see, the called routine is telling the caller to supply him with
"scratch paper". This technique is an old dodge. It was a requirement
in IBM 360 BAL (Basic Assembler Language) that the caller provide the
callee with a place to save the 16 "general purpose registers" of the
machine.
The problem then and now is what happens if the caller is called
recursively by the callee as it might be in exception handling, and
the callee uses the same structure. It's not supposed to but it can
happen.
He's not talking about the technique used in BAL etc. =3DA0The
transient data is contained within a structure that is passed by
the caller to the callee. =3DA0The space for the structure is on the
stack. =3DA0Recursion is permitted.
If control "comes back" to the caller who has stacked the struct and
the caller recalls the routine in question with the same struct, this
will break.
This isn't right; I dare say the fault is mine for being unclear.
C uses call by value; arguments are copied onto the stack. =A0The
upshot is that callee operates on copies of the original
variables. =A0This is true both of elemental values, i.e., ints,
floats, etc, and composite values, i.e., structs.
So, when the calling sequence contains a struct a copy of the
struct is placed on the stack. =A0The callee does not have access
to the caller's struct. =A0To illustrate suppose that foo calls bar
and bar calls foo, and that foo passes a struct to bar which in
turn passes the struct it received to foo. =A0There will be two
copies of the struct on the stack, one created when foo called
bar, and one created when bar called foo.
Correct. And foo sees bar's scratchpad memory, which is a security
exposure. It creates opportunities for fun and games.
I'm foo. I have to pass bar this struct:
struct { int i; char * workarea; }
i is data. workarea is an area which bar needs to do its work. bar
puts customer passwords in workarea. Control returns to foo in an
error handler which is passed the struct. foo can now see the
passwords.
Because you violate encapsulation, you have a security hole, right?
Better to use an OO language in which each invocation of the stateful
foo object gets as much memory as it needs.
Let me know if I am missing anything.
As an initial remark, your example is not the kind of thing that
St. Denis was discussing. The "per-thread data" is data that is
passed down to routines in a thread. The top level routines in
the thread get passed a copy of the per-thread data struct; in
turn they pass copies down to the routines they call. Either the
struct has no pointers at all or, if it does have pointers, they
are opaque pointers. There is no path for passing data up.
As a second initial remark, your example should be qualified.
Generally speaking, functions that need scratch space provide it
internally. For the sake of argument, let's suppose that there
is some good reason for supplying scratch space.
You don't "have to pass bar this struct" - there are a number of
alternatives, perhaps too many. Here are some:
(1) If we know the size of work area we can make it an array.
The struct is:
struct stuff {
int i;
char workarea[size];
}
Foo and bar will have separate copies of the workarea. The
upside is that there is no security hole. The down side is that
there will be two copies on the stack.
(2) We can malloc the space in the calling sequence. The
following won't pass muster in a code review but it has the
general idea:
struct stuff {int i; void * workarea;} barbill;
barbill.workarea = malloc(size);
bar(barbill);
In this version, bar frees the workarea. The upside is that
there is only one copy of the work area. The downside is that
bar must free the work area space.
(3) Bar can zero out the space when it is done with it.
(4) Foo does not call bar directly; instead it calls an inteface
routine with a handle to bar as an argument; the interface
routine calls bar and takes care of zeroing out the space.