Newsgroups: comp.lang.ada,comp.lang.c++,comp.realtime,comp.software-eng
Subject: Re: Teaching new tricks to an old dog (C++ -->Ada)
References: <
[email protected]> <
[email protected]> <
[email protected]> <
[email protected]> <
[email protected]> <
[email protected]> <
[email protected]>
From: Robert A Duff <
[email protected]>
Organization: The World Public Access UNIX, Brookline, MA
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2
Date: 14 Mar 2005 17:35:28 -0500
Message-ID: <
[email protected]>
Lines: 139
--text follows this line--
CTips said:
My knowledge of this is a little dated and hazy, so I could be wrong.
If we're doing something like:
foo()
{
int x, y;
bar()
{
int y;
gah()
{
use(x), use(y);
}
}
}
then, at run-time, in use(x), x comes from the stack-frame of the
nearest invocation of foo() on the stack, and in use(y), y comes from
the stack-frame of the nearest invocation of bar().
Right. Note that multi-level nesting as in this example is rare in
languages that allow nesting. (And of course nonexistent in languages
that don't!) The vast majority of procedures are not nested (level 0),
some are nested (level 1), and extremely few are more nested (level 2 or
more). So to implement this stuff efficiently, the compiler writer
should keep this in mind.
There has to be a mechanism to identify the nearest invocation of
foo()/bar(). There are, if I remember correctly, 4 mechanisms to find
the invocation:
- dynamic chaining: you just follow the back-chain pointers, stepping
through all stack frames, till you come to the right stack frame.
I don't know of any compilers that do that. To make that work, you'd
have to have some way to identify the "right" stack frame, and I can't
think of any way to do that efficiently.
- static chianing: you maintain a separate chain (the static chain) that
directly link to the stack frame of the enclosing functions. Thus the
static chain for gah() would have the last invocation of bar() followed
by the last invocation of foo().
That's the method I prefer. The best way to think about it is: the
static link is an extra parameter passed to the called function.
Each nested function needs a parameter that somehow represents its
context.
Given that multi-level nesting is rare, the chain is usually length 1
(or we don't need to do anything at all, for the nonnested ones).
- display: somewhat like the static chain, except its an array instead
of a list
To me, that seems like an attempt to optimize multi-level nesting, which
is why I don't prefer it.
- currying (?): this one I'm really hazy about, but, roughly, you passed
pointers to the correct uplevel variables as extra arguments to the
functions. Thus bar would be passed the pointer to x and gah would be
passed pointers to x and y, as well as their other arguments, or
something like that.
Yes, something like that makes sense; I view it as an optimization of
static chains. It's not what I know as "currying", though, which is
something completely different. In the example you gave, you can pass x
and y, not pointers to them.
There were some additional nastinesses dealing with what happens when a
nested function is passed as an argument
It's really no big deal. When passing a nested procedure to an outer
one, you need to pass an extra parameter indicating the enclosing
context. When using static chains, this is just the static link.
Or, in some cases, optimize by doing what you called "currying" above.
(Which is just like optimizing by passing the components of a struct
instead of passing a pointer to it.)
Note that Ada distinguishes (syntactically) the case where the passed
procedure can be nested, and the case where it cannot. The case where
it cannot is handled exactly as in C -- pass just the address of the
procedure's code.
(Actually, that's an oversimplification -- the nonnested case is really
the same-nested case, which includes the C case of nonnested.)
Or, in many cases, inline the whole mess, and there's no call overhead
at all.
Yes, there is some cost -- it complexifies the compiler, for one thing.
But the run-time cost is really near zero.
By the way, there's another method you didn't mention. The gcc dialect
of C supports nested functions (and passing those as parameters), and
they're implemented using trampolines.
Depending on the techinques, you either pay whenever you access
variables of enclosing functions, or you pay to maintain the
data-structures [static-chain,display] which make this access cheaper,
or both.
Right. But in most cases, you pay nothing, and in the cases where you
pay, you're getting something for it (like, not having to pass extra
parameters, which you would have to pay for instead).
On some implementations (on RISC architectures) a register is reserved
for the static-chain. This means one less register for *every* function
(or maybe only for enclosed functions?) when used with a language that
support this kind of nested functions.
Only for nested functions. And when passing functions as parameters, if
the function *might* be nested (which, in Ada, the compiler knows).
C++ probably left it out because of its C heritage,
Yes.
... while Ada probably
dropped it in because of its Pascal heritage.
Perhaps, but it's not just Pascal. It's pretty much every language from
Algol and Lisp onward. This feature is really extremely useful!
... IMHO, its probably not
worth the cost.
Well, my philosophy is: the (run-time) cost is irrelevant. If you need
some feature, you need it, and if it's not in the language, you have to
implement it yourself, and that will cost at least as much. What's
relevant is distributed overhead: if you pay the cost for a feature when
you *don't* use a feature, then that's a good reason to leave it out of
the language. IMHO, to argue against nested procedures, you have to
either argue that they're not very useful, or you have to argue that
they cause distributed overhead (cost when you don't need them). You
can't just argue that they're slow, because the alternatives
(implemented by hand) are just as slow.
- Bob