Stack usage

  • Thread starter Michael Sig Birkmose
  • Start date
M

Malcolm

Richard Bos said:
You don't understand what he's saying. There is _no_ way to
predict, in advance, what path an arbitrarily complex program with
arbitrary human input is going to take. It is tantamount to the halting
problem.
Comparing the drawing of a call-tree, even of a large program, to the
halting problem is ridiculous.
There are problems with recursive functions, and with function pointers
(this last more theoretical than a real issue), but for most structured
programs it should not be too difficult.
 
M

Michael Wojcik

Run time efficiency. Maintaining a stack-like data structure has lower
overhead than the alternatives.

I figured that was what you meant. However, run-time efficiency is
not the sole metric of quality, and an implementation could prefer
a mechanism other than a (contiguous) stack for at least two reasons:
robust response to undefined behavior, and providing extensions to
the language (such as call-with-current-continuation).

Indeed, I think a C implementation that used non-contiguous activation
records with guard pages for function calls would be very handy
indeed, for development code - similar to an Electric Fence but for
activation records (and anything stored adjacent to them).

--
Michael Wojcik (e-mail address removed)

Unlikely predition o' the day:
Eventually, every programmer will have to write a Java or distributed
object program.
-- Orfali and Harkey, _Client / Server Programming with Java and CORBA_
 
M

Martin Dickopp

Malcolm said:
Comparing the drawing of a call-tree, even of a large program, to the
halting problem is ridiculous.

"Large" is not the same as "arbitrary". In fact, a program small enough
that it can be posted here suffices to show that drawing a call tree for
an arbitrary program is impossible.

Image a function `p' with the following prototype:

int p (const char *);

This function should accept C source code (in the form of a string) as
an argument. I don't require it to draw a complete call tree; all it
has to do is to determine whether the code passed as argument calls a
function `f'. If yes, it should return a nonzero value, otherwise zero.
Do you think such a function could in principle be written?

Consider the following translation unit. It should be linked together
with the implementation of the `p' function, and it should be stored
in a file `t.c' (so that it can read its own source code by reading a
text file `t.c').

#include <stdio.h>
extern int p (const char *);
void f (void) { }

int main (void)
{
FILE *fp;
char mycode [10000];

fp = fopen ("t.c", "r");
if (fp != NULL)
{
const size_t n = fread (mycode, 1, sizeof mycode - 1, fp);
fclose (fp);
mycode [n] = '\0';
if (!p (mycode))
f ();
}

return 0;
}

Let's assume for simplicity that no read error occurs when the program
reads its own source code. What will `p' return? Will it claim that
the code calls `f' or will it claim that it doesn't?

Martin
 
M

Malcolm

"Large" is not the same as "arbitrary". In fact, a program small
enough that it can be posted here suffices to show that drawing a
call tree for an arbitrary program is impossible.

Image a function `p' with the following prototype:

int p (const char *);

This function should accept C source code (in the form of a string) as
an argument. I don't require it to draw a complete call tree; all it
has to do is to determine whether the code passed as argument calls a
function `f'. If yes, it should return a nonzero value, otherwise zero.
Do you think such a function could in principle be written?

Consider the following translation unit. It should be linked together
with the implementation of the `p' function, and it should be stored
in a file `t.c' (so that it can read its own source code by reading a
text file `t.c').

#include <stdio.h>
extern int p (const char *);
void f (void) { }

int main (void)
{
FILE *fp;
char mycode [10000];

fp = fopen ("t.c", "r");
if (fp != NULL)
{
const size_t n = fread (mycode, 1, sizeof mycode - 1, fp);
fclose (fp);
mycode [n] = '\0';
if (!p (mycode))
f ();
}

return 0;
}

Let's assume for simplicity that no read error occurs when the program
reads its own source code. What will `p' return? Will it claim that
the code calls `f' or will it claim that it doesn't?
Firstly you often generate logical conundrums by applying a function (or a
statemement) to itself, eg "Everything I say is a lie".

Secondly, we want the call tree, so we simply assume that a conditional
branch can go each way. This is even in the case of statements such as if(
pow(x, 2) >= 0). We do not need to know if a function is actually called.
Normally we assume that if the programmer has put in a conditional branch
then he doesn't know which will be true at compile time.

In your example, f() is clearly part of the call tree. If p() returns false
it is taken. We don't need to know about the internals of p().
 
J

James Kanze

(e-mail address removed) (Dan Pop) writes:

|> In <[email protected]> CBFalconer <[email protected]>
|> writes:
|> >Peter Nilsson wrote:
|> >>> Michael Sig Birkmose wrote:

|> >>>| Does anyone know, if it is possible to meassure the maximum
|> >>>| stack usage of a C program throughout it's entire execution?

|> >>> It may be, but it is irrelevant here. As far as standard C is
|> >>> concerned, there is no such thing as a 'stack'. ...

|> >> Well, yes there is. It just isn't ever explicitly called a stack.
|> >> But the behaviour of function calls is well defined and
|> >> consistent with the notion of stacks.

|> >Nonsense. I can fairly easily design a machine with no stack, no
|> >call instruction, and have it execute C programs.

|> You forgot to engage your brain while reading Peter Nilsson's post.

|> It doesn't matter how function calls work, implementing C without at
|> least one stack data structure, even if conceivable, is not going to
|> happen, due to QoI reasons.

Well, by definition the data structure for auto variables and return
addresses is a stack. How the stack is implemented, of course, is
implementation defined -- I once used a C compiler in which function
prologues called malloc to obtain the local stack frame.
 
R

Richard Bos

Malcolm said:
Comparing the drawing of a call-tree, even of a large program, to the
halting problem is ridiculous.

It's not ridiculous, it's correct. It's nothing to do with mere size,
btw, and everything with logical complexity. If you want an accurate
prediction of how much memory a program could take, you have to be able
to predict which program paths it takes, and that means you have to
predict where it halts.
If, OTOH, you're satisfied with a rough guess, well, just do a
good-sized run, ask your OS how much memory it takes now, then double
that amount. Much simpler.

Richard
 
C

CBFalconer

Richard said:
It's not ridiculous, it's correct. It's nothing to do with mere
size, btw, and everything with logical complexity. If you want an
accurate prediction of how much memory a program could take, you
have to be able to predict which program paths it takes, and that
means you have to predict where it halts.

If, OTOH, you're satisfied with a rough guess, well, just do a
good-sized run, ask your OS how much memory it takes now, then
double that amount. Much simpler.

On the contrary, speaking C, and ignoring memory usage via malloc
and friends, and also (IMPORTANT) assuming no recursion, we can
find maximum local (automatic) memory use for each function, and
then deduce the maximum usage from the call tree. The values will
not be portable, i.e. they will be system and compiler dependant.

Simple recursive functions can usually be also measured if they
properly protect themselves against bad calls, by examining the
conditions that terminate recursion. For example, a recursive
function to output a char representation of an integer is
intrinsically limited by the size of an integer, and thus the
maximum number of digits possible.

Once mutual recursion is encountered things become much murkier.
The comparison to the halting problem is valid for arbitrary
programs, but most code is not arbitrary, or at least we hope so.
 
I

I. Appel

CBFalconer said:
Richard Bos wrote:
On the contrary, speaking C, and ignoring memory usage via malloc
and friends, and also (IMPORTANT) assuming no recursion, we can
find maximum local (automatic) memory use for each function, and
then deduce the maximum usage from the call tree. The values will
not be portable, i.e. they will be system and compiler dependant.

People, you're reinventing the bycicle! Fortran 77 was designed to make
programs portable to stackless iron, so it permitted no recursion and
AFAIR no equivalent to malloc.

Ivan.
 
O

ozbear

It's not ridiculous, it's correct. It's nothing to do with mere size,
btw, and everything with logical complexity. If you want an accurate
prediction of how much memory a program could take, you have to be able
to predict which program paths it takes, and that means you have to
predict where it halts.
If, OTOH, you're satisfied with a rough guess, well, just do a
good-sized run, ask your OS how much memory it takes now, then double
that amount. Much simpler.

Richard


For a crude overestimate, assuming no recursion, one could just add
up all the automatic storage used in all functions irrespective
of whether they are ever called or not, then tack on the rest of
the variables (file-scope variables, statics, etc.).

Oz
 
A

August Derleth

People, you're reinventing the bycicle! Fortran 77 was designed to make
programs portable to stackless iron, so it permitted no recursion and
AFAIR no equivalent to malloc.

I believe Fortran 77 allowed recursion, but earlier versions did not.

And Fortran doesn't have malloc for the same reason Pascal and Perl don't:
It's operating too far above the machine level for malloc to make sense.
The compiler is expected to figure out the low-level stuff.
 
I

I. Appel

August Derleth said:
I believe Fortran 77 allowed recursion, but earlier versions did not.

It didn't.
And Fortran doesn't have malloc for the same reason Pascal and Perl don't:
It's operating too far above the machine level for malloc to make sense.
The compiler is expected to figure out the low-level stuff.

I denoted not the low-level memory blocks allocation, but general use of as
much memory
as you want. In Perl you can manage as much memory as you want, while in
Pascal
(and AFAIR in Fortran 77) all you can is to use arrays, whose size is
defined in the
compilation time.

Ivan.
 
M

Michael Wojcik

It's not ridiculous, it's correct. It's nothing to do with mere size,
btw, and everything with logical complexity. If you want an accurate
prediction of how much memory a program could take, you have to be able
to predict which program paths it takes, and that means you have to
predict where it halts.

I think there's a difference of definition here. I believe Malcolm
is referring not to a map of possible program paths, but a static
call graph.

That is, he's referring to a directed graph (it's only a tree for
programs where each function has exactly one caller, and all
functions are reachable from main - not the usual case) where nodes
are functions in the program, and there's an edge from node A to node
B if function A calls function B (directly or through a function
pointer).

Since C does not allow the dynamic construction of function calls,
it's possible to construct such a graph by static analysis of the
source. Whether a given edge will actually be followed in any run of
the program is a separate question (and is equivalent to the Halting
Problem).

For a C program which is non-recursive, the call graph will be
acyclic. If nodes are weighted by their automatic storage
requirement, then the maximum weighted depth of the graph can be
computed.

However, there's no mechanism in standard C for computing the
automatic storage requirement of a function; many interesting C
programs are recursive; and in many cases not all of the source is
available for analysis. And even for programs which satisfy the
latter two conditions and an implementation which supplies sufficient
information for the first, all you know is the upper bound on
automatic storage, which is unlikely to be particularly useful in
most cases. So the prospects for exact analysis of automatic storage
requirements are not good.

As a side note, the non-recursive constraint can be avoided. Any
recursive program can be algorithmically transformed into an
iterative one, by rewriting mutually-recursive functions as a single
function, singly-recursive functions with non-tail recursion as
tail-recursive continuation-passing functions, and finally
tail-recursive functions as iterative. Since C doesn't directly
support continuation-passing style, such a transformation would
require a manual implementation of it - which would mean, in effect,
implementing a CPS-capable language on top of C. Hardly worth the
effort. And all that does is transform auto-storage requirement into
direct-allocated storage requirement anyway, which is a zero-sum or
worse tradeoff in most environments.

--
Michael Wojcik (e-mail address removed)

[After the lynching of George "Big Nose" Parrot, Dr. John] Osborne
had the skin tanned and made into a pair of shoes and a medical bag.
Osborne, who became governor, frequently wore the shoes.
-- _Lincoln [Nebraska] Journal Star_
 
R

Rob Thorpe

August Derleth said:
I believe Fortran 77 allowed recursion, but earlier versions did not.

No, Fortran 77 does not require recursion to be supported. Fortran 90
does as long as the function is marked as recursive AFAIK.
And Fortran doesn't have malloc for the same reason Pascal and Perl don't:
It's operating too far above the machine level for malloc to make sense.

Fortran is not operating on a much different level to C. Allocatable
arrays plus the ALLOCATE and DEALLOCATE statements are used to give
the same effect as malloc and free. It also supports variable length
arrays that are allocated automatically, as C99 does.
 
C

CBFalconer

August said:
.... snip ...

I believe Fortran 77 allowed recursion, but earlier versions did
not.

And Fortran doesn't have malloc for the same reason Pascal and
Perl don't: It's operating too far above the machine level for
malloc to make sense. The compiler is expected to figure out the
low-level stuff.

Pascal has the equivalent of malloc. Try "new(p);" where p is a
pointer type. This is almost the equivalent of a c macro:

#define new(p) p = malloc(sizeof *p)

it exists until "dispose(p);"
 
D

Dan Pop

In said:
I believe Fortran 77 allowed recursion, but earlier versions did not.

Nope, it didn't.
And Fortran doesn't have malloc for the same reason Pascal and Perl don't:

Pascal does have *explicit* support for dynamic memory allocation.
It's operating too far above the machine level for malloc to make sense.
The compiler is expected to figure out the low-level stuff.

Nonsense! It's a language issue, not a compiler issue. If the language
doesn't provide support for dynamically sized arrays, the compiler has
nothing to figure out. Many F77 implementations supported a Cray
extension that provided malloc functionality. F90 supports dynamic
memory allocation as a feature long awaited for by the Fortran community.

Dan
 
D

Dan Pop

In said:
On the contrary, speaking C, and ignoring memory usage via malloc
and friends, and also (IMPORTANT) assuming no recursion, we can

malloc and friends are a non-issue when measuring/evaluating the amount of
automatically allocated memory, anyway.
find maximum local (automatic) memory use for each function, and
then deduce the maximum usage from the call tree. The values will
not be portable, i.e. they will be system and compiler dependant.

C99 VLAs render this approach useless: you must simulate the program
execution in order to figure out their sizes.

Dan
 
I

I. Appel

CBFalconer said:
Pascal has the equivalent of malloc. Try "new(p);" where p is a
pointer type. This is almost the equivalent of a c macro:

#define new(p) p = malloc(sizeof *p)

it exists until "dispose(p);"

I'm not sure it was in original Wirth's Pascal and I'm not sure it is
suitable for creating dynamic arrays.

Ivan.
 
M

Malcolm

Michael Wojcik said:
I think there's a difference of definition here. I believe Malcolm
is referring not to a map of possible program paths, but a static
call graph.
Exactly. Consider this

long x, y, z;
scanf("%ld %ld %ld", &x, &y, &z)

x = clamp(x, 2, CUBEROOTLONGMAX);
y = clamp(y, 2, CUBEROOTLONGMAX);
z = clamp(z, 2, CUBEROOTLONGMAX);

if( x*x*x + y*y*y == z*z*z)
thisisnevercalled();


The function call can never be taken, regardless of the size of long (assume
CUBEROOTLONGMAX is calculated dynamically).

However how is an automatic simulator going to determine that? In fact the
average programmer probably couldn't prove it, unless he has a maths degree.

It is also not of practical use. We can imagine a young programmer writing
exactly this function to test Fermat's theorem for himself. If for some
reason he is short of stack space, he will want to know that the function
thisisnevercalled() doesn't overflow the stack.
 
J

James Kanze

|> |> > > And Fortran doesn't have malloc for the same reason Pascal and
|> > > Perl don't: It's operating too far above the machine level for
|> > > malloc to make sense. The compiler is expected to figure out the
|> > > low-level stuff.

|> > Pascal has the equivalent of malloc. Try "new(p);" where p is a
|> > pointer type. This is almost the equivalent of a c macro:

|> > #define new(p) p = malloc(sizeof *p)

|> > it exists until "dispose(p);"

|> I'm not sure it was in original Wirth's Pascal and I'm not sure it
|> is suitable for creating dynamic arrays.

It was in Wirth's Pascal, but a lot of early Pascal compilers skipped
it.
 
A

August Derleth

C99 VLAs render this approach useless: you must simulate the program
execution in order to figure out their sizes.

And hope nobody passes in a value too big to be used to size a VLA of a
given type. How big is too big? Only your debugger (or maintenance
programmer) knows for sure. Until then, you have no way of checking that
your next run won't cause all sorts of crap to happen.

At least malloc() gives me a fighting chance at handling errors in memory
allocation. You can quote at me all the systems which will over-commit to
Hell and back so malloc() can never fail*, but all things equal, I'd
rather have /something/ to test instead of being forced to hope nothing
blows up.

VLAs are the new gets().

*Yes, I'm following that thread with rapt interest.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,143
Messages
2,570,822
Members
47,368
Latest member
michaelsmithh

Latest Threads

Top