Avoiding recursive stack overflow in C on Unix/Linux?

B

boltar2003

You're not taking into account the stack space above the call to main(),
which will include some amount of space used by Libc plus the command
line and environment.

Ah yes, of course. Doh!

Would there be any way to find out how much space they take up or alternatively
a way to find the starting base address of the stack when the process was
first exec'd?

B2003
 
R

Richard Kettlewell

Ah yes, of course. Doh!

Would there be any way to find out how much space they take up or alternatively
a way to find the starting base address of the stack when the process was
first exec'd?

You could extend the stack down (or up as the case may be) until you get
a segfault and round the last good address to the page size. Not very
efficient though.

If you have access to the process's memory map you may be able to
identify the stack (e.g. /proc/self/maps on Linux, possibly analogous
interfaces on other platforms).

Split stack tricks like the gcc feature mentioned earlier will probably
betray you whatever you do...
 
J

James Kuyper

On May 5, 11:37 am, (e-mail address removed) wrote:
Hello
If a program recurses too deeply there's always a chance that it could
run out of stack space and die with a SIGSEGV. Is there any API that can
tell you how much stack space is left or some other method to prevent this
in C/C++? I realise I could catch the signal but thats a pretty damn ugly
way to do it.

The only proper way to avoid stack-overflows is to prevent it from
happening in the first place.

And how do you do that? If the stack size is sufficiently limited, even

� � � � int main(int argc, char*argv[]) { return 0; }

could overflow the stack.

That would be _very_ bad design, omitting to consider the basic
processing needs of the system.
You can take measures to reduce the likelihood of overflow, but there's
no easy way to know whether you've done enough.

Nope, there isn't. But then again, there's no easy way to produce any
software. Designing software _is_ hard and it will remain hard. Anyone
offering an "easy" solution to this kind of considerartions is either
a fool or a fraud.

I'm not complaining about it being too much work to deal with this
problem; I'm complaining about the fact that there's absolutely no
portable way to deal with it, at all. There's not too many non-portable
ways, either, and none that I'm aware of that are not prohibitively
expensive in terms of developer time.
True. But then again, considering the platform(s) your software will
run on is an integral part of the design process.

There are programs whose purpose is inherently platform specific;
failing to take into consideration platform-specific features is indeed
a design flaw for such programs. There are other programs, such as the
ones I work on, that are required to be portable to wide variety of
platforms.

My own programs are allowed to assume support for C90, a correspondingly
early version of POSIX, and three widely ported third-party libraries.
They are required to either work correctly or fail gracefully (if, for
instance, malloc() returns a null pointer) on any platform which meets
those requirements. Thankfully, one of those libraries (HDF) handles a
lot of the portability issues itself, freeing my code to concentrate on
other issues. If my code has to take into consideration
platform-specific characteristics that cannot be tested using code that
meets these requirements, then it has violated a design requirement.

It shouldn't be difficult to meet such a requirement. The fact that C
provides no mechanism for determining whether or not you can safely call
a particular function without overflowing the stack makes it
unnecessarily difficult to do so.
Nice. So what?

So, it would be ridiculous to blindly prohibit the use of techniques
that take advantage of the capabilities of such platforms, just because
they won't work on other platforms. It's a problem with C (though it's
not unique to C) that it provides no way for my code to determine
whether it can safely take advantage of such capabilities.
Avoid recursion if at all possible, for instance by using things like
finite state machines for parsing data. Recursion is quite costly. I'm
surprised i have to explain this in this forum.

And no, it's not _always_ possible to avoid recursion, not is it
_always_ wise to do so, but as a rule of thumb, it's good.

I'm not strongly tempted by recursion, except for traversal of certain
tree-like data structures - an issue which has seldom come up in my work.

But arbitrarily avoiding recursion makes as much sense to me as
arbitrarily avoiding binary trees, or any other design option.
 
J

James Kuyper

On 6 mei, 10:59, (e-mail address removed) wrote: ....

Bravo! That's he first sensible objection I've read in this thread,
and you're right. Recursive algorithms are usually easier to grasp.

I think that this objection was considered so obvious by many of us that
we didn't feel it needed emphasizing. It was referred to indirectly in
some of the other messages people did post. As it turns out, we were
right: you were in fact fully aware of the issue, and reached your
conclusion despite that fact.
 
K

Kleuskes & Moos

On 05/05/2011 03:24 PM, Kleuskes & Moos wrote:
On May 5, 11:37 am, (e-mail address removed) wrote:
Hello
If a program recurses too deeply there's always a chance that it could
run out of stack space and die with a SIGSEGV. Is there any API thatcan
tell you how much stack space is left or some other method to prevent this
in C/C++? I realise I could catch the signal but thats a pretty damnugly
way to do it.
B2003
The only proper way to avoid stack-overflows is to prevent it from
happening in the first place.
And how do you do that? If the stack size is sufficiently limited, even
int main(int argc, char*argv[]) { return 0; }
could overflow the stack.
That would be _very_ bad design, omitting to consider the basic
processing needs of the system.
Nope, there isn't. But then again, there's no easy way to produce any
software. Designing software _is_ hard and it will remain hard. Anyone
offering an "easy" solution to this kind of considerartions is either
a fool or a fraud.

I'm not complaining about it being too much work to deal with this
problem; I'm complaining about the fact that there's absolutely no
portable way to deal with it, at all. There's not too many non-portable
ways, either, and none that I'm aware of that are not prohibitively
expensive in terms of developer time.

Of course there are portable ways of dealing with it. For instance:
it's simple enough to pass long an extra integer parameter and
decrementing some initial setting on every recursion. If the counter
reaches zero, print an error message and abort. Portable as hell. The
only thing you need to take a stab at are sensible initial values,
high enough to allow wellbehaved programs to run, but low enough not
to allow runaway recursion to blast the stack to kingdom come.
There are programs whose purpose is inherently platform specific;
failing to take into consideration platform-specific features is indeed
a design flaw for such programs. There are other programs, such as the
ones I work on, that are required to be portable to wide variety of
platforms.

Which is difficult and takes different and way more consideration of
the platforms involved.
My own programs are allowed to assume support for C90, a correspondingly
early version of POSIX, and three widely ported third-party libraries.
They are required to either work correctly or fail gracefully (if, for
instance, malloc() returns a null pointer) on any platform which meets
those requirements. Thankfully, one of those libraries (HDF) handles a
lot of the portability issues itself, freeing my code to concentrate on
other issues. If my code has to take into consideration
platform-specific characteristics that cannot be tested using code that
meets these requirements, then it has violated a design requirement.

Wonderful. Sounds like you're doing your job. I wasn't talking about
introducing anything platform specific, though. Hence the '(s)' at the
end of 'platform(s)'. And judging by what you write, you ponder your
target platforms as diligently as any good programmer would. Otherwise
your would not choose C90.
It shouldn't be difficult to meet such a requirement. The fact that C
provides no mechanism for determining whether or not you can safely call
a particular function without overflowing the stack makes it
unnecessarily difficult to do so.



So, it would be ridiculous to blindly prohibit the use of techniques
that take advantage of the capabilities of such platforms, just because
they won't work on other platforms.

Replacing a recursive algorithm with a non-recursive one is not
detrimental to performance on high-volume machines. In fact, it may
improve performance quite dramatically, depending on what gets
replaced with what.

Besides, i'm not 'prohibiting' anything, 'blindly' or otherwise. I'm
providing a simple bit of advice: 'lay off recursion, if at all
possible. Often, there are other ways'.
It's a problem with C (though it's
not unique to C) that it provides no way for my code to determine
whether it can safely take advantage of such capabilities.

So pick another language. Besides, you realize that you are argueing
my point? That is if 'such capabilities' did indeed refer to
'employing recursion safely'.

Depending on your definition of 'safety' recursion can be safely
employed. If the greatest risk you run is getting a SIGSEGV and you
and your customer can live with that, recursion isn't really a
problem, especially since the risk of an actual stack-overflow and
consequent SIGSEGV is mostly hypothetical in many cases.

In other cases, it may kill someone.
I'm not strongly tempted by recursion, except for traversal of certain
tree-like data structures - an issue which has seldom come up in my work.

But arbitrarily avoiding recursion makes as much sense to me as
arbitrarily avoiding binary trees, or any other design option.

The word "arbitrarily" should be taken out of that sentence and shot
at dawn. That's the second time your argumen is maily in the
adjectives and i resent that. There is nothing arbitrary about it.
 
K

Kleuskes & Moos

What about tail-calls? Sure not all recursive functions use tail-calls,
but for tail-calls gcc (and I guess other good compilers) do
tail-call-optimization. Can they be considered safe?

Who do you think i am? The Omniscient Oracle of C-Implementations?
well i'm not. If tail-recursion can be optimized away, then i guess it
should be no problem on platforms that actually do that. But if
portability is a issue, then it may produces a lot of other headaches,
since not ll compiler may support it, or support it in a subtly
different way.

RTFM, would be my advice, and use your own brain instead of mine.
 
K

Kleuskes & Moos

I think that this objection was considered so obvious by many of us that
we didn't feel it needed emphasizing.

It's always nice to provide your own applauding audience, so unless
the 'we' is a 'pluralis maiestatis' i'm no likely to take that very
seriously.
It was referred to indirectly in
some of the other messages people did post. As it turns out, we were
right: you were in fact fully aware of the issue, and reached your
conclusion despite that fact.

Of course i was aware of the issue, and yes i reached that onclusion
in spite of the fact that non-recursive algorithm's tend to be a bit
less intuitive.

But since you have already (and wrongly, i might add) argued that
recursive algorithms are unsafe to begin with i still feel my position
is sound on this matter.
 
M

Michael Press

Well its always theoretically possible to re-write recursive code in an
iterative format but generally it leads to any reasonably complex recursive
code becoming an unreadable mess.

Does not have to. Implement a dynamically growable list
with caching and memory recovery. Design a data
structure for the data in the list. Design for list
growing failure. Write push and pop. Presto! Chango!
You have written a recursive function.
 
J

James Kuyper

Of course there are portable ways of dealing with it. For instance:
it's simple enough to pass long an extra integer parameter and
decrementing some initial setting on every recursion. If the counter
reaches zero, print an error message and abort. Portable as hell.

That has nothing to do with the issue I was referring to.
... The
only thing you need to take a stab at are sensible initial values,

That is the issue I was referring to: there's so portable solution to
determining what initial values are "sensible".
Wonderful. Sounds like you're doing your job. I wasn't talking about
introducing anything platform specific, though. ...

Any non-zero "sensible initial value' is definitely platform specific.
... Hence the '(s)' at the
end of 'platform(s)'. And judging by what you write, you ponder your
target platforms as diligently as any good programmer would. Otherwise
your would not choose C90.

Don't give me the credit for that decision; it was made by our clients
at a time when C90 was half the age that C99 is now. At the time they
made that choice, C90 was substantially more widely supported than C99
is now, but it was still a decision that cost them significant
portability at that time. If it were up to me, right now, I'd choose C99
and C++98.
Replacing a recursive algorithm with a non-recursive one is not
detrimental to performance on high-volume machines. In fact, it may
improve performance quite dramatically, depending on what gets
replaced with what.

The main reason I'd go for a recursive algorithm, when appropriate,
would be maintainability, not performance, because the code would be
easier to write and easier to read. My comments about those
platform-specific capabilities is that I don't like being forced away
from that choice by reason of the existence of platforms where there
wasn't enough stack to support the recursion, unless it's actually
necessary to port my code to such platforms.
Besides, i'm not 'prohibiting' anything, 'blindly' or otherwise. I'm
providing a simple bit of advice: 'lay off recursion, if at all
possible. Often, there are other ways'.

Your use of "if at all possible" is a bit disingenuous. As I'm sure you
already know, it's always possible to avoid recursion; it's never a
mandatory approach - just a convenient one.
So pick another language.

I'm not aware of any other language that provides a mechanism for
avoiding or dealing with this problem, either. There might be one, but I
don't know it's name.
... Besides, you realize that you are argueing
my point? That is if 'such capabilities' did indeed refer to
'employing recursion safely'.

I was thinking more broadly than that: "employing function calls
safely". C provides no mechanism for determining whether or not there's
enough stack left to safely call any function - recursively or not -
note even the initial call to main(). Nor does it provide a mechanism
for dealing with the consequences, if there's not enough stack.
As I said above, I don't know of any other language that provides such
mechanisms, either.

....
The word "arbitrarily" should be taken out of that sentence and shot
at dawn. That's the second time your argumen is maily in the
adjectives and i resent that. There is nothing arbitrary about it.

Your choice seems insufficiently motivated to me; "arbitrary" is a fine
word for conveying that judgment. It might be a mistaken judgment, but
then the fault is in the judgment, not in the word used to express it.
 
K

Kleuskes & Moos

That has nothing to do with the issue I was referring to.

In that case, you failed to make it clear what you were talking about.
I took it to be "applying recursion safely", but if it did not,
please...
That is the issue I was referring to: there's so portable solution to
determining what initial values are "sensible".

Usually there are. You can inspect the application domain and make an
estimate how deep the recursion could sensibly be. It is application
dependent, but then again, any software is application specific.

Any non-zero "sensible initial value' is definitely platform specific.

Nope, it's application specific, and merely assumes the platform has
enough memory to cope, and that's an assumption inherent to any
program, even your ultra-portable stuff.

If you're not capable of making an estimate of the maximum depth of
reccusion which will occur and how that jives with your target
platforms, every installation of your software is equivalent to just
rolling the dice.

And you don't strike me as a gambler. You rely on SIGSEGV so cap it, I
suggest you cap it yourself, since SIGSEGV is a luxury item on many
platforms.
Don't give me the credit for that decision; it was made by our clients
at a time when C90 was half the age that C99 is now. At the time they
made that choice, C90 was substantially more widely supported than C99
is now, but it was still a decision that cost them significant
portability at that time. If it were up to me, right now, I'd choose C99
and C++98.

And why? Perhaps because you know that version is supported on all
your target platforms? If you're not sure that it's supported across
all your targets, your choice would be irresponsable, and you do not
strike me as being that.

Besides, my point was only to note that target platforms must be
considered, and i consider it made and acknowledged.

The main reason I'd go for a recursive algorithm, when appropriate,
would be maintainability, not performance, because the code would be
easier to write and easier to read. My comments about those
platform-specific capabilities is that I don't like being forced away
from that choice by reason of the existence of platforms where there
wasn't enough stack to support the recursion, unless it's actually
necessary to port my code to such platforms.

Nobody's forcing anybody. You're right o consider readability and
maintainability important objectives. As a matter of fact, so do i.

However in my experience, the difference between recursive and non-
recursive algorithm's isn't that big. Unless of course you merely
produce a kludge instead of a design.
Your use of "if at all possible" is a bit disingenuous. As I'm sure you
already know, it's always possible to avoid recursion; it's never a
mandatory approach - just a convenient one.

Perhaps, and i tend to disagree without your providing some sort of
proof for that suggested theorem. It' also a costly and (at times)
risky one. For the reasons me, you and others have already mentioned.
The risk is in fact what gave rise to this thread.

The advice i provided still seems reasonable.
I'm not aware of any other language that provides a mechanism for
avoiding or dealing with this problem, either. There might be one, but I
don't know it's name.

So why say it's a problem with 'C', when in fact it's a problem in any
given language you know of?
I was thinking more broadly than that: "employing function calls
safely".

Oh, wow...
C provides no mechanism for determining whether or not there's
enough stack left to safely call any function - recursively or not -
note even the initial call to main(). Nor does it provide a mechanism
for dealing with the consequences, if there's not enough stack.
As I said above, I don't know of any other language that provides such
mechanisms, either.

C does not have a mechanism to guarantee 'any' statement will execute
successfully. Nor does any other (compiled) language. In interpreted
languages, however, it's a doddle.

But then again, the interpreter for such a language is compiled and so
the problem is merely shifted. Is there a point in any of this, or are
you merely providing verbal foliage?
Your choice seems insufficiently motivated to me; "arbitrary" is a fine
word for conveying that judgment. It might be a mistaken judgment, but
then the fault is in the judgment, not in the word used to express it.

Well, pigheadedness seems a nice word to describe my judgment, so i'll
guess i'll use that.

Have a nice day. Remind me never to buy your software.
 
K

Kleuskes & Moos

Does not have to. Implement a dynamically growable list
with caching and memory recovery. Design a data
structure for the data in the list. Design for list
growing failure. Write push and pop. Presto! Chango!
You have written a recursive function.

:) Thanx...
 
D

Dr Nick

Kleuskes & Moos said:
The saving grace of recursion is that recursive implementations are
usually easier to understand. If it weren't for that, i'd ban the
practice outright.

I'd be intrigued to see a non-recursive implementation of the code of a
library for creating, reading into memory and printing JSON data
structures as an obvious example.
 
B

Barry Margolin

Dr Nick said:
I'd be intrigued to see a non-recursive implementation of the code of a
library for creating, reading into memory and printing JSON data
structures as an obvious example.

Recursive algorithms for applications like this are generally not a
problem. The recursion depth is proportional to the level of nesting of
the data structures, which is usually in the single digits, and
practically never more than a couple dozen. This is negligible on any
non-toy implementation.

Where recursion becomes an issue is when you use it for every element of
a sequential data structure. For instance, if a parser's algorithmic
structure were something like:

read_first_token
process_token
recurse(rest of document)

it would probably run into a stack limit on most implementations when
processing any real input.
 
J

James Kuyper

Usually there are. You can inspect the application domain and make an
estimate how deep the recursion could sensibly be. It is application
dependent, but then again, any software is application specific.

If your application must run on a given platform, and that platform
doesn't provide enough room on the stack to handle the worst case, it
would be helpful to have some way of detecting, from within the program,
that it is in fact the worst case - before it causes your program to
abort. That is just as much a problem with recursive solutions, as
non-recursive ones.

....
And you don't strike me as a gambler. You rely on SIGSEGV so cap it, I
suggest you cap it yourself, since SIGSEGV is a luxury item on many
platforms.

I don't rely on SIGSEGV. Among the restrictions placed on my code by our
client is a prohibition on using the features of <signal.h>. My code has
to play nicely with a library provided by another vendor, and that
library presumably (I've never checked) does things with signals that my
code is not allowed to interfere with.

....
So why say it's a problem with 'C', when in fact it's a problem in any
given language you know of?

Yes. The fact that it's a problem with every language I know of implies,
in particular, that it is a problem with C. There's no contradiction
there, and I even emphasized precisely that point by saying "(though
it's not unique to C)".

....
Have a nice day. Remind me never to buy your software.

You can't buy it - it's only available for free - another requirement
imposed by our client.
 
D

Dr Nick

Barry Margolin said:
Recursive algorithms for applications like this are generally not a
problem. The recursion depth is proportional to the level of nesting of
the data structures, which is usually in the single digits, and
practically never more than a couple dozen. This is negligible on any
non-toy implementation.

I know. It's not me who is arguing that recursion is evil and in an
ideal world ought to be banned!
Where recursion becomes an issue is when you use it for every element of
a sequential data structure. For instance, if a parser's algorithmic
structure were something like:

read_first_token
process_token
recurse(rest of document)

it would probably run into a stack limit on most implementations when
processing any real input.

Most implementations where the compiler doesn't optimise tail recursion
away, anyway.
 
K

Kleuskes & Moos

If your application must run on a given platform, and that platform
doesn't provide enough room on the stack to handle the worst case, it
would be helpful to have some way of detecting, from within the program,
that it is in fact the worst case - before it causes your program to
abort. That is just as much a problem with recursive solutions, as
non-recursive ones.

Sure, and theres a slew of profilers and other nifty gadgets around
that will help you with such problems. That's still at compile time,
but if you need to inspect such things at runtime, either the design
(and with that, the hardware restrictions such as minimum required
memory size) or the implementation is seriously flawed.

In non-recursive software, the problem is relatively easily solved
using such tools, recursive software is a bit more tricky, since you
CANNOT calculate a maximum call-tree height in advance. Hence my
suggestion not to use recursive algorithms, if you can avoid it.
I don't rely on SIGSEGV. Among the restrictions placed on my code by our
client is a prohibition on using the features of <signal.h>.

Let me see if i got that straight... You've been arguing that there's
no portable way to cap recursion and a runaway recursive process will
overflow the stack, resulting in a SIGSEGV...

That means the only cap placed on recursion in such a case is the
kernel THROWING a SIGSEGV. This will terminate the process in question
quite effectively, but not to the liking f your client, i imagine.
My code has
to play nicely with a library provided by another vendor, and that
library presumably (I've never checked) does things with signals that my
code is not allowed to interfere with.

I guess it's more a question of portability, even the POSIX standard
leaves room for variation across platforms, so using <signal.h> would
be a major liability if portability is a major concern.

Yes. The fact that it's a problem with every language I know of implies,
in particular, that it is a problem with C. There's no contradiction
there, and I even emphasized precisely that point by saying "(though
it's not unique to C)".

It's like telling a man "the problem with this car is that you might
run out of gas driving it, if you don't pay attention to the fuel-
gauge". No contradiction, but that doesn't mean nothing is wrong with
it.
You can't buy it - it's only available for free - another requirement
imposed by our client.


I like your client.
 
R

Rainer Weikusat

Dr Nick said:
[...]
Where recursion becomes an issue is when you use it for every element of
a sequential data structure. For instance, if a parser's algorithmic
structure were something like:

read_first_token
process_token
recurse(rest of document)

it would probably run into a stack limit on most implementations when
processing any real input.

Most implementations where the compiler doesn't optimise tail recursion
away, anyway.

Eh ... you do understand that 'compiler detects that programmer was a
crackpot and works around that automatically' implies that recursion
is probematic, do you?
 
D

Dr Nick

Rainer Weikusat said:
Dr Nick said:
[...]
Where recursion becomes an issue is when you use it for every element of
a sequential data structure. For instance, if a parser's algorithmic
structure were something like:

read_first_token
process_token
recurse(rest of document)

it would probably run into a stack limit on most implementations when
processing any real input.

Most implementations where the compiler doesn't optimise tail recursion
away, anyway.

Eh ... you do understand that 'compiler detects that programmer was a
crackpot and works around that automatically' implies that recursion
is probematic, do you?

I don't even understand the question.
 
N

Niklas Holsti

James said:
I'm not aware of any other language that provides a mechanism for
avoiding or dealing with this problem, either. There might be one, but I
don't know it's name.

Stack overflow in the Ada language is expected to raise the
Storage_Error exception, which can be caught and handled in the Ada
program itself. However, this can depend on compilation options, for
example the GNU Ada compiler gnat needs the option -fstack-check for
Storage_Error to work in this way, otherwise stack overflow just signals
a SIGSEGV.

You can also do the recursive computation in a dedicated thread (an Ada
"task") and control the amount of stack space allocated to the thread by
means of pragma Storage_Size, with a static constant value or a
dynamically calculated value, when the thread is created.

If you expect that Storage_Error may be raised by stack overflow during
recursion, it is a good idea to put the handler for Storage_Error
outside the recursion, so that the recursion unwinds and stack space is
again available before the handler is entered.
 
B

boltar2003

Does not have to. Implement a dynamically growable list
with caching and memory recovery. Design a data
structure for the data in the list. Design for list
growing failure. Write push and pop. Presto! Chango!
You have written a recursive function.

So to implement a recursive function iteratively you manual stacks?
Wow, who knew!
</sarcasm>

B2003
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,102
Messages
2,570,645
Members
47,245
Latest member
ShannonEat

Latest Threads

Top