Strange bug

E

Eric Sosman

I like quicksort, mergesort, and the 8 queens problem,
as recursion exercises.

Quicksort, yes: It teaches an important algorithm, it teaches the
idea of recursion, and it shows what can go wrong with recursion. As
for the other two, I can't see why anyone would use a recursive approach
for either. (Maybe these are the same people who use recursion to fill
an N-place array with 0..N.)
 
B

BartC

Eric Sosman said:
Quicksort, yes: It teaches an important algorithm, it teaches the
idea of recursion, and it shows what can go wrong with recursion. As

It would be teaching mostly about sorting.
for the other two, I can't see why anyone would use a recursive approach
for either. (Maybe these are the same people who use recursion to fill
an N-place array with 0..N.)

I can't see the problem with examples such as Fibonacci; it is very simple;
it does show how recursion works, without other distractions; and because it
is so inefficient, it is also a good comparative benchmark *when you are
measuring the performance of function calls* (a lot of people don't get this
point).

(I had a go at implementing a fast version of the OP's Trim function; it was
about a magnitude faster, but the recursive content was so obviously
contrived, that it wasn't worth bothering with, causing a 3x slowdown over a
version using iteration. Not a good example.)

There are few other examples that might be useful: recursive or hierarchical
data structures (trees and such) are naturally handled with recursion, and
might be easier to appreciate than sorting (since the sort data is flat).
Then there is language and expression parsing. And puzzle solving. And ...

So lots of examples to draw from. Many of practical use too, although this
doesn't really matter for teaching (or benchmarking) purposes.
 
W

William Hughes

     Quicksort, yes: It teaches an important algorithm, it teaches the
idea of recursion, and it shows what can go wrong with recursion.  As
for the other two, I can't see why anyone would use a recursive approach
for either.  (Maybe these are the same people who use recursion to fill
an N-place array with 0..N.)

As an introduction to recursion
I like Towers of Hanoi. Here is a problem that is very hard to solve
iteratively but easy to solve recursively but is simple and
does not require complex data structures. If introduction to
recursion is done using Fibonacci (or even worse factorial!)
students are likely to think, "So What?".

I prefer looking at the problem of computing finite Fourier
transforms,
to the problem of sorting.
- William Hughes
 
S

Super Ted

Well anyway, I have fixed it now with strcpy_safe. Here is the complete
working code.

#include <stdio.h>

int strcpy_safe(char *s, char *t)
{
int i = 0;
if(strlen(t) == 0)
return; // handle null strings
for(; i < strlen(t); ++i)
*(s+i) = *(t+i);
*(s+strlen(t))='\0';
return 0;
}

static int TrimWSpaceL(char *s)
{
int memmove(char *,char*);
if(isspace(*s)) {
strcpy_safe(s,s+1/*,strlen(s)*/);
return 1+TrimWSpaceL(s);
}
else
return 0;
}

static int TrimWSpaceR(char *s)
{
if(isspace(*(s+strlen(s)-1))) {
*(s+strlen(s)-1)='\0';
return 1+TrimWSpaceR(s);
}
else
return 0;
}

/* trims white-space, returning #spaces trimmed */
int TrimWSpace(char *s)
{
return TrimWSpaceL(s) + TrimWSpaceR(s);
}

int main()
{
char s[] = " \t *Hello world* \n ";
int i = TrimWSpace(s);
printf("%s\n(deleted %d spaces)\n", s,i);
return 0;
}
 
I

Ian Collins

Well anyway, I have fixed it now with strcpy_safe. Here is the complete
working code.

#include<stdio.h>

int strcpy_safe(char *s, char *t)
{
int i = 0;
if(strlen(t) == 0)
return; // handle null strings

Empty strings, you don't test for null strings. You have also omitted
the return value.
for(; i< strlen(t); ++i)
*(s+i) = *(t+i);
*(s+strlen(t))='\0';
return 0;
}

Why do you call strlen(t) 3 times?
static int TrimWSpaceL(char *s)
{
int memmove(char *,char*);

Why don't you include <string.h>? Then you'd see this prototype is
wrong (and you don't call memmove anyway!).
 
K

Keith Thompson

Super Ted said:
Well anyway, I have fixed it now with strcpy_safe. Here is the complete
working code.

#include <stdio.h>

You have calls to strlen(). You must declare it before using it,
preferably via "#include <string.h>".

You have calls to isspace(). You must declare it before using it,
preferably via "#include said:
int strcpy_safe(char *s, char *t)

The identifier "strcpy_safe" is just as reserved now as it was last time
we discussed it.
{
int i = 0;

Making i a size_t rather than an int would make your code more portable.
Consider copying a 40000-byte string on a system with 16-bit int and
32-bit size_t.
if(strlen(t) == 0)
return; // handle null strings

This handles *empty* strings.

It also means that if t points to an empty string, your strcpy_safe()
does nothing. For example:

char target[] = "hello";
strcpy_safe(target, "");
printf("target = \"%s\", target);

With strcpy(), this would print
target = ""
With your strpcy_safe(), this will print:
target = "hello"
for(; i < strlen(t); ++i)

It would be more idiomatic to write:

for (i = 0; i < strlen(t); ++i)

Or, if you can assume C99 conformance:

for (size_t i = 0; i < strlen(t); ++i)

But since you've said before that you care about efficiency, you're
calling strlen() multiple times, turning what should be an O(N)
algorithm into an O(N*M) algorithm.

Call strlen() once and save the result in a variable. Or just
detect the end of the string pointed to by t by looking for the '\0'.
*(s+i) = *(t+i);

Do you have reason not to use array notation?

s = t;
*(s+strlen(t))='\0';
return 0;
}

You said before that you're using this strcpy_safe() because it
works with overlapping strings. Have you documented its behavior?
Have you thought about what happens with strcpy_safe(s, s+1)
vs. strcpy_safe(s+1, s)?

And even if you had gotten it right, you're likely to be paying
a substantial performance penalty for the greater flexibility.
strcpy() can be optimized in ways that depend on the assumption
that the source and target strings do not overlap.
static int TrimWSpaceL(char *s)
{
int memmove(char *,char*);

Why are you declaring memmove() inside your function, why are you
declaring a function you never call, and why are you declaring
it incorrectly? You didn't even get the number of arguments right.
The only sane way to do this is to add #include <string.h> to the top
of your source file -- or, since you never call it, to leave it out.

[more code snipped]
int main()

Do you have a reason to continue using "int main()" rather than
the more correct "int main(void)"?

[...]
 
J

Jens Thoms Toerring

Super Ted said:
Well anyway, I have fixed it now with strcpy_safe. Here is the complete
working code.
#include <stdio.h>
int strcpy_safe(char *s, char *t)
{
int i = 0;
if(strlen(t) == 0)
return; // handle null strings

That's an empty string. And then a simple

if ( ! *t )
return;

is more efficient. And the 'return' call is broken since
you promised that the function would return and int but
you don't do so here.

And a final comment is: why don't you copy the
empty string to the destination when the user
is asking for it? It's a completely legal request
and most users will be rather surprised if the
function doesn;t do what (s)he askes it to do,
especially in the most simple case.

If you call your function strcpy_safe() wouldn't it
make sense to also check that neither 's' nor 't'
are NULL pointers?
for(; i < strlen(t); ++i)
*(s+i) = *(t+i);
*(s+strlen(t))='\0';

Now, that's a truely inefficient implementation. You call
strlen() over and over again without any good reason.

The usual idiom for all this, whichs work perfectly well
(and much more efficient), is

while ( *s++ = *t++ )
;
return 0;
}

Another problem with this implementation is that you forgo
all optimizations that the normal strcpy() function can use.
A more reasonable implementation would probably be to check
if the two strings can overlap and then call either memcpy()
(if they don't) or memmove() (if the do).

And, of course, by calling the function strcpy_safe() you
have invaded the namespace that belongs to the implementa-
tion...
static int TrimWSpaceL(char *s)
{
int memmove(char *,char*);

This is the name of a standard function (which is declared
in <string.h>) but which has different arguments. There
also doesn't seem to be any reason for declaring it here.
(If you want the standard memmove() function just include
if(isspace(*s)) {
strcpy_safe(s,s+1/*,strlen(s)*/);
return 1+TrimWSpaceL(s);
}

Can you get any more inefficient? For each white-space
character you remove you do a copy and then another function
call (recursively) while simply counting how many white-
spaces there are and the do a single copy would reduce the
amount of work to be done considerably.
else
return 0;
}
static int TrimWSpaceR(char *s)
{
if(isspace(*(s+strlen(s)-1))) {
*(s+strlen(s)-1)='\0';
return 1+TrimWSpaceR(s);
}
else
return 0;
}

Similar inefficiency as with the other function.

Regards, Jens
 
J

James Dow Allen

     Quicksort, yes: It teaches an important algorithm, it teaches the
idea of recursion, and it shows what can go wrong with recursion.

I also often choose simple looping in many problems where recursion
is plausible. Odd, that only one good example for recursion has
been presented so far - quicksort - and it's recursive only because
tail recursion can handle only one of two tails.

Let me nominate another example: digraph explorers; in particular
game tree solvers.

Even there, I'm not ashamed to confess I wrote one solver
http://fabpedigree.com/james/lesson9.htm#p93
that, to emphasize its non-recursion, does everthing in main()!

James Dow Allen
 
N

Nick

James Dow Allen said:
I also often choose simple looping in many problems where recursion
is plausible. Odd, that only one good example for recursion has
been presented so far - quicksort - and it's recursive only because
tail recursion can handle only one of two tails.

Working your way through naturally recursive data structures is a pretty
good one. So if you have an interpreter for a language in which you can
have arrays or lists of objects, each of which can be an array or list,
recursion is a very obviously way to, say, write the data to a file.
 
M

Mark Wooding

William Hughes said:
As an introduction to recursion I like Towers of Hanoi. Here is a
problem that is very hard to solve iteratively but easy to solve
recursively but is simple and does not require complex data
structures.

Towers of Hanoi is relatively straightforward to solve nonrecursively.
The important observation is that the binary representation of the step
number encodes the state of the game and the action to take in a fairly
simple way.

/* Determine what to do in step I of a Towers of Hanoi problem.
*
* N is the number of discs; we must have 1 <= I < 2^N. Store
* in *DD the number of the disc to move (numbered 1 up to N)
* and in *FF and *TT the numbers of the source (`from') and
* destination (`to') spindles (numbered 0, 1, 2) respectively.
* It is assumed that the overall task is to move all N discs
* from spindle 0 to spindle 2.
*/
static void hanoi_step(unsigned n, unsigned long i,
unsigned *dd, unsigned *ff, unsigned *tt)
{
unsigned m;
unsigned f = 0, t = 2, s = 1;

#define XCH(a, b) do { unsigned x = a; a = b; b = x; } while (0)

m = 1 << (n - 1);
for (;;) {
if (i == m) break;
else if (i & m) { XCH(f, s); i &= ~m; }
else { XCH(s, t); }
m >>= 1; n--;
}
*dd = n; *ff = f; *tt = t;

#undef XCH
}

Being able to determine a single step in the game without computing the
rest is handy if you're using a Towers of Hanoi backup schedule, for
example: you probably know how long it is since you took a full dump,
and would like to know what kind of incremental dump to do now.
If introduction to recursion is done using Fibonacci (or
even worse factorial!) students are likely to think, "So What?".

Recursion is an even worse way to compute Fibonacci numbers. The
following algorithm computes F(n) in time proportional to log(n)
multiplications.

/* Return the Nth Fibonacci number F(N). F(0) is defined to be
* 0 and F(1) to be 1; F(I) = F(I - 1) + F(I - 2).
*/
static unsigned long fib(unsigned n)
{
unsigned long u = 0, v = 1;
unsigned long u2, v2, iiuv, uu, vv;
unsigned m = ~0u ^ (~0u >> 1);

while (m) {
u2 = u*u; v2 = v*v; iiuv = 2*u*v;
uu = u2 + iiuv; vv = u2 + v2;
if (n & m) { u = uu + vv; v = uu; }
else { u = uu; v = vv; }
m >>= 1;
}

return (u);
}

A good algorithm to compute factorials is a little harder. If we assume
we're going to use arbitrary-precision arithmetic then multiplication is
fastest when we try to make the operands equally long, which suggests a
simple algorithm which maintains a stack with the invariant that entries
higher up in the stack are smaller. Then you push the numbers 1 up to
n, each time maintaining the invariant by popping and multiplying as
necessary. When you're done, you simply multiply the remaining entries
on the stack. Again, this is not naturally recursive.

This is probably now much faster than the algorithm used to print the
answer in decimal. A simple-minded base conversion will repeatedly take
quotient and remainder by (say) R, building up the base-R digits in
reverse order. A cleverer approach is to find the largest power R^(2^k)
smaller than the number we're meant to convert, take quotient and
remainder, and convert both separately and stitch the results together
(being careful with leading zeroes; note that R^(2^(k-1)) is an upper
bound for the power of the base needed to split the remaining parts).
This /is/ a naturally recursive algorithm, and it's very efficient; it
also has the advantage that it produces output in more or less the right
order. (There are faster ways if R is a power of 2, obviously.)
I prefer looking at the problem of computing finite Fourier
transforms, to the problem of sorting.

I can see the merits of that. Of course, Fourier transforms require a
certain amount of explaining in order to motivate them and explain why
the recursive algorithm is a plausible approach, but this is useful in
itself. (My university course didn't cover Fourier transforms at all.)

-- [mdw]
 
S

Super Ted

Jens said:
That's an empty string. And then a simple

if ( ! *t )
return;

is more efficient. And the 'return' call is broken since you promised
that the function would return and int but you don't do so here.

And a final comment is: why don't you copy the empty string to the
destination when the user is asking for it? It's a completely legal
request and most users will be rather surprised if the function doesn;t
do what (s)he askes it to do, especially in the most simple case.

If you call your function strcpy_safe() wouldn't it make sense to also
check that neither 's' nor 't' are NULL pointers?

If the function is called with a null string pointer then the program
will not crash but the return value will be undefined. In the normal
"success" case, the return value will be 0.
This is the name of a standard function (which is declared in
<string.h>) but which has different arguments. There also doesn't seem
to be any reason for declaring it here. (If you want the standard
memmove() function just include <string.h> and if it's a different
function give it a different name.)

Someone earlier in this topic suggested using memmove so this code is
left over from when I was trying that out. For efficiency reasons I
decided to use my own strcpy_safe function which is tailored for strings,
not arbitrary memory.
 
T

Tim Rentsch

Francois Grieu said:
[snip]

I note that the archetypal learning use case for recursion are
factorials and the "Tower of Hanoi", although both happen to have
a non-recursive solution which is much simpler:
http://en.wikipedia.org/wiki/Tower_of_Hanoi#Non-recursive_solution
This should be a giveaway that recursion is seldom useful, since
instructors fail to find a good use case for it.

I think the reasoning here is bad. These examples are chosen
because they usually are already familiar to students outside the
realm of computing. More specifically, they are chosen to *explain*
recursion, not to *motivate* recursion. I think they do a good job
in this respect -- in both cases the recursive solutions are simple
enough and easy to understand. Calling the referenced non-recursive
tower-of-hanoi algorithm simpler, let alone much simpler, is highly
misleading; the *steps* may be simpler, but seeing how and why the
algorithm works certainly is not. Conversely, for a recursive
version it's easy to see at a glance both how it works and why, and
that's the point of the exercise - to present a simple example where
recursion can be understood. Deeper understanding, about when it
might be good to use recursion, and when not, comes only later.
 
K

Keith Thompson

Super Ted said:
If the function is called with a null string pointer then the program
will not crash but the return value will be undefined. In the normal
"success" case, the return value will be 0.

You have a "return" statement within a function that returns an int.
In C99, this is a constraint violation; the compiler must issue a
diagnostic, and will probably reject your program.

Even in C90, since you pass the value of s to strlen(), your
program's behavior is undefined if s is a null pointer. It might
crash, it might return an undefined value, or it might do something
entirely unexpected.

And even if your description were correct, do you really think that
deliberately returning an undefined value would the right thing to
do here?
Someone earlier in this topic suggested using memmove so this code is
left over from when I was trying that out. For efficiency reasons I
decided to use my own strcpy_safe function which is tailored for strings,
not arbitrary memory.

Your strcpy_safe() function is horrendously inefficient. It's also
incorrect.

And even if you had used memmove(), your declaration is both
unnecessary and incorrect.

Seebs recently brought the marvelous phrase "fractally wrong" to
our attention. It refers to something that's so deeply wrong that,
the more you explore it, the more wrong it becomes, with errors on
multiple levels. If you're deliberately trying to write "fractally
wrong" code, you're pretty good at it (and it might be entertaining
if you had acknowledged it).

On the other hand, if you're actually trying to write good code,
you should consider following the advice you've been getting here
rather than ignoring it.
 
T

Tim Rentsch

Eric Sosman said:
I like quicksort, mergesort, and the 8 queens problem,
as recursion exercises.

Quicksort, yes: It teaches an important algorithm, it teaches the
idea of recursion, and it shows what can go wrong with recursion. As
for the other two, I can't see why anyone would use a recursive approach
for either. [snip]

Take two software developers of approximately equal ability,
assign each one the task of writing mergesort, one recursively,
one non-recursively. With high probability, the recursive
version will be written faster, have source code that is both
shorter and easier to understand, and has a fair chance of
working more reliably in corner cases, than the non-recursive
version. That's the most consistent with my experience anyway,
and it's certainly the way I'd bet. Perhaps your experience is
different?
 
T

Tim Rentsch

Keith Thompson said:
Super Ted said:
Well anyway, I have fixed it now with strcpy_safe. Here is the complete
working code.

#include <stdio.h>

You have calls to strlen(). You must declare it before using it,
preferably via "#include <string.h>". [snip]

The return type of strlen() is size_t, and therefore it is undefined
behavior to use strlen() without a '#include <string.h>' directive.
 
I

Ian Collins

Keith Thompson said:
Super Ted said:
Well anyway, I have fixed it now with strcpy_safe. Here is the complete
working code.

#include<stdio.h>

You have calls to strlen(). You must declare it before using it,
preferably via "#include<string.h>". [snip]

The return type of strlen() is size_t, and therefore it is undefined
behavior to use strlen() without a '#include<string.h>' directive.

Isn't that what Keith said?
 
T

Tim Rentsch

Ian Collins said:
Keith Thompson said:
Well anyway, I have fixed it now with strcpy_safe. Here is the complete
working code.

#include<stdio.h>

You have calls to strlen(). You must declare it before using it,
preferably via "#include<string.h>". [snip]

The return type of strlen() is size_t, and therefore it is undefined
behavior to use strlen() without a '#include<string.h>' directive.

Isn't that what Keith said?

Maybe it was but I didn't read it that way. I thought
Keith was saying it's okay to declare strlen() without
a #include <string.h>, even though with the #include
would be preferable. My point is that declaring strlen()
"manually", without the #include, still leads to undefined
behavior.
 
K

Keith Thompson

Ian Collins said:
Keith Thompson said:
Well anyway, I have fixed it now with strcpy_safe. Here is the complete
working code.

#include<stdio.h>

You have calls to strlen(). You must declare it before using it,
preferably via "#include<string.h>". [snip]

The return type of strlen() is size_t, and therefore it is undefined
behavior to use strlen() without a '#include<string.h>' directive.

Isn't that what Keith said?

Pretty much -- except that in C99 it's also a constraint violation
(implicit int is gone).

Oh, and you *could* declare strlen() yourself rather than
#include'ing <string.h>, but there's no good reason to do so (I
deliberately glossed over that alternative).
 
T

Tim Rentsch

Keith Thompson said:
Ian Collins said:
Well anyway, I have fixed it now with strcpy_safe. Here is the complete
working code.

#include<stdio.h>

You have calls to strlen(). You must declare it before using it,
preferably via "#include<string.h>". [snip]

The return type of strlen() is size_t, and therefore it is undefined
behavior to use strlen() without a '#include<string.h>' directive.

Isn't that what Keith said?

Pretty much -- except that in C99 it's also a constraint violation
(implicit int is gone).

Oh, and you *could* declare strlen() yourself rather than
#include'ing <string.h>, [snip].

You can't, at least not definedly; that was my point.
 
K

Keith Thompson

Tim Rentsch said:
Ian Collins said:
Well anyway, I have fixed it now with strcpy_safe. Here is the complete
working code.

#include<stdio.h>

You have calls to strlen(). You must declare it before using it,
preferably via "#include<string.h>". [snip]

The return type of strlen() is size_t, and therefore it is undefined
behavior to use strlen() without a '#include<string.h>' directive.

Isn't that what Keith said?

Maybe it was but I didn't read it that way. I thought
Keith was saying it's okay to declare strlen() without
a #include <string.h>, even though with the #include
would be preferable. My point is that declaring strlen()
"manually", without the #include, still leads to undefined
behavior.

The standard explictly permits this. C99 7.1.4p2:

Provided that a library function can be declared without
reference to any type defined in a header, it is also permissible
to declare the function and use it without including its
associated header.

(There is almost no sane reason to take advantage of this permission.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,083
Messages
2,570,589
Members
47,211
Latest member
JaydenBail

Latest Threads

Top