Translation from recursive to iterative

B

BQ

Due to a lack of resources, I have to translate the following recursive
function in its iterative form.
It's a kind of dichotomic search.

void SearchSlaves(unsigned long start_uid, unsigned long end_uid)
{
char ret;
//ping over a range of addresses (all slaves with uid in the range from
start_uid to end_uid will reply)
ret = PingSlave(start_uid,end_uid);

if (start_uid == end_uid) //single item
{
if (ret == VALID_PING_REPLY)
AddUidToSlaveList(start_uid);
return;
}
if (ret == VALID_PING_REPLY)
{
SearchSlaves( start_uid , ( (start_uid + end_uid + 1) >> 1) - 1 );
//left subtree
SearchSlaves( ( (start_uid + end_uid + 1) >> 1) , end_uid); //right
subtree
}
}

Any help will be greatly appreciated! :)
Regards,
Marco
 
R

Richard Bos

BQ said:
Due to a lack of resources, I have to translate the following recursive
function in its iterative form.
It's a kind of dichotomic search.

void SearchSlaves(unsigned long start_uid, unsigned long end_uid)
{
char ret;
//ping over a range of addresses (all slaves with uid in the range from
start_uid to end_uid will reply)
ret = PingSlave(start_uid,end_uid);

if (start_uid == end_uid) //single item
{
if (ret == VALID_PING_REPLY)
AddUidToSlaveList(start_uid);
return;
}
if (ret == VALID_PING_REPLY)
{
SearchSlaves( start_uid , ( (start_uid + end_uid + 1) >> 1) - 1 );
//left subtree
SearchSlaves( ( (start_uid + end_uid + 1) >> 1) , end_uid); //right
subtree

[ BTW, using C++-style comments _and_ tabs in a newsgroup is not a good
idea. ]

First thought: get rid of the premature optimisation, get rid of the
unnecessary object (yes, I do see the irony):

void SearchSlaves(unsigned long start_uid, unsigned long end_uid)
{
if (PingSlave(start_uid,end_uid)==VALID_PING_REPLY) {
if (start_uid == end_uid) {
AddUidToSlaveList(start_uid);
return;
}
SearchSlaves(start_uid, (start_uid+end_uid+1)/2) - 1);
SearchSlaves((start_uid+end_uid+1)/2, end_uid);
}
}

Second thought: while this algorithm is efficient if slaves are very
sparse, or relatively sparse and clustered within the range, this is
probably an efficient algorithm. Every slave is pinged several times,
though (log2(end_uid-start_uid) times, circa), so if there are many
slaves, this is not efficient. Nor is it optimal if there are a
reasonable number of slaves peppered through the range. If you suspect
one of those may be the case, a straightforward linear search may beat
this algorithm.
In fact, if PingSlave uses the obvious, naive algorithm for searching
its range, this already _is_ a linear search - several times over. One
other possibility would be to improve PingSlave's efficiency, perhaps by
giving it a (resettable) memory.

Richard
 
B

BQ

Richard said:
[ BTW, using C++-style comments _and_ tabs in a newsgroup is not a good
idea. ]

ehr, sorry.... :)
First thought: get rid of the premature optimisation, get rid of the
unnecessary object (yes, I do see the irony):

void SearchSlaves(unsigned long start_uid, unsigned long end_uid)
{
if (PingSlave(start_uid,end_uid)==VALID_PING_REPLY) {
if (start_uid == end_uid) {
AddUidToSlaveList(start_uid);
return;
}
SearchSlaves(start_uid, (start_uid+end_uid+1)/2) - 1);
SearchSlaves((start_uid+end_uid+1)/2, end_uid);
}
}

Second thought: while this algorithm is efficient if slaves are very
sparse,

-- and this is the case: max.20/30 'slaves', sparse over a range 2^16 wide
or relatively sparse and clustered within the range, this is
probably an efficient algorithm. Every slave is pinged several times,
though (log2(end_uid-start_uid) times, circa), so if there are many
slaves, this is not efficient. Nor is it optimal if there are a
reasonable number of slaves peppered through the range. If you suspect
one of those may be the case, a straightforward linear search may beat
this algorithm.

in my situation, a straightforward linear search can't be made.
Every ping has a 150ms timeout that can't be reduced, and the
searchspace is very large.
In fact, if PingSlave uses the obvious, naive algorithm for searching
its range, this already _is_ a linear search - several times over.

Slaves answer to the ping if their uid is start_uid<= uid <= end_uid.
So searching 10/20 slaves requires about 2-3 seconds at most

Anyway, this code is compiled against a microprocessor with very limited
resources and I have a problem with stack that pushes me toward
searching an iterative version of this algorithm: I can make at most 32
subroutine calls, and SearchSlave calls itself at least 16 times in a
2^16 space. Since SearchSlave is not at level 0, I sometime obtain a
stack overflow.

By the way, theory assures that a recursive algorithm can always be
rewritten in an iterative way. If anyone has an idea...

Regards,
Marco
 
C

CBFalconer

BQ said:
Due to a lack of resources, I have to translate the following recursive
function in its iterative form.
It's a kind of dichotomic search.

void SearchSlaves(unsigned long start_uid, unsigned long end_uid)
{
char ret;
//ping over a range of addresses (all slaves with uid in the range from
start_uid to end_uid will reply)
ret = PingSlave(start_uid,end_uid);

if (start_uid == end_uid) //single item
{
if (ret == VALID_PING_REPLY)
AddUidToSlaveList(start_uid);
return;
}
if (ret == VALID_PING_REPLY)
{
SearchSlaves( start_uid , ( (start_uid + end_uid + 1) >> 1) - 1 );
//left subtree
SearchSlaves( ( (start_uid + end_uid + 1) >> 1) , end_uid); //right
subtree
}
}

Any help will be greatly appreciated! :)
Regards,
Marco

You might start by making it easy for people to help you. 8 char
indents are excessive, // comments don't survive line wrapping
well, and any linelength over about 72 is likely to cause a problem
on newsgroups. Do not use tabs, use spaces. Include dummy
functions for the undefined things, so that people can compile it
immediately. As it is the SearchSlaves operation seems singularly
useless, since it returns nothing.
 
E

Eric Sosman

BQ said:
Richard Bos wrote:

[ BTW, using C++-style comments _and_ tabs in a newsgroup is not a good
idea. ]


ehr, sorry.... :)

First thought: get rid of the premature optimisation, get rid of the
unnecessary object (yes, I do see the irony):

void SearchSlaves(unsigned long start_uid, unsigned long end_uid)
{
if (PingSlave(start_uid,end_uid)==VALID_PING_REPLY) {
if (start_uid == end_uid) {
AddUidToSlaveList(start_uid);
return;
}
SearchSlaves(start_uid, (start_uid+end_uid+1)/2) - 1);
SearchSlaves((start_uid+end_uid+1)/2, end_uid);
}
}

Second thought: while this algorithm is efficient if slaves are very
sparse,


-- and this is the case: max.20/30 'slaves', sparse over a range 2^16 wide

or relatively sparse and clustered within the range, this is
probably an efficient algorithm. Every slave is pinged several times,
though (log2(end_uid-start_uid) times, circa), so if there are many
slaves, this is not efficient. Nor is it optimal if there are a
reasonable number of slaves peppered through the range. If you suspect
one of those may be the case, a straightforward linear search may beat
this algorithm.


in my situation, a straightforward linear search can't be made.
Every ping has a 150ms timeout that can't be reduced, and the
searchspace is very large.

In fact, if PingSlave uses the obvious, naive algorithm for searching
its range, this already _is_ a linear search - several times over.


Slaves answer to the ping if their uid is start_uid<= uid <= end_uid.
So searching 10/20 slaves requires about 2-3 seconds at most

Anyway, this code is compiled against a microprocessor with very limited
resources and I have a problem with stack that pushes me toward
searching an iterative version of this algorithm: I can make at most 32
subroutine calls, and SearchSlave calls itself at least 16 times in a
2^16 space. Since SearchSlave is not at level 0, I sometime obtain a
stack overflow.

By the way, theory assures that a recursive algorithm can always be
rewritten in an iterative way. If anyone has an idea...

If you're sure this is the algorithm you want to use,
go ahead and implement it with an explicit stack of your own:
a local array of start/end pairs plus an index indicating
where the stack top is.

#include <limits.h>
#define STACK_MAX (sizeof(unsigned long) * CHAR_BIT)
/* ... or perhaps less if you know something about
* the possible range of inputs.
*/

void SearchSlaves(unsigned long start, unsigned long end) {
struct { unsigned long start, end; } stack[STACK_MAX];
int depth;

/* Begin by pushing the whole range as one pair */
stack[0].start = start;
stack[0].end = end;
depth = 1;

/* Keep looping as long as the stack still contains
* unexplored ranges
*/
while (--depth >= 0) {

/* Pop a range from the stack */
start = stack[depth].start;
end = stack[depth].end;

/* Do the test */
if (PingSlave(start, end) == VALID_PING_REPLY) {
if (start == end) {
/* Base case: a trivial range */
AddUidToSlaveList(start);
}
else {
/* "Recursive" case: split the range and
* push the two halves
*/
unsigned long mid = (start + end + 1) / 2;
stack[depth].start = start;
stack[depth++].end = mid - 1;
stack[depth].start = mid;
stack[depth++].end = end;
}
}
}
}

This could be cleaned up a bit to eliminate about half of the
stack pushes and pops; I've written it this way to show the
essentials of the transformation. Also, with some cleverness
you might be able to stack just one endpoint of each range.
 
B

BQ

CBFalconer said:
You might start by making it easy for people to help you. 8 char
indents are excessive, // comments don't survive line wrapping
well, and any linelength over about 72 is likely to cause a problem
on newsgroups. Do not use tabs, use spaces. Include dummy
functions for the undefined things, so that people can compile it
immediately. As it is the SearchSlaves operation seems singularly
useless, since it returns nothing.

Thanks CBFalconer, I'll keep it in mind when I'll write a post next time.
Regards,
Marco
 
B

BQ

Eric said:
If you're sure this is the algorithm you want to use,
go ahead and implement it with an explicit stack of your own:
a local array of start/end pairs plus an index indicating
where the stack top is.

[CUT]

Eric, your post was very useful and teached my this new (new for me) use
of a stack. Thank you very much!!
Regards,
Marco
 
C

CBFalconer

BQ said:
Thanks CBFalconer, I'll keep it in mind when I'll write a post next
time.

Well, I thought you would post back with a cleaned up version. The
first and obvious thing is to remove the tail recursion. That can
automatically (with a choice of which half to recurse to) cut the
maximum recursion depth to log2(n) with practically zero effort.
After that there are straight forward mechanical methods for
replacing recursion with a local stack.
 
E

E. Robert Tisdale

BQ said:
Due to a lack of resources,
I have to translate the following recursive function in its iterative form.
It's a kind of dichotomic search.

void SearchSlaves(unsigned long start_uid, unsigned long end_uid) {
char ret;
//ping over a range of addresses
//(all slaves with uid in the range
// from start_uid to end_uid will reply)
ret = PingSlave(start_uid,end_uid);

if (start_uid == end_uid) //single item {
if (ret == VALID_PING_REPLY)
AddUidToSlaveList(start_uid);
return;
}
if (ret == VALID_PING_REPLY) {
// left subtree
SearchSlaves(start_uid, ((start_uid + end_uid + 1) >> 1) - 1);
// right subtree
SearchSlaves(((start_uid + end_uid + 1) >> 1), end_uid);
}
}

Any help will be greatly appreciated!

I don't think that I can help you with your immediate problem
but you should keep in mind that
many C compilers will optimize [tail-]recursion for you.
For example:
> cat factorial.c
unsigned int
factorial1(unsigned int n, unsigned int accumulator) {
if (n == 0)
return accumulator;
else
return factorial1(n - 1, n*accumulator);
}

unsigned int
factorial(unsigned int n) {
return factorial1(n, 1);
}
> gcc -Wall -std=c99 -pedantic -O2 -S factorial.c
> cat factorial.s
.file "factorial.c"
.text
.p2align 4,,15
.globl factorial1
.type factorial1, @function
factorial1:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %edx
movl 12(%ebp), %eax
jmp .L4
.p2align 4,,7
.L7:
imull %edx, %eax
decl %edx
.L4:
testl %edx, %edx
jne .L7
popl %ebp
ret
.size factorial1, .-factorial1
.p2align 4,,15
.globl factorial
.type factorial, @function
factorial:
pushl %ebp
movl $1, %eax
movl %esp, %ebp
subl $8, %esp
movl %eax, 4(%esp)
movl 8(%ebp), %eax
movl %eax, (%esp)
call factorial1
leave
ret
.size factorial, .-factorial
.section .note.GNU-stack,"",@progbits
.ident "GCC: (GNU) 3.4.1"
 
B

BQ

CBFalconer said:
Well, I thought you would post back with a cleaned up version.

Sorry, I didn't do that because Eric had already answered my question.
Anyway, here is the cleaned up version:

void SearchSlaves(unsigned long start_uid, unsigned long end_uid)
{
if (PingSlave(start_uid,end_uid)==VALID_PING_REPLY) {
if (start_uid == end_uid) {
AddUidToSlaveList(start_uid);
return;
}
SearchSlaves(start_uid, (start_uid+end_uid+1)/2) - 1);
SearchSlaves((start_uid+end_uid+1)/2, end_uid);
}
}
The
first and obvious thing is to remove the tail recursion.
[CUT]

After that there are straight forward mechanical methods for
replacing recursion with a local stack.

I agree with you and I think this will be the solution I'll chose, and
I'll use Eric's function.
Thank you all very much.
Regards,
Marco
 
C

CBFalconer

E. Robert Tisdale said:
.... snip ...

.file "factorial.c"
.text
.p2align 4,,15
.globl factorial1
.type factorial1, @function
factorial1:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %edx
movl 12(%ebp), %eax

It seems as if it is once again time to warn the various newbies
out there that ANYTHING Trollsdale alias Tisdale posts is almost
certainly off-topic, non-responsive, wrong, and designed purely to
troll the newsgroup. The above is typical.
 
C

CBFalconer

BQ said:
Sorry, I didn't do that because Eric had already answered my
question. Anyway, here is the cleaned up version:

void SearchSlaves(unsigned long start_uid, unsigned long end_uid)
{
if (PingSlave(start_uid,end_uid)==VALID_PING_REPLY) {
if (start_uid == end_uid) {
AddUidToSlaveList(start_uid);
return;
}
SearchSlaves(start_uid, (start_uid+end_uid+1)/2) - 1);
SearchSlaves((start_uid+end_uid+1)/2, end_uid);
}
}

Which immediately becomes:

void SearchSlaves(unsigned long start_uid, unsigned long end_uid)
{
while (PingSlave(start_uid, end_uid) == VALID_PING_REPLY) {
if (start_uid == end_uid) {
AddUidToSlaveList(start_uid);
return;
}
SearchSlaves(start_uid, (start_uid + end_uid + 1) / 2) - 1);
start_uid = (start_uid + end_uid + 1) / 2;
}
}

Now if we have some knowledge of possible recursion level, it is
easy to build a local stack with an array and an index, and
eliminate the internal recursion. All it has to save is end_uid
values and know when it is empty. You can write it in terms of
push and pop, together with "if (stackempty) ...". Afterwards the
push/pop/empty will become something like "stack[tos++] = value"
and "value = stack[--tos]" and "if (tos) ...". Don't forget to
initialize tos.
 
E

Eric Sosman

CBFalconer said:
[Concerning what amounts to a binary search written recursively]
Well, I thought you would post back with a cleaned up version. The
first and obvious thing is to remove the tail recursion. That can
automatically (with a choice of which half to recurse to) cut the
maximum recursion depth to log2(n) with practically zero effort.

The code as shown already has a recursion depth of lg(n).
How does eliminating one of the two recursive calls while leaving
the other intact change that? Maybe it goes from lg(n) to lg(n)-1,
but I don't see where any significant improvement arises.
 
C

CBFalconer

Eric said:
CBFalconer said:
[Concerning what amounts to a binary search written recursively]
Well, I thought you would post back with a cleaned up version. The
first and obvious thing is to remove the tail recursion. That can
automatically (with a choice of which half to recurse to) cut the
maximum recursion depth to log2(n) with practically zero effort.

The code as shown already has a recursion depth of lg(n).
How does eliminating one of the two recursive calls while leaving
the other intact change that? Maybe it goes from lg(n) to lg(n)-1,
but I don't see where any significant improvement arises.

I was speaking in general. You are correct, in that his partitions
are equi-sized. If they were not the improvement would apply. I
had not bothered to read his code, because things were so ugly.

Come to think of it, if the partitions are made with a size ratio
of 1:2, and the 2 sized portion is handled with tail recursion, the
overall depth will be reduced from log2(n) to log3(n). However the
speed will be reduced. For some particular costs of comparison,
calling, etc. there is probably an optimum ratio.
 
R

Richard Bos

As it is the SearchSlaves operation seems singularly
useless, since it returns nothing.

It doesn't need to; it adds found slaves to a list. Whether it's a good
idea to have a (presumably single) global list for this is a different
matter.

Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,740
Latest member
JudsonFrie

Latest Threads

Top