Musings, alternatives to multiple return, named breaks?

I

Ian Collins

glen said:
(snip, I wrote)


And faster function calls help move in that direction.

For programs with run-times of hours, days, or weeks, instead of
seconds or minutes, there is a tendency to code for speed.

Another reason to extract functionality. The optimiser can make
inlining decisions based on many factors, including cache size.
 
B

BartC

Another reason, frequently, is that something went very right.
"Right" as in "I was looking for something, and found it." There
you are, looping from node to node in a search tree or something,
and suddenly you come upon the match you've been looking for. What
could be more natural than to say "Hooray!" and spell it "return?" ....
Yes, you *could* collapse the two returns into one by introducing
an extra variable and a `break':
... and further obfuscations are possible. But to my mind that's
all they are: obfuscations, unnecessary increases in the amount
of information the reader must decipher and understand to see
what's going on. In short, ideology overcoming clarity. Pfui!

I've just had to revise a function which used 6 returns: 5 'success' returns
(returning a reference to a table entry), and one 'failure' return.

It worked fine. Except I now need the function to set a flag in the table
entry for the successful returns. For simplicity, because I'd already
decided to use multiple returns, I duplicated that extra code five times
(*and* made it into a function call, so that if the code for setting
changes, it's in one place).

With a single point of return, only one line of extra code would have been
needed, in place of ten or so.
 
E

Eric Sosman

I've just had to revise a function which used 6 returns: 5 'success'
returns (returning a reference to a table entry), and one 'failure' return.

It worked fine. Except I now need the function to set a flag in the
table entry for the successful returns. For simplicity, because I'd
already decided to use multiple returns, I duplicated that extra code
five times (*and* made it into a function call, so that if the code for
setting changes, it's in one place).

With a single point of return, only one line of extra code would have
been needed, in place of ten or so.

"One line of extra code," not counting the extra lines required
to collapse the six returns into one: Status flags, tests for "Should
I do this next part, or am I already done?" and so on.

As the King says, "Begin at the beginning, and go on till you
come to the end, then set a silly status variable, test it to see
if you've set it or not, copy the actual answer somewhere else, test
a couple more times, and stop when you've completely forgotten what
you once thought you were doing."
 
B

BartC

Eric Sosman said:
"One line of extra code," not counting the extra lines required
to collapse the six returns into one: Status flags, tests for "Should
I do this next part, or am I already done?" and so on.

The thread discusses alternatives to multiple returns which do not require
convoluting the following code. Any alternative would likely be a single
statement just like the original return, so no extra lines.

As for having to store the return value first to be picked up at the return
point, in my case that was already in a variable.
 
I

Ian Collins

BartC said:
I've just had to revise a function which used 6 returns: 5 'success' returns
(returning a reference to a table entry), and one 'failure' return.

It worked fine. Except I now need the function to set a flag in the table
entry for the successful returns. For simplicity, because I'd already
decided to use multiple returns, I duplicated that extra code five times
(*and* made it into a function call, so that if the code for setting
changes, it's in one place).

With a single point of return, only one line of extra code would have been
needed, in place of ten or so.

Wouldn't it have been easier (and maybe a cleaner design) to wrap the
original function with a simple function that sets the flag via the
returned value?
 
B

BartC

Ian Collins said:
BartC wrote:

Wouldn't it have been easier (and maybe a cleaner design) to wrap the
original function with a simple function that sets the flag via the
returned value?

I suppose that's one general approach to convert any multi-return function
into a single-return one!

That might be something I would do if I couldn't modify the function for any
reason (didn't have the source, or it's in a module maintained by someone
else, or as a temporary wrapper for debugging purposes).

But with the source available it's not ideal because it will create an
arbitrary extra function (what will the new one be called? Why is it there
(if looking at the source in future)?) The answer is to do it properly.
 
B

BartC

Can you not set it at the start of the function and reset if before the
failure return?

In this case (searching a complex tree for an entry matching some criteria)
it won't work because I don't know what entry I'm dealing until each return
point.

(Out of interest, and since I've recently discovered how to use pastebin,
the de-commented code for that function is here:

http://pastebin.com/tsb86XHa

However, this is not C code (nor even static, compiled code). The principle
is the same though.

In that example, the 'setusedflag()' line is the code I had to add to each
return. (Also the "==" operator is for identity, not equality, if anyone
wonders why "!=" or "<>" wasn't used.))
 
G

glen herrmannsfeldt

(snip, someone wrote)
I suppose that's one general approach to convert any multi-return function
into a single-return one!
That might be something I would do if I couldn't modify the function for any
reason (didn't have the source, or it's in a module maintained by someone
else, or as a temporary wrapper for debugging purposes).

Interestingly, (to me) gfortran does that for programs with the ENTRY
statement. Makes using a debugger interesting.

All this discussion about multiple returns, but some languages allow
for multiple entries into a function (or subroutine).
But with the source available it's not ideal because it will create an
arbitrary extra function (what will the new one be called? Why is it there
(if looking at the source in future)?) The answer is to do it properly.

-- glen
 
T

Tim Rentsch

This can be written avoiding both breaks and returns:

int f()
{ int fail = 0;
if( fail = do_something_1() );
else if( fail = do_something_2() );
else { ... }
return fail; }

, and this will also exit immediately when do_something_1()
is nonzero.

A rather meaningless comparison, since the revised definition has
different semantics than the orginal.
 
T

Tim Rentsch

Keith Thompson said:
Much like a return statement, yes?

Certainly not.
The behavior of a return statement is defined relative to a named
entity (a function definition) in whose scope it appears. The
same would be true of a named break.

This analogy doesn't hold up. It happens that functions have
names, but we don't give a function name in a 'return' statement,
and we don't have to look for a name (any name) to see which
block is exited upon seeing a 'return'. Similarly, in a code
fragment such as

XYZZY:
while(condition){
...
if(something) break;
...
}

the 'break' is exiting a named statement, but the presence of a
name, or what name is used, is completely incidental to how break
works, and what we have to do to understand it in the context of
the function body. So it is also with return.
 
G

glen herrmannsfeldt

Tim Rentsch said:
Keith Thompson <kst-u@mib.org> writes:

(snip, someone wrote)
Certainly not.
This analogy doesn't hold up. It happens that functions have
names, but we don't give a function name in a 'return' statement,
and we don't have to look for a name (any name) to see which
block is exited upon seeing a 'return'.

Standard C doesn't have nested functions, but some languages do,
and as I understand it, the C compiled by gcc does. In that case,
it isn't so obvious looking at a return statement which function
it returns from.
Similarly, in a code fragment such as

XYZZY:
while(condition){
...
if(something) break;
...
}
the 'break' is exiting a named statement, but the presence of a
name, or what name is used, is completely incidental to how break
works, and what we have to do to understand it in the context of
the function body. So it is also with return.

Java has the named break, which can break out of more than one
level of loop. Also, Java doesn't have a goto statement, though
it is a reserved word.

-- glen
 
K

Keith Thompson

Tim Rentsch said:
Certainly not.


This analogy doesn't hold up. It happens that functions have
names, but we don't give a function name in a 'return' statement,
and we don't have to look for a name (any name) to see which
block is exited upon seeing a 'return'.

Hypothetically, if the proposed named break feature were extended
to cover returns from nested functions (which, yes, don't exist in
standard C), then it would make sense to allow a return statement
to return from a named function rather than from the nearest
enclosing one. The syntax might be something like
return value from name;
or perhaps
return value _From name;
where "name" is a function name, defaulting to the innermost enclosing
function if it's not specified.

I'm not advocating such a feature, merely stating that it could be
defined consistently.
Similarly, in a code
fragment such as

XYZZY:
while(condition){
...
if(something) break;
...
}

the 'break' is exiting a named statement, but the presence of a
name, or what name is used, is completely incidental to how break
works, and what we have to do to understand it in the context of
the function body. So it is also with return.

In C as it's currently defined, yes, the break always refers to the
innermost enclosing loop or switch statement. What we're discussing
is a hypothetical change (or extension) to the language, in which
a loop or switch statement can be named and a break statement
may optionally refer to an enclosing loop or switch statement by
its name.

There's precedent for this in Ada, for example:

Outer: loop
Inner: loop
-- ...
end loop Inner;
end loop Outer;

Inside the inner loop, you can write any of:

exit; -- Ada's spelling of "break"
exit Inner; -- exits the loop named "Inner"
exit Outer; -- exits the loop named "Outer"

The names "Inner" and "Outer" refer to the entire respective loop
constructs, not to a point in the code at the top of a loop.
(There's a different syntax for labels used as the targets of
goto statements.)

If such a feature were to be added to C, it could either re-use
the existing syntax for goto labels (Perl does this), or invent
a new syntax used only to name loops and switch statements (and
perhaps blocks).

I'm certainly aware that loops don't have names that are relevant
to break statements *in C as it's currently defined*.
 
T

Tim Rentsch

glen herrmannsfeldt said:
Standard C doesn't have nested functions, but some languages do,
and as I understand it, the C compiled by gcc does. In that case,
it isn't so obvious looking at a return statement which function
it returns from.

That's irrelevant to what I was saying. What happens still doesn't
depend on looking for a particular name the way a labelled break
would; return's are like unlabelled break's, not labelled breaks.
 
G

glen herrmannsfeldt

(snip, I wrote)
That's irrelevant to what I was saying. What happens still doesn't
depend on looking for a particular name the way a labelled break
would; return's are like unlabelled break's, not labelled breaks.

Yes. All I meant was that sometimes, when reading a program, it
isn't so easy to know which function a return statement belongs to.

Even without nesting, with a complicated nesting it might be
confusing to read and understand.

Compilers are not confused, but sometimes people can be.

I suppose in a language with nested functions one might consider
a labelled return. I don't know of any that do that.

-- glen
 
G

glen herrmannsfeldt

Keith Thompson said:
Hypothetically, if the proposed named break feature were extended
to cover returns from nested functions (which, yes, don't exist in
standard C), then it would make sense to allow a return statement
to return from a named function rather than from the nearest
enclosing one. The syntax might be something like
return value from name;
or perhaps
return value _From name;
where "name" is a function name, defaulting to the innermost enclosing
function if it's not specified.

There is a long thread, I believe in a different newsgroup, on
non-local GOTO.

With nested procedures in Pascal, one can GOTO out of one to a
containing procedure (I believe that is required to be on the call
path.) You could, then GOTO the return statement in the outer
procedure.

PL/I allows for that, and also GOTO with label variables. Again,
you can GOTO back up the call path.

C has longjmp() which allows for a similar operation.
(The only time I remember using longjmp() was a C translation
of a Pascal program.)
I'm not advocating such a feature, merely stating that it could be
defined consistently.
In C as it's currently defined, yes, the break always refers to the
innermost enclosing loop or switch statement. What we're discussing
is a hypothetical change (or extension) to the language, in which
a loop or switch statement can be named and a break statement
may optionally refer to an enclosing loop or switch statement by
its name.

Like break in Java. Java doesn't have goto (thought it is a reserved
word) so it is a little more important there.
There's precedent for this in Ada, for example:
(snip)

If such a feature were to be added to C, it could either re-use
the existing syntax for goto labels (Perl does this), or invent
a new syntax used only to name loops and switch statements (and
perhaps blocks).
I'm certainly aware that loops don't have names that are relevant
to break statements *in C as it's currently defined*.

and Fortran also has EXIT from with labels to allow more than one
loop to be exited.

-- glen
 
T

Tim Rentsch

glen herrmannsfeldt said:
(snip, I wrote)


Yes. All I meant was that sometimes, when reading a program, it
isn't so easy to know which function a return statement belongs to.

Sure. Obviously true, and again irrelevant to what I
was saying. You could have said that programs whose
variables consist solely of o's, l's, 0's and 1's
can be difficult to read - just as true, as just as
relevant.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,113
Messages
2,570,688
Members
47,269
Latest member
VitoYwo03

Latest Threads

Top