Musings, alternatives to multiple return, named breaks?

M

mathog

I am not a big fan of code that uses multiple returns in one function.
(I do it at times, but try not to.) However, without using goto,
eliminating the multiple returns leads either to a deeply nested series
of "if" blocks or a construct like this

int function(}{
int status = 0;
while(1) { /* this is NOT used as a loop */
if(something)break;
if(something_else)break;
status = 1;
break;
}
return(status);
}

This sort of while construct has a couple of problems:

1. It looks like it should be running in cycles, and the only thing
telling a programmer who stumbles onto it that it isn't is the comment -
or code analysis, and that takes time if the code is long.

2. If there are real loops used within this then an extra variable is
needed solely to carry the "break" up to the outer level.

3. It needs the final break, to force a loop to not be a loop, because a
loop construct is being used in a nonlooping context.

The final alternative, to use goto, is not a programming problem, but it
is a social one, since one will inevitably have to defend the use of the
"evil" goto, even though if used properly it does not result in
spaghetti code - here it actually simplifies the structure. I hate
those annoying exchanges, and admit that I avoid goto's even in some
cases where it would make sense to use them just so I never have to
waste time discussing the matter.

I think something like this would be a better syntax than the preceding,
but AFAIK it isn't part of C and it isn't even on the drawing board:

int function(}{
int status = 0;
once(label) { /* will not be confused with a loop */
if(something)break; //exits once()
if(something_else)break(label); //exits once()
while(1){ /* really a loop */
if(some_loop_ending_test)break; // exits while()
if(some_logic_ending_test)break(label); //exits once()
}
status = 1;
} /* target for break(label) */
return(status);
}

and of course these could be nested with the constraints that all labels
must be unique in function and break must be to the last line of an
enclosing once():

once(label1){
once(label2){
if(test1)break;
if(test2)break(label1);
}
/* other tests */
}

In terms of moving code around, for instance, when editing it, it seems
like it wouldn't be particularly problematical. If part of the contents
of once(label1) are moved into once(label2), for whatever reason, to
clean it up would only require a search and replace on the pasted block
for "break(label1)" -> "break(label2)". If an editing error is made the
compiler will catch labeled breaks to labels for structures that they
are not within.

Another way to go about this would be to provide a way to "label" any
pair of matched {} at the top bracket, so that "once" would not be
needed. For discussion sake, one such syntax would be "label:{", which
could be used any place a "{" is currently used. It looks very odd to
me in some contexts

if(test)first_test_set:{
if(test2)break(first_test_set);
//other stuff
}

but OK in others:

parity_tests:{
if(test)break(parity_tests);
// other stuff
}

Presumably there would be some conflicts between this syntax and the
existing C line label syntax, so the actual syntax for this sort of
label might be something else.

The equivalent "continue(label)" is suggested by symmetry, but maybe it
should be avoided because it would turn code that does not look like it
cycles, into code that does cycle. The labeled continue might be useful
though if its use was restricted to the innards of existing loops, like
a for(){} or a while(){}.


Any chance we will ever see something like this in a C language
standard? Labeled breaks are not required, of course, since we can do
the same things with existing structures, but it would certainly be
_convenient_ if something like this was available.

Regards,

David Mathog
 
K

Kenny McCormack

I am not a big fan of code that uses multiple returns in one function.
(I do it at times, but try not to.) However, without using goto,
eliminating the multiple returns leads either to a deeply nested series
of "if" blocks or a construct like this

int function(}{
int status = 0;
while(1) { /* this is NOT used as a loop */
if(something)break;

The more common idiom is:

do {
...
}
while (0);

which is guaranteed to execute exactly once.
 
J

James Kuyper

I am not a big fan of code that uses multiple returns in one function.
(I do it at times, but try not to.) However, without using goto,
eliminating the multiple returns leads either to a deeply nested series
of "if" blocks or a construct like this

int function(}{
int status = 0;
while(1) { /* this is NOT used as a loop */
if(something)break;
if(something_else)break;
status = 1;
break;
}
return(status);
}

This sort of while construct has a couple of problems:

The biggest problem is that it doesn't avoid the real problem of code
with multiple returns: tracing the control flow. It is precisely as
difficult to track down "return" statements at it is to track down
"break;" statements. If the breaks are acceptable, replacing them with
returns should be equally acceptable.
 
S

Stefan Ram

mathog said:
I am not a big fan of code that uses multiple returns in one function.
(I do it at times, but try not to.) However, without using goto,
eliminating the multiple returns leads either to a deeply nested series
of "if" blocks or a construct like this

Multiple returns are ok, as long as the function is small.
1. It looks like it should be running in cycles

A C programmer knows that the body of a »while« statement
can be executed 0 times or 1 time. He is not surprised when
this happens.

But the bloated coding style given has no obvious advantage
compared to

status = !something && !something_else;
 
E

Eric Sosman

I am not a big fan of code that uses multiple returns in one function.
(I do it at times, but try not to.) However, without using goto,
eliminating the multiple returns leads either to a deeply nested series
of "if" blocks or a construct like this

int function(}{
int status = 0;
while(1) { /* this is NOT used as a loop */
if(something)break;
if(something_else)break;
status = 1;
break;
}
return(status);
}

int function() {
return !something && !something_else;
}

(I do in fact feel some of your pain, but even a local
anesthetic suffices for the injury you've shown. Perhaps
a more agonizing example would help; otherwise, "Take two
aspirin and call me in the morning.")
This sort of while construct has a couple of problems:

1. It looks like it should be running in cycles, and the only thing
telling a programmer who stumbles onto it that it isn't is the comment -
or code analysis, and that takes time if the code is long.

2. If there are real loops used within this then an extra variable is
needed solely to carry the "break" up to the outer level.

3. It needs the final break, to force a loop to not be a loop, because a
loop construct is being used in a nonlooping context.

The `do { ... } while(0);' construct solves that one.
The final alternative, to use goto, is not a programming problem, but it
is a social one, since one will inevitably have to defend the use of the
"evil" goto, even though if used properly it does not result in
spaghetti code - here it actually simplifies the structure. I hate
those annoying exchanges, and admit that I avoid goto's even in some
cases where it would make sense to use them just so I never have to
waste time discussing the matter.

It doesn't seem to me that you've avoided the goto: You've
just found an alternate spelling. True, that might be good enough
for social purposes -- but if you think there are reasons other
than societal pressure to avoid goto, you're deciding to ignore
those reasons. In other words: If you think you can defend this
rather obfuscated construct, you ought to be able to defend a goto.
Or a return, for that matter ...
I think something like this would be a better syntax than the preceding,
but AFAIK it isn't part of C and it isn't even on the drawing board:

int function(}{
int status = 0;
once(label) { /* will not be confused with a loop */
if(something)break; //exits once()
if(something_else)break(label); //exits once()
while(1){ /* really a loop */
if(some_loop_ending_test)break; // exits while()
if(some_logic_ending_test)break(label); //exits once()
}
status = 1;
} /* target for break(label) */
return(status);
}

Java has something a bit like this. You can stick a label
on a loop (the loop constructs are much the same as C's), and
inside the loop you can `break label;' (or `continue label;').
It's occasionally (but only occasionally) helpful, when an inner
loop wants to break out of (or continue) a containing loop. Your
technique would work, but it's hard to imagine it surviving a
thoughtful code review.

(Unfortunately, Java's syntax shares a drawback with the one
you propose: The label is *here*, and `break label;' transfers
control to *there*, possibly far away. Ugh.)
Any chance we will ever see something like this in a C language
standard? Labeled breaks are not required, of course, since we can do
the same things with existing structures, but it would certainly be
_convenient_ if something like this was available.

I have no idea whether anything along these lines has been
proposed, and even less about what the Committee would likely
do with it. To me, it feels like the convenience would be limited
to a relatively few corner cases, meaning that the overall utility
might be rather small. (Since Java has *no* goto the designers may
have felt more pressure to provide multi-level control transfer--
but C isn't caught in that particular bind.)

Still, that's just a feeling. Coding styles differ, and it's
quite possible other people run into the lack of multi-level
breaks more often than I do.
 
S

Stefan Ram

But the bloated coding style given has no obvious advantage
compared to
status = !something && !something_else;

The whole function indeed can be written as just

int function(}{ return !something && !something_else; }

. Maybe now you (mathog) say that this was just an example.
But in programming details matter. So to discuss the best
way to implement something, we first need to know /what to
implement/, i. e., a requierement specification. Without
one, one cannot tell whether possible alternative solutions
still solve the problem at hand.
 
S

Stefan Ram

The whole function indeed can be written as just
int function(}{ return !something && !something_else; }

or

int function(){ return !( something || something_else ); }

, which I deem more readable.
 
S

Stefan Ram

int function(){ return !( something || something_else ); }

Another refactor: Make it meaningful! In a specific project,
we always have more meaning than »something«. Assume:

int function(){ return !( internal_intruder || external_intruder ); }

, we now can make this more readable by defining two functions:

inline int intruder(){ return internal_intruder || external_intruder; }
inline int no_intruder(){ return !intruder; }

. Now compare the readability with the original:

int function(}{
int status = 0;
while(1) { /* this is NOT used as a loop */
if(something)break;
if(something_else)break;
status = 1;
break;
}
return(status);
}
 
B

BartC

mathog said:
I am not a big fan of code that uses multiple returns in one function. (I
do it at times, but try not to.) However, without using goto, eliminating
the multiple returns leads either to a deeply nested series of "if" blocks
or a construct like this

int function(}{
int status = 0;
while(1) { /* this is NOT used as a loop */
if(something)break;
if(something_else)break;
status = 1;
break;
}
return(status);
}

I don't think it's good when attempts to avoid goto lead to more convoluted
code. Why not just embrace it, but make the goto a little less obvious; this
defines a new 'statement' to jump to a common return point:

#define break_end goto end

int function(void){
int status = 0;
if (something) break_end;
if (something_else) break_end;
status = 1;
end:
return(status);
}

You can probably dress up multiple-level loop breaks in the same way.
 
M

mathog

Stefan said:
Another refactor: Make it meaningful! In a specific project,
we always have more meaning than »something«. Assume:

The examples were all trimmed down to as few lines as possible. The
issue at hand only comes up in the real world when a very much larger
set of tests is applied.

Regards,

David Mathog
 
S

Stefan Ram

mathog said:
The examples were all trimmed down to as few lines as possible. The
issue at hand only comes up in the real world when a very much larger
set of tests is applied.

When the issue at hand only comes up when a very much larger
set of tests is applied, then the issue at hand cannot be
discussed using examples trimmed down to as few lines as possible.
 
M

mathog

Eric said:
(Unfortunately, Java's syntax shares a drawback with the one
you propose: The label is *here*, and `break label;' transfers
control to *there*, possibly far away. Ugh.)

Don't let my use of the word "label" lead you into the idea that it is a
C label of the existing type - it is a "{}" label (something we
currently have no name for). It is probably better considered a name
for the block of code delimited by a pair of brackets. So break(name)
doesn't mean "take me to the C label 'name'" but rather "take me to the
line after the 'name' block of code". (And only from within that block
of code.)

In other words, not "take me THERE" but "get me out of HERE".

Regards,

David Mathog
 
K

Kaz Kylheku

I am not a big fan of code that uses multiple returns in one function.
(I do it at times, but try not to.)

I'm not going to put this at the end of a function:

out: ;
}

just so that I can replace "return;" statements with "goto out;",
and then congratulate myself that my function has only a single return
point.

That function still has multiple returns. The "goto out;" is a return.
It's just not spelled with the return keyword.

Any function which can bail without executing the rest of the function in
fact has multiple returns.

I'm not a fan of code which has multiple return *statements*, such that each
return case contains a different variation on the cleanup which is needed to be
performed, customized to that specific case.

"If this fails, then we need to bail out. But oh, by this statement, we have
acquired the foo handle already, so unlike the previous bail, this one now
needs to also release foo ..."

If you *can* achieve multiple returns using just a return statement, without
any cleanup, that is a nice situation, and you should just do it that way:

int classify_string(char *a)
{
if (strncmp(a, "abc", 3)
return CLS_STARTS_WITH_ABC;

if (strspn(a, " \t") != 0)
return CLS_HAS_LEADING_SPACES_TABS;

/* ... */

return CLS_UNINTERESTING;
}

I will not put gotos into this type of function, just to have one return
statement. Just like I will not arrange pencils on my desk in order
of increasing length.
 
D

David Brown

Multiple returns are ok, as long as the function is small.

I think it's also OK with extra returns at the start of a function, for
quick exit. So you might have something like this:

int sortOfSquareRoot(int x) {
if (x < 0) return 0;
... big lump of code ...
return y;
}

Have an early return saves a layer of bracketing and indentation, and
should be quite clear in such cases.

But multiple returns in the middle of a function are often hard to
trace, and that's never a good idea.
 
S

Stefan Ram

David Brown said:
I think it's also OK with extra returns at the start of a function, for
quick exit. So you might have something like this:
int sortOfSquareRoot(int x) {
if (x < 0) return 0;
... big lump of code ...
return y;
}
Have an early return saves a layer of bracketing and indentation, and
should be quite clear in such cases.

The above could also be written as

int root( int const x ){ return x < 0 ? 0 : root_nonnegative( x ); }
 
B

BartC

Stefan Ram said:
When the issue at hand only comes up when a very much larger
set of tests is applied, then the issue at hand cannot be
discussed using examples trimmed down to as few lines as possible.

It seems obvious to me that this about both avoiding multiple returns within
a large body of code, and avoiding gotos, and wasn't just the OP's example
taken literally.

We don't need to know what the rest of the code looks like! We can just
assume it's a bunch of statements within which the returns can occur
anywhere, including deeply nested statements (and where it is impractical to
combine them all into one giant return expression).
 
B

BartC

Kaz Kylheku said:
I'm not going to put this at the end of a function:

out: ;
}

just so that I can replace "return;" statements with "goto out;",
and then congratulate myself that my function has only a single return
point.

That function still has multiple returns. The "goto out;" is a return.
It's just not spelled with the return keyword.

Any function which can bail without executing the rest of the function in
fact has multiple returns.

That would also apply to a function consisting of an if/else-if chain, or a
switch statement, followed by a single return.
If you *can* achieve multiple returns using just a return statement,
without
any cleanup, that is a nice situation, and you should just do it that way:

int classify_string(char *a)
{
if (strncmp(a, "abc", 3)
return CLS_STARTS_WITH_ABC;

if (strspn(a, " \t") != 0)
return CLS_HAS_LEADING_SPACES_TABS;

/* ... */

return CLS_UNINTERESTING;
}

Obviously that's possible, although the OP wanted alternatives (and his
example had a yes or no return value). The trouble is that alternatives
might not be as clear.

(FWIW, I strive to have a single return, because it makes some things
simpler. If I decided I needed to add some common code just before all the
returns, that's easier with one return point than 28 (and you might only
spot 26). But one at the end, and perhaps one or two near the start, will
do.)
 
K

Keith Thompson

mathog said:
I am not a big fan of code that uses multiple returns in one function.
(I do it at times, but try not to.) However, without using goto,
eliminating the multiple returns leads either to a deeply nested series
of "if" blocks or a construct like this

int function(}{
int status = 0;
while(1) { /* this is NOT used as a loop */
if(something)break;
if(something_else)break;
status = 1;
break;
}
return(status);
}

An alternative is the

do {
/* ... */
} while (0);

idiom, commonly used in macro definitions.

[...]
 
G

glen herrmannsfeldt

(previous snip on return from nested loops)

Personally, I am not against multiple returns, but, as someone else
noted, it is different in the case of small functions.

Otherwise, the goal is always readability, not blindly following
rules. If you comment it enough, multiple returns might be the
best way. That probably includes a comment near the end, indicating
that there are other returns.
Eric Sosman wrote:
Don't let my use of the word "label" lead you into the idea that
it is a C label of the existing type - it is a "{}" label (something
we currently have no name for). It is probably better considered a
name for the block of code delimited by a pair of brackets.

Fortran and Java now have named loops with a statement to exit from
the correspondingly named loop. Personally, I find it harder to read
than a goto.
So break(name) doesn't mean "take me to the C label 'name'" but
rather "take me to the line after the 'name' block of code".
(And only from within that block of code.)
In other words, not "take me THERE" but "get me out of HERE".

Yes, but to follow it, you have to search back for the matching label,
and then forward to find the end of the block. With proper indenting,
that shouldn't be too hard, but then it gets harder as the loop
(and program) get longer.

-- glen
 
M

mathog

int function(}{
Related to this simple case, the only reason the while(1) construct is
needed, with its accompanying terminal "break", is because C does not
support this:

int function(}{
int status = 0;
{ /* start a block of code */
if(something)break;
if(something_else)break;
// many more cases to handle, OK?
status = 1;
} /* end a block of code */
return(status);
}

When a break is used within a delimiting set of brackets like that there
is compile time error like:

break statement not within loop or switch

That limitation results because this needs to do what it currently does:

while(1){
if(test){
break; //from outer switch or loop
}
}

Because in C {} delimited code blocks are not nameable, it was
impossible to specify with a break the identity of the block to exit.
So other rules were employed to determine "break from what", and one
corollary was that some pairs of brackets could not be exited with a break.

In terms of naming blocks (a range of lines), which is different than
setting labels (a single line), it might make sense to require both ends
to be tagged, that is (using a made up syntax just to indicate the sort
of thing I mean):

if(test){:has_parity:
//lots of stuff
if(test2)break;
if(test3){
//stuff
break(has_parity);
}
//lots more stuff
}:has_parity:

that looks strange, but many times in long or very nested code
programmers add comments exactly like that to help them keep track of
the code blocks:

if(test){ /* has_parity */
//contents omitted
} /* has_parity */

On thinking about this a bit further, the if/else if/else case is
interesting in terms of block naming, it would not be:

if(test1){:block1:
}:block1:
else if(test2){:block2:
}:block2:

but rather

if(test1){:block_for_whole:
}
else if(test2){
}:block_for_whole:

since a break from "block1" in the first example goes where? Certainly
not to the "else" line after }:block1:, but rather to the one after
}:block2:. Similarly, a break from any other part of that construct
would go to the same place. This indicates that the entire if/else
if/else block should have one name for the whole code block. Similarly,
if the if/else if/else block was not explicitly named, then a break
would still go to the line after the entire block.


Regards,

David Mathog
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,954
Messages
2,570,116
Members
46,704
Latest member
BernadineF

Latest Threads

Top