Note: this article is cross-posted to [comp.lang.c++] and
[comp.programming].
The subject of how to best express a logical "loop-and-a-half"
has popped up in a number of recent [clc++] threads.
The language, C++, is a bit important because in C++ all code
must be prepared for exceptions at any point, so that in this
language the possibility of missed cleanup due to early exit
is not an issue.
It never was really an issue. Dijkstra raised the issue over
thirty years ago. The issues were intensely discussed at the
time, and it quickly became clear that Dijkstra was right, and
that spaghetti code couldn't be made correct. I find it simply
amazing that over 30 years later, there are still people ready
to argue that it's better just to hack away, on simple gut
feeling, rather than to use established techniques of
guaranteeing readability and correction.
Note that there are different degrees of "spaghetti". While
there are strong reasons to prefer establishing the loop
invariant at the top, having one exit in the middle is still
orders of magnitude better than having several exits.
void throwX( char const s[] ) { throw std::runtime_error( s ); }
string commandStringFromUser()
{
using namespace std;
string line;
cout << "Command? ";
if( !getline( cin, line ) ) { throwX( "i/o failure" ); }
return line;
}
bool isValidCommandString( string const& s )
{
return (s.length() == 1 && isValidCommandChar( s[1] ));
}
Then, an example of "loop-and-a-half" expressed with exit in the middle:
// Variant 1, exit in middle.
char validUserCommand()
{
for( ;; )
{
std::string const line = commandStringFromUser();
if( isValidCommandString( line ) )
{
return line[1];
}
giveShortHelpAboutValidCommands();
}
}
char
getValidatedCommand()
{
std::string line( getUserInput() ) ;
while ( ! isValidCommandString( line ) ) {
giveShortHelpAboutValidCommands();
line = getUserInput() ;
}
return line[ 0 ] ;
}
(I've changed the names to reflect standard conventions as
well---functions are verbs.)
Short, simple, to the point. I don't see why anything more
complicated is needed.
Easy to prove correct. If I'm in the loop, I don't have a valid
command.
[...]
It has been argued that the last two forms have a loop
invariant whereas the exit-in-middle lacks one, or more
specifically, that its loop invariant isn't established until
halfway through the first iteration.
I think that's a bogus argument, and prefer variant 1,
exit-in-middle.
Why do you think it's bogus? Why do you think we should ignore
loop invariants? This is an honest question. Your first
example does not violate the single entry/single exit principle,
which is the most important. But some sort of reasoning
involving loop invariants is necessary if you want to be sure
the code is correct. (I think we can agree that calling
"getUserInput()" makes progress to loop termination, the other
criticall "proof" necessary for correction.) I'm not saying
that such reasoning is necessarily impossible, but it seems to
me much simpler if we establish the condition up front (and that
the post-condition of the function---that we've read a valid
line---is also the termination condition of the loop.
FWIW: at one point (some 15 years ago), I did consider the idea
that the two versions are "equivalent"---that your version is a
transformation (in the Chomskian sense) of my version, much
like, say, "The dog is seen by me" is a transformation of "I see
the dog". And of course, it is possible to prove that if the
underlying sentence is correct, then the transformation is as
well. But that's exactly how I'd go about analysing your
version: prove that it is functionally equivalent to mine, then
prove that mine works. Which means extra work in understanding
yours.