Is goto Still Considered Harmful?

T

Tim Rentsch

Lynn McGuire said:

gotos are like weeds. A garden may have a few weeds scattered
here and there yet still be a fairly nice garden. But when a
garden has weeds all over the place it starts to look pretty
ratty, even if it's a rather nice garden otherwise. And of
course the very nicest gardens have no weeds at all.

Dijkstra's point is not about gotos but about programmers, viz.,
programmers who feel no compunction against using goto freely
tend to write lousy code. And this is borne out IME: even among
developers who are okay with using goto on occasion, the good
ones still have an aversion to using it generally and look for an
alternate formulation before resorting to goto.
 
T

Tim Rentsch

James Van Buskirk said:
["Followup-To:" header set to comp.lang.c.]

Posted to clf because the nature of the question is central to an
issue due to Fortran interoperability (_Bool <-> LOGICAL(C_BOOL)).


At last! A C programmer to inflict the Question on! Now, what
if true is a const _Bool with internal representation B'00000010'
and var is also an _Bool. Does the assert happen if NDEBUG is
not on?

The code generated by the assert macro, when NDEBUG is not
present, should not treat this expression any differently than
something like

if (var = true) ...

The assignment must happen if NDEBUG is absent, and its result
tests as true.

Yes, this last line of code above is perhaps a better illustration
of the issue. My impression was that assignment between variables
of the same integer type in C was to copy all the bits, ...

Not quite - the value of the right operand of an assignment
expression is determined (a process that may involve the usual
arithmetic conversions), and is then converted to the destination
type (if different).

Part of the premise of this subthread is that both left and right
operands have the same intrinsic type, _Bool. See above. In
n1124.pdf, section 6.3 I read:

"Conversion of an operand value to a compatible type causes no
change to the value or the representation."

Doesn't that mean that the compiler is just supposed to copy bits?
[snip elaboration]

The short answer is no, but it's instructive to look at why.

First, if the width of _Bool is 1, then it may only ever store
the values 0 and 1. Any bit pattern occupying a _Bool object
will therefore denote either 0, or 1, or a trap representation.
If the bit pattern denotes a 0 value then 0 will be stored; if
the bit pattern denotes a 1 value then 1 will be stored; and if
the _Bool object holds a trap representation then any behavior at
all is permissible, either copying the bits, or storing a 0 or 1
value, or crashing the program. So a compiler is not obliged to
exactly copy the bits in this case, although of course it may,
but the key thing is it doesn't have to.

Now consider the question when the width of _Bool is greater
than 1. Under this assumption a _Bool may now hold an object
representation corresponding to the value 2 or 3 (and perhaps
others, but these are enough), and are legal values, ie, not
trap representations. The Standard imposes two requirements:
one, that conversion to the same type causes no change to the
value or representation; and two, that conversion to a _Bool
type produces 0 if the original value compares equal to 0, and 1
otherwise. The only way both of these requirements can be
satisfied simultaneously is if a _Bool object may not definedly
hold a value other than 0 or 1. Hence the possibility that the
width of a _Bool is greater than 1 is not satisfiable in a
conforming implementation. Therefore we may conclude that _Bool
/must/ have a width of 1, which gives each implementation enough
freedom to either just copy the bits or not, as it chooses.
 
M

Malcolm McLean

When you write a function »double average( int const N ...«,
what value do you return on an abort?
It just depends. There's a case for nan or 0, sometimes I use DBL_MAX.
You might or might not want an assert in there too. Whilst you've still got
to return something, on many environments asserts are left in on release
builds, so it's largely academic.

If you know the context the function is going to be called from, that tells
you what to do. For example I recently wrote an average function for a
palette chooser. You could conceivably get an empty list (e.g. if user chooses
a 256 colour palette from 8x8 image), so it's correct to quietly return black
for unused cells. So in that case, putting return 0 is the easy if slightly
hacky way of dealing with it.
 
J

James Kuyper

My assertion is that in C we (the majority of educated
programmers of today) prefer short functions. I do not wish
to make any assertion about whether this holds or does not
hold for any other language.

Well, in my experience, most C programmers seem to prefer function
lengths that I'm not willing to call "short". I can't say for sure how
many of them you would be willing to consider "educated", but I believe
that a significant fraction of them had CS degrees.

I do see a lot of short member functions written by educated C++
programmers, because they tend to make heavy use of classes with
inheritance, with each class in the inheritence tree having
responsibility for only a small piece of the total behavior of the code.
That responsibility can often be met by defining a small number of very
simple member functions. Non-member functions, however, (main() in
particular) are often very long.
 
M

Malcolm McLean

I do see a lot of short member functions written by educated C++
programmers, because they tend to make heavy use of classes with
inheritance, with each class in the inheritence tree having
responsibility for only a small piece of the total behavior of the code.
That responsibility can often be met by defining a small number of very
simple member functions. Non-member functions, however, (main() in
particular) are often very long.
A lot of C++ programmers write short access functions, getField() and setField(), that simply
read and assign a member variable, maybe with a range check. Whilst you do need the odd one,
heavy use of getter and setters is usually a sign that the programmer hasn't really understood how
to abstract functionality. Whilst the idea is that the getters and setters can be over-ridden or
rewritten at a later date, that's unlikely to happen, more likely the code will need a total rewrite.
 
S

Stephen Sprunk

A lot of C++ programmers write short access functions, getField() and
setField(), that simply read and assign a member variable, maybe with
a range check. Whilst you do need the odd one, heavy use of getter
and setters is usually a sign that the programmer hasn't really
understood how to abstract functionality. Whilst the idea is that the
getters and setters can be over-ridden or rewritten at a later date,
that's unlikely to happen, more likely the code will need a total
rewrite.

The usual rationale is that changing the member variable means you need
to update _other_ state in the object (or resource you're wrapping) as a
side effect; the setField() functions provide a place for that to
happen, and the getField() functions are for symmetry. Or you may need
to get the current state from the wrapped resource, or calculate it, etc.

Some member variables _can_ be safely updated by callers, at least in
this version of the class. But what if you may need to make them
private in a future version due to adding side effects? Better to make
callers use the accessor functions now so they aren't broken by an API
change that _should_ be transparent to them.

(The above applies to OO interfaces in C as well, just minus the
syntactic sugar that C++ classes provide to make it a little less painful.)

Overloading operators (so callers appear to be directly updating a
member variable) seems more elegant in some respects, but it hides a
(possibly expensive) function call, which may introduce other problems.
That's a general problem with C++'s operator overloading, not specific
to this.

S
 
S

Stephen Sprunk

I prefer functions not to be "unnecessarily long".

Sometimes "necessarily long" can mean a thousand lines or more.

I've never run across such a case.
If a large function is broken up into a small top function which
calls many smaller ones, but those smaller ones have no caller other
than the top function, then the change is pretty much a waste of
effort.

Not at all. You can test the smaller functions in isolation and remove
their complexity from the top-level function; that itself has value,
even if the compiler just ends up inlining them.

Also, massive functions imply that your function is doing too many
things, which is inherently problematic and difficult to test or
troubleshoot; it is often better to create a set of small but related
functions, which then call a common set of helper functions.
Not only that, but C has poor support for dividing logic into
functions. Breaking up a large function into smaller ones can mean
changing the structure of the code in order to simulate the
environment passing that naturally occurs in a language with nested
functions. What were originally nice, local variables in the
original function turn into some ugly structure or structures that
only exists in order to support the fragmentation of the logic into
multiple functions.

How did all that state get into the top-level function, though? Use the
same mechanism to pass (a portion of) that state to the helper functions.
Breaking up logic into functions should only be done when:

- some of those functions will be called from more than one place,
eliminating repetitions.

Unit tests mean _every_ function is called from at least two places.
- some of those functions are useful on their own and will end up
being called by other functions or modules.

That's nice but not necessary.
- pieces of the original large code were being selectively dispatched
by a large switch or if/else ladder, so that when the function is
broken up, a dispatch based on function pointer indirection can
streamline the flow and possibly even be faster.

If you have a fairly uniform dispatch loop that can use such a
structure, sure. Many times, you can't.

Still, if the cases are more than a few lines each, I prefer to put the
code in a separate function, even if that's the only caller (besides the
unit test, of course).
Experienced, mature programmers can handle large functions,

Experience and maturity cannot overcome the inherent limit on the amount
of information that can be kept in short-term memory.
and large functions occur in all advanced, real-world projects.

That doesn't mean they're a good idea.
For instance, the function _dl_map_object_from_fd in the glibc
dynamic linker:

https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/dl-load.c

goes from line 919 to 1623. (Commit 9455df...bef85a).

A lot of that is whitespace, preprocessor directives or comments. Also,
that function clearly does a sequence of operations that could have been
broken up into helper functions--even if it were the only caller. My
bet is that it started out a lot smaller but grew over time as features
and special cases (or bug fixes) were added, a series of straws that
should have broken the camel's back long ago.

S
 
J

James Kuyper

On 25-Mar-14 11:14, Kaz Kylheku wrote: ....
Unit tests mean _every_ function is called from at least two places.

That depends upon how you define unit tests. For purposes of unit
testing, I treat functions with internal linkage as part of the
function(s) with external linkage in the same translation unit that
(directly or indirectly) call them. If they have any feature which
cannot be properly tested through calls to those externally linked
functions, then that generally means that the feature is either
unnecessary or at least badly designed.

....
Experience and maturity cannot overcome the inherent limit on the amount
of information that can be kept in short-term memory.

No, but it can dramatically increase that limit. Well-designed programs
stress your short-term memory capacity much less than poorly designed
programs, because they're easier to understand.
 
I

Ian Collins

Stephen said:
A lot of that is whitespace, preprocessor directives or comments. Also,
that function clearly does a sequence of operations that could have been
broken up into helper functions--even if it were the only caller. My
bet is that it started out a lot smaller but grew over time as features
and special cases (or bug fixes) were added, a series of straws that
should have broken the camel's back long ago.

That function appears to be a classic case of the code rot that occurs
when functionality as added without any thought to improving the design.
One of the things I found surprisingly hard when managing development
teams was getting programmers to refactor code before updating it. It's
all too easy just to pile on more functionality. This is usually a
symptom of poor management not allowing time for code improvements and a
lack of unit tests which support refactoring.
 
M

Malcolm McLean

The usual rationale is that changing the member variable means you need
to update _other_ state in the object (or resource you're wrapping) as a
side effect; the setField() functions provide a place for that to
happen, and the getField() functions are for symmetry. Or you may need
to get the current state from the wrapped resource, or calculate it, etc.

Some member variables _can_ be safely updated by callers, at least in
this version of the class. But what if you may need to make them
private in a future version due to adding side effects? Better to make
callers use the accessor functions now so they aren't broken by an API
change that _should_ be transparent to them.

(The above applies to OO interfaces in C as well, just minus the
syntactic sugar that C++ classes provide to make it a little less painful.)
That's the standard rationale. It does work sometimes. The reality is that
setting a member directly can't usually be replaced by an interface that
does something, because that would change the contract and break code in
unpredicatable ways.
E.g. we have a setSalary method that sets an employee's salary. We could
decide that we'll update his tax bracket at the same time to keep the class
always consistent. But now we've broken logic that goes

employee.setSalary( newsalary);
if(employee.getTaxBracket() != taxBracketForSalary(newsalary))
{
SendLetter("Your tax has changed");
employee.updateTaxBracket();
}

So in fact we've got to rewrite the program. Which is possibly the best thing
to do.
 
S

Stefan Ram

Malcolm McLean said:
E.g. we have a setSalary method that sets an employee's salary. We could
decide that we'll update his tax bracket at the same time to keep the class
always consistent. But now we've broken logic that goes

When the documentation of the public interface (the
contract) of the setSalary function already contained the
guarantee that this function will only change the salary and
nothing else, then changing this guarantee will change the
public interface of the entity, which in any case (not only
in the case of fields) means that all clients of the old
public interface are not necessarily valid clients of the
new interface. So, one usually tries to avoid to change
public interfaces, not only in the case of field and setters.
 
P

Phil Carmody

Ian Collins said:
That function appears to be a classic case of the code rot that occurs
when functionality as added without any thought to improving the
design. One of the things I found surprisingly hard when managing
development teams was getting programmers to refactor code before
updating it. It's all too easy just to pile on more functionality.
This is usually a symptom of poor management not allowing time for
code improvements and a lack of unit tests which support refactoring.

Oh, come on, that function is carp, there's no other word for it.

Phil
 
I

Ian Collins

Phil said:
Oh, come on, that function is carp, there's no other word for it.

Ah Phil, you can which one of us has been corrupted by the politics of
management!
 
S

Stefan Ram

A carp is a fish. The (other )word for it is »crap«.

As an example for how to solve problems with small functions,
I wrote this program that prints a list of prime numbers:

#include <stdio.h>

static int divides( int const divisor, int const number )
{ return number % divisor == 0; }

static int numberofdivisors( int const n )
{ int sum = 0; for( int i = 1; i <= n; ++i )sum += divides( i, n ); return sum; }

static int prime( int const number )
{ return numberofdivisors( number )== 2; }

static void printifprime( int const number )
{ if( prime( number ))printf( "%d\n", number ); }

int main(){ for( int i = 1; i < 101; ++i )printifprime( i ); }
 
S

Stefan Ram

As an example for how to solve problems with small functions,
I wrote this program that prints a list of prime numbers:

As an example for how to solve problems with if-goto
I wrote this assembler program to print some prime numbers:

#include <stdio.h>

int main()
{ int i;
int sum;
int j;
int p;
int tmp;
i = 1;
loop: if( i >= 101 )goto out;
sum = 0;
j = 1;
innerloop: if( j > i )goto innerout;
tmp = i % j;
tmp = tmp == 0;
sum += tmp;
++j;
goto innerloop;
innerout: p = sum == 2;
if( !p )goto over;
printf( "%d\n", i );
over: ++i;
goto loop;
out: ; }
 
M

Malcolm McLean

When the documentation of the public interface (the
contract) of the setSalary function already contained the
guarantee that this function will only change the salary and
nothing else, then changing this guarantee will change the
public interface of the entity, which in any case (not only
in the case of fields) means that all clients of the old
public interface are not necessarily valid clients of the
new interface. So, one usually tries to avoid to change
public interfaces, not only in the case of field and setters.
There's some truth in that.
The situation is that we have an employee, a salary, an associated tax band,
and when a change in salary results in a change in tax band, we need to
send him a letter informing him of the fact.
So we've got three levels of task.
Lowest level - write the new salary to the field.
Medium level - ensure the tax band is calculated and represented correctly.
High level - send the letter.

The idea of abstracted get/setters is that a task at the lowest level can
be reimplemented as a task at a higher level. So instead of rewriting the
field, we recalculate the tax band or even send the letter.
But a change in the level of a function is much more likely to break things
deeply than a change in what the function does (e.g. a high-level
updateSalary() method sends a letter, we rewrite it to send a carbon copy
to the printer as well. That change is of a different order to changing
setSalary() from a field write to sending a letter).

You do need some access functions. But they should be the exception.
 
M

Malcolm McLean

On 4/1/14, 3:24 AM, Malcolm McLean wrote:

I suppose if your data model says that to meet data consistency rules
that are to kept, the lower two functions/operations are NOT part of the
"Public" interface of the class, but should be private or maybe
protected members. Your lowest level operation, being seemingly DEFINED
as just changing the member, should not be a member function in my mind.
Effectively that's right. A setter is "kind of defined as just changing the member".
So it shouldn't be public, which means there's no reason for it to be a function.
The middle level, which sets the variable and updates other fields to be
consistent would be a function, but with access restrictions so that
only are in charge of maintaining the full constraints call it. This
also should be the only place where the member is changed (which is one
reason the lowest level shouldn't be a function, it will be, by
definition, a one liner called in all likelyhood one place.
We've also got special cases. For example it's probably valid to copy an employee, so we'll need
to set the salary field there. Similarly in retrieving him from disk. Then when he retires or leaves
the company there might be a rule, or it might be added at a later date.
The High level function is really the only place that you have your
public API. Things that don't know the details about the rules for
updating a salary should only deal with this. And in fact, this
function should probably be the only place the medium level routine is
called. This could be a reason to say that the middle level shouldn't
even exist as a function, but there may be enough difference in
abstraction of these operations that it does make sense to have them
separate.
There's fundamental difference. The middle level is only manipulating the employee. It needs to
know about the employee and the rules for tax bands (which probably means a connection to
some kind of tax law engine). But it doesn't need to know about letters and printers, nor should it.
Note, that since the whole purpose of having a setter is so that it my
add any needed constraint update, anything that breaks when you add it
doing something (like writing the letter) was already broken, you just
couldn't see it, or should have already been documented as needing some
"special access" so that it was automatically reviewed (and perhaps
updated) when the change was made.
The entire concept of "change the implementation but not the interface" is flawed, when you think
about it. With the exception of cosmetic changes to code, the only reason to change an implementation
is to change the object's behaviour in some respect. So what we really mean is "keep the interface valid".
That's not an easy thing to define. For example, say we move from drawing pixel lines to drawing
anti-aliased lines. That's fine if caller wants something to display to the end user, hopeless if he's
doing post-processing of the output. That doesn't mean that the change is unreasonable, just
that we've got to be careful.
But getters and setters are probably more likely to break than other functions, if rewritten. They
create the illusion of object-oriented programming when in fact they're just syntactical sugar for
structures.
 
S

Stefan Ram

Malcolm McLean said:
Effectively that's right. A setter is "kind of defined as just changing the member".

There are different definitions of »setter«. For example,
some definitions say that a setter does not change a member,
but a property. Some definitions say that a setter does not
always change a member, but might conditionally not change a
member to protect class invariants, or might have additional
effects, such as »setVisible( false )« making a screen
object invisible. Some people seem to call every function
whose name begins with »set« followed by a word a setter.
 
M

Malcolm McLean

There are different definitions of »setter«. For example,
some definitions say that a setter does not change a member,
but a property. Some definitions say that a setter does not
always change a member, but might conditionally not change a
member to protect class invariants, or might have additional
effects, such as »setVisible( false )« making a screen
object invisible. Some people seem to call every function
whose name begins with »set« followed by a word a setter.
There's a continuum between setting a field, range-checking and setting a field, setting and changing
another member, and triggering widespread changes.
However if you've got a matching pair of get/set functions, and a field which contains that value,
then I'd say you're using get/setters. I don't ban them totally - obviouslya GUI checkbox, for example,
will reasonably have an internal state and a get/set pair to access it. Butthat's rather a special case,
it's a sort of graphical wrapper for a boolean which the user can edit, so you're giving the user direct
access to a field.
You might also have a background colour. But I'd be unhappy (though not toounhappy) about a
get / set background colour pair. In reality the user is probably going to want white or grey - if you
permit a "theme" then you don't really want it implemented by thousands of calls to set background
methods. And it's not clear what a set background should do if you move to a fancy 3d type effect
with a shadow under the tick. And if it's necessary to query the checkbox background as part of
caller's logic control, really his program is extremely badly designed.
So I'd prefer a "set appearance" method which offers a limited menu of aesthetically pleasing
colours.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,120
Messages
2,570,710
Members
47,283
Latest member
hopkins1988

Latest Threads

Top