Coding standards

L

Lew Pitcher

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Croughton wrote:
[snip]
I read C code in words, so I will mentally pronounce

if (var == 4)

as "if var is four", the alternative "if four is var" doesn't make much
sense.

Which just goes to show the inadequacy of trying to express mathematical truths
in common English.

Mathematically, if a is equal to b, then b is equal to a, and thus it is just as
true to say "if four is var" as it is to say "if var is four".


- --
Lew Pitcher
IT Consultant, Enterprise Data Systems,
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed are my own, not my employers')
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFBxxaaagVFX4UWr64RAjd6AKC7YgQDq+92bWGPpKiE4qxB/6Wg3QCgmowV
xngncTKXe2pNjvhRDqkGmPo=
=GX4a
-----END PGP SIGNATURE-----
 
J

Jarno A Wuolijoki

Is it really possible for someone who can understand
variable == constant
to get lost while trying to read
constant == variable
?

No. Instead it's possible that it strikes out as unnecessary obfuscation
that uses a mathematical trick just to do things in an atypical way
(perhaps gaining a debatable advantage) thus slowing down the process of
processing the meaning of the source.

To put it in more technical terms, my brain has 7 registers (plus a stack
pointer which is unavailable) and a disproportionately high penalty for
accesses to long-term memory. In order to optimize their use, I can take
advantage of sopisticated data compression algorithms to make more room
for obvious things with a minimal expense for less obvious stuff. (I have
hardware bit extraction functions so the act of (de)compression doesn't
take those resources)

Now, considering that variable==constant is about 10 times more popular
way of putting things than constant==variable, which of them is more
likely to require spilling an extra register causing a pipeline stall
and a tedious recollection of lost context that might result in a yet
another coffee break?
 
K

Keith Thompson

pete said:
Is it really possible for someone who can understand
variable == constant
to get lost while trying to read
constant == variable
?

Yes, if only briefly.

Of course we all know that (x == 4) and (4 == x) are entirely
equivalent. The difference is in how I process it mentally as I read
it left-to-right.

The expression (x == 4) tells me something about x, presumably
something useful. The expression (4 == x) tells me something about 4,
a topic on which I think I learned all I need to know before I left
elementary school.

The (4 == x) idiom grates on my nerves in a similar way as something
like:

#include <stdio.h>
int main (
void )
{
printf (
"Hello, world\n" )
;
return 0 ;
}

That's an absurdly exaggerated example, but it's the same idea; it's
exactly equivalent to the more legible version, but it takes too long
to figure that out, for no sufficient benefit.

Now if your mind happens to work in such a way that (x == 4) and
(4 == x) are equally legible, that's great (no sarcasm). But mine
doesn't, and that's not going to change. I can deal with it when I
have to, just as I deal with indentation styles that I dislike; I just
find (x == 4) easier to read.

And yes, I know that some people feel that avoiding (x = 4) errors is
a sufficient benefit. I don't.

If it weren't for the issue of accidentally using "=" rather than
"==", would you even consider writing (4 == x) (or 3["hello"])?
 
E

E. Robert Tisdale

Lawrence said:
E. Robert Tisdale said:
Write

constant == variable

instead of

variable == constant

I have to say this is a pet [peeve] of mine.
You are significantly compromising the readability of the code
for a limited case that
many compilers will warn about anyway if you ask them to.

I agree. A quality implementation should warn the programmer
whenever an assignment operator appears in a conditional expression.
IMO it is far better to write code that is more readable.



That doesn't always express the logic naturally, e.g.

for (x = max; min <= x; --x)

looks nasty to me.
I'd have to flip the condition operation around in my mind
to feel comfortable with what it is doing.
Writing x >= min here is much more natural.

I don't agree with your comments about readability.
I am completely comfortable with my style rules
and your rules now seem unnatural to me.
But that is exactly the advantage
of developing or adopting a consistent coding style.
It narrows the possible patterns so that
any deviation (often an error) stands out.
Your style may bother other programmers
when they begin to read your code but,
if you are consistent, they will quickly become habituated
and should have no trouble reading, understanding
and maintaining your code.
 
C

Christian Bau

pete said:
Is it really possible for someone who can understand
variable == constant
to get lost while trying to read
constant == variable

It takes more time, and it is pointless. Considering how much code I
have to _read_ every day, any unnecessary obstacles like that are really
annoying.
 
C

CBFalconer

Chris said:
The first one is more expected to change, in English. "Is the
ship at its destination?" makes a lot more sense than "Is the
destination at the ship?", although "Is ship 1 at ship 2?"
(where both are variable) makes sense.

Neither thing especially implies either variablity nor constancy.
The only question being asked is "are they the same" for some
definition of "the same". In some languages this would require
that they reside in the identical place.

I maintain that it is a mistake to read either more or less into an
expression than is actually meant. In this case the meaning is set
by the ISO C standard. Of course you are perfectly free to use it
in the way that makes most sense to you, but please do not imply
that this is the only usage, nor criticize those who set up an
additional bulwark against silly mistakes. Meanwhile I am
attempting to convince you to join that crowd.
 
E

Eric Sosman

CBFalconer said:
Please explain to me how:

thinga == thingb;

carries any different connotations than:

thingb == thinga;

"Backwards run sentences until boggles the mind."
-- Dorothy Parker
 
T

Thomas Stegen

CBFalconer said:
Please explain to me how:

thinga == thingb;

carries any different connotations than:

thingb == thinga;

Maybe they don't.
although I will readily concede that, in C, replacement of '=='
with '=' will immediately attach special connotations.

I don't know about you, but I am human. And as such I use language
in certain ways to enhance my understanding of many things.

a == b says to me that the important things here is that a is equal to
b, not that b is equal to a. Now, they mean the same thing, but being
human... Maybe you perceive this as a flaw, but hey, I've found a
coping mechanism.
 
L

Lawrence Kirby

On Mon, 20 Dec 2004 13:35:10 -0800, E. Robert Tisdale wrote:

....
I don't agree with your comments about readability.
I am completely comfortable with my style rules
and your rules now seem unnatural to me.

In which case you've made it more difficult for yourself to read the
majority of C code out there. Also consider that x in the loop above is
the control variable, if you like the subject, of the loop. It is very
natural that in an expression designed specifically to test its value that
it comes first.
But that is exactly the advantage
of developing or adopting a consistent coding style.

That's an argument for using things like #define BEGIN { as long as you
use them "consistently". That's fine if you work in complete isolation,
but if you ever have to read other people's code or expect other people to
read yours then "consistency" has a broader context than personal
programming preferences. It even goes beyong programming languages where
you can leverage people's abilites to parse English and Mathematical
expressions.
It narrows the possible patterns so that
any deviation (often an error) stands out.
Your style may bother other programmers
when they begin to read your code but,
if you are consistent, they will quickly become habituated
and should have no trouble reading, understanding
and maintaining your code.

I agree with the idea of picking a style and using it consistently, but
that doesn't mean that all styles are equal.

Lawrence
 
C

Charlie Gordon

pete said:
Is it really possible for someone who can understand
variable == constant
to get lost while trying to read
constant == variable
?

Yes ! Because it goes against natural flow of thinking : you are comparing the
value of x (variable) to some constant A, not the value of constant A to some
variable x. Take for instance the following example :

if (constant == var) {
...
} else
if (some_other_constant == var) {
....
}

the two tests are closely related, as in a switch statement, placing the
constant first breaks that symmetry.

The possibility for == being mistyped as = is almost always caught in practice,
with the proper warning enabled at compilation time, as most of these
comparisions take place in tests where assignments are definitely suspicious.
 
M

Mike Wahler

Chris Croughton said:
The first one is more expected to change, in English. "Is the ship at
its destination?" makes a lot more sense than "Is the destination at the
ship?", although "Is ship 1 at ship 2?" (where both are variable) makes
sense.

I read C code in words, so I will mentally pronounce

if (var == 4)

as "if var is four", the alternative "if four is var" doesn't make much
sense.

I think your problem is using the word "is".

var == 4

means:

The value of the expression 'var' is equal to
the value of the expression '4'.

That is, it doesn't mean 'var' *is* 4.

Personally I typically write the "var == 4" form, but I don't
have trouble reading it the other way.

This issue seems to come up here from time to time,
I think it's one of those 'non-issues' myself.


-Mike
 
C

Charlie Gordon

Lawrence Kirby said:
On Sun, 19 Dec 2004 19:29:55 -0800, E. Robert Tisdale wrote:

for (x = max; min <= x; x--)

looks nasty to me. I'd have to flip the condition operation around in my
minf to feel comfortable with what it is doing. Writing x >= min here is
much more natural.

There is a more compelling reason why this code looks nasty :
The <= operator is more often than not an indication of problems to come !
Here the loop is bogus in these cases :

- x signed int and min = INT_MIN (a contorted counter example ;-)
- x unsigned int or size_t and min = 0 (a very common mistake !)

Downward for loops are a difficult craft indeed.
 
E

E. Robert Tisdale

Charlie said:
There is a more compelling reason why this code looks nasty:
The <= operator is more often than not an indication of problems to come!
Here the loop is bogus in these cases:

- x signed int and min = INT_MIN (a contorted counter example ;-)
- x unsigned int or size_t and min = 0 (a very common mistake !)

Downward for loops are a difficult craft indeed.

Your example is bogus.
> cat main.c
#include <stdio.h>
#include <limits.h>

int main(int argc, char* argv[]) {
for (int x = INT_MAX - 1; x <= INT_MAX; ++x)
fprintf(stdout, "x = %d\n", x);
return 0;
}
> gcc -Wall -std=c99 -pedantic -o main main.c
> ./main
x = 2147483646
x = 2147483647
x = -2147483648
x = -2147483647
.
.
.

It doesn't matter whether the loop increments downward or upward.
You just need to be careful at the limits of the loop variable.
 
M

Michael Mair

E. Robert Tisdale said:
Charlie said:
There is a more compelling reason why this code looks nasty:
The <= operator is more often than not an indication of problems to come!
Here the loop is bogus in these cases:

- x signed int and min = INT_MIN (a contorted counter example ;-)
- x unsigned int or size_t and min = 0 (a very common mistake !)

Downward for loops are a difficult craft indeed.


Your example is bogus.
cat main.c
#include <stdio.h>
#include <limits.h>

int main(int argc, char* argv[]) {
for (int x = INT_MAX - 1; x <= INT_MAX; ++x)
fprintf(stdout, "x = %d\n", x);
return 0;
}
gcc -Wall -std=c99 -pedantic -o main main.c
./main
x = 2147483646
x = 2147483647
x = -2147483648
x = -2147483647
.
.
.

It doesn't matter whether the loop increments downward or upward.
You just need to be careful at the limits of the loop variable.

For the newbies: ERT, the poor sod, cannot get it into his skull
that for signed integers, overflow invokes undefined behaviour.
That means:
int x = INT_MAX;
x++;
will not necessarily wrap around to (x == INT_MIN), so the above
construction is not guaranteed to work.
For unsigned integers, if we run out of the range, the new value
is computed by adding or subtracting one more than the maximum
representable value as often as necessary to come back into
the range 0 .. maximum representable value. Example:
unsigned int x = UINT_MAX;
x++;
means that (x == 0).


-Michael
 
E

E. Robert Tisdale

Michael said:
E. Robert Tisdale said:
Your example is bogus.
cat main.c
#include <stdio.h>
#include <limits.h>

int main(int argc, char* argv[]) {
for (int x = INT_MAX - 1; x <= INT_MAX; ++x)
fprintf(stdout, "x = %d\n", x);
return 0;
}
gcc -Wall -std=c99 -pedantic -o main main.c
./main
x = 2147483646
x = 2147483647
x = -2147483648
x = -2147483647
.
.
.

It doesn't matter whether the loop increments downward or upward.
You just need to be careful at the limits of the loop variable.

For signed integers, overflow invokes undefined behaviour.
That means:

int x = INT_MAX;
++x;

will not necessarily wrap around to (x == INT_MIN),

I never said that it did.
so the above construction is not guaranteed to work.

The "above construction" was never guaranteed to work.
But it is guaranteed to fail.
For unsigned integers, if we run out of the range,
the new value is computed by adding or subtracting
one more than the maximum representable value as often as necessary
to come back into the range 0 .. maximum representable value.
Example:

unsigned int x = UINT_MAX;
++x;

means that (x == 0).

So what's your point?
 
L

Lawrence Kirby

There is a more compelling reason why this code looks nasty :
The <= operator is more often than not an indication of problems to come !

Well, the <= operator is more commonly associated with upwards loops,
which is perhaps another reason why the form of the test above looks
strange. You do have to be careful about off-by-1 errors however.
Here the loop is bogus in these cases :

- x signed int and min = INT_MIN (a contorted counter example ;-)
- x unsigned int or size_t and min = 0 (a very common mistake !)

Downward for loops are a difficult craft indeed.

We know there are issues using unsigned variables around 0 because you are
working close to the boundary, and it is easy to just step over it. But
that doesn't really relate to the style issue being discussed.

Lawrence
 
C

Charlie Gordon

E. Robert Tisdale said:
Your example is bogus.

The example code is yours !
I only showed in what cases the use of <= in your example will lead to errors.
You can turn the example around and make a loop going up with the same issue of
course
Or you could use >= even, that will not cure the problem.
cat main.c
#include <stdio.h>
#include <limits.h>

int main(int argc, char* argv[]) {
for (int x = INT_MAX - 1; x <= INT_MAX; ++x)
fprintf(stdout, "x = %d\n", x);
return 0;
}
gcc -Wall -std=c99 -pedantic -o main main.c

gcc complains about comparing an unsigned number to 0 with >= but not in your
example where the result of the comparison is constant, this is a pity !
especially when looking at the assembly, where it is obvious that gcc detected
the infinite loop and removed all traces of the comparison code :

..globl _main
.def _main; .scl 2; .type 32; .endef
_main:
pushl %ebp
movl %esp, %ebp
pushl %ebx
pushl %eax
andl $-16, %esp
xorl %eax, %eax
call __alloca
call ___main
movl $2147483646, %ebx
.align 4
L6:
pushl %eax
pushl %ebx
pushl $LC0
pushl %ecx
call ___getreent
popl %edx
pushl 8(%eax)
call _fprintf
incl %ebx
addl $16, %esp
jmp L6
.def ___getreent; .scl 2; .type 32; .endef
.def _fprintf; .scl 2; .type 32; .endef

It doesn't matter whether the loop increments downward or upward.
You just need to be careful at the limits of the loop variable.

Yes, and using strict comparisons is a better choice.
 
J

Jason Taylor

I am using a tool, Crystal REVS for C (www.sgvsarc.com) that provides a
practical solution to code layout style and programming practice
issues.

Crystal REVS provides the best code-formatter and the best flowchart
generator that I have seen. You can set format parameters such
K&R/non-K&R braces, indent size, code/comment widths, spaces
before/after operators etc.


1. Code layout style issues - how about the following scenario:

Let the author/responsible engineer use the code layout style he/she
prefers and thus be more productive.

When you review the code, set the format parameters as per your
preference. You will see the code in the format that you prefer - so
you will be more productive.

2. When you edit, Crystal REVS frees you from low-level editing. As
soon as you complete a statement or a declaration, Crystal REVS
automatically formats it as per the format settings. It lines up
successive declarations and successive assignments. It formats long
expressions as per operator precedence rules.

3. Some engineers like to sprinkle comments before and after every few
statements - it is fine - it works for them. I personally prefer the
comments to be on the right so that I can get an uninterrupted view of
the code sequence. With one command, Crystal REVS shifts the comments
to the right and formats them.


4. If you prefer fully bracketed syntax to help you detect problems,
you can command Crystal REVS to automatically insert the brackets.

5. Whether you have multiple exits or a single exit, with Crystal REVS'
flowcharts, you can understand the code easily.

6. Crystal REVS assists you in adding comments to your code. You can
generate a high-level comment flowchart.
 
N

Natt Serrasalmus

First I want to mention that I have read all the responses in this thread, I
just don't have time to respond to them all. Thanks to those who have
replied. More below:

E. Robert Tisdale said:
Are you that guy?


Yes, but are you that guy?

I'm trying not to be. There are things that I like that I can give objective
reasons for liking. For instance, I don't like K&R brace style. I like for
the braces to be at the same indentation level so it's easy to see which
braces match which. Having said that, I also am not that concerned about it
since the editor I use will show which braces match.
You need to distinguish between *style* issues
and good programming practice.

I think that all programmers blur that distinction. Is it good pragramming
practice to put the braces on the same level to show where they match or is
that just a style issue? Is it good programming practice to use if
statements instead of the ternary operator, or is that style?
I also think [that] it is wrong for coding standards
to try to prevent idiots from doing stupid things.

I don't think it's wrong.
I just don't think it works.

Yet you have advocated it!?!?

<snip>

issue that I have come across is the one way out of a function
This is a bit of a "straw man" argument.
The problem is that functions which contain multiple exit points
are extremely difficult to read, understand an maintain.

No, not at all. Consider:

int func(arglist)
{
(declarations)
int retval = BADSTATUSVALUE;
void *pointer;

if(condition1)
{
pointer = malloc(SOMESIZE);
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
if(condition2)
{
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
if(condition3)
{
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
if(condition4)
{
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
while(condition5)
{
long_line_that_ should_
be_broken_up_because_it_is_so_long_and_indented;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
retval = GOODSTATUSVALUE;
}
} /* condition4 */
} /* condition3 */
} /* condition2 */
} /* condition1*/

free(pointer);
return retval;
}


I'm sure most of us have seen code similar to the above. Now consider:

int func(arglist)
{
(declarations)
void *pointer;

if(!condition1)
return BADSTATUSVALUE:

pointer = malloc(SOMESIZE);
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
if(!condition2)
{
free(pointer);
return BADSTATUSVALUE;
}

codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
if(!condition3)
{
free(pointer);
return BADSTATUSVALUE;
}
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
if(!condition4)
{
free(pointer);
return BADSTATUSVALUE;
}
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
while(condition5)
{
long_code_line_that_doesn't_have_to_be_broken;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
codeline;
}

free(pointer);
return GOODSTATUSVALUE;
}

Now then, this second example is far easier to deal with. The extreme
indentation has been eliminated, the distant brace matching has been
eliminated and the retval variable has been optimized away. This results in
several less things to keep track of. With the less indentation, fewer long
lines need to be broken up.
Suppose, for example, that you need to modify the code
and malloc some temporary storage.
How do you ensure that that storage is free'd
before any of the multiple exits?

Look at the example I've given above. If you were a maintenance programmer
adding another condition that required an early exit, you'd have to be a
complete moron not to realize that the pointer has to be freed before the
early exit. The only justification I can see for the one way out in the
first example is to protect the code from morons. That's something that you
admitted doesn't work.
Maybe you can write more concise code
with one, two, three or mre return statements.
But where do you draw the line.
The line has been drawn, perhaps somewhat arbitrarily,
at a single point of return --
preferrably at the end of the function.
This simplifies cleanup, for example,
because the thread of execution is guaranteed
to run through any statements such as free
which are placed just before it.

The examples I've given have made your point moot.

As far a coding style is concerned
that should be left up to each individual programmer.
Any rules regarding the style used for code repositories
can and should be resolved with a code reformatter such as indent

http://www.gnu.org/software/indent/indent.html

Funny you (as well as others) should mention a reformatter. I've been an
advocate of this for years. A coder would check out code, do anything they
want to it as far as style is concerned and then when they check it back in,
it is reformatted to the standard. Probably nodoby will be happy with the
results, but it is consistent. Whenever I have mentioned this in the past to
anyone, I get sort of a bemused look and an admission that it might be
interesting, but nobody ever seems to want to do it.

<snip a lot of style stuff that seems to be primarily concerned with the use
of whitespace>
Write

constant == variable

instead of

variable == constant

when comparing a variable to a constant for equality
so that if you write

constant = variable

by mistake, the compiler will detect the error.

It's amazing how much response this recommendation generated. Nobody seems
to like it. What is the sole purpose of this recommendation? It is to
protect the code from morons who can't keep track of the difference between
= and ==. Here is a case where those who responded felt that the risk was
worth the convenience of having the expression written the way they prefer
it. Yet I've had many coders of the same ilk act as if some of my
suggestions were just not worth the risk of the damage that some moron might
do to the code. Here's where it becomes a matter of where you draw the line
and unfortunately it really is subjective.


Incidentally, this rule would be better written as:

rvalue == lvalue

but the coders that need this rule probably haven't figured out the
difference between an rvalue and an lvalue anyway and admittedly the
compiler may not catch the problem for all rvalues.
 
M

Mark McIntyre

First I want to mention that I have read all the responses in this thread, I
just don't have time to respond to them all. Thanks to those who have
replied. More below:

Natt, a word of advce. E Robert Tisdale is well-known round here for being
a very inaccurate and misleading poster. Treat his advice with caution.

And don't take my word for this (ERT will undoubtedly reply to suggest that
I'm a troll) but look at responses to his posts via google groups. You'll
see the picture pretty quickly.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
474,160
Messages
2,570,889
Members
47,421
Latest member
StacyTaver

Latest Threads

Top