Stylistic questions on UNIX C coding.

B

BruceS

I've actually laughed out loud when I've read Bruce's part above; it's
so blatant that I'm almost completely sure it was meant as irony.
(Perhaps so is your response, too.)

Cheers,
lacos

Well *someone* laughed anyway. Yes, this was intended ironically, and
I thought it was so obvious that I didn't put in an emoticon. This is
one of the oldest pointless battles in C. Now it looks like I've
started it again, rather than the intended humor. I would have said
something about the inferiority of little-endianism, but I'm afraid
few people really get that.

Years ago, I had my one and only protest shirt. It said, in big
letters, "STOP PLATE TECTONICS". So many people were confused by it I
almost gave up wearing it. Then one Saturday, a manager at my work
started to walk past me, stopped to double-check what my shirt said,
and walked on laughing heartily.
 
T

Tom St Denis

Hmmm...

if (condition) {    <-- one line

if (condition)      <-- one line
{                   <-- and another one, makes two lines

Does that really make a BIG difference?  




Really? What's so bad about that? I prefer the latter btw...

The only reason I don't use the two-liner is that it's two lines. If
you write functions with a lot of conditionals (e.g. you're calling
functions that can fail) it adds up really quickly. It's also another
line you have to indent properly [something a lot of people don't do].

There is no technical reason beyond that for favouring either.

Personally I use the one-liner but I'll work on code that uses the
other. If I take ownership of code that uses the two liner I might
switch it back to one-liner but beyond that I'm not that petty :)

my two cents...

Tom
 
B

BruceS

As CS Lewis said, the first pot, which would prove its maker a genius
if it were the first pot ever made, would prove him a dunce if it came
after many millenia of pot-making.

That's a good one; I'll try to remember it.
FWIW, the K&R bit was intentional, to make the joke more obvious. Not
obvious enough for some, but apparently I either overestimated the
volume in the ng, or underestimated the mass. My bad.
 
P

Poster Matt

Julienne said:
It's only error prone if you have multiple variables in a declaration
statement (which the OP's example did not). That itself is often
viewed as an unsafe practice.

OP here... Yes and agreed. My view is that each variable should have it's own
line. The only exception I make is when a for loop needs 2 index variables and
both are initiated in the for loop statement. EG:

int i, j;
for (i = pos, j = 0; ...)
 
J

John Bode

Hi,

I've a few questions concerning style when programming C on UNIX systems. I
don't want to look like an amateur. :)

1. Having been programming in higher level languages for the last 15 years, I'm
finding it hard to get used to DEFINES in all capitals. Is it really frowned on
not to do so? Is CamelCase acceptable?

EG. '#define MaxNumFiles 1024' not '#define MAXNUMFILES 1024'.

The all-caps convention makes it easier to distinguish preprocessor
macros from other symbols. This can matter, especially when using
function-like macros (macros that take arguments). Remember that
macro expansions are simple text substitutions; a macro like

#define square(x) (x*x)

does not compute "x*x" and return a value; it *replaces* the text
"square(x)" with "(x*x)". If x is "z++", then the replacement text is
"(z++*z++)", which invokes undefined behavior. If x is "a+b", then
the replacement text is "(a+b*a+b)". By using all-uppercase for
macros, it makes it easier to see potential red flags like "SQUARE(x+
+)" or "SQUARE(x+y)".
2. My personal variable and function naming style is camel case, with variable
names beginning with a lower case char and function names not. Is that
acceptable, if not what is?

EG:
Variables: int numFiles = 0;
Functions: int CountNumFilesInDir(char* path);

That's fine. Just be consistent.
3. Is there an accepted maximum line length? I've got a 24" monitor, if I reach
120 chars I start thinking this might not look great in someone else's editor.

Side scrolling is irritating. Just make sure you break lines up
sensibly.
4. Does anyone care where the pointer * is? I prefer keeping to next to the
type, rather than next to the variable name.

EG. I like: char* firstName; and not so much: char *firstName;

It's an accident of C lexical rules that you can write it either way.
The second form correctly reflects the language syntax (the '*'
operator is bound to the declarator, not the type specifier), and
tends to be preferred among C programmers. It also guards against
potential mistakes like

char* a, b; // b is a regular char

Declarations in C (and C++) reflect the type of an *expression*, not
an object. The type of the *expression* "*firstName" is char. The
idea is that the form of the declaration should closely match the form
of an expression that yields a value of that type. If you have an
array of pointers to int, and you wanted to access a specific int
value, the expression would be "x = *a;" The type of the
*expression* "*a" is int, so the declaration of a is "int *a[N]".

C++ programmers prefer "char* firstName", because the type of the
*object* "firstName" is char*. Fine, but what about "char
lastName[20]"? The type of "lastName" is "20-element array of char",
but we can't write "char[20] lastName". Same thing for function
types. Same thing for pointers to arrays, pointers to functions,
etc.
5. On a slightly different note, I've been handling my error messages by using
#define string constants in a header file. I saw some code which did this and it
looked good to me. Is that standard practise, if not what is?

EG. #define ErrorDirNotFound "The directory was not found."

There are so many style guides out there, most of them say contradictory things
at one point or another. What do the pros do?

I can't speak for anyone else, but I typically create a string table
and an enum to index it:

enum ErrorCodes {ErrDirNotFound, ErrInvalidPath, ...};
char ErrorStrings[] = {"Directory not found", "Path is invalid", ...};

It scales a little better than using #defines all over the place.
 
S

Stephen Sprunk

Stephen Sprunk said:
On 24 Feb 2010 12:35, Poster Matt wrote: [...]
EG:
Variables: int numFiles = 0;

This is camelCase.
Functions: int CountNumFilesInDir(char* path);

This is PascalCase.

Mixing the two in the same project will drive adherents of _both_ styles
nuts. Pick one and stick to it; that way you'll only drive half of your
readers nuts.

His convention apparently is to use camelCase for variables and
PascalCase for functions. It's not necessarily a style I'd use, but
it's not obviously horrible (and it's more or less the style I use
at work).

Granted, it's not as bad as mixing the two within the same type of
thing, but it's still bad, IMHO.

I hate PascalCase; I can deal with it if _everything_ is cased that way,
but a lot of my functions and function calls would end up in camelCase
if I had to switch back and forth.
As with most of these rules, conforming to existing code
is far more important than any benefits of one style over another.

Of course; I said roughly the same thing at the top of my post. I've
just never seen a project that does it that way--or perhaps it was so
painful for me to read that I gave up, moved on, and repressed the memory.
Again, This_Type_Of_Identifier isn't obviously horrible. (I use it
myself, though not in C.)

IMHO, that's also horrible. What I was thinking of was worse, though:
this_typeOfIdentifier. "Horrible" isn't strong enough for that.

S
 
P

Poster Matt

Where did you pick up these preferences?
I'd have guessed Java, but as Rainer Weikusat points out in
<[email protected]>, CamelCase and co. originate from
much earlier. (Thanks for the interesting historical tidbit, Rainer!)

OP here...

I picked them up from a university lecturer of mine during a software
engineering course in the mid 1990's. The course required groups of 5 students
collaborating on a coding project, don't remember what language we used but
definitely not C or C++, possibly Java, but not sure.

The lecturer explained that in real world projects with a programming team, a
style guide would be the norm. So he imposed his own one on us. When free to
choose, I've been using my own variation of it ever since. It may have been
influenced by a book called Code Complete, but it's so long since I read it that
I can't recall exactly what I picked up from that and what from personal
experience and preference.
 
P

Poster Matt

Eric said:
Contradict each other, of course! Why did you ask?

Given the amount of contradictory advise I'm getting in this
thread I am beginning to wonder why myself. :)
 
K

Keith Thompson

John Bode said:
The all-caps convention makes it easier to distinguish preprocessor
macros from other symbols. This can matter, especially when using
function-like macros (macros that take arguments). Remember that
macro expansions are simple text substitutions; a macro like

#define square(x) (x*x)

does not compute "x*x" and return a value; it *replaces* the text
"square(x)" with "(x*x)". If x is "z++", then the replacement text is
"(z++*z++)", which invokes undefined behavior. If x is "a+b", then
the replacement text is "(a+b*a+b)". By using all-uppercase for
macros, it makes it easier to see potential red flags like "SQUARE(x+
+)" or "SQUARE(x+y)".
[...]

The side effect problem can't be solved (or at least can't be easily
solved) if you're using a macro, but the operator precedence problem
can.

#define SQUARE(x) ((x)*(x))

You need to parenthesize the entire definition *and* each reference to
the parameter(s).

BTW, here's my favorite example of preprocessor abuse:

#include <stdio.h>

#define SIX 1+5
#define NINE 8+1

int main(void)
{
printf("%d * %d = %d\n", SIX, NINE, SIX * NINE);
return 0;
}
 
R

Rainer Weikusat

Keith Thompson said:
Rainer Weikusat said:
Keith Thompson said:
[...]
4. Does anyone care where the pointer * is? I prefer keeping to next
to the type, rather than next to the variable name.

EG. I like: char* firstName; and not so much: char *firstName;

C knows about three kinds of derived types, arrays

char a[];

Pointers to functions

char (*a)();

and pointers

char *a;

Array types, structure types, union types, function types, and pointer
types are all derived types (C99 6.2.5).

Exercise for the reader: Which of the six types defined above are
irrelevant for this statement about 'declaration of derived types'
because they belong to a different syntactical class than the three
examples? Which type is irrelevant because it cannot directly appear
in a variable declaration? Which other class of types should appear
instead because they can?

You said that "C knows about three kinds of derived types". In fact,
there are six. I was disputing the accuracy of your statement, not
its relevance.

Given that these three kinds of derived types exist, my statement was
accurate. I didn't claim that there weren't any other derived types, I
just didn't write about them because they were irrelvant for the
statement I intended to make (this is, of course, just the same kind
of pointless nitpicking).

[...]
Wrong. char* is a type. Specifically, it's a derived type and a
pointer type. See C99 6.2.5, particularly paragraph 20.

T *o;

is a pointer declarator and it means that the type of o is 'pointer to
T', not 'T*' (6.7.5.1|1).

[...]
(And for the record, I agree that "T *o;" is preferable to
"T* o;"; I just don't agree that it's a huge deal.)

It's a pointless inconsistency.
 
K

Keith Thompson

Rainer Weikusat said:
Keith Thompson said:
Rainer Weikusat said:
C knows about three kinds of derived types, arrays

char a[];

Pointers to functions

char (*a)();

and pointers

char *a;

Array types, structure types, union types, function types, and pointer
types are all derived types (C99 6.2.5).

Exercise for the reader: Which of the six types defined above are
irrelevant for this statement about 'declaration of derived types'
because they belong to a different syntactical class than the three
examples? Which type is irrelevant because it cannot directly appear
in a variable declaration? Which other class of types should appear
instead because they can?

You said that "C knows about three kinds of derived types". In fact,
there are six. I was disputing the accuracy of your statement, not
its relevance.

Given that these three kinds of derived types exist, my statement was
accurate. I didn't claim that there weren't any other derived types, I
just didn't write about them because they were irrelvant for the
statement I intended to make (this is, of course, just the same kind
of pointless nitpicking).

Ok. I read your statement as implying that C recognizes those three
kinds of derived types and no others. Thanks for the clarification.

[...]
T *o;

is a pointer declarator and it means that the type of o is 'pointer to
T', not 'T*' (6.7.5.1|1).

So you agree that "pointer to char" is a type, right? The type
"pointer to char" is commonly referred, using C syntax, as "char*".
I see no problem with referring to it that way.

And if the point you were trying to make was the type often referred
to as "char*" is properly referred to as "pointer to char", that
wasn't at all clear to me from what you wrote.
[...]
(And for the record, I agree that "T *o;" is preferable to
"T* o;"; I just don't agree that it's a huge deal.)

It's a pointless inconsistency.

I don't agree that it's entirely pointless, but I'm not going to argue
the point.
 
B

Ben Pfaff

Kelsey Bjarnason said:
My take on this has always been to standardize the format "checked in",
then use indent or some equivalent, on check in and check out, to convert
between the "checkin format" and the individual coder's preferred format.

I've heard of this approach, but I always assumed that it was a
joke. You really use it?
 
P

Phred Phungus

Keith said:
John Bode said:
The all-caps convention makes it easier to distinguish preprocessor
macros from other symbols. This can matter, especially when using
function-like macros (macros that take arguments). Remember that
macro expansions are simple text substitutions; a macro like

#define square(x) (x*x)

does not compute "x*x" and return a value; it *replaces* the text
"square(x)" with "(x*x)". If x is "z++", then the replacement text is
"(z++*z++)", which invokes undefined behavior. If x is "a+b", then
the replacement text is "(a+b*a+b)". By using all-uppercase for
macros, it makes it easier to see potential red flags like "SQUARE(x+
+)" or "SQUARE(x+y)".
[...]

The side effect problem can't be solved (or at least can't be easily
solved) if you're using a macro, but the operator precedence problem
can.

#define SQUARE(x) ((x)*(x))

You need to parenthesize the entire definition *and* each reference to
the parameter(s).

BTW, here's my favorite example of preprocessor abuse:

#include <stdio.h>

#define SIX 1+5
#define NINE 8+1

int main(void)
{
printf("%d * %d = %d\n", SIX, NINE, SIX * NINE);
return 0;
}

$ gcc -D_GNU_SOURCE -Wall -Wextra k1.c -o out
$ ./out
6 * 9 = 42
$ cat k1.c
#include <stdio.h>

#define SIX 1+5
#define NINE 8+1

int main(void)
{
printf("%d * %d = %d\n", SIX, NINE, SIX * NINE);
return 0;
}

// gcc -D_GNU_SOURCE -Wall -Wextra k1.c -o out
$


SIX * NINE equals 1 + 5 * 8 + 1

I save these nice little toy programs in my linuxlog.
 
R

robertwessel2

I have, repeatedly, over the years.

The problem with _not_ doing it is that you tend to get a lot of checkins
where the "diff" is wildly out of sync with what actually got changed,
particularly if the coder's tools do things such as switching tabs to
spaces or re-organizing braces, etc, etc, etc, which many tools do.

By using indent or an equivalent on checkin, you ensure a standard format
going in, such that only "real" changes are recorded, and by using it on
checkout, you deliver to the coder whatever flavour he's happiest with.

Or, you can skip it on checkout and just let him use whatever tools he
likes, but I've found delivering to the developer something which he is
maximally comfortable with, right out of the gate, tends to produce
maximum productivity and minimum frustration.


A problem with that approach is that it will trash any special/careful
formatting that you might use to clarify a complex section of code.
While indent allows you to disable formatting for sections, that's not
an ideal solution in many ways.

On the flip side, some people can't be coerced into clean formatting,
or making an effort to follow the formatting already established for
the module their editing, no matter how much violence you threaten.
Or keeping "use tabs" turned off in their editors. At times that gets
bad enough that the easiest solution is to go ahead and run indent (or
the equivalent) and live with the damage.

But there's really no excuse - following a formatting style is not
that hard, even if it's not exactly the one you prefer. Frankly there
is almost no excuse for not follow the existing style when modifying a
program. Or following the shop standards when creating a new module.
 
N

Nick Keighley

I worked on a project where you couldn't check code in unless it had
been through indent with a specified set of flags. Well you *could*
but it got flagged somewhere. At first I used to indent to my style
(the indent style was a mix of tabs and spaces which I hated) edit and
indent to project style before check in. But it was such a PITA I just
accepted project style in the end. [cue twenty page rant by nilg about
the corporate castration of the programmer]


I do a special layout change only checkin if the layout had gone
wildly astray.

I think your tools are broken

A problem with that approach is that it will trash any special/careful
formatting that you might use to clarify a complex section of code.
While indent allows you to disable formatting for sections, that's not
an ideal solution in many ways.

On the flip side, some people can't be coerced into clean formatting,
or making an effort to follow the formatting already established for
the module their editing, no matter how much violence you threaten.

on the project I was on they simple wouldn't be able to checkin the
code. If they did end-run around *that* they'd run into trouble at
build time.

Or keeping "use tabs" turned off in their editors.  At times that gets
bad enough that the easiest solution is to go ahead and run indent (or
the equivalent) and live with the damage.

But there's really no excuse - following a formatting style is not
that hard, even if it's not exactly the one you prefer.  Frankly there
is almost no excuse for not follow the existing style when modifying a
program.  Or following the shop standards when creating a new module.

apart from some truly hideous layout styles. One guy must have been
told that using blank lines would make his code clearer.

#include <stdio.h>

int main ( void )

{
int i = 99;

printf ( "%d\n", i );

return 0;

}

he never used more than one blank line
 
N

Nick Keighley

C++ programmers prefer "char* firstName", because the type of the
*object* "firstName" is char*.  Fine, but what about "char
lastName[20]"?  The type of "lastName" is "20-element array of char",
but we can't write "char[20] lastName".

but no True C++ would use a C-style array!
 
T

Tim Woodall

You've gotten some strong reactions to your not-seen-in-30-years
comment above.

FWIW I use the latter form myself. However, one good reason I got
from an ex-colleague for using the first style was that if you are
scrolling upward in source code, and get to the closing brace of a
block, using the keyboard shortcut in your editor for jumping to the
matching brace will get you to see the if (or while or for)
condition. Otherwise, it will only scroll upward to a point where the
opening brace is shown as the first line, and one has to go up one
line more. To me, this is not a big inconvenience.
However, the converse is that when cursoring down the code you have to:

a) watch the RH edge of the code for the brace and watch for when the
cursor is on the same line but in column 1 - this is especially true if
you've just bounced to a '}' previously because, at least in vim, you
then lose the "end of line" behaviour of the cursor and it retains the
column position.

b) At least for vim you have to then do $ to get to the end of the line
before you hit the % to find the matching close brace.

if(x<0) {
}
Hitting % while in column 1 of the if line in vim bounces you between
the '(' and ')'

if(x<0)
{
}
Hitting % while in column 1 (or any column) of the line with '{' bounces
you to the '}'

c) Assuming the code is correctly indented - starting with your cursor
on the '{' and then moving downwards you will have read the entire block
when the cursor first stops on a '}' (Assuming your cursor isn't
following the end of the line)

d) For people who persist on using long lines, putting the '{' on a line
on its own will mean it is not off the edge of the screen and invisible
or wrapped onto the next line

Obviously, the major advantage to putting the '{' on the same line as
the conditional is that it reduces paper usage when typesetting and
hence why you will normally see it in books.

Tim.
 
R

Rich Webb

Since I don't know what tab setting you had when you handed me the code,
I think s/exactly two/a random number of/ applies here.

[George Lucas missed a bet when he chose to make Star Wars instead of
Style Wars.]

In the context of C source files, the tab character (\t or \x09 or 0x09
or however expressed) should always advance the cursor position to the
next mod 8 column.

The tab *key* on the other hand, may be chosen by the user to insert a
tab character, or a fixed number of spaces, or a variable number of
spaces which advance the cursor to a preferred mod n column.

Hard tabs in code that presume other than "every eight" are just evil.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,231
Members
46,820
Latest member
GilbertoA5

Latest Threads

Top