bartc said:
Tidying up of C Language
For someone taking a fresh look at C, and who doesn't care too much
about it's history or compatibility with existing code, these are a
few things that might stand out.
I'm not necessarily providing any fixes here, just pointing out areas
that might cause raised eyebrows:
As I'm sure you know, anything that will break existing code,
particularly that will silently break existing code, has no chance of
making it into a future C standard. That's not intended to discourage
you from posting your ideas; some might be adapted to avoid breaking
existing code, some might be useful for future design of other
languages, and many are interesting in their own right.
[...]
'Header' Files
A misnomer for ordinary include files. Someone used to Import
statements for example might expect the contents of header files
to exist in a separate scope from the module being compiled.
(Although a proper treatment of header files might be difficult
without introducing module namespaces too.)
The solution for C as it is now is to understand how #include
directives actually work.
I wouldn't mind some sort of "import" feature that works on a higher
level than simple text inclusion. But it would need to be defined,
which would be a lot of work, and I've never seen a proposal.
And it would have to coexist with what we have now.
[...]
Type Declarations
These are C's famous convoluted inside-out declarations. I've
already suggested a left-to-right alternative (which might just
co-exist with the old scheme)
I'm skeptical that such a scheme could actually coexist with the
current one. I'm even more skeptical that a new scheme could be
*proven* to be compatible with the current scheme. In the best case,
there would almost inevitably be cases where a casual reader would
have difficult figuring out whether a given declaration uses the
old scheme or the new one, and what it means.
I'm not a big fan of C's declaration syntax, but I don't think
adding a new set of rules is the answer.
Struct Namespace
Just a cause of confusion. Just let a struct name be equivalent to
a typedef name.
It worked for C++. I don't see much of a problem with the idea.
I think it would break some existing code, but only in relatively
obscure cases.
Numeric Literals
Who would have guessed that 0123 is an octal number? Get rid of that
notation, and introduce 8x123 if anyone is still interested.
Making 0123 mean 123 (decimal) would quietly change the meaning of
existing code. You could avoid that by banning leading 0s except for
"0", but that would invalidate existing code.
I agree that the existing octal notation is confusing, and a different
syntax like 8x123 would have been better.
And there needs to be *some* octal notation, at least for Unix file
permissions.
And there should be a notation for binary literals, perhaps 2x11011.
There's some precedent for 0b11011.
And those strange suffixes you sometimes see: LU, LLU and so on,
are they really necessary? Why can't the type of the constant be
automatic?
The type of an unsuffixed constant already is automatic; see C99
6.4.4.1. The suffixes are needed only when you need to specify a type
other than the implicit one.
I suppose you could have the type of an integer constant depend on
the context in which it appears but (a) I'm not sure what that would
buy you, and (b) it would break the existing rule that the type of
a subexpression is (almost always) determined by the subexpression
itself, not by its context.
Sizeof Operator
This just gives the number of bytes in a type (or the type of an
expression). That's fine, but what about getting the number of
elements of an array? Ie. without bothering having to divide the
bytes in the entire array by the bytes in one element...
It's easy enough to write a macro:
#define ARRLEN(arr) (sizeof (arr) / sizeof (arr)[0])
On the other hand, it's easy to misuse that by applying it to a
pointer. Something that can only be applied to array expressions
could be useful.
Type Limits
These is where you start seeing names such as USHRT_MAX and
LLONG_MIN (all tacky abbreviations we are constantly told to avoid
as macros and typedefs), and where you start wondering, is there a
Better Way?
(Such as, perhaps, long.max or signed char'min, which together with
a set of standardised short type names would tidy things up
considerably.)
Ada, for example, has a number of "attributes" that can be applied
to various entities: Typename'Size, Object'Size, Typename'First,
Typename'Last, Array'Length, and so forth. Something like that
in C could replace sizeof, offsetof, the above ARRLEN macro, and
probably a number of other things. On the other hand, creeping
featurism is always dangerous.
[...]
Operators
A power operator is missing (I think because no-one can decide what
to use, since * is heavily involved with pointers).
I suppose ^^ is available.
The << and >> operators have a strange precedence (they effectively
multiply and divide, so should be the same as * and /)
Can't be fixed without quietly breaking existing code.
Switch Statement
It should not be necessary to use break to terminate every case
statement. (And there's the problem that break cannot then be used
to escape from a loop).
csh uses "breaksw" to break out of a switch statement.
Case expressions should be able to use ranges and commas:
case 1,2,3,5..7,8:
instead of:
case 1: case 2: case 3: case 5: case 6: case 7: case 8:
No excuses!
On the other hand, that introduces the temptation to write
case 'a'..'z':
which isn't portable.
gcc uses the existing "..." token for this.
And Switch statements are very strange in that the case statements
do not form a normal block scope, so that you can have a case label
buried deep inside an embedded if statement or a loop! This is just
too weird to have in a serious language.
Take a look at ioccc.org and tell me C is a serious language.
}
I don't think the existing switch statement can be removed,
but a new form of selection statement might be added. (Again,
this runs into the creeping featurism problem.)
For Statement
This is a funny, but useful, variation, of a loop statement, but is
not a For statement as normally understood. A streamlined 'proper'
For statement would be handy (but is awkward to fit into C's zero-
based philosophy).
Can you be more specific?
[...]
Named Constants
Ie. what someone might expect when writing const int x=1000; x is
variable not an alias for 1000.
Given that const really means read-only, there is no proper way of
assigning a name to a literal, other than workarounds using #define
and enum, both with their own restrictions.
I think stealing what C++ did with this would be quite reasonable.
Arrays
Array handling is ... different. However I don't have suggestions
to fix that, without completely changing the language.
Name Scoping
As I understand it, function names, and variable names declared
outside of functions, are always exported unless some attribute
(static?) is used.
I don't think this is what one would expect (ie. names are normally
private unless explicitly exported). The way C works now seems just
a little too casual.
I agree with the principle, but again, this would break existing code.
Something related to this: if I were designing my own language,
declared objects would be read-only by default. If you want to be
able to change an object's value after declaring it, you need to
say so, perhaps with a "var" keyword.
Text and Binary File Modes
No comments needed...
Compiler Attributes
When you look at actual header files they always seem to be full of
cr*p like this (and often a lot worse):
_CRTIMP __p_sig_fn_t __cdecl __MINGW_NOTHROW signal(int, __p_sig_fn_t);
all full of ad-hoc non-portable extensions specially designed to
make declarations completely incomprehensible.
Whatever it is these attributes are supposed to do, why not just
standardise them?
Because many of them are implemention-specific. Though a standard
syntax for implementation-defined attributes wouldn't be a bad thing;
is that what you meant? gcc provides some precedent for this.
[...]
Well, that's about all I could think of before breakfast. I've tried to
leave out personal preferences as that would have made it several
times the size.
And I've mainly concentrated on syntax...
Hey, you forgot to define a new meaning for "static"!