Sidney Cadot said:
Paul said:
Paul Hsieh wrote:
[...] I for one would be happy if more compilers would
fully start to support C99, [...]
I don't think that day will ever come. In its totallity C99 is almost
completely worthless in real world environments. Vendors will be
smart to pick up restrict and few of the goodies in C99 and just stop
there.
Want to take a bet...?
Sure. Vendors are waiting to see what the C++ people do, because they
are well aware of the unreconcilable conflicts that have arisen. Bjarne
and crew are going to be forced to take the new stuff C99 in the bits and
pieces that don't cause any conflict or aren't otherwise stupid for other
reasons. The Vendors are going to look at this and decide that the
subset of C99 that the C++ people chose will be the least problematic
solution and just go with that.
Ok. I'll give you 10:1 odds; there will be a (near-perfect) C99 compiler
by the end of this decade.
A single vendor?!?! Ooooh ... try not to set your standards too high.
Obviously, its well known that the gnu C++ people are basically converging
towards C99 compliance and are most of the way there already. That's not my
point. My point is that will Sun, Microsoft, Intel, MetroWerks, etc join the
fray so that C99 is ubiquitous to the point of obsoleting all previous C's for
all practical purposes for the majority of developers? Maybe the Comeau guy
will join the fray to serve the needs of the "perfect spec compliance" market
that he seems to be interested in.
If not, then projects that have a claim of real portability will never
embrace C99 (like LUA, or Python, or the JPEG reference implementation, for
example.) Even the average developers will forgo the C99 features for fear
that someone will try compile their stuff on an old compiler.
Look, nobody uses K&R-style function declarations anymore. The reason is
because the ANSI standard obsoleted them, and everyone picked up the ANSI
standard. That only happened because *EVERYONE* moved forward and picked up
the ANSI standard. One vendor is irrelevant.
If instead, the preprocessor were a lot more functional, then you
could simply extract packed offsets from a list of declarations and
literally plug them in as offsets into a char[] and do the slow memcpy
operations yourself.
This would violate the division between preprocessor and compiler too
much (the preprocessor would have to understand quite a lot of C semantics).
No, that's not what I am proposing. I am saying that you should not use
structs at all, but you can use the contents of them as a list of comma
seperated entries. With a more beefed up preprocessor one could find the
offset of a packed char array that corresponds to the nth element of the list
as a sum of sizeof()'s and you'd be off to the races.
Perhaps I'm missing something here, but wouldn't it be easier to use the
offsetof() macro?
It would be, but only if you have the packed structure mechanism. Other
people have posted indicating that in fact _Packed is more common that I
thought, so perhaps my suggestion is not necessary.
That's true. I don't quite see how this relates to the preceding
statement though.
I'm saying that trying to fix C's intrinsic problems shouldn't start or end
with some kind of resolution of call stack issues. Anyone who understands
machine architecture will not be surprised about call stack depth limitations.
There are far more pressing problems in the language that one would like to
fix.
Explain to me how you implement malloc() in a *multithreaded* environment
portably. You could claim that C doesn't support multithreading, but I highly
doubt your going to convince any vendor that they should shut off their
multithreading support based on this argument. By dictating its existence in
the library, it would put the responsibility of making it work right in the
hands of the vendor without affecting the C standards stance on not
acknowledging the need for multithreading.
Well, it looks to me you're proposing to have a feature-rich heap
manager. I honestly don't see you this couldn't be implemented portably
without platform-specific knowledge. Could you elaborate?
See my multithreading comment above. Also, efficient heaps are usually
written with a flat view of memory in mind. This kind of is impossible in
non-flat memory architectures (like segmented architectures.)
[...] I want this more for reasons of orthogonality in design than anything
else.
You want orthogonality in the C language? You must be joking ...
Well, I'm a programmer, and I don't care about binary output -- how does your
proposal help me decide what I think is useful to me?
I don't think it's too bad an idea (although I have never gotten round
to trying the mechanism gcc provides for this). In any case, this kind
of thing is so much more naturally done in a OOP-supporting language
like C++ . Without being bellingerent: why not use that if you want this
kind of thing?
Well, when I am programming in C++ I will use it. But I'm not going to move
all the way to using C++ just for this single purpose by itself.
I used "%x" as an example of a format specifier that isn't defined ('x'
being a placeholder for any letter that hasn't been taken by the
standard). The statement is that there'd be only 15 about letters left
for this kind of thing (including 'x' by the way -- it's not a hex
specifier). Sorry for the confusion, I should've been clearer.
Well what's wrong with %@, %*, %_, %^, etc?
* I think I would like to see a real string-type as a first-class
citizen in C, implemented as a native type. But this would open
up too big a can of worms, I am afraid, and a good case can be
made that this violates the principles of C too much (being a
low-level language and all).
The problem is that real string handling requires memory handling.
The other primitive types in C are flat structures that are fixed
width. You either need something like C++'s constructor/destructor
semantics or automatic garbage collection otherwise you're going to
have some trouble with memory leaking.
A very simple reference-counting implementation would suffice. [...]
This would complexify the compiler to no end. Its also hard to account for a
reference that was arrived at via something like "memcpy".
A first-class citizen string wouldn't be a pointer; neither would you
necessarily be able to get its address (although you should be able to
get the address of the characters it contains).
But a string has variable length. If you allow strings to be mutable, then
the actual sequence of characters has to be put into some kind of dynamic
storage somewhere. Either way, the base part of the string would in some way
have to be the storable into, say a struct. But you can copy a struct via
memcpy or however. But this then requires a count increment since there is
now an additional copy of the string. So how is memcpy supposed to know that
its contents contain a string that it needs to increase the ref count for?
Similarly, memset needs to know how to *decrease* such a ref count.
If you allow the base of the string itself to move (like those morons did in
the Safe C String Library) then a simple things like:
string *a, b;
a = (string *) malloc (sizeof (string));
*a = b;
b = b + b + b; /* triple up b, presumably relocating the base */
/* But now *a is undefined */
are just broken.
Look, the semantics of C just don't easily allow for a useful string primitive
that doesn't have impact on the memory model (i.e., leak if you aren't
careful.) Even the Better String Library (
http://bstring.sf.net/) concedes
that the programmer has to dilligently call bdestroy() to clean up after
themselves, otherwise you'll just leak.
Ok, so the language should have a big bunch of operators, ready for the
taking. Incidentally, Mathematica supports this, if you want it badly.
Hey, its not me -- apparently its people like you who wants more operators.
My point is that no matter what operators get added to the C language, you'll
never satisfy everyone's appetites. People will just want more and more,
though almost nobody will want all of what could be added.
My solution solves the problem once and for all. You have all the operators
you want, with whatever semantics you want.
This seems to me a bad idea for a multitude of reasons. First, it would
complicate most stages of the compiler considerably. Second, a
maintenance nightmare ensues: while the standard operators of C are
basically burnt into my soul, I'd have to get used to the Fantasy
Operator Of The Month every time I take on a new project, originally
programmed by someone els.
Yes, but if instead of actual operator overloading you only allow redefinition
of these new operators, there will not be any of the *surprise* factor. If
you see one of these new operators, you can just view it like you view an
unfamilliar function -- you'll look up its definition obviously.
There's a good reason that we use things like '+' and '*' pervasively,
in many situations; they are short, and easily absorbed in many
contexts. Self-defined operator tokens (consisting, of course, of
'atomic' operators like '+', '=', '<' ...) will lead to unreadable code,
I think; perhaps something akin to a complicated 'sed' script.
And allowing people to define their own functions with whatever names they
like doesn't lead to unreadable code? Its just the same thing. What makes
your code readable is adherence to an agreed upon coding standard that exists
outside of what the language defines.
Do you have a reference? That's bound to be a fun read, and he probably
missed a few candidates.
It was just in the notes to some meeting Bjarne had in the last year or so to
discuss the next C++ standard. His quote was something like that: while
adding a feature for C++ can have value, removing one would have even more
value. Maybe someone who is following the C++ standardization threads can
find a reference -- I just spent a few minutes on google and couldn't find it.
I can only speak for myself; I have been exposed, and think it's a bad
idea. When used very sparsely, it has it's uses. However, introducing
new user-definable operators as you propose would be folly; the only way
operator overloading works in practice is if you maintain some sort of
link to the intuitive meaning of an operator. User defined operators
lack this by definition.
But so do user definable function names. Yet, functionally they are almost
the same.
"<>" would be a bad choice, since it is easy to confuse for "not equal
to". I've programmed a bit in IDL for a while, which has my dear "min"
and "max" operators.... It's a pity they are denoted "<" and ">",
leading to heaps of misery by confusion.
<<< and @ are nice though. I would be almost in favour of adding them,
were it not for the fact that this would drive C dangerously close in
the direction of APL.
You missed the "etc., etc., etc." part. I could keep coming up with them
until the cows come home: a! for factorial, a ^< b for "a choose b" (you want
language supposed for this because of overflow concerns of using the direct
definition) <-> a for endian swapping, $% a for the fractional part of a
floating point number, a +>> b for the average (there is another overflow
issue), etc., etc.
Again I wonder, seriously: wouldn't you be better of using C++ ?
No because I want *MORE* operators -- not just the ability to redefine the
ones I've got (and therefore lose some.)
Sure, but you're talking about something that goes a lot further than
run-off-the-mill operator overloading. I think the simple way would be
to just introduce these min and max operators and be done with it.
"min" and "max" are perhaps less important than "+" and "*", but they
are probably the most-used operations that are not available right now
as operators. If we are going to extend C with new operators, they would
be the most natural choice I think.
WATCOM C/C++ defined the macros min(a,b) and max(a,b) in some header files.
Why wouldn't the language just accept this? Is it because you want variable
length parameters? -- Well in that case, does my preprocessor extension
proposal start to look like its making more sense?
Those are not existing operators, as you know. They would have to be
defined in your curious "operator definition" scheme.
I find the idea freaky, yet interesting. I think C is not the place for
this (really, it would be too easy to compete in the IOCCC) but perhaps
in another language... Just to follow your argument for a bit, what
would an "operator definition" declaration look like for, say, the "?<"
min operator in your hypothetical extended C?
This is what I've posted elsewhere:
int _Operator ?< after + (int a, int b) {
if (a > b) return a;
return b;
}
Well, I'd show you, but it's impossible _in principle_. Given that you
are multiplying two expressions of the widest type supported by your
compiler, where would it store the result?
In two values of the widest type -- just like how just about every
microprocessor which has a multiply does it:
high *% low = a * b;
Well, I don't know if these dozen-or-so big-number 'powermod' operations
that are needed to establish an SSL connection are such a big deal as
you make it.
Its not me -- its Intel, IBM, Motorola, Sun and AMD who seem to be obsessed
with these instructions. Of course Amazon, Yahoo and Ebay and most banks are
kind of obsessed with them too, even if they don't know it.
It looks cute, I'll give you that. Could you please provide semantics?
It may be a lot less self evident than you think.
How about:
- carry is set to either 1 or 0, depending on whether or not a + b overflows
(just follow the 2s complement rules of one of a or b is negative.)
- var is set to the result of the addition; the remainder if a carry occurs.
- The whole expression (if you put the whole thing in parenthesese) returns
the result of carry.
+< would not be an operator in of itself -- the whole syntax is required.
For example: c +< v = a * b would just be a syntax error. The "cuteness" was
stolen from an idea I saw in some ML syntax. Obviously +< - would also be
useful.
Ah, I see you've never implemented a non-table-driven CRC or a binary
greatest common divisor algorithm.
You can find a binary gcd algorithm that I wrote here:
http://www.pobox.com/~qed/32bprim.c
You will notice how I don't use or care about carries coming out of a right
shift. There wouldn't be enough of a savings to matter.
[...] They are both hard at work when you establish an SSL connection.
The specific operations I am citing make a *HUGE* difference and have billion
dollar price tags associated with them.
These numbers you made up from thin air, no? otherwise, I'd welcome a
reference.
Widening multpilies cost transistor on the CPU. The hardware algorithms are
variations of your basic public school multiply algorithm -- so it takes n^2
transistors to perform the complete operation, where n is the largest bit
word that the machine accepts for the multiplier. If the multiply were not
widened they could save half of those transistors. So multiply those extra
transistors by the number of CPUs shipped with a widening multipliy (PPC,
x86s, Alphas, UltraSparcs, ... etc) and you easily end up in the billion
dollar range.
Sure is. Several good big-number libraries are available that have
processor-dependent machine code to do just this.
And that's the problem. They have to be hand written in assembly. Consider
just the SWOX Gnu multiprecision library. When the Itanium was introduced,
Intel promised that it would be great for e-commerce. The problem is that
the SWOX guys were having a hard time with IA64 assembly language (as
apparently lots of people are.) So they projected performance results for
the Itanium without having code available to do what they claim. So people
who wanted to consider using an Itanium system based on its performance for
e-commerce were stuck -- they had no code, and had to believe Intel's claims,
or SWOX's as to what the performance would be.
OTOH, if instead, the C language had exposed a carry propogating add, and a
widening multiply in the language, then it would just be up to the Intel
*compiler* people to figure out how to make sure the widening multiply was
used optimally, and the SWOX/GMP people would just do a recompile for baseline
results at least.