How would you design C's replacement?

K

Keith Thompson

Lanarcam said:
Le 29/04/2012 18:17, Rui Maciel a écrit :
My responses will probably be OT, they are the result of many years
chasing elusive and costly bugs in the domain of real time embedded
systems. My wish list:

- prevents writing at wrong memory locations by the use of wild
pointers or out of bound array indices.

- erasing of the stack memory resulting in wild jumps.

How would you change the language to prevent these?

C implementations are already *permitted* to perform bounds checking,
but most don't, since it imposes performance costs.
- prevents mistaking one structure with another by the
use of improper casts.

Would you forbid cssts from one pointer type to another?
- overrun of integral types. The language should define
types such as uint32_t, int16_t, etc.

It does, or rather the standard library does (<stdint.h>, <inttypes.h>).

By "overrun", do you mean "overflow"? If so, how do int16_t and friends
address that? What prevents me from writing

int16_t x = 32767;
x ++;

and what should happen if I do?
- errors in comparison instructions resulting in an
assignation, if (a = b) instead of if (a == b)

Many compilers already warn about "if (a = b)". But it can only be a
warning, since "if (a = b)" is valid code with unambiguous semantics.

If I were designing the language from scratch today, I'd probably use
":=" for assignment and "=" for comparison, but it's too late to change
it now.

One possible change would be to require parentheses when an assignment
is used as a condition, so you have to write "if ((a = b))" if that's
what you really want -- but that would break existing code.
- no dangling pointers after free()

If I write:

int *p0 = malloc(sizeof *p0);
int *p1 = p0;
/* ... */
free(p0);

then p1 is a dangling pointer. How would you change the language to
prevent that?
 
L

Lanarcam

Le 01/05/2012 22:43, Keith Thompson a écrit :
How would you change the language to prevent these?

I have no idea, it's a wish list, you are the (potential)
writer. If you say it is impossible I accept that.
 
J

Jens Gustedt

Am 05/01/2012 10:32 PM, schrieb Keith Thompson:
As I understand it, C++ introduced a new use for the "auto" keyword
*without* breaking existing uses.

(The new form of "auto" lets you declare an object without explicitly
specifying its type; the type is inferred from the initializer.)

For example:

{
auto int x = 42; /* old-style "auto", useless but legal */
auto y = x; /* new-style "auto" */
}

It's no more problematic than C's habitual re-use of "static".

that would be good news, since I like the feature itself

As long as there is a type following auto, this is an old style
declaration of an automatic variable?

How would one then resove ambiguities, and declaration order?

Would it fit well with the rest of C?

In C++ an "auto" type determined variable can't have a {} initializer,
since these don't have types. But in C we could use a compound
literal:

typedef struct toto toto;
struct toto { toto * x; };
auto y = (toto){ &y };

The compound literal that is used for initialization is only valid if
y is of type toto. So the compound literal would first be evaluated
for its type, the type of y would be determined, and then the compound
literal would be evaluated for its value?

Jens
 
J

James Kuyper

On 05/01/2012 04:32 PM, Keith Thompson wrote:
....
As I understand it, C++ introduced a new use for the "auto" keyword
*without* breaking existing uses.

(The new form of "auto" lets you declare an object without explicitly
specifying its type; the type is inferred from the initializer.)

For example:

{
auto int x = 42; /* old-style "auto", useless but legal */
auto y = x; /* new-style "auto" */
}

It's no more problematic than C's habitual re-use of "static".

The latest draft of the C++ standard that I currently have access to is
n3035, dated 2010-02-16.

In section C, it describes differences between C and C++. In section
C.1.5 it says "The keyword auto cannot be used as a storage class
specifier.". The rationale for this change is given as "Allowing the use
of auto to deduce the type of a variable from its initializer results in
undesired interpretations of auto as a storage class specifier in
certain contexts." It cites section 7.1.6.4, which describes the 'auto'
type specifier, as the basis for this prohibition.

In section 7.1.6.4p5 it says "A program that uses auto in a context not
explicitly allowed in this section is ill-formed.". I found no mention,
in that section, of using 'auto' with the meaning it has in C, only the
new meaning that it has in C++.
 
B

BartC

Keith Thompson said:
a better mechanism could be one which is "similar" to include, except it
leaves the matter undefined as to whether actual textual inclusion is
used, or if the headers are compiled independently (and their contents
imported later).

one idea here:
#pragma pch_standalone
or:
#pragma precompiled_header
[...]

If we're trying to come up with something better than #include, I'd
rather create a new mechanism that isn't part of the preprocess, and
doesn't look like it is (#pragma).

For example, this:

#include <stdio.h>

could be replaced by:

import stdio;

This would make all the declarations in the stdio "module" visible; it

*All* the declarations? Perhaps only the ones intended to be exported..
 
B

BartC

Rui Maciel said:
That's true. Nonetheless, there are languages out there who are referred
to
as high level assembly languages which also employ some abstractions that,
at least in some aspects, don't make their abstraction level much lower
than
C's.

All the high-level assembler kind of languages I've used or created, were
generally far lower level than C (although most were some time ago too).
Another issue we might consider is that if we were given the task of
designing an assembly language which should be able to generate code for
multiple architectures, I suspect we would end up with a language which,
in
terms of features, wouldn't be much different than C's core language.

Probably not. But C is actually rather higher level than it's given credit
for. That also means it's not as great at doing low-level stuff as people
think. I only recently tried to use C as a target language for intermediate
code (no control structures, and the simplest possible expressions), yet
there were still a few problems in doing exactly what I wanted it to do.

(In the end I opted for generating assembly code directly; I know I can get
it to do anything, and I don't have a C compiler messing with my
intermediate code doing who knows what to it.)

(Try this test; given:

int a;

write some C code that will push 'a' onto the stack, and leave it there. By
'the stack', I mean the hardware stack if there is one. Then, I don't know,
perhaps pop it off the stack, treat the value as a label address, and jump
to that label. Any actual assembler worth its salt wouldn't have the
slightest problem with any of this. But can C do it? And in a way that could
fit neatly into the margin of this message?)
 
K

Keith Thompson

James Kuyper said:
On 05/01/2012 04:32 PM, Keith Thompson wrote:
...

The latest draft of the C++ standard that I currently have access to is
n3035, dated 2010-02-16.

In section C, it describes differences between C and C++. In section
C.1.5 it says "The keyword auto cannot be used as a storage class
specifier.". The rationale for this change is given as "Allowing the use
of auto to deduce the type of a variable from its initializer results in
undesired interpretations of auto as a storage class specifier in
certain contexts." It cites section 7.1.6.4, which describes the 'auto'
type specifier, as the basis for this prohibition.

In section 7.1.6.4p5 it says "A program that uses auto in a context not
explicitly allowed in this section is ill-formed.". I found no mention,
in that section, of using 'auto' with the meaning it has in C, only the
new meaning that it has in C++.

Then I stand corrected.

I would have thought that the new usage could coexist with the
old one (with "auto" having the new meaning if and only if the
declaration lacks an explicit type), but apparently the C++ committee
disagreed. Probably there are ambiguous cases I haven't thought of,
or perhaps they just wanted to avoid potential confusion and felt
that the minor backward-incompatibility was a price worth paying.
 
K

Keith Thompson

Lanarcam said:
Le 01/05/2012 22:43, Keith Thompson a écrit :

I have no idea, it's a wish list, you are the (potential)
writer. If you say it is impossible I accept that.

I'm not saying it's impossible. There are plenty of languages that
avoid wild pointers, typically by not having pointers, or at least by
not having pointer arithmetic. But I don't think any such language
could reasonably be called "C", or even C-like.
 
K

Keith Thompson

Keith Thompson said:
Assembly code that uses macros still unambiguously specifies the CPU
instructions to be generated; it just does so indirectly.

I'm not sure what you mean by "high level assembly"; can you cite an
example?

I remember a very similar discussion from last year; looking back at my
archives, I see that it was with you.

You claimed at the time that:

| I'm not aware of any specification of an assembly language which
| explicitly forbids any form of optimization and also requires that
| each and every instruction must be exactly mapped to a specific
| opcode.

I asked you for a concrete example:

| Then show us an example of an assembly language, or of an
| assembler, that *does* permit this kind of optimization, or where
| the assembly language input doesn't specify the exact sequence of
| CPU instructions.

Unless I missed something, you never did so; you merely asserted that
most assemblers work in a way that's inconsistent with my understanding
of every assembler I've ever used.

This was in a thread from May and June of 2011, subject "for your
languages".

References:

Message-ID: <[email protected]>
Message-ID: <[email protected]>
Message-ID: <[email protected]>

http://groups.google.com/group/comp.lang.c/msg/7d6294f32f8537d4
http://groups.google.com/group/comp.lang.c/msg/bc4fee4fceca1fc5
http://groups.google.com/group/comp.lang.c/msg/64a9897cb09d3832

We ran into an impass back then; I'm not interested in repeating that.
 
B

BGB

I'm not saying it's impossible. There are plenty of languages that
avoid wild pointers, typically by not having pointers, or at least by
not having pointer arithmetic. But I don't think any such language
could reasonably be called "C", or even C-like.

a trick used in my own language / VM is having support for "bounded"
pointers, where the VM basically keeps track of the underlying memory
object, and will trap if a pointer access goes out of range (as a result
of either arithmetic or indexing).

granted, however, this is not free.

currently, this is only done for things like arrays and strings, whereas
more explicit pointers don't do this.
 
I

Ian Collins

On 05/01/2012 04:32 PM, Keith Thompson wrote:
....

The latest draft of the C++ standard that I currently have access to is
n3035, dated 2010-02-16.

That's rather old, I believe N3225 was the last draft (Like C11, the
actual standard is still overpriced).
In section C, it describes differences between C and C++. In section
C.1.5 it says "The keyword auto cannot be used as a storage class
specifier.". The rationale for this change is given as "Allowing the use
of auto to deduce the type of a variable from its initializer results in
undesired interpretations of auto as a storage class specifier in
certain contexts." It cites section 7.1.6.4, which describes the 'auto'
type specifier, as the basis for this prohibition.

In section 7.1.6.4p5 it says "A program that uses auto in a context not
explicitly allowed in this section is ill-formed.". I found no mention,
in that section, of using 'auto' with the meaning it has in C, only the
new meaning that it has in C++.

That's why it gets a mention in the compatibility section!
 
I

Ian Collins

Am 05/01/2012 10:32 PM, schrieb Keith Thompson:

that would be good news, since I like the feature itself

As long as there is a type following auto, this is an old style
declaration of an automatic variable?

How would one then resove ambiguities, and declaration order?

Would it fit well with the rest of C?

James Kuyper addressed compatibility.
In C++ an "auto" type determined variable can't have a {} initializer,
since these don't have types. But in C we could use a compound
literal:

It can where the type can be deduced, the standard uses an example:

auto x1 = { 1, 2 }; // decltype(x1) is std::initializer_list<int>

Something simple like

auto x = {1};

is also OK.
typedef struct toto toto;
struct toto { toto * x; };
auto y = (toto){&y };

This wouldn't work, but in that case there is nothing to be gained with
auto (you had to name the type in the cast).
The compound literal that is used for initialization is only valid if
y is of type toto. So the compound literal would first be evaluated
for its type, the type of y would be determined, and then the compound
literal would be evaluated for its value?

But you still have to name the type somewhere in the statement.
 
G

Guest

What are C's design goals? I would say that they allow low-level manipulation and production of efficient code while opting for a terse syntax and portability.

....whilst still providing reasonable HLL capabilities. C's current survivability is more related to its ubiquitousness than too its original design goals.
If we are interested in allowing better documentation, we should have a "Long Comment Begins" and "Long Comment Ends" in addtion to "short comment begins" and "short comment ends" "/*" and "*/" are fine for the latter.

If we are interested in low-level manipulation, how about a "Fake Assembly Language" syntax as an option.

never quite understood what "fake assembly language" was? What is it you want that FAL can provide but other syntax can't?

<snip>
 
G

Guest

But not persuasively, IMHO.

C is at a lower semantic level than a lot of other languages.
But the major difference between C and assembly language is that
an assembly program specifies CPU instructions, while a C program
specifies run-time behavior.

That's a *huge* distinction, with assembly languages on one side,
and C, C++, Ada, Python, APL, and Intercal firmly on the other.

this is why I keep asking people like io_x why he wants "fake assembly language"? Or even what it *is*!

wasn't there a guy a while back who was into Portable Assembly Language? PASM?
 
G

Guest

Le 29/04/2012 18:17, Rui Maciel a écrit :

My responses will probably be OT, they are the result of many years
chasing elusive and costly bugs in the domain of real time embedded
systems. My wish list:

I think must of these are undo-able without removing C's essential "C-ness".. I think you want some sort of reasonably efficient script language running on top of a VM that enforces these sorts of limits; with the ability to plunge into C when necessary. Unfortunatly I don't know of such a thing. Forth maybe?
- prevents writing at wrong memory locations by the use of wild
pointers or out of bound array indices.

who decides when a pointer is "wild"? Forbid pointer arithmatic except in critical parts of the code.
- erasing of the stack memory resulting in wild jumps.

- prevents mistaking one structure with another by the
use of improper casts.

ban casting XICP
- overrun of integral types. The language should define
types such as uint32_t, int16_t, etc.

puke. It does.
- errors in comparison instructions resulting in an
assignation, if (a = b) instead of if (a == b)

experienced programmers hardly ever do this. Some compilers warn about it. How would you fix it?
- no dangling pointers after free()

you use malloc/free on an embedded system? How would you implement this? Zeroing the pointer isn't the solution...

Rewrite all your applications in Lisp.
 
G

Guest

I'm not saying it's impossible. There are plenty of languages that
avoid wild pointers, typically by not having pointers, or at least by
not having pointer arithmetic. But I don't think any such language
could reasonably be called "C", or even C-like.

and embedded people often want to do this sort of thing

volatile unsigned char *sio_reg = (unsigned char *)0x8010;

*sio_reg = 0xfe;
*sio_reg = 0x10;
*sio_reg = 0x10;

sio_reg++;
*sio_reg = 'h';
*sio_reg = 'i';
*sio_reg = '!';
 
B

BartC

experienced programmers hardly ever do this. Some compilers warn about it.
How would you fix it?

That's being a little elitist. Why shouldn't anyone be able to use the
language, for example those who have to program other languages too?

And the fix is easy: use different symbols for assignment and equality. By
using the natural "=" symbol for equality, you won't then fall into the trap
of using it by mistake instead of the very 1970s-looking "==".

Or use two symbols for assignment; one that returns the value of the left
side ("="), and one which returns void (eg. ":="). This could be added to C
today. It won't eliminate the problem, but can reduce it.
 
K

Keith Thompson

Ian Collins said:
On 05/ 2/12 09:14 AM, James Kuyper wrote: [...]
The latest draft of the C++ standard that I currently have access to is
n3035, dated 2010-02-16.

That's rather old, I believe N3225 was the last draft (Like C11, the
actual standard is still overpriced).

N3290 was newer, but it's no longer available.

The C++11 standard is now available for $30 from the ANSI store:

http://webstore.ansi.org/RecordDetail.aspx?sku=INCITS/ISO/IEC+14882-2012

They're still charging $285 for the C11 standard ($228 if you're a
member); my understanding is that that should drop to $30 as soon as
ANSI officially adopts ISO C11 as an ANSI standard.

(If you think $30 is still overpriced, I won't argue with you.)

[...]
 
J

James Kuyper

On 05/ 2/12 09:14 AM, James Kuyper wrote: ....

That's why it gets a mention in the compatibility section!

Well, yes. Since section C is only informative, not normative, I was
trying to explain precisely why what it says in C.1.5 is in fact a
correct description of the normative text. In my experience, what C.1.5
says is occasionally less accurate than it could be, particularly in
it's explanation of the C side of C/C++ differences.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,082
Messages
2,570,589
Members
47,211
Latest member
Shamestone

Latest Threads

Top