What's the deal with C99?

Paul Hsieh · Mar 28, 2008

I have ignored the remainder. Getting too complex for me.

Ok, this I don't get. Did you not write your own heap manager
yourself?!?! I was pretty sure you did, and used it for the DOS port
of gcc or something.

Look, you fetch big blocks from system memory in some way, whether
through sbrk() or VirtualAlloc() or some DPMI memory allocation or
whatever. You can track these big blocks in a list. The
isInAllocatedMemoryRange function would just look through this list to
see if the pointer was pointing to the inside of one of these blocks.
What is the complication here?!?! There's no issues -- I *did* this
in order to revive some old buggy code while doing major invasive
refactoring; this level of debugging was critical in making the effort
practical.

However, my point is that there is no way (apart from comprehensive
runtime tables) to tell that a pointer is to malloced memory,
static memory, or automatic memory.

What? If you are a compiler implementer (remember, someone was
challenging me on compiler extensions, so this is our premise) then in
fact, exactly those three memory regions represent exactly what you
*DO* know. In the effort that I referred to above, what I did is I
had the program read back in its own MAP file to figure what its
static ranges were and tracked the bottom of the stack from main (and
top from ESP in an _asm fragment.) This allowed me to implement a
memClassify function as well. Of course, its possible to get pointers
from other sources, so this does not represent a comprehensive memory
classifier.

But the compiler can do way better. At link time the memory map can
be determined, so hacks like reading back its own map file are
unnecessary when you can just append a small data region with the
start and length information for the static memory (and init/const
memory, if there is any.) For the stack, the beginning can be marked
before the call to main, and a macro be used for determining the
current stack top.

[...] There is also no way to tell
that that pointer is the the whole chunk, or to a miniscule chunk
of that. This makes memory checking virtually impossible, if you
want to preserve efficiency.

Huh? Look, you want the block with its header to be aligned. But you
also want the base pointer to be aligned. That means you have *LOTS*
of header space. If you do an approximation of (uint32_t)
SHA2({ptr,size}) (you can go waaay simpler, I am just giving an
overkill example to illustrate the point) and store it in the header
somewhere, then that's it; the memory allocation is marked in a pseudo-
unique fashion. You aren't going to run into flukey internal or
static pointers that randomly reproduce this signature.

Sure, I can do some things with a subset of those pointers, which I
have allocated and manipulated since birth. But those operations
won't function on another system.

Huh? I am talking about how one implements this in the compiler
library for each specific system. Obviously there's no way the
implementations would be portable; but this is unnecessary.

[...] They will be totally confused if
ever passed non-malloced pointers, or even malloced pointers using
another malloc package.

Huh? If it doesn't come from the malloc that corresponds to these
functions, then it will fail the isInAllocatedMemoryRange () function.

I just don't like any software that 'works sometimes'.

Its a debugging tool, and "sometimes" is 99.99% of the time.

[...] If its
purpose is to tell me that memory is being misused, I want a
definite answer. Barring that, I prefer no answer.

An automated system that is correct 99.99% of the time, is a higher
standard than I have ever seen in any human when tracking down a
memory corruption bug in a non-trivial amount of code.

Antoninus Twink · Mar 28, 2008

What the hell is your problem?

How long have you got?

CBFalconer · Mar 28, 2008

Paul said:
.... snip ...

[...] There is also no way to tell
that that pointer is the the whole chunk, or to a miniscule chunk
of that. This makes memory checking virtually impossible, if you
want to preserve efficiency.

Click to expand...

Huh? Look, you want the block with its header to be aligned. But
you also want the base pointer to be aligned. That means you have
*LOTS* of header space. If you do an approximation of (uint32_t)
SHA2({ptr,size}) (you can go waaay simpler, I am just giving an
overkill example to illustrate the point) and store it in the
header somewhere, then that's it; the memory allocation is marked
in a pseudo-unique fashion. You aren't going to run into flukey
internal or static pointers that randomly reproduce this signature.

Consider:

void *cpymem(void* restrict s1, const void* restrict s2, size_t
n);

How is that code going to tell the three types of pointers
mentioned above (apart from other subdivisions) apart? How is it
to tell that the n passed is one-off? Every time.

The compiler and its tables do not go along with the object code.
For C large and complex run-time tables are needed, and total
indirection of all pointers. This gives up the fundamental
advantage of C, i.e. its closeness to the underlying machine.

Walter Banks · Mar 28, 2008

Paul said:
I am not a proposal generating beaurocrat. I am not an active
participant in the standards process, and I am not a stakeholder. I
have nevertheless proposed various API modifications in this very
newsgroup over the years just on memory alone. They have never even
been acknowledged by anyone who has influence with the standards
committee.

The (admittedly raw) proposals have typically been in the form of

size_t memTotalAllocated (void);
void * memExpand (...);
size_t memSize(void *);
enum MemType memClassify (void *);
int memCheck (void *);
int memWalk (int (* cb) (void * ctx, int state, void * loc), void *
ctx);

Paul,

If you want to see real change from your proposals expand the
outline into a real document that can be debated. The process of
producing a document will require that details will have been
well thought out and the implication understood. Circulate the
document for honest debate and answer the critics.

It is this approach that starts real change in the standards process.
You are a stakeholder in this by the very fact that you are using
the language.

Spell out the details all the details. Standards are hard work
that involve a lot of time it is worth it. Prove that your approach
is better than the alternatives. Produce something that standards
organizations can actually debate so you will be heard.

Rant off

w..

Paul Hsieh · Mar 28, 2008

Paul Hsiehwrote:

[...] There is also no way to tell
that that pointer is the the whole chunk, or to a miniscule chunk
of that. This makes memory checking virtually impossible, if you
want to preserve efficiency.

Click to expand...

Click to expand...

Huh? Look, you want the block with its header to be aligned. But
you also want the base pointer to be aligned. That means you have
*LOTS* of header space. If you do an approximation of (uint32_t)
SHA2({ptr,size}) (you can go waaay simpler, I am just giving an
overkill example to illustrate the point) and store it in the
header somewhere, then that's it; the memory allocation is marked
in a pseudo-unique fashion. You aren't going to run into flukey
internal or static pointers that randomly reproduce this signature.

Click to expand...

Consider:

void *cpymem(void* restrict s1, const void* restrict s2, size_t n);

How is that code going to tell the three types of pointers
mentioned above (apart from other subdivisions) apart? How is it
to tell that the n passed is one-off? Every time.

Wait. Remember that the extensions I was discussing are about dynamic
memory. I can classify the other *regions* as well, but obviously
declared variable boundaries are not available (so even fields inside
of a struct on the heap cannot be protected, and I never claimed such
a thing). But we can, in fact, do something:

struct chkCtx {
void * ptr;
int sz;
enum relative2heap { INSIDE_GOOD, CROSS_BOUNDARY, NOT_IN }
ret;
};

static int memCheck (void * ctx, void * ptr) {
struct chkCtx * c = (struct chkCtx *) ctx;
size_t l = ((char *) ptr) - c->ptr;
if (l >= c->sz) return 0;
if (l + c->sz > sizeoOfMemoryAllocation (ptr)) {
c->ret = CROSS_BOUNDARY;
return -1;
}
c->ret = INSIDE_GOOD;
return -1;
}

/* I dropped the restricts, since they have no relevance to safe
code */
void *cpymem (void* s1, const void* s2, size_t n) {
struct chkCtx c;
c.sz = n;

if (isInAllocatedMemoryRange (s1, n)) {
c.ret = NOT_IN; /* Default in free entry */
c.ptr = s1;
memAllocationsIterate (memCheck, &c);
if (c.ret != INSIDE_GOOD) {
/* Failed */
return NULL;
}
}
if (isInAllocatedMemoryRange (s2, n)) {
c.ret = NOT_IN; /* Default in free entry */
c.ptr = s2;
memAllocationsIterate (memCheck, &c);
if (c.ret != INSIDE_GOOD) {
/* Failed */
return NULL;
}
}
/* Substance of the code: */
memmove (s1, s2, n);
}

The compiler and its tables do not go along with the object code.
For C large and complex run-time tables are needed, and total
indirection of all pointers. This gives up the fundamental
advantage of C, i.e. its closeness to the underlying machine.

Ok, but you are putting a demand on C that is too high. I never made
or implied that such a claim could be satisfied.

But, in many cases, a library may be able to assert that certain
structures must be allocated as exact results from malloc(), which can
be tested for.

When trying to track down wild pointers is real world code, you
usually don't need to know things to a resolution of exact declared
variable boundaries. Static corruptions which you want to guard
against can still be easily hunted down with a binary search once you
have located your corruption. The point of the extensions that *I*
have proposed is that they let you deal with corruption that you
cannot see with ANSI C interface when it happens.

CBFalconer · Mar 29, 2008

Paul said:
.... snip ...

Ok, but you are putting a demand on C that is too high. I never
made or implied that such a claim could be satisfied.

Ok, I think we can conclude that we are thinking along different
lines, for different objectives. Having come down with the flu, I
am not in the least inclined to worry about it.

Andrew Haley · Mar 30, 2008

Initially (back in the days leading up to C89), the GCC developers
were actively hostile to the standardization process (I'm not
entirely sure why), as illustrated by their initial implementation
of #pragma.

Ah, yes. That was a long time ago.

That attitude is, thankfully, long gone and the current developers
have made significant progress in conformance, although it's
disappointing that standards compliance still isn't part of their
mission statement.

Why ought it to be? The goal is freedom; if standards compliance
helps free software -- and it usually does -- gcc will be standards
compliant, but if standards compliance hurts free software it won't
be.

Being an open source project,

GCC is free software.

I'm sure there are logistical problems that would have to be
addressed for formal participation, but I don't think that any of
them are insurmountable. In particular, I would think that the GCC
steering committee could become a member of the ANSI committee
(DECUS was a member for quite a while), which would entitle them to
one Principal and unlimited Alternate representatives.

This is to misunderstand the role of the steering committee, which is
not to make technical decisions about language details, but to make
major decisions "in the best interests of the GCC project", which
usually means politics.

It makes sense for gcc contibutors to attend WG meetings, and in the
case of C++ some do -- but they don't represent the GCC project but
their employers.

Andrew.

lawrence.jones · Mar 31, 2008

Andrew Haley said:
Why ought it to be? The goal is freedom; if standards compliance
helps free software -- and it usually does -- gcc will be standards
compliant, but if standards compliance hurts free software it won't
be.

Unless a standard is encumbered in some way (like requiring use of
patented technology that is not freely licensed), I can't see compliance
doing anything but helping free software. It seems to me that it
belongs on the list of Design and Development Goals.

This is to misunderstand the role of the steering committee, which is
not to make technical decisions about language details, but to make
major decisions "in the best interests of the GCC project", which
usually means politics.

The standardization process is, at its heart, political.

It makes sense for gcc contibutors to attend WG meetings, and in the
case of C++ some do -- but they don't represent the GCC project but
their employers.

That's the problem -- it seems to me that GCC is important enough that
it deserves direct representation. The Steering Committee is the
closest thing GCC has to a formal organization that "owns" the
implementation.

-Larry Jones

I don't see why some people even HAVE cars. -- Calvin

Keith Thompson · Apr 4, 2008

Keith Thompson said:
santosh said:

Ioannis said:

Keith Thompson wrote:
(e-mail address removed) wrote:
[...]
I'm just reporting what Intel says in
<http://www.intel.com/support/performancetools/c/sb/cs-015003.htm>:

The Intel(R) C++ Compilers conforms to the ANSI/ISO standard
ISO/IEC 9899:1999 for C language with one limitation:

* long double (128-bit representations) is not supported.

Perhaps the support is mostly there but not quite complete?
Perhaps C99 "upgraded" long double to larger value range than C90
had it?

No, it didn't.

Then I think their long double comment doesn't make any sense.

Click to expand...

Obviously they are saying that 128 bit long doubles are not yet
supported. That doesn't mean that a smaller long double isn't.
Currently they seem to be using 96 bit long doubles.

Click to expand...

They're saying that they don't support 128-bit long double, which is
correct.

They're also saying that this is a limitation on their conformance to
the C99 standard, which is incorrect, since C99 doesn't require
128-bit long double.

I've just submitted a comment on this to Intel via the web page's
feedback link.

Intel got back to me, and they've fixed the page. It now says:

The Intel(R) C++ Compilers conforms to the ANSI/ISO standard
ISO/IEC 9899:1999 for C language.

Assuming it's true (and I have no reason to doubt it), this is good
news.

(I'll reply and ask them to fix the grammatical error.)

Ioannis Vranos · Apr 4, 2008

Keith said:
Intel got back to me, and they've fixed the page. It now says:

The Intel(R) C++ Compilers conforms to the ANSI/ISO standard
ISO/IEC 9899:1999 for C language.

Assuming it's true (and I have no reason to doubt it), this is good
news.

(I'll reply and ask them to fix the grammatical error.)

I am surprised Intel made this kind of previous mistake, and this
grammatical error.

Ioannis Vranos · Apr 4, 2008

Keith said:
Intel got back to me, and they've fixed the page. It now says:

The Intel(R) C++ Compilers conforms to the ANSI/ISO standard
ISO/IEC 9899:1999 for C language.

Assuming it's true (and I have no reason to doubt it), this is good
news.

(I'll reply and ask them to fix the grammatical error.)

I am surprised Intel made this kind of the previous mistake.

Keith Thompson · Apr 4, 2008

Ioannis Vranos said:
I am surprised Intel made this kind of the previous mistake.

Intel, much like Soylent Green, is made of people. We all make
misteaks.

What's going on with C Compilers and C99??	30	Mar 28, 2007
What's the deal with size_t?	104	Nov 6, 2007
C99 portability challenge	65	Aug 26, 2008
Can someone tell me what's wrong with this question on StackOverflow?	0	Aug 19, 2023
What standard library functions must deal with allocated memory?	12	Apr 23, 2012
What's the deal with deadlocks	17	Apr 17, 2011
Questions about C90 vs C99	15	Dec 10, 2007
Is C99 C?	46	Aug 3, 2009

What's the deal with C99?

Paul Hsieh

Antoninus Twink

CBFalconer

Walter Banks

Paul Hsieh

CBFalconer

Andrew Haley

lawrence.jones

Keith Thompson

Ioannis Vranos

Ioannis Vranos

Keith Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads