How much memory does malloc(0) allocate?

E

Eric Sosman

[...]
The wording in the standard above allows malloc(0) to return anything,
you just should not dereference the result.
Not quite "anything." If it returns a non-NULL value, the value

- Must compare unequal to every other valid pointer value,
- Including NULL
- Including prior malloc(0) results not yet free()'d

i don't understand why
"Must compare unequal...
Including prior malloc(0) results not yet free()'d"

See 7.22.3p1: "Each such allocation shall yield a pointer
to an object disjoint from any other object."
 
J

James Kuyper

On 7/26/2013 1:27 PM, James Kuyper wrote: ....
It occurs to me that there's a third option, not allowed by the standard
for malloc(), that is available for your wrapper. ...
....

I think 7.22.3p1 forbids this dodge:

"If [malloc(0) returns non-null] the behavior is as if
the size were some nonzero value, [...]"

and (a few lines earlier)

"Each such allocation shall yield a pointer to an object
disjoint from any other object."

When the size is non-zero, a successful allocation must be
distinct from all other successful allocations (and from all
other valid pointers, null and non-null). That's part of the
behavior of a successful malloc(), and if malloc(0) undertakes
to imitate that behavior it must imitate the uniqueness, too.

See above.

The wrapper is not required to follow the same rules that malloc()
itself is bound by. Allowing behavior that's different from that allowed
for the wrapped function is one of the most common reasons for writing a
wrapper.
 
M

Malcolm McLean

That last sentence was probably a bit overstated and dismissive of some
real concerns.

I think that most code that calls malloc does so in such a way that it
*cannot* call malloc(0). Typically the argument is some positive
multiple of sizeof something, and sizeof always yields a positive
result.

But if malloc is used to allocate a dynamically sized buffer, it can be
possible for the size to be 0 -- and in that case, well-written code has
to take extra care to allow for the implementation-defined result of
malloc(0). "Well-written code" is code that takes this extra care so
that it doesn't have to care what malloc(0) returns.
The poet and divine John Donne condemned the newfangled introduction of
zero in a sermon.
The null or empty case is often difficult. For instance if we are pasting
the empty image into a larger image, obviously it should be a no-op.
But what if we are stretching an image to 256x256 and displaying it?
What should that routine do if passed the empty image? Should a 2x0 image
compare as equal to a 0x3 image? There aren't any obvious answers and,
unless null images are used extensively, code authors are unlikely to
document what happens in the null case.
malloc(0) not being properly defined just adds one more gotcha. It's
unsatisfactory. Of course it's possible to implement correct behaviour on
top of it, but at the cost of cluttering every allocation request with a
test for zero.
 
J

Joe Pfeiffer

Keith Thompson said:
That last sentence was probably a bit overstated and dismissive of some
real concerns.

I think that most code that calls malloc does so in such a way that it
*cannot* call malloc(0). Typically the argument is some positive
multiple of sizeof something, and sizeof always yields a positive
result.

But if malloc is used to allocate a dynamically sized buffer, it can be
possible for the size to be 0 -- and in that case, well-written code has
to take extra care to allow for the implementation-defined result of
malloc(0). "Well-written code" is code that takes this extra care so
that it doesn't have to care what malloc(0) returns.

And yes, this implementation-defined behavior is annoying. If the C
library were being designed from scratch today, I'm sure that the
behavior of malloc(0) would be defined one way or the other. It's the
way it is (presumably) because early pre-standard implementations did
not behave consistently, and the authors of the standard wanted to avoid
breaking code that depended on one particular behavior. (Such code was
already non-portable, but not all C code has to be portable.)

To me, it seems like code shouldn't try to malloc(0). There have been
examples in this thread of programs that attempt to allocate an empty
image; it still seems as if you're trying to allocate 0 bytes by the
time you get to the malloc() call you've failed to adequately check your
inputs.
 
J

James Kuyper

On 07/26/2013 10:23 PM, Joe Pfeiffer wrote:
....
image; it still seems as if you're trying to allocate 0 bytes by the
time you get to the malloc() call you've failed to adequately check your
inputs.

Whether it's a "failure" or "inadequately checked" depends upon how it
would be used. If, without having to write any special-case code, the
simple fact that the amount of memory needed is zero guarantees that a
given program will never dereference the pointer (which is fairly
plausible), it's no problem if that pointer happens to be null. The
program could either avoid calling malloc() when the size is zero, or
avoid panicking when malloc returns 0 if the size is zero. I would favor
avoiding the call, but I can't see anything horribly wrong with avoiding
the panic, instead.
 
B

Bart van Ingen Schenau

if a=malloc(0) and v=malloc(0)
why a!=v
if both a and v point to 0 space mem
Because a and v point to *different* blocks of memory that can hold 0
bytes of data.

There are two ways that malloc(0) can work:
- It always fails (returns NULL)
- It works exactly like malloc(X) for X>0 (so there is nothing special
about allocating 0 bytes)

Bart v Ingen Schenau
 
R

Rosario1903

Because a and v point to *different* blocks of memory that can hold 0
bytes of data.

i think malloc() function i use would return one address
in response to malloc(0), each different each call of malloc(0)

but why allocate all these different blocks of mem?
only because standard say so?
 
K

Kleuske

Not quite "anything." If it returns a non-NULL value, the value

- Must compare unequal to every other valid pointer value,
- Including NULL - Including prior malloc(0) results not yet
free()'d

- Must be convertible to any data pointer type and back again
to void* without damage

So, maybe s/anything/anything within reason/ ?

Ok. Almost anything. It remains tricky, though.
The C99 Rationale (I haven't seen a C11 version yet) explains
the Committee's thinking; see section 7.20.3.


Fine -- But in your usage, the assert should precede the
call to malloc(), and not depend on the returned value.

Of course.
 
K

Kleuske

The C99 Rationale (I haven't seen a C11 version yet) explains
the Committee's thinking; see section 7.20.3.

Thanks for that, it is enlightening.

The rationale is "we do not wish to break existing code, reported to
be in widespread use." (paraphrased). The committee doesn't seem to be
very happy about that coding practice.

The C89-committee was a bit wiser and refused to accept zero byte
objects, resulting in a "quiet change" whenever programs rely on zero-
byte allocations. Hence the resulting mishmash, which is, arguably, the
worst of both worlds.

I hope a future committee will be even wiser and force a bit of
maintenance on programs that rely on zero-byte allocations.
 
E

Eric Sosman

i think malloc() function i use would return one address
in response to malloc(0), each different each call of malloc(0)

That's all right, if the address returned is NULL.
but why allocate all these different blocks of mem?
only because standard say so?

Why allow `short'? Only because standard say so?

That is, "The Standard requires behavior X" is a very
good reason for an implementation to behave X-ishly.
 
E

Eric Sosman

Thanks for that, it is enlightening.

The rationale is "we do not wish to break existing code, reported to
be in widespread use." (paraphrased). The committee doesn't seem to be
very happy about that coding practice.

The C89-committee was a bit wiser and refused to accept zero byte
objects, resulting in a "quiet change" whenever programs rely on zero-
byte allocations. Hence the resulting mishmash, which is, arguably, the
worst of both worlds.

I think you're misreading the Rationale. No edition of the
Committee -- for ANSI, C99, C11, or the various intermediate
TC's and amendments -- supported the idea of a zero-size object.
But all of them, starting with ANSI, allowed zero-byte allocations.
The "quiet change" two-and-a-half decades ago amounted to

- Code that assumes malloc(0)==NULL may break

- Code that assumes malloc(0)!=NULL may break

.... which wasn't really a "change" at all, since code that made
either assumption might break when moved from one system to
another.
I hope a future committee will be even wiser and force a bit of
maintenance on programs that rely on zero-byte allocations.

Just the way C99 "forced" variable-length arrays on, for
example, Microsoft?

If a Committee were to issue a Standard whose adoption would
require reviewing and patching a billion lines of C (my off-the-
cuff estimate, probably low) for no benefit beyond "It'll be
cleaner when you're finished," how eager do you think anyone
would be to adopt it? Such a Standard would be a dead letter.
 
K

Kleuske

Just the way C99 "forced" variable-length arrays on, for
example, Microsoft?

If a Committee were to issue a Standard whose adoption would
require reviewing and patching a billion lines of C (my off-the- cuff
estimate, probably low) for no benefit beyond "It'll be cleaner when
you're finished," how eager do you think anyone would be to adopt it?
Such a Standard would be a dead letter.

You're right. Just live with the history and learn from it. It isn't C's
only quirk.
 
H

Heinrich Wolf

Lynn McGuire said:
How much memory does malloc(0) allocate?

I tried with 3 different compilers

Borland Turbo C 2.0 and Borland C++ Builder 5 both return a NULL pointer.
Here I did not try to write bytes to that pointer nor to free the pointer.

Fedora 14 gcc returns some non-NULL pointer. I can write up to 5000 byte and
maybe more to that pointer without crash. There's no crash when I have
written maximum 13 byte to that pointer and then free it. Otherwise there is
a crash on free.
 
J

James Kuyper

I tried with 3 different compilers

Borland Turbo C 2.0 and Borland C++ Builder 5 both return a NULL pointer.
Here I did not try to write bytes to that pointer nor to free the pointer.

Fedora 14 gcc returns some non-NULL pointer. I can write up to 5000 byte and
maybe more to that pointer without crash. There's no crash when I have
written maximum 13 byte to that pointer and then free it. Otherwise there is
a crash on free.

Keep in mind that overwriting the end of allocated memory can cause very
serious problems that DO NOT have to include crashing your program. The
single most common consequence is that if your program, either directly
or indirectly, uses the malloc() family to allocate two or more
different blocks of memory at the same time, overwriting the end of one
will result in writing to one of the others. Since the behavior of such
code is undefined, the compiler need not keep track of such overwrites,
so data extracted from the other block might currently be copied to a
register. As a result, it might appear to your program that no overwrite
has occurred. It could be quite some time before the program needs to
retrieve data from the actual overwritten memory, and even longer before
that fact causes noticeable malfunctions (in some cases, the damage can
be so subtle that it isn't noticed for years). That can make it very
difficult to track down where the overwrite occurred.

One of the two most common ways I know of for implementing the malloc()
family requires storing heap-management information about each allocated
block of memory in a location that is adjacent to that block. In that
case, overwriting the end of one block will generally damage the
heap-management information associated with the next block. That can
cause the malloc() family of functions to malfunction wildly.

Don't assume your code is safe just because it doesn't crash. When
malloc(0) returns a non-null value, the only part of the returned memory
you can portably assume is writeble is the very first byte in that block.
 
E

Eric Sosman

[...]
Don't assume your code is safe just because it doesn't crash. When
malloc(0) returns a non-null value, the only part of the returned memory
you can portably assume is writeble is the very first byte in that block.

That's "portable" in the sense of "likely to work," not in the
sense of "guaranteed to work." The Standard does not require that
the zero-byte allocation be writable or even readable. If such an
attempt is made the behavior is undefined:

7.22.3p1: "If the size of the space requested is zero, [...]
the returned pointer shall not be used to access an object."
 
G

Geoff

[...]
Don't assume your code is safe just because it doesn't crash. When
malloc(0) returns a non-null value, the only part of the returned memory
you can portably assume is writeble is the very first byte in that block.

That's "portable" in the sense of "likely to work," not in the
sense of "guaranteed to work." The Standard does not require that
the zero-byte allocation be writable or even readable. If such an
attempt is made the behavior is undefined:

7.22.3p1: "If the size of the space requested is zero, [...]
the returned pointer shall not be used to access an object."

Is this another case of "trust the programmer"? The word "shall"
implies that there is some kind of enforcement of this section. Is the
compiler/runtime required to disallow access of a zero sized object or
is the programmer required to guard against it?


#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

int main(int argc, char *argv[])
{
void *mem;
size_t size = 0;

mem = malloc(size);
if (mem == NULL) {
printf("malloc failed\n");
return EXIT_FAILURE;
}
memset(mem, 'A', size);
free (mem);

return EXIT_SUCCESS;
}
 
K

Keith Thompson

Geoff said:
[...]
Don't assume your code is safe just because it doesn't crash. When
malloc(0) returns a non-null value, the only part of the returned memory
you can portably assume is writeble is the very first byte in that block.

That's "portable" in the sense of "likely to work," not in the
sense of "guaranteed to work." The Standard does not require that
the zero-byte allocation be writable or even readable. If such an
attempt is made the behavior is undefined:

7.22.3p1: "If the size of the space requested is zero, [...]
the returned pointer shall not be used to access an object."

Is this another case of "trust the programmer"? The word "shall"
implies that there is some kind of enforcement of this section. Is the
compiler/runtime required to disallow access of a zero sized object or
is the programmer required to guard against it?

In this case, "shall" doesn't imply enformcement. Violating a "shall"
that's not part of a constraint means the program has undefined
behavior.

Section 4 of the C standard says:

In this International Standard, "shall" is to be interpreted as a
requirement on an implementation or on a program; conversely, "shall
not" is to be interpreted as a prohibition.

If a "shall" or "shall not" requirement that appears outside of a
constraint or runtime-constraint is violated, the behavior is
undefined. Undefined behavior is otherwise indicated in this
International Standard by the words "undefined behavior" or by the
omission of any explicit definition of behavior. There is no
difference in emphasis among these three; they all describe
"behavior that is undefined".
 
E

Eric Sosman

[...]
Don't assume your code is safe just because it doesn't crash. When
malloc(0) returns a non-null value, the only part of the returned memory
you can portably assume is writeble is the very first byte in that block.

That's "portable" in the sense of "likely to work," not in the
sense of "guaranteed to work." The Standard does not require that
the zero-byte allocation be writable or even readable. If such an
attempt is made the behavior is undefined:

7.22.3p1: "If the size of the space requested is zero, [...]
the returned pointer shall not be used to access an object."

Is this another case of "trust the programmer"? The word "shall"
implies that there is some kind of enforcement of this section.

No, although you need more context than I quoted to make
the determination. The quoted "shall" is not inside a constraint or
runtime-constraint section, so the implementation is not required to
detect violations or take any particular action if they occur. All
you get is "the behavior is undefined" (4p2).
Is the
compiler/runtime required to disallow access of a zero sized object or
is the programmer required to guard against it?

Neither. The implementation can do whatever it pleases ("the
behavior is undefined"). On the other hand, the programmer is not
required to avoid undefined behavior! Perhaps the programmer knows
how the implementation at hand will respond to a particular violation,
and wants to elicit that response.
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

int main(int argc, char *argv[])
{
void *mem;
size_t size = 0;

mem = malloc(size);
if (mem == NULL) {
printf("malloc failed\n");
return EXIT_FAILURE;
}
memset(mem, 'A', size);
free (mem);

return EXIT_SUCCESS;
}

I'm not sure what you're trying to demonstrate here. As has
already been pointed out, the NULL return in this case need not
indicate a "failure" of malloc(). Bailing out might be a good idea,
though, because `memset(NULL, 'A', 0)' invokes undefined behavior
(7.1.4p1; also, you need to #include <string.h> to declare memset).
 
J

James Kuyper

[...]
Don't assume your code is safe just because it doesn't crash. When
malloc(0) returns a non-null value, the only part of the returned memory
you can portably assume is writeble is the very first byte in that block.

That's "portable" in the sense of "likely to work," not in the
sense of "guaranteed to work." The Standard does not require that
the zero-byte allocation be writable or even readable. If such an
attempt is made the behavior is undefined:

7.22.3p1: "If the size of the space requested is zero, [...]
the returned pointer shall not be used to access an object."

I'd forgotten about that clause when I wrote the above paragraph. The
clause that I was thinking about corresponds to the "..." in the above
citation, so I really had no very good excuse for thinking about one
without the other, but that's what I did.
Is this another case of "trust the programmer"? The word "shall"
implies that there is some kind of enforcement of this section. Is the
compiler/runtime required to disallow access of a zero sized object or
is the programmer required to guard against it?

There's only one circumstance in which the standard mandates rejection
of a program: if it contains a correctly formated #error directive that
survives conditional compilation.

ISO has no enforcement arm, no authority to create one. A false claim of
conformance to the C standard might be legally actionable in many
countries, but no more so than any other kind of false advertising.

The standard is, in effect, a contract. It never prohibits anything - it
just specifies what a conforming implementation of C is (and is not)
required to do when asked to translate and execute a program. An
implementation isn't prohibited from violating those requirements, it
just fails to be conforming if it does so. A program can have syntax
errors, constraint violations, or undefined behavior, but you're not
prohibited from writing such code. It's a bad idea to write such code,
but not because it's prohibited - its simply because the standard
doesn't guarantee that such code will do what you want it to do.

A program that violates a "shall" that occurs in a normative section of
the C standard, but outside of a constraint section (as is the case
here) has undefined behavior. That means the C standard imposes no
requirements of any kind on how a conforming implementation may deal
with it. Therefore, if there is anything, anything at all, that you
don't want your program to do, then you should not write such code,
because it's possible, at least in principle, that your program will do
one of the things you don't want it to do.
 
B

BGB


FWIW, in my custom allocator.

previously:
allocating 0-7 bytes resulted in a 16-byte (1 cell) allocation, and 8-23
would allocate 32 bytes (2 cells).

currently:
0 will allocate 16 bytes, 1-15 will allocate 32, 16-31 will allocate 48
bytes, ...

the first 8/16 bytes are actually the object header.


the change was mostly due to allowing larger allocations, and also the
ability to track source location (file name, line number), ...

previously, there was no support for source file/line tracking, and
allocations were limited to 1GB (now 64PB, on 64-bit targets).

another expansion is now there are (theoretically) up to 16M unique
object types, though the current limit is still a bit smaller (types IDs
are currently hash-indices, and 16M entries would be *huge* for a
hash-table). the current limit may go away if I switch over to a
hash-chained-array or similar.

the headers also contain a small check-value (header hash), used to
detect corrupted object headers.


note:
unlike "malloc()", memory objects have an associated type-name, which is
usable for things like run-time type-checks for pointers (among other
things).

the use of source-file/line information is mostly for things like
debugging and leak-detection (IOW: help trying to figure out where
memory is being leaked from).


allocator statistics have generally shown that small objects (< 4kB)
represent a good portion of the total memory use, *1, but currently with
a big spike at 32kB (one of the major subsystems allocates a lot of 32kB
arrays).

*1: roughly forming a Gaussian distribution centered on 0, with millions
of small objects.

currently, typically, it is dealing with heap-sizes of around 500MB to 2GB.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,076
Messages
2,570,565
Members
47,201
Latest member
IvyTeeter

Latest Threads

Top