Question about malloc, alignment, ...

B

Bhalchandra Thatte

Christian Bau said:
He missed pointer to short, int, long, long long and pointer to struct;
he missed long long as well. All the pointers that I mentioned could in
principle have different size and alignment requirements than void*.

I was trying to convey the main idea of the author, but errors
are mine. Probably the author didn't miss anything.
On one very common architecture where sizeof (double) == 8, it would be
quite reasonable that structs with two consecutive double members have
alignment requirement of 16. And I cannot see anything in the C Standard
that would prevent this from happening.

This is the kind of thing that makes me uncomfortable with that idea.
But then it seems there is no easy general solution without wasting much space.
Thanks.
 
B

Bhalchandra Thatte

Richard Heathfield said:
No. For the alignment thing to be ***guaranteed*** to work on a general
basis, you must list every type used by the program, which includes all
user-defined types (which is why implementing a generalised leak detector
is much harder than it looks). It would be sufficient, in *your* case, to
include

struct header hdr;

and

struct record rcd;

into the alignment structure. But of course my solution doesn't require you
to include any of the other types (into your program, to solve your current
problem). A simple union of struct header and struct record is sufficient.

It won't work if I want to use the same pool for different applications.
Now I see why Knuth does it somewhat differently. He puts the header
at the end, (that is after the allocation for structs). Since the header is
fixed in all applications, you know its alignment requirement.
Based on his idea, probably I can do something like this:
(Errors in the following are of course mine, not Knuth's.)

struct header {
void* first;
header* next;
};

struct record {
....
}

/* to allocate for num records */
unsigned int n = sizeof(record)*num;
unsigned int s = sizeof(header);
unzigned int offset = (n%s) ? s - n%s : 0;
void* x = malloc(n + offset + s);
header* head = (header *) x+n+offset;
head->first = x;
head->next = ...

This is definitely better than putting the header in the
beginning of the block.

Thanks.
 
T

The Real OS/2 Guy

Consider:

================
#include <stdlib.h>

struct one { char ch; };
struct two { long l; };

char *fred = malloc(sizeof(one) + sizeof(two));

struct two *ptr2 = fred + sizeof(one);

is identical with:

strut one *o = (struct one *) fred;
ptr2 = (struct two*) (o +1),

When you put your compier in the sharpest warning level you needs the
casting. ALL warnings are really helpful! But when you knows exactly
what you does you can avoid the warning by saying the compiler: be
quirte, I know what I do.
long x = ptr2->l;
================

Code equivalent to the above will generate an alignment fault on some
architectures.

Never - except

struct two t2[2];
t2[1].l = 0;

gives the same exception - but then your compiler is buggy.

Each struct is defined on an aligned address - ever! You must do some
undefined pointer arithmetic to get an misaligned pointer.
 
A

Arthur J. O'Dwyer

is identical with:

struct one *o = (struct one *) fred;
struct two *ptr2 = (struct two*) (o + 1);

Modulo your typos, which I've fixed above, yes.
When you put your [compiler] in the sharpest warning level you [need]
the casting. ALL warnings are really helpful! But when you [know]
exactly what you [are doing] you can avoid the warning by saying [to]
the compiler: be [quiet], I know what I [am doing].

While this is also true, I don't see what casting has to do with
anything. As you said, the two code samples are basically equivalent.
Neither one contains any compiler error. Both contain undefined
behavior.
long x = ptr2->l;
================

Code equivalent to the above will generate an alignment fault on some
architectures.

Never - except

struct two t2[2];
t2[1].l = 0;

gives the same exception - but then your compiler is buggy.

That doesn't make any sense. My compiler is *not* buggy, and I don't
see any reason to suspect that Jack's is, either. The code above,
using t2[1], exhibits perfectly well-defined behavior. The code
that Jack posted, and the equivalent code that you posted, using
ptr2, exhibits undefined behavior.
Each struct is defined on an aligned address - [always]!

This is patently false. Consider again:

struct foo { char c; };
struct bar { int i; };

On many architectures, 'foo' will have an alignment restriction
of 1 or 2 bytes, and 'bar' will have an alignment restriction of
4 bytes. This is not theoretical.
You must do some
undefined pointer arithmetic to get an misaligned pointer.

True, but vacuous. Consider:

char c;
void *p = &c;
int *i = p;

Of course this is "undefined", and it's "pointer arithmetic",
but I don't think it's meaningful to call it "undefined pointer
arithmetic", since each operation is perfectly reasonable in
and of itself.
Replace the above with

struct foo c;
void *p = &c;
struct bar *i = p;

and you have the exact same problem - undefined behavior that
your average compiler won't be able to catch.


Indeed.

-Arthur
 
C

Christian Bau

pete said:
What's the reason ?

What can a cpu do with a struct that has two double members,
when the the alignment requirements are for the size of the structure,
that a cpu can't can't do
when the structure alignment requirements are for the size of a member ?

Pentium 4, SSE2 instructions. Load a pair of double precision numbers
into a 16 byte register and know that the pair of doubles is correctly
aligned for this.
 
T

The Real OS/2 Guy

The said:
Now your union has the size of the biggest struct.

Correct. This is why I pointed out that efficiency might be an issue if the
structs were wildly different in size.
When the big struct
is 4 K, your header 20 bytes and the smallest struct is 16 bytes then
you needs more than 400 KB instead 20 + 16 * 100 = 1620 bytes when you
needs header + smallest struct [100].

But, as it turns out, the OP's header is slightly smaller than his struct,
so there is almost no waste of space, and the solution is much simpler than
any other so far proposed.
The idea is really bad.

Your assertion is noted. I disagree entirely with it. The idea of following
the advice slavishly is bad (for example, consider a very large header and
a comparatively small record), but in certain circumstances, including
those of the OP, it is a sensible strategy.
Understund how pointer arithmetic works and your memory usage is
thifty.

Parse error.

Always let the compiler think about alignment - it is its work
to do so.

It's the programmer's job to understand his program.

And for that he has to know how pointer arithmetic works.

For that it is guaranteed that any struct of a given struct type has
equal size. Sou you can easy build single structs and struct arrays
dynamically and not only statically. It is guaranteed that each
pointer to each struct has the same size and aligned well.

sizeof(struct) gives you always the size of a struct. Adding this to a
pinter to char gives you always an well aligned address to start a new
struct - but native pointer arithmetic on the right type does it as
well.

I'd made the test not too long ago by writing an ANSI C program based
on that principe to get it in production on 5 different compilers in 5
different architectures - 16 and 32 bit, some of them were sensitive
against misalignment.
Written and debuged on ONE mashine (really insensitve against
misalignment), compiled and linked on all. Whereas I'd seen the
comilers of the four other mashines the first time as I had to start
to compile the program on that mashines. My knowledge of that mashines
was 0, zero, void at that time. As the program was tested well as I
started to port it nothing was to change.

The program was written completely without #ifdef, pure ANSI C.

Massive pointer arithmetic was needed to get all requiremts done - the
same as I'd desribed in the artikles before.
 
R

Richard Heathfield

Bhalchandra said:
It won't work if I want to use the same pool for different applications.

Very true, but that's new information. I've already explained that it won't
work well in the general case.
Now I see why Knuth does it somewhat differently. He puts the header
at the end, (that is after the allocation for structs).

Um, if it's a header, shouldn't it go at the fr... well, never mind. If it
makes you happy, by all means stick it at the back.
Since the header
is fixed in all applications, you know its alignment requirement.

How would you go about determining this, in a *portable* way? (This could
turn out to be a very interesting sub-thread.)
Based on his idea, probably I can do something like this:
(Errors in the following are of course mine, not Knuth's.)

struct header {
void* first;
header* next;
};

struct record {
...
}

/* to allocate for num records */
unsigned int n = sizeof(record)*num;
unsigned int s = sizeof(header);
unzigned int offset = (n%s) ? s - n%s : 0;
void* x = malloc(n + offset + s);
header* head = (header *) x+n+offset;
head->first = x;
head->next = ...

This is definitely better than putting the header in the
beginning of the block.

If you find it more convenient to put the header at the end, by all means do
so! :)

It is not, however, a portable solution, and it may very well not work on
some platforms.
 
R

Richard Heathfield

The said:
And for that he has to know how pointer arithmetic works.

Yes. It's not difficult.
sizeof(struct) gives you always the size of a struct.

No, it gives you a syntax error.

I don't have time to continue this conversation. Please ask your C teacher
about syntax.
 
L

LibraryUser

Richard said:
The said:
Now your union has the size of the biggest struct.

Correct. This is why I pointed out that efficiency might be an
issue if the structs were wildly different in size.
When the big struct is 4 K, your header 20 bytes and the
smallest struct is 16 bytes then you needs more than 400 KB
instead 20 + 16 * 100 = 1620 bytes when you needs header +
smallest struct [100].

But, as it turns out, the OP's header is slightly smaller than
his struct, so there is almost no waste of space, and the
solution is much simpler than any other so far proposed.

In addition, the explicit case of large header and small items
can be catered to by a struct containing the header and a pointer
to an array of small items. Allocation of this requires two
malloc calls, but all references can then be made via the header
alone. It also allows the luxury (or foolishness) of static
allocation of the header.

You can see an example of this in hashlib, available at:

<http://cbfalconer.home.att.net/download/>

which actually goes through three levels of allocation: the
header, a (variable) sized array of pointers, and a (variable)
sized item with no restrictions.

As Richard has pointed out, there is no substitute for
considering the data to be handled. This may actually require
thought.
 
T

Tim Prince

pete said:
The memory returned by malloc is suitably aligned for
the most restrictive type on your machine.

Not true on IA-32 platforms, unless you "restrict" yourself to types which
are no more "restrictive" than double, and accept mis-aligned doubles as
"suitable" if you should happen to choose a malloc which aligns only ints.
 
P

pete

Tim said:
pete wrote:

Not true on IA-32 platforms,
unless you "restrict" yourself to types which
are no more "restrictive" than double,
and accept mis-aligned doubles as "suitable"
if you should happen to choose a malloc which aligns only ints.

Where are you going to get a malloc that only aligns ints ?

N869
7.20.3 Memory management functions
[#1] The order and contiguity of storage allocated by
successive calls to the calloc, malloc, and realloc
functions is unspecified. The pointer returned if the
allocation succeeds is suitably aligned so that it may be
assigned to a pointer to any type of object and then used to
access such an object or an array of such objects in the
space allocated
 
K

Kevin Easton

LibraryUser said:
Richard said:
The said:
RJH>> > If you must do it this way, the least clumsy way
RJH>> > I can think of that is guaranteed to work is to
RJH>> > make a union of the header and the record itself,
RJH>> > and allocate a bunch of those.

Now your union has the size of the biggest struct.

Correct. This is why I pointed out that efficiency might be an
issue if the structs were wildly different in size.
When the big struct is 4 K, your header 20 bytes and the
smallest struct is 16 bytes then you needs more than 400 KB
instead 20 + 16 * 100 = 1620 bytes when you needs header +
smallest struct [100].

But, as it turns out, the OP's header is slightly smaller than
his struct, so there is almost no waste of space, and the
solution is much simpler than any other so far proposed.

In addition, the explicit case of large header and small items
can be catered to by a struct containing the header and a pointer
to an array of small items. Allocation of this requires two
malloc calls, but all references can then be made via the header
alone. It also allows the luxury (or foolishness) of static
allocation of the header.

You don't need two malloc calls - you simply start the small item array
at the smallest multiple of the small item size that is larger than the
header size from the start of the malloc'ed region (and calculate a
suitable size to malloc in the first place).

- Kevin.
 
T

Tim Prince

pete said:
Where are you going to get a malloc that only aligns ints ?
As supplied with the most popular compilers for Windows.
N869
7.20.3 Memory management functions
[#1] The order and contiguity of storage allocated by
successive calls to the calloc, malloc, and realloc
functions is unspecified. The pointer returned if the
allocation succeeds is suitably aligned so that it may be
assigned to a pointer to any type of object and then used to
access such an object or an array of such objects in the
space allocated
"suitably aligned" is commonly taken as aligned on 4-byte boundaries for
Windows, and 8-byte boundaries for linux, although both OS support the same
data types, on the same hardware, one of which (not a std C data type)
requires 16-byte alignment. The hardware supports mis-aligned doubles, at
potentially great cost in performance. Thus, the definition of suitability
differs among implementations, with a different compromise between space
and performance.
 
L

LibraryUser

Kevin said:
LibraryUser said:
Richard said:
The Real OS/2 Guy wrote:

RJH>> > If you must do it this way, the least clumsy way
RJH>> > I can think of that is guaranteed to work is to
RJH>> > make a union of the header and the record itself,
RJH>> > and allocate a bunch of those.

Now your union has the size of the biggest struct.

Correct. This is why I pointed out that efficiency might be an
issue if the structs were wildly different in size.

When the big struct is 4 K, your header 20 bytes and the
smallest struct is 16 bytes then you needs more than 400 KB
instead 20 + 16 * 100 = 1620 bytes when you needs header +
smallest struct [100].

But, as it turns out, the OP's header is slightly smaller than
his struct, so there is almost no waste of space, and the
solution is much simpler than any other so far proposed.

In addition, the explicit case of large header and small items
can be catered to by a struct containing the header and a pointer
to an array of small items. Allocation of this requires two
malloc calls, but all references can then be made via the header
alone. It also allows the luxury (or foolishness) of static
allocation of the header.

You don't need two malloc calls - you simply start the small item
array at the smallest multiple of the small item size that is
larger than the header size from the start of the malloc'ed region
(and calculate a suitable size to malloc in the first place).

While you can play such foolish games, it is error prone and a
maintenance nightmare. It is also totally unnecessary.
 
K

Kevin Easton

LibraryUser said:
While you can play such foolish games, it is error prone and a
maintenance nightmare. It is also totally unnecessary.

I don't believe it is either - the correct offset can be calculated
using sizeof so that it will continue to be correct after the object
sizes are changed, and I think it's more error-prone to use two mallocs
(and hence require two free()s).

- Kevin.
 
L

LibraryUser

Kevin said:
I don't believe it is either - the correct offset can be calculated
using sizeof so that it will continue to be correct after the object
sizes are changed, and I think it's more error-prone to use two mallocs
(and hence require two free()s).

IMNSHO the optimum method is to create two functions, something
like:

thing *creatething(size_t sizerequired);
void destroything(thing *thething);

which handle all the interrelationships and strictly limit any
error propagation. The application will then not need to know
anything about the organization of a thing, provided that
suitable functions are provided to operate on it. Once more, see
my hashlib package for an example.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
474,078
Messages
2,570,572
Members
47,204
Latest member
MalorieSte

Latest Threads

Top