Alignment of pointer to void

D

David Deharbe

Hi,

Assuming that a pointer to void is always a multiple of 4 provides an
opportunity to use as tags the two bits that are always 0. Knowing that
several experts on the ISO99 C standard should be monitoring this
group, I would like to know if this assumption complies with this
standard before using it in a project. Also, if this is the case, where
does the standard state this?

Best regards,

David.
--
 
P

pete

David said:
Hi,

Assuming that a pointer to void is always a multiple of 4 provides an
opportunity to use as tags the two bits that are always 0.
Knowing that
several experts on the ISO99 C standard should be monitoring this
group, I would like to know if this assumption complies with this
standard before using it in a project.

It doesn't.

N869
6.2.5 Types
[#27] A pointer to void shall have the same representation
and alignment requirements as a pointer to a character type.

and that's all that the standard says about the representation
of pointer to void.
 
S

Skarmander

David said:
Assuming that a pointer to void is always a multiple of 4 provides an
opportunity to use as tags the two bits that are always 0. Knowing that
several experts on the ISO99 C standard should be monitoring this
group, I would like to know if this assumption complies with this
standard before using it in a project. Also, if this is the case, where
does the standard state this?
If by "complies" you mean "is guaranteed by", then no, this is not the case.

If you mean that it's legal for an implementation to have pointers to void
for which the binary representation has the two least significant bits
unset, then yes.

If you mean that it's legal for an implementation to have pointers to void
for which the binary representation *always* has the two least significant
bits unset, then yes.

If you're asking whether it's possible to write a C program that assumes
either is the case, then also yes, but with the provision that such a C
program cannot be strictly conforming. Conversions from pointers to integers
and back are implementation-defined at best and undefined at worst.

If you're asking in general whether it's a good idea to do this, then
emphatically no. Your program will be chained to a relatively small set of
circumstances, it will have to be careful doing low-level manipulations, and
the gain is questionable. If you need to store extra information with a
pointer, consider using a struct for each individual pointer or a hashtable.
(Solving the problem of how to hash pointers to void as portably as possible
is still more rewarding than manipulating the representation directly.)

S.
 
R

Roberto Waltman

pete said:
David said:
Assuming that a pointer to void is always a multiple of 4 provides an
opportunity to use as tags the two bits that are always 0.
Knowing that
several experts on the ISO99 C standard should be monitoring this
group, I would like to know if this assumption complies with this
standard before using it in a project.

It doesn't.

N869
6.2.5 Types
[#27] A pointer to void shall have the same representation
and alignment requirements as a pointer to a character type.

and that's all that the standard says about the representation
of pointer to void.

(In my copy of the standard (ISO/IEC 9899:1999(E),) that is paragraph
6.2.5 - #26, not #27. Wonder what changed.)

A pointer to void returned by malloc(), etc., has stronger alignment
requirements, (see below, probably the source of the wrong
assumption,) but not just any void pointer. It may be a multiple of 4
in a particular environment, but of course any dependency in that fact
will make your code non-portable.

7.20.3
"The pointer returned if the allocation succeeds is suitably aligned
so that it may be assigned to a pointer to any type of object..."
 
D

Dik T. Winter

....
> If you're asking in general whether it's a good idea to do this, then
> emphatically no. Your program will be chained to a relatively small set of
> circumstances, it will have to be careful doing low-level manipulations, and
> the gain is questionable. If you need to store extra information with a
> pointer, consider using a struct for each individual pointer or a hashtable.

Indeed. The assumption that for some particular pointer type some low order
bits were always 0 and could be used for administrative purposes lead to
a headache when porting the Korn shell to the Cray.
 
D

Dik T. Winter

> 7.20.3
> "The pointer returned if the allocation succeeds is suitably aligned
> so that it may be assigned to a pointer to any type of object..."

But suitable alignment does *not* imply that the low order bits are 0.
For instance, on the Cray it implies that the high order 16 bits are 0
(they contain the byte in word pointer).
 
P

pete

Dik said:
But suitable alignment does *not* imply that the low order bits are 0.
For instance, on the Cray it implies that the high order 16 bits are 0
(they contain the byte in word pointer).

The concepts of low order bits or value bits or padding bits,
are not applied to pointers by the standard.
 
R

Roberto Waltman

Dik T. Winter said:
...

But suitable alignment does *not* imply that the low order bits are 0.
For instance, on the Cray it implies that the high order 16 bits are 0
(they contain the byte in word pointer).

No disagreement here. What Cray model/line are you referring to?
The Crays are one of the architectures often mentioned in c.l.c as
examples of environments were common (but wrong) assumptions break you
code. I would like to read the C manual for other implementations
"oddities"
 
D

Dik T. Winter

>
> No disagreement here. What Cray model/line are you referring to?

The Cray-1 and successors (i.e. those based on the original architecture).
> The Crays are one of the architectures often mentioned in c.l.c as
> examples of environments were common (but wrong) assumptions break you
> code. I would like to read the C manual for other implementations
> "oddities"

I do not know whether such manuals are available online or offline, one
oddity is that there is no division instruction and the quotient can be
wrong in the two low order bits. In the course of time I have used four
different architectures that would break common assumptions.
(1) Cray-1 and successors. Pointers are (64-bit) word pointers. A char
pointer is constructed by putting the char number in the high order
16 bits. Also no division instruction, so the quotient could be
quite a bit wrong. Integers contain padding bits (the high order
16 bits of the 64 bit word).
(2) Data General MV series. A char pointer would have the low order
24 (I think) bits as byte pointer and the high order 8 bits as
"ring number", which would be non-zero for user programs (and so
NULL is not all bits 0). Any other pointer would have the (16-bit)
word address in the low order 23 bits, next 8 bits for the ring number
and one bit that indicates indirection. On that machine with c a
char pointer, the cast (int *)c was certainly *not* a no-op.
(3) CDC 205. Every pointer was a bit pointer. So a char pointer would
have the lower three bits 0. Also 0.0 (when normalised) would not
be all bits zero. Also on this machine (a == b) == (b == a) could
be false (there was asymmetry in the instruction).
(4) Intel i960. No division instruction, so division could be a bit off
(but not as far off as the Cray).
 
K

Keith Thompson

Dik T. Winter said:
(1) Cray-1 and successors. Pointers are (64-bit) word pointers. A char
pointer is constructed by putting the char number in the high order
16 bits. Also no division instruction, so the quotient could be
quite a bit wrong. Integers contain padding bits (the high order
16 bits of the 64 bit word).

My experience on the Cray T90 is that the byte offset is stored in the
high order 3 bits of the 64-bit word. (Presumably the other 13 bits
of the top 16 are always 0; you'd never need more than 48 bits to
address all of memory.) So saying that the offset is stored in the
high 16 bits is correct, but imprecise.

("High-order bits" are not, of course, a concept defined by the C
standard for pointers.)

It would be nice if C provided ways to look at the alignments of
pointers without depending on their representation. I think it would
suffice to define a macro MAX_ALIGN specifying the maximum meaningful
alignment, and a "%" operator that takes a pointer and an integer,
defined only where the integer is a power of 2 no greater than
MAX_ALIGN. (But this assumes everything is done in powers of 2, which
might not be the case.)
 
D

David Deharbe

Thanks for all your answers - that was very helpful. Actually, I first
saw this technique in a BDD implementation (i.e.
http://www.cs.cmu.edu/~modelcheck/bdd.html), and was thinking of using
it to implement balanced trees. Since I strive for portability I will
discard this solution.

Best,

David.
--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,184
Messages
2,570,976
Members
47,536
Latest member
MistyLough

Latest Threads

Top