a variable use of variables is what i'm after i think

B

ben

is there anyway in c to write code that can variably make use of one of
two structs (one that has 32 bit vals and the other that has 64 bit
vals) throughout the code? i'm writing some code that parses some data.
there's a few data types whose max values are 10^10 which requires 34
bits, so a u_int64_t type of variable would be needed to hold them
while using/manipulating. (as i write this the more i think about it
the more i think what i'm hoping for isn't on at all and the answer is
no but anyway..). nearly always, or even always (just with the tiny,
tiny, incredably remote chance they won't) the values in question will
fit into 32 bit values fine. on a 32 bit machine dealing with 64 bit
values throughout the code is going to slow things up i think, so what
i'm wondering is, is there anyway to variably use 32 bit values and 64
bit values somehow? the only way i can think of is just having pretty
much all the code duplicated, once for 32 bit handling and once for 64
bit handling. what i would like to do is set up two structs (one for 32
bit vals the other for 64) then based on an if statement at the start
of the code that determines which should be used, use that struct
throughout. at the end of the day the only way to do that is
duplication of the code that deals with the structs right? i just
thought i'd check in case there is some nifty way to get something like
this that i don't know about.

unions aren't any direct use. function pointers could be of use but
doesn't get round the problem of having to duplicate most, probably
nearly all, of the code. so i guess just use 64 bit vals and be done
with it. does seem a shame though because the real need for over 32 bit
values will be so rare -- although who knows in the future.
 
A

aegis

ben said:
is there anyway in c to write code that can variably make use of one of
two structs (one that has 32 bit vals and the other that has 64 bit
vals) throughout the code? i'm writing some code that parses some data.
there's a few data types whose max values are 10^10 which requires 34
bits, so a u_int64_t type of variable would be needed to hold them
while using/manipulating. (as i write this the more i think about it
the more i think what i'm hoping for isn't on at all and the answer is
no but anyway..). nearly always, or even always (just with the tiny,
tiny, incredably remote chance they won't) the values in question will
fit into 32 bit values fine. on a 32 bit machine dealing with 64 bit
values throughout the code is going to slow things up i think, so what
i'm wondering is, is there anyway to variably use 32 bit values and 64
bit values somehow? the only way i can think of is just having pretty
much all the code duplicated, once for 32 bit handling and once for 64
bit handling. what i would like to do is set up two structs (one for 32
bit vals the other for 64) then based on an if statement at the start
of the code that determines which should be used, use that struct
throughout. at the end of the day the only way to do that is
duplication of the code that deals with the structs right? i just
thought i'd check in case there is some nifty way to get something like
this that i don't know about.

unions aren't any direct use. function pointers could be of use but
doesn't get round the problem of having to duplicate most, probably
nearly all, of the code. so i guess just use 64 bit vals and be done
with it. does seem a shame though because the real need for over 32 bit
values will be so rare -- although who knows in the future.

object types can vary in size and thus the range of values they can
represent from platform to platform. In this respect, one can consider
a degree of variability in object types. But this is beyond /your/
control.

As such, I'd probably create my own container
capable of representing some [x,y] range of values.

<OT>
Have you actually profiled your program to determine the net effect
of supporting 64-bit values on a 32-bit system?

It would seem from your post that you are merely speculating that
it would slow things up. But any slowness experienced may actually
be negligible. However, this really all depends on
the nature of the application.
</OT>
 
B

ben

aegis said:
object types can vary in size and thus the range of values they can
represent from platform to platform. In this respect, one can consider
a degree of variability in object types. But this is beyond /your/
control.

As such, I'd probably create my own container
capable of representing some [x,y] range of values.

right, ok.
<OT>
Have you actually profiled your program to determine the net effect
of supporting 64-bit values on a 32-bit system?

It would seem from your post that you are merely speculating that
it would slow things up. But any slowness experienced may actually
be negligible. However, this really all depends on
the nature of the application.
</OT>

no i haven't and yes it's all speculation -- it's just something i
thought was a bit wasteful.

ok thanks for the confirmation.

cheers, ben.
 
C

Chris Torek

is there anyway in c to write code that can variably make use of one of
two structs (one that has 32 bit vals and the other that has 64 bit
vals) throughout the code?

Not directly, no.

Consider the fact that, at the machine level, you often -- I would
say "always" but there are exceptions to everything -- get different
instruction-sequences for two identical-except-for-type source code
constructs. For instance:

struct foo32 {
uint32_t a, b, c;
};
struct foo64 {
uint64_t a, b, c;
};

#define MUL(p) ((p)->a = (p)->b * (p)->c)

...
struct foo32 x;
struct foo32 y;
...
MUL(&x);
MUL(&y);

This might emit code like:

ld %r1,x+4 # fetch x.b
ld %r2,x+8 # fetch x.c
umul %r1,%r2 # compute (32-bit) product
st %r1,x # store result

ldx %r1,y+8 # fetch y.b
ldx %r2,y+16 # fetch y.c
mulx %r1,%r2 # compute (64-bit) product
stx %r1,y # store result

Here, the C compiler has used the type information in the source
code -- that the members of "x" are "uint32_t"s, while those in
"y" are "uint64_t"s -- to generate different code: the offsets
for y.b and y.c differ from those for x.b and x.c; the sizes of
all the operands differ; and the "mul" instruction for x uses 32
bit arithmetic while that for "y" uses 64-bit arithmetic.
i'm writing some code that parses some data.
there's a few data types whose max values are 10^10 which requires 34
bits, so a u_int64_t type of variable would be needed to hold them
while using/manipulating.

In that case, write the code using "unsigned long long" (or uint64_t
if you really want to restrict it to *exactly* 64 bits, which seems
like overkill). Then, if and only if it turns out to be "too slow",
profile the code, find where it spends "too much time", and optimize
it once it actually works.

Note that you can use macros to generate "equivalent code except
for types". This works becaues the preprocessor phase is ignorant
of C semantics: it merely expands tokens, and C's type-information
is more deeply embedded, so that individual tokens carry their
types on into the C compiler. The MUL example above is a trivial
one (so trivial that it probably should just be expanded in-line
"by hand", but this was meant to be a simple example). In other
words, the preprocessor's inability to "see" types is both a drawback
(type-checking is impossible) and a feature (type-independence is
natural).
 
B

ben

Chris Torek said:
Not directly, no.

Consider the fact that, at the machine level, you often -- I would
say "always" but there are exceptions to everything -- get different
instruction-sequences for two identical-except-for-type source code
constructs. For instance:

struct foo32 {
uint32_t a, b, c;
};
struct foo64 {
uint64_t a, b, c;
};

#define MUL(p) ((p)->a = (p)->b * (p)->c)

...
struct foo32 x;
struct foo32 y;
...
MUL(&x);
MUL(&y);

This might emit code like:

ld %r1,x+4 # fetch x.b
ld %r2,x+8 # fetch x.c
umul %r1,%r2 # compute (32-bit) product
st %r1,x # store result

ldx %r1,y+8 # fetch y.b
ldx %r2,y+16 # fetch y.c
mulx %r1,%r2 # compute (64-bit) product
stx %r1,y # store result

Here, the C compiler has used the type information in the source
code -- that the members of "x" are "uint32_t"s, while those in
"y" are "uint64_t"s -- to generate different code: the offsets
for y.b and y.c differ from those for x.b and x.c; the sizes of
all the operands differ; and the "mul" instruction for x uses 32
bit arithmetic while that for "y" uses 64-bit arithmetic.

right, yes. i hadn't considered it in such detail but i had started to
think/realise that there was no possibility for flexibility in the way
i wanted -- the code is very much bound to the types, as you've
illustrated above.
In that case, write the code using "unsigned long long" (or uint64_t
if you really want to restrict it to *exactly* 64 bits, which seems
like overkill). Then, if and only if it turns out to be "too slow",
profile the code, find where it spends "too much time", and optimize
it once it actually works.

don't understand why you think using uint64_t is overkill. the values
that i said have max values of 10^10 -- they really are never going to
be more than that -- the format of the data itself if fixed in that
way. i guess in the future long long might be 128 bits. that would be
overkill for a 34 bit value right? i'm just picking the smallest size
that the value fit into.

yes i'll just use a larger value. and profile when done. i'm sure it'll
be fine.
Note that you can use macros to generate "equivalent code except
for types". This works becaues the preprocessor phase is ignorant
of C semantics: it merely expands tokens, and C's type-information
is more deeply embedded, so that individual tokens carry their
types on into the C compiler. The MUL example above is a trivial
one (so trivial that it probably should just be expanded in-line
"by hand", but this was meant to be a simple example). In other
words, the preprocessor's inability to "see" types is both a drawback
(type-checking is impossible) and a feature (type-independence is
natural).

right ok.

thanks very much for that. 64 bit variables it is.
ben.
 
C

Chris Torek

don't understand why you think using uint64_t is overkill. the values
that i said have max values of 10^10 -- they really are never going to
be more than that -- the format of the data itself if fixed in that
way. i guess in the future long long might be 128 bits. that would be
overkill for a 34 bit value right? i'm just picking the smallest size
that the value fit into.

Exercise: The *smallest* size it fits into is uint34_t. Why not
use that? And if there are some reasons not to use that, do they
also apply to uint64_t?

Specifically, uintN_t (for any N) tells the (C99) compiler: "don't
you dare use anything other than an exactly-2-sup-N type, no matter
how <censored> slow it might be." The "native" types (char, short,
int, long, and long long, and their signed and unsigned variants)
are less "forceful" requests: "please get me a type that holds at
least 127, 32767, etc" and the compiler can use whatever makes them
go fastest / smallest-code / "best".

In other words, when you use "int" or "long" or similar, you are
cutting the compiler some slack.
 
M

Malcolm

Chris Torek said:
Exercise: The *smallest* size it fits into is uint34_t. Why not
use that? And if there are some reasons not to use that, do they
also apply to uint64_t?

Specifically, uintN_t (for any N) tells the (C99) compiler: "don't
you dare use anything other than an exactly-2-sup-N type, no matter
how <censored> slow it might be."
It also tells the maintaining programmer that this type is, for some reason,
naturally an exact number of bits. For instance if an integer holds rgba
pixel values, with 256 levels for each channel, it makes sense to call it an
int32_t. If it just has to hold a big number, like the size of file in
bytes, make an int or a long.
 
B

ben

Chris Torek said:
Exercise: The *smallest* size it fits into is uint34_t. Why not
use that?

because it doesn't exist :/
And if there are some reasons not to use that, do they
also apply to uint64_t?

no; because it exists.
Specifically, uintN_t (for any N) tells the (C99) compiler: "don't
you dare use anything other than an exactly-2-sup-N type, no matter
how <censored> slow it might be." The "native" types (char, short,
int, long, and long long, and their signed and unsigned variants)
are less "forceful" requests: "please get me a type that holds at
least 127, 32767, etc" and the compiler can use whatever makes them
go fastest / smallest-code / "best".

In other words, when you use "int" or "long" or similar, you are
cutting the compiler some slack.

oh right, i see. ok i'll definetely use an unsigned long long then --
thanks very much for all the info -- much appreciated.

ben.
 
K

Keith Thompson

ben said:
because it doesn't exist :/


no; because it exists.

Correction: it *probably* exists (on any C99 conforming
implementation). If there is no 64-bit unsigned integer type with no
padding bits, it won't define uint64_t. All C99 implementations are
required to support unsigned long long, which is required to have a
width of at least 64 bits, so the only ways uint64_t will be defined
*unless* unsigned long long either is bigger than 64 bits or has
padding bits.
 
B

ben

Keith Thompson said:
Correction: it *probably* exists (on any C99 conforming
implementation). If there is no 64-bit unsigned integer type with no
padding bits, it won't define uint64_t. All C99 implementations are
required to support unsigned long long, which is required to have a
width of at least 64 bits, so the only ways uint64_t will be defined
*unless* unsigned long long either is bigger than 64 bits or has
padding bits.

righy hoe -- ok, thanks.
 
C

Chris Torek

(This was mostly already covered, but I wanted to add the "least"
and "fast" items here.)

because it doesn't exist :/

Well, that *is* a darned good reason, of course. :)
no; because it exists.

But, as someone else noted, it might not: a 36-bit machine will
have uint18_t instead of uint16_t, uint36_t instead of uint32_t,
and uint72_t instead of uint64_t. Fortunately (?), 36-bit machines
are rare today.

C99 has a whole host of new type-names to account for several
different needs and desires. A "uint34_t" is the smallest type
that would have served, in this case; but it usually does not
exist. C99 has a bunch of additional types, including:

uint_fast64_t
uint_least64_t

These two *are* guaranteed to exist, and mean different things:
"a type that is 64 bits or more and is fast" and "a type that is
64 bits or more and is as little more as possible".

So in this case you could use uint_least64_t, which will exist
and will usually map to "unsigned long long"; or you could use
uint_fast64_t, which will also exist and will usually map to the
same thing; or you can just write "unsigned long long" and not
worry about it all. :)

Other langauges attack this problem in a different manner, with
"subrange" types: one simply says that some variable will, e.g.,
hold values between -167 and +981781513 inclusive, and the compiler
finds some appropriate type. This method is at least quite
straightforward (though wildly insufficient for dealing with floating
point); but C is not such a language.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,184
Messages
2,570,973
Members
47,529
Latest member
JaclynShum

Latest Threads

Top