32/64 bit cc differences

K

Keith Thompson

JohnF said:
The code's supposed to be portable, and not care as long as int>=32.

Then it's only *mostly* portable; the standard allows int to be as
narrow as 16 bits. If you don't mind that restriction, that's fine (I'd
probably add a compile time assertion that int is at least 32 bits) --
or you might consider using int32_t and uint32_t when you need a 32-bit
type. Or intleast32_t or intfast32_t if you need *at least* 32
bits.
The one place it wanted strictly >32 I used long long (despite
obnoxious -Wall warnings about it). Anyway, I found the problem,
explained in subsequent followup, kind of along the lines you're
suggesting, but a rounding problem.

I'd probably use int64_t and friends. But what warnings do you get when
you use long long? You can likely get rid of any such warnings by
telling your compiler to conform to C99 or later.
 
J

James Kuyper

In C99, several features were added that addressed rounding:
the macro FLT_ROUNDS was added to <float.h>; it expands to an expression
that will indicate the current rounding mode.

"A floating expression may be contracted, that is, evaluated as though
it were a single operation, thereby omitting rounding errors implied by
the source code and the expression evaluation method.89) The FP_CONTRACT
pragma in <math.h> provides a way to disallow contracted expressions."
(6.5p8)

fegetround() gets the current rounding direction, fesetround() sets it.
The possible rounding directions are identified by macros #defined in
<fenv.h>. They are all optional: #ifndef, the corresponding rounding
direction is not supported.

Several new functions were added to <math.h> for rounding to an integer
in various ways: nearbyint(), lrint(), round().

... I don't normally use floating point calculations if I don't

I think C is a little stricter, but there is a claim that a valid
Fortran floating point system could return 42.0 (or 42.D0) for all
floating point expressions.

C is a little stricter than that: Conversions to floating point type are
required to result in the nearest higher or nearest lower representable
value (6.3.1.4, 6.3.1.5), so (double)196 is not allowed to have a value
of 42.0. Also, "For decimal floating constants, and also for hexadecimal
floating constants when FLT_RADIX is not a power of 2, the result is
either the nearest representable value, or the larger or smaller
representable value immediately adjacent to the nearest representable
value, chosen in an implementation-defined manner.
For hexadecimal floating constants when FLT_RADIX is a power of 2, the
result is correctly rounded." (6.4.4.2p3), so the floating point
expression 3.0 is not allowed to have a value of 42.0. Also, all
floating point comparison operators (<, >, <=, >=, ==, !=) are required
to return either 0 or 1, and they must return the correct value - they
are not allowed to perform fuzzy comparisons.

However, unless an implementation of C chooses to pre-#define
__STDC_IEC_559__, in all other regards C is not very strict at all: "The
accuracy of the floating-point operations (+, -, *, /) and of the
library functions in <math.h> and <complex.h> that return floating-point
results is implementation defined, as is the accuracy of the conversion
between floating-point internal representations and string
representations performed by the library functions in <stdio.h>,
<stdlib.h>, and <wchar.h>. The implementation may state that the
accuracy is unknown."

Therefore, a conforming implementation of C is allowed to evaluate the
division in DBL_MIN/DBL_MAX < DBL_MAX with such lousy accuracy that the
comparison ends up being false.

If __STDC_IEC_559__ is pre-#defined, the implementation must conform to
the requirements of Annex K, which is based upon, but not identical
with, IEC 60559 (which in turn, is essentially equivalent to IEEE 754).
In that case, the floating point accuracy requirements are quite strict
- they come pretty close to being as strict as it is practically
possible for them to be. Which is still to low a requirement for this
kind of use.
 
J

JohnF

Jorgen Grahn said:
-Wno-long-long is a decent cure for that. Or better, switch to C99
and tell the compiler you're doing it. /Jorgen

It's no problem. I meant "obnoxious" humorously.
In fact, I'd rather continue seeing the warnings -- reminds me
I'm doing something a little funky that probably ought to be changed.
 
J

JohnF

Keith Thompson said:
Then it's only mostly portable; the standard allows int to be as
narrow as 16 bits.

Yeah, but that's pretty much deprecated/archaic, at least for
general purpose computers. I usually just try to follow K&R 2nd ed
for "portable" syntax, whereas "portable semantics" gets trickier,
and I usually just try to figure "anything that can go wrong will".
If you don't mind that restriction, that's fine (I'd
probably add a compile time assertion that int is at least 32 bits) --
or you might consider using int32_t and uint32_t when you need a 32-bit
type. Or intleast32_t or intfast32_t if you need *at least* 32
bits.
The one place it wanted strictly >32 I used long long (despite
obnoxious -Wall warnings about it). Anyway, I found the problem,
explained in subsequent followup, kind of along the lines you're
suggesting, but a rounding problem.

I'd probably use int64_t and friends. But what warnings do you get when
you use long long? You can likely get rid of any such warnings by
telling your compiler to conform to C99 or later.


That might be preferable to LL. All three compilers
64-bit: cc --version cc (Debian 4.3.2-1.1) 4.3.2
32-bit: cc --version cc (NetBSD nb2 20110806) 4.5.3
cc --version cc (GCC) 4.7.1
issue similar -pedantic -Wall warnings. Explicitly, from 4.7.1,
fm.c: In function 'rseeder':
fm.c:865:6: warning: ISO C90 does not support 'long long' [-Wlong-long]
fm.c:866:11: warning: use of C99 long long integer constant [-Wlong-long]
fm.c:877:20: warning: ISO C90 does not support 'long long' [-Wlong-long]
fm.c:878:11: warning: ISO C90 does not support 'long long' [-Wlong-long]
fm.c:880:25: warning: use of C99 long long integer constant [-Wlong-long]
fm.c:880:30: warning: use of C99 long long integer constant [-Wlong-long]
fm.c:880:37: warning: use of C99 long long integer constant [-Wlong-long]
fm.c:892:3: warning: ISO C90 does not support the 'll' gnu_printf
length modifier [-Wformat]
But that whole function ought to be re-algorithmized anyway,
so my concern is pretty minimal.
 
J

JohnF

Aleksandar Kuktin said:
int-using-bankers-rounding-in-c

Got it: ceil(), floor(), round(), rint() and others.

I think you're right that I mistakenly said "round" once or twice,
where I really should have said "truncate". Rounding isn't actually
an issue. The code is using a (random) float 0.0<=f<1.0 to calculate
an int ilo<=i<=ihi, with more-or-less the usual kind of formula
i = ilo + f*(ihi-ilo+1); better written as
i = ilo + (int)(f*(float)(ihi-ilo+1));
So I want it to truncate.
 
J

JohnF

Eric Sosman said:
[...]
This calculation is better done in fixed point.
The RNG already computes an integer between 0 and 21747483647,

Unlikely: The larger value requires >=35 bits, which isn't
usual these days. ;-) Also, isn't his RNG based on Park & Miller's
"Minimal Standard" generator? If so, the lower limit is not 0
but 1, and the upper is not 2147483647 but 2147483646.

Yes, Park&Miller, according to the discussion in Numerical Recipes
in C, 2nd ed, page 280, from which I copied the code.
Many authors recommend against this, because the low-order
bits of maximum-period linear congruential generators have short
periods. But the "Minimal Standard" generator is not such a
generator: It's a pure congruential generator with prime modulus,
and its low-order bits are "as random as" the others.


Yet another method is to use a rejection technique. If you
want a value in the range 0<=V<N and the generator produces values
in LO<=R<HI, you generate an R and compute V=(R-LO)/((HI-LO)/N).
If that's <N you're done; if not, throw it away, generate another R,
and keep trying. (In the worst case you'll reject a hair fewer
than half of the R's, so the average quantity of R's you need is
no worse than 1 + 1/2 + 1/4 + ... = 2.)

Thanks, guys. I was just getting around to thinking about
the best way to handle this. Glad I read your discussion first.
 
J

JohnF

Aleksandar Kuktin said:
Out of curiosity - how did you come up with 67584? Its hexadecimal
representation - 0x10800 - isn't particularly round and I can't think of
any other magic properties that number could have.
I didn't read the code, if the answer is hiding in there.

Answer's hiding in
http://www.forkosh.com/fm.html?algorithms.permutebits
(use the ?, not a #),
"With default block sizes randomly selected between
2048 and 8192 bytes, and with default noise between
32 and 256 bytes per block, fm's permutations range
between 16640 and 67584 bits."
 
J

JohnF

BartC said:
If you're using ran1() to generate 0.0 to 1.0, couldn't differences start
from there? In that in one instance, the ran1() might return, say,
0.249999... and in another, 0.250000... from the same sets of integer start
values.

Looking at ran1(), it only seems to use floats to convert a 0 to 2147483647
integer result, to a 0.0 to 1.0 one. Could you make use of an integer random
function which returns the range 0 to 2147483647 as it is?

Yeah, that's been the suggestion, which I'll definitely be taking.
(If you integer-divide 0...2147483647 by 31775, you do get 0...67584 (just
about), which is close to the 0...67583 you seem to need. I don't know how
to get rid of that one out-of-range result without skewing the results.)

Eric's preceding followup seems to contain the definitive discussion
about how to best do this.
(BTW, there seems to be something 'off' about using a 23-bit float value
(1.0/2147483647) to multiply a 31-bit random integer. But probably OK unless
you try and recreate the 31-bit value from the result....)

Not sure myself. See Numerical Recipes in C, 2nd ed, page 280
for discussion of theory and algorithm.
I copied their code -- hopefully correctly.
 
J

JohnF

Ben Bacarisse said:
Well there are some very good ones around (search for KISS and George
Marsaglia) but you don't need a new one. Your PRNG is an integer one,
it just does a final divide to return a float rather than an int.
If it is a good PRNG, the final int can be used instead of the
divided float.

Oh, yeah, that's what I understood you as saying, and what I intended
to do, from your previous remarks. Existing ran1() does what's needed.
Unfortunately, for cryptographic work, you should have very strong
guarantees about the way it behaves, but since this PRNG is designed for
numerical work, you have probably tacitly assumed it is good enough.

To get some more confidence, test the integer PRNG suing any one of the
standard random test suites. It won't give you cryptographic levels of
confidence, but it will ensure that you can use all the bits with equal
confidence.

The section, starting on page 278 of Numerical Recipes in C, 2nd ed,
discusses (and provides code for) several rng's, including some tests.
Based on all that, you're right, I tacitly assumed ran1() okay.
Or okay enough. And, in any case, my fm.html page has a short
"Random number generation" <h3> section that ends up with
the "cya" remark,
"Of course, you're welcome to replace the "stock" versions of ran1( )
and rseeder( ) supplied in fm.zip with any other code of your own
choosing (keeping the same calling sequences, etc)."
 
G

glen herrmannsfeldt

Yeah, but that's pretty much deprecated/archaic, at least for
general purpose computers. I usually just try to follow K&R 2nd ed
for "portable" syntax, whereas "portable semantics" gets trickier,
and I usually just try to figure "anything that can go wrong will".

Well, pretty much int is the word size of the machine. On a PDP-11
or even 8 bit processors, int is usually 16 bits. Also, for the
early MS-DOS machines, before the 80386, and to later machines.

But VAX is a 16 bit word machine, but I would expect 32 bit int
for it, though I don't remember what VAX-C does.

The tradition from some years ago was to use short for 16 bits,
long for 32, but that got confusing when 64 bit systems came out.

-- glen
 
G

glen herrmannsfeldt

Dr Nick said:
Eric Sosman <[email protected]> writes:

(snip, I wrote)
Another reason to do that is that it can lead to a bias in the numbers.
Consider a generator that produces 0-9 inclusive, with equal
probability. If you take the results mod 3 you get 3 instances of 1,
three of 2, and four of 0. It's a small bias, but a real one.

And, of course, using floating point there isn't any bias...

-- glen
 
J

JohnF

Dr Nick said:
Eric Sosman said:
[...]
This calculation is better done in fixed point.
The RNG already computes an integer between 0 and 21747483647,

Unlikely: The larger value requires >=35 bits, which isn't
usual these days. ;-) Also, isn't his RNG based on Park & Miller's
"Minimal Standard" generator? If so, the lower limit is not 0
but 1, and the upper is not 2147483647 but 2147483646.
the easy way is to take that modulo one more than the largest
value you want.

Many authors recommend against this, because the low-order
bits of maximum-period linear congruential generators have short
periods. But the "Minimal Standard" generator is not such a
generator: It's a pure congruential generator with prime modulus,
and its low-order bits are "as random as" the others.

Another reason to do that is that it can lead to a bias in the numbers.
Consider a generator that produces 0-9 inclusive, with equal
probability. If you take the results mod 3 you get 3 instances of 1,
three of 2, and four of 0. It's a small bias, but a real one.

Fyi, the solution I've now coded was based on Eric's preceding (but
snipped here) discussion. His formula was a little opaque to me,
so I googled the keywords he introduced to come up with the following,
which is pseudocoded below from the real code in forkosh.com/fm.zip,
int iran1 ( int ilo, int ihi ) { /* you want int rn from ilo to ihi */
long ran1(/*some args go here*/), /*original rng from Numerical Recipes*/
iran = ran1(/*args*/), /* integer result from rng */
range = ihi-ilo+1, /* ihi-ilo+1 */
IM = 2147483647, /* ran1()'s actual range is 1...IM */
imax = IM - (IM%range); /* force iran's max to a multiple of range */
while ( iran >= imax ) iran=ran1(/*args*/); /*discard out-of-range iran*/
return ( ilo + (iran%range) ); } /* back with random ilo <= i <= ihi */
 
I

Ike Naar

int iran1 ( int ilo, int ihi ) { /* you want int rn from ilo to ihi */
long ran1(/*some args go here*/), /*original rng from Numerical Recipes*/
iran = ran1(/*args*/), /* integer result from rng */
range = ihi-ilo+1, /* ihi-ilo+1 */
IM = 2147483647, /* ran1()'s actual range is 1...IM */

Isn't ran1()'s actual range [1..2147483646] ?
 
J

J. Clarke

I'm getting a tiny-cum-microscopic, but nevertheless fatal,
difference in the behavior of the exact same C code compiled
on one 64-bit linux machine...
o dreamhost.com
uname -a Linux mothman 2.6.32.8-grsec-2.1.14-modsign-xeon-64 #2 SMP
Sat Mar 13 00:42:43 PST 2010 x86_64 GNU/Linux
cc --version cc (Debian 4.3.2-1.1) 4.3.2
versus two other 32-bit linuxes...
o panix.com
uname -a NetBSD panix3.panix.com 6.1.2 NetBSD 6.1.2 (PANIX-USER) #0:
Wed Oct 30 05:25:05 EDT 2013 i386
cc --version cc (NetBSD nb2 20110806) 4.5.3
o my own local box running slackware 14.0 32-bit
cc --version cc (GCC) 4.7.1

The code is an en/de-cryption utility forkosh.com/fm.zip,
which is way too many lines to ask anybody to look at.
But my own debugging is failing to identify where the
difference creeps in, and googling failed to help suggest
where to look more deeply.

Firstly, both executables "work", i.e., if you encrypt and
then decrypt, you get back the exact same original file.
But if you encrypt using the 32-bit executable, scp the
encrypted file to the 64-bit machine (md5's match) and then
decrypt, the result is exactly the same length and almost
identical except for about one byte in a thousand that doesn't
diff. Vice versa (encrypt on 64-bit, decrypt on 32) gives
the same behavior. (By the way, the 32-vs-64-bit encrypted files
are also ~one-in-a-thousand different, so both stages exhibit
this small problem.)
And I tried cc -m32 on the 64-bit machine, but there's
some stubs32.h that it's missing. So instead, I cc -static
on my own box, and that executable does work on the 64-bit
machine when run against files encrypted on either 32-bit box.
So the problem doesn't seem to be the 64-bit os, but rather
the cc executable, though I'm not 100% sure.

What I'm really finding weird is that ~one-byte-in-a-thousand
diff. The program uses several streams of random numbers
(generated by its own code) to xor bytes, permute bits, etc.
The slightest problem would garble up the data beyond belief.
Moreover, it's got a verbose flag, and I can see the streams
are identical. And everywhere else I've thought to look
seems okay, too, as far as I can tell.
So I'm asking about weird-ish 32/64-bit cc differences
that might give rise to this kind of behavior. Presumably,
there's some subtle bug that I'm failing to see in the code,
and which the output isn't helping me to zero in on. Thanks,

I'm no expert but one thing I learned <mumble> years ago was to make
sure that the problem you're chasing really is the problem you _think_
you're chasing. You've got three different versions of the compiler
with two of them giving one behavior and the third, oldest one giving a
different behavior, which you are attributing to 64 bit vs 32-bit. It
could also be the result of some change made to the more recent releases
of the compiler and I would want to rule that out rather than assuming
that it's a 32- vs 64- bit issue.
 
K

Keith Thompson

JohnF said:
Yeah, but that's pretty much deprecated/archaic, at least for
general purpose computers. I usually just try to follow K&R 2nd ed
for "portable" syntax, whereas "portable semantics" gets trickier,
and I usually just try to figure "anything that can go wrong will".

Most modern *hosted* implementations make int 32 bits, but there's
nothing deprecated or archaic about 16-bit int (at least as far as the C
standard is concerned).

POSIX requires at least 32 bits, so if your program already depends on
POSIX features, you can safely make that assumption. Otherwise, you can
certainly assume 32-bit or wider int if you want to, but I personally
would take care to make that assumption explicit, so if someone tries to
compile my code with a fully conforming implementation that happens to
have 16-bit int the problem will be detected early.

#include <limits.h>
#if INT_MAX < 2147483647
#error This code requires at least 32-bit int
#endif

[...]
I'd probably use int64_t and friends. But what warnings do you get when
you use long long? You can likely get rid of any such warnings by
telling your compiler to conform to C99 or later.

That might be preferable to LL. All three compilers
64-bit: cc --version cc (Debian 4.3.2-1.1) 4.3.2
32-bit: cc --version cc (NetBSD nb2 20110806) 4.5.3
cc --version cc (GCC) 4.7.1
issue similar -pedantic -Wall warnings. Explicitly, from 4.7.1,
fm.c: In function 'rseeder':
fm.c:865:6: warning: ISO C90 does not support 'long long' [-Wlong-long]
fm.c:866:11: warning: use of C99 long long integer constant [-Wlong-long]
fm.c:877:20: warning: ISO C90 does not support 'long long' [-Wlong-long]
fm.c:878:11: warning: ISO C90 does not support 'long long' [-Wlong-long]
fm.c:880:25: warning: use of C99 long long integer constant [-Wlong-long]
fm.c:880:30: warning: use of C99 long long integer constant [-Wlong-long]
fm.c:880:37: warning: use of C99 long long integer constant [-Wlong-long]
fm.c:892:3: warning: ISO C90 does not support the 'll' gnu_printf
length modifier [-Wformat]
But that whole function ought to be re-algorithmized anyway,
so my concern is pretty minimal.

Note how the warning is phrased: "ISO C90 does not support 'long long'".
The long long type has been a standard C feature since the 1999 standard
(and a common extension before that). Failure to support long long is
not merely deprecated, it's completely non-standard. If you're willing
to assume that int is at least 32 bits, you should be even more willing
to assume that long long exists.

And <stdint.h> also did not exist in C90; both it and long long were
introduced by C99.

Just invoke your compiler with options to tell it to use a more modern
version of the language.

gcc in particular uses "-std=gnu89" by default, which is C89/C90 with
GNU extensions. IMHO this is unfortunate, and it's time for gcc to
support C99 by default. But it probably doesn't make much sense to rely
on gcc's default anyway.

If you need your code to be portable to Microsoft's compiler, you might
have a problem; I don't remember whether it supports long long, but I
know it doesn't support C99 or C11.
 
J

jacob navia

Le 12/01/2014 22:09, Keith Thompson a écrit :
#include <limits.h>
#if INT_MAX < 2147483647
#error This code requires at least 32-bit int
#endif

A system with sizeof(int) of 16 bits will have problems with the above
constant "2147483647" since it is an integer constant that overflows
thompson

i would rather write

#if INT_MAX < 2147483647L

since long must be at least 32 bits


conclusion:
 
K

Kaz Kylheku

-Wno-long-long is a decent cure for that. Or better, switch to C99
and tell the compiler you're doing it.

GCC's warnings about "long long" in C90 code (-ansi) are not a consequence of
-Wall. They are a consequence of -pedantic.

"long long" is a conforming extension to C90: it relies on grammar which
is a syntax error in C90, requiring a diagnostic.

The -pedantic mode means "generate all standard-required diagnostics" (or
rather, make an effort to do that, modulo bugs and omissions).

Try the following with "gcc -Wall -ansi". The only diagnostic you get is
about the unused variable:

#include <stdio.h>

long long x;

int main(void)
{
printf("hello, world\n");
int declaration_after_statement = 0;
return 0;
}

If you don't use -pedantic, you risk accidentally using extensions that you
don't intend to use. The cure for that is to compile your code base with
-pedantic once in a while and fix everything that you care to fix.
 
B

Ben Bacarisse

jacob navia said:
Le 12/01/2014 22:09, Keith Thompson a écrit :

A system with sizeof(int) of 16 bits will have problems with the above
constant "2147483647" since it is an integer constant that overflows
thompson

No. What matters is the types intmax_t and uintmax_t and they can't be
anything like as small as 16 bits. See 6.10.1p4.

<snip>
 
B

Ben Bacarisse

JohnF said:
Ben Bacarisse <[email protected]> wrote:

The section, starting on page 278 of Numerical Recipes in C, 2nd ed,
discusses (and provides code for) several rng's, including some tests.

They may be numerical tests based on the floating point value. It will
make almost no difference to a numerical test if the bottom bit of the
int (just before the final divide) cycles 0,1,1,0,1,1,0,... (for
example) but it will make a big difference if you make binary choices by
using ran1(...) & 1. Eric S suggests that this sort of thing does not
happen with the PRNG you use, but I'd not seen that post when I wrote.
Based on all that, you're right, I tacitly assumed ran1() okay.

Assuming that the floats are well distributed, is not quite the same as
assuming that the ints have all the right properties so a test or two
would not go aims.

<snip>
 
I

Ike Naar

Le 12/01/2014 22:09, Keith Thompson a ?crit :

A system with sizeof(int) of 16 bits will have problems with the above
constant "2147483647" since it is an integer constant that overflows
thompson

i would rather write

#if INT_MAX < 2147483647L

since long must be at least 32 bits


conclusion:

1.6.10 p4:
"For the purposes of this token conversion and evaluation, all
signed integer types and all unsigned integer types act as if they
have the same representation as, respectively, the types intmax_t
and uintmax_t defined in the header <stdint.h>."

The maximum value of type intmax_t, INTMAX_MAX, is required to be
at least (2 to the power 63) - 1 (see 7.20.2.5 p1),
so 2147483647 <= INTMAX_MAX, and therefor "#if INT_MAX < 2147483647"
is valid even on implementations where INT_MAX < 2147483647,
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,075
Messages
2,570,549
Members
47,197
Latest member
NDTShavonn

Latest Threads

Top