Martin said:
Not sure what operations you are doing: In Python, bits never drop off
(at least not in recent versions).
If you need to drop bits, you need to do so explicitly, by using the
bit mask operations. I could tell you more if you'd tell us what
the specific operations are.
This code is in a contribution to the reportlab toolkit that handles TTF fonts.
The fonts contain checksums computed using 32bit arithmetic. The original
Cdefintion is as follows
ULONG CalcTableChecksum(ULONG *Table, ULONG Length)
{
ULONG Sum = 0L;
ULONG *Endptr = Table+((Length+3) & ~3) / sizeof(ULONG);
while (Table < EndPtr)
Sum += *Table++;
return Sum;
}
so effectively we're doing only additions and letting bits roll off the end.
Of course the actual semantics is dependent on what C unsigned arithmetic does
so we're relying on that being the same everywhere.
This algorithm was pretty simple in Python until 2.3 when shifts over the end of
ints started going wrong. For some reason we didn't do the obvious and just do
everything in longs and just mask off the upper bits. For some reason (probably
my fault) we seem to have accumulated code like
def _L2U32(L):
'''convert a long to u32'''
return unpack('l',pack('L',L))[0]
if sys.hexversion>=0x02030000:
def add32(x, y):
"Calculate (x + y) modulo 2**32"
return _L2U32((long(x)+y) & 0xffffffffL)
else:
def add32(x, y):
"Calculate (x + y) modulo 2**32"
lo = (x & 0xFFFF) + (y & 0xFFFF)
hi = (x >> 16) + (y >> 16) + (lo >> 16)
return (hi << 16) | (lo & 0xFFFF)
def calcChecksum(data):
"""Calculates TTF-style checksums"""
if len(data)&3: data = data + (4-(len(data)&3))*"\0"
sum = 0
for n in unpack(">%dl" % (len(data)>>2), data):
sum = add32(sum,n)
return sum
and also silly stuff like
def testAdd32(self):
"Test add32"
self.assertEquals(add32(10, -6), 4)
self.assertEquals(add32(6, -10), -4)
self.assertEquals(add32(_L2U32(0x80000000L), -1), 0x7FFFFFFF)
self.assertEquals(add32(0x7FFFFFFF, 1), _L2U32(0x80000000L))
def testChecksum(self):
"Test calcChecksum function"
self.assertEquals(calcChecksum(""), 0)
self.assertEquals(calcChecksum("\1"), 0x01000000)
self.assertEquals(calcChecksum("\x01\x02\x03\x04\x10\x20\x30\x40"), 0x11223344)
self.assertEquals(calcChecksum("\x81"), _L2U32(0x81000000L))
_L2U32(0x80000000L))
where while it might be reasonable to do testing it seems the tests aren't very
sensible eg what is -6 doing in a u32 test? This stuff just about works on a 32
bit machine, but is failing miserably on a 64bit AMD. As far as I can see I just
need to use masked longs throughout.
In a C extension I can still do the computation exfficiently on a 32bit machine,
but I need to do masking for a 64 bit machine.