B
Bengt Richter
Just to see if I understand your problem statement, does the following (though it's slow)If we can tease out a little more about your problem, maybe we can help better ;-)
E.g., how do your tuple-keys come into being and how many times are they re-used?
If there is no nesting, you could create a string key just by ''.join([chr(el) for el in tup])
which wouldn't be as compact as a compressed bit string, but would be smaller than the tuple.
If you were lucky, a change could gain you both time and space.
I'll be implementing that this weekend, as my employer is expecting
results before Christmas. Damn deadlines.
I'm curious what your dicts and their int tuples represent in the real world ;-)
Well, I'm a researcher for an Australian university (University of
Wollongong) and my current task is to enumerate every possible 6 digit
binary Gray code.
Most recently people have done the 5 digit BGCs, something which I
have also accomplished, but never the 6 digit codes.
For a graphical representation, think of a cube in n-dimensions,
where n represents the number of digits in the code. A binary Gray
code around that cube represents a path starting from any arbitrary
point and then visiting every vertex exactly once.
The applications of such research are enourmous, and I'll not go
over them here, but that is my aim. To find the number of ways a 6
dimensional fly could walk a 6d cube visiting every vertex exactly
once.
I've actually been working on this problem for many months now and
have gotten so close. The previous best estimate for calculation time
(with a program written in C, I might add) was 2.8 * 10^15 years.
I've got the thing down to about 10^4 years. If the calculation
becomes tractable on a supercomputer, e.g. Barossa hosted at
www.ac3.com.au, then I'll run it on that.
I hope that's sated your curiosity for now . If you'd like any
more information, or if anyone else would like to know anything about
this, then I'll be happy to correspond with them. I don't know how
on-topic it is for c.l.py.
do what you are trying to do? If so, what would be the point of generating
all those gray code sequences? IOW, how are you going to use them? Assuming my
prog generated all the 6-bit grey code seqences to a monster file, all the 64-number
sequences packed into 64-byte strings end to end, then you would either have to
read that file sequentially, or seek to some point and read 64 bytes to get a
given gray code sequence. If the latter, how about an algorithm that would
do a virtual seek into the whole data as a virtual file, and then generate
what you would read? I.e., there would be no point in storing all that data
if you could easily generate an arbitrary segment of it. Of course, maybe that's
not so easy ;-) But I'm wonder what the use is going to be, and the access pattern,
if you assumed availability in a monster file, virtual or not.
If you really do want to generate all that data, I suppose the loop and recursions
could be fanned out to parallel processing. But first, do I understand the problem?
====< gray.py >=========================================================
# gray.py -- enumerate n-bit gray codes
# 20031220 10:34:27 alpha version by Bengt Richter. No warranties ;-)
#
def enumgraycodes(n):
"""
Enumerate all possible complete n-bit gray code sequences
returning a list of 2**n-length strings whose bytes each
contain a complete gray code sequence in the ord values.
"""
visit_mask = 2L**(2**n)-1
nbits = [1<<i for i in xrange(n)]
vbits = [1L<<i for i in xrange(2**n)]
codes = []
def visit(curr=0, visited=0L, code=''):
"""visit curr, and recursively available successors"""
visited |= vbits[curr]
code += chr(curr)
if visited == visit_mask:
codes.append(code)
return
for nbit in nbits:
succ = curr^nbit
if vbits[succ]&visited: continue
visit(succ, visited, code)
for start in xrange(2**n):
visit(start)
return codes
if __name__ == '__main__':
import sys
codes = enumgraycodes(int(sys.argv[1]))
for code in codes:
print ''.join(['%02x'%ord(c) for c in code])
========================================================================
Example: (even 4 bits generates >3mb and takes a while, so I won't post it ;-)
[10:41] C:\pywk\clp\gray>python gray.py 2
00010302
00020301
01000203
01030200
02030100
02000103
03020001
03010002
[10:41] C:\pywk\clp\gray>python gray.py 3
0001030206070504
0001030206040507
0001030705040602
0001050406070302
0001050406020307
0001050703020604
0002030105040607
0002030105070604
0002030706040501
0002060703010504
0002060405070301
0002060405010307
0004050706020301
0004050103020607
0004050103070602
0004060705010302
0004060203010507
0004060203070501
0100020307060405
0100020307050406
0100020604050703
0100040507060203
0100040507030206
0100040602030705
0103020004050706
0103020004060705
0103020607050400
0103070602000405
0103070504060200
0103070504000206
0105040607030200
0105040002030706
0105040002060703
0105070604000203
0105070302000406
0105070302060400
0203010004050706
[...]
0602000405010307
0706040501000203
0706040501030200
0706040002030105
0706020301000405
0706020301050400
0706020004050103
0705040602030100
0705040602000103
0705040001030206
0705010004060203
0705010302000406
0705010302060400
0703020001050406
0703020604050100
0703020604000105
0703010002060405
0703010504060200
0703010504000206
Regards,
Bengt Richter