I don't have the code with me, but for huge arrays, I have used
something like:
arr[0] = initializer
for i in range N:
arr.extend(arr)
This doubles the array every time through the loop, and you can add
the powers of 2 to get the desired result.
Gil
To all who asked about my previous post,
I'm sorry I posted such a cryptic note before, I should have waited
until I had the code available.
I am still new to Python, and I couldn't find a way to create an array
of size N, with each member initialized to a given value. If there is
one, please let me know.
I think Paul Rubin and Paul Rudin are right, if they really are 2
different people, an array is a better solution if you're dealing with
integers. They both mention numpy, which I know nothing about, or the
array module.
kj, what *are* you going to do with this list/array? As others have
pointed out, there are differences between lists, arrays, and
dictionaries.
The problem I was solving was this: I wanted an array of 32-bit
integers to be used as a bit array, and I wanted it initialized with
all bits set, that is, each member of the array had to be set to
4294967295. Of course, you could set your initializer to 0, or any
other 32-bit number.
Originally I found that the doubling method I wrote about before was a
LOT faster than appending the elements one at a time, and tonight I
tried the "list = initializer * N" method. Running the code below, the
doubling method is still fastest, at least on my system.
Of course, as long as you avoid the 'one at a time' method, we're
talking about fractions of a second, even for arrays that I think are
huge, like the 536,870,912 byte beastie below.
Code:
# Written in Python 3.x
import array
import time
#* * * * * * * * * * * * * * * * * * * * * *
# Doubling method, run time = 0.413938045502
t0 = time.time()
newArray = array.array('I') # 32-bit unsigned
integers
newArray.append(4294967295)
for i in range(27): # 2**27 integers, 2**29 bytes
newArray.extend(newArray)
print(time.time() - t0)
print(newArray[134217727]) # confirm array is fully initialized
#* * * * * * * * * * * * * * * * * * * * * *
# One at a time, run time = 28.5479729176
t0 = time.time()
newArray2 = array.array('I')
for i in range(134217728): # the same size as above
newArray2.append(4294967295)
print(time.time() - t0)
print(newArray2[134217727]) # confirm array
#* * * * * * * * * * * * * * * * * * * * * *
# List with "*", run time = 1.06160402298
t0 = time.time()
newList = [4294967295] * 134217728
print(time.time() - t0)
print(newList[134217727]) # confirm list
If, instead of 134,217,728 integers, I want something different, like
100,000,000, the method I use is:
Code:
#* * * * * * * * * * * * * * * * * * * * * *
# Not a power of 2, run time = 0.752086162567
t0 = time.time()
newArray = array.array('I')
tempArray = array.array('I')
tempArray.append(4294967295)
size = 100000000
while size: # chew through
'size' until it's gone
if (size & 1): # if this bit of
'size' is 1
newArray.extend(tempArray) # add a copy of the temp array
size >>= 1 # chew off one bit
tempArray.extend(tempArray) # double the size of the temp
array
print(time.time() - t0)
print(newArray[99999999])
#* * * * * * * * * * * * * * * * * * * * * *
# # Not a power of 2, list with "*", run time = 1.19271993637
t0 = time.time()
newList = [4294967295] * 100000000
print(time.time() - t0)
print(newList[99999999])
I think it is interesting that the shorter list takes longer than the
one that is a power of 2 in length. I think this may say that the
"list = initializer * N" method uses something similar to the doubling
method.
Also, tempArray (above) gets reallocated frequently, and demonstrates
that reallocation is not a big problem.
Finally, I just looked into calling C functions, and found
PyMem_Malloc, PyMem_Realloc, PyMem_Free, etc. in the Memory Management
section of the Python/C API Reference Manual. This gives you
uninitialized memory, and should be really fast, but it's 6:45 AM
here, and I don't have the energy to try it.
Gil