M
MartinBroadhurst
In my experience, the most common data structures used for lists
supporting addition and removal at the tail, and random access, are
linked lists and dynamic arrays.
* Linked lists require following pointers for traversal (3 lookups per
element, poor cache performance), and have O(n) indexed access.
Adding and removing elements cause many small allocations and
deallocations of memory.
* Dynamic arrays have efficient traversal (1 lookup per element, good
cache performance) and indexed access is O(1).
Adding elements requires reallocation when full, however, which
involves copying all elements.
Instead, I should like to propose a structure consisting of a dynamic
array of increasing-sized arrays:
0 -> 0 1 2 3
1 -> 4 5 6 7
2 -> 8 9 10 11 12 13 14 15
3 -> 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
* Each array after the first two is twice the size of its predecessor.
This means that the structure has the same growth strategy as most
dynamic arrays (double when full), reducing the number of
allocations.
* As new arrays are appended to the end, no copying is required,
except for the pointers when the containing array is resized.
* The regular increase in the size of the arrays makes indexed access
efficient.
I have placed the source code for an implemenation here:
http://www.martinbroadhurst.com/c/list.html
To do:
* Allow specification of a starting capacity.
* The algorithm for finding the block for an index should use log2,
not repeated multiplication and addition.
* The block at the tail should not be freed as soon as it becomes
empty, as this means that a list whose population oscillates around a
power of 2 will repeatedly allocate and deallocate that block.
Instead, the block should be freed when some number of elements from
the next block have been removed, introducing a bit of "hysteresis".
Martin
supporting addition and removal at the tail, and random access, are
linked lists and dynamic arrays.
* Linked lists require following pointers for traversal (3 lookups per
element, poor cache performance), and have O(n) indexed access.
Adding and removing elements cause many small allocations and
deallocations of memory.
* Dynamic arrays have efficient traversal (1 lookup per element, good
cache performance) and indexed access is O(1).
Adding elements requires reallocation when full, however, which
involves copying all elements.
Instead, I should like to propose a structure consisting of a dynamic
array of increasing-sized arrays:
0 -> 0 1 2 3
1 -> 4 5 6 7
2 -> 8 9 10 11 12 13 14 15
3 -> 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
* Each array after the first two is twice the size of its predecessor.
This means that the structure has the same growth strategy as most
dynamic arrays (double when full), reducing the number of
allocations.
* As new arrays are appended to the end, no copying is required,
except for the pointers when the containing array is resized.
* The regular increase in the size of the arrays makes indexed access
efficient.
I have placed the source code for an implemenation here:
http://www.martinbroadhurst.com/c/list.html
To do:
* Allow specification of a starting capacity.
* The algorithm for finding the block for an index should use log2,
not repeated multiplication and addition.
* The block at the tail should not be freed as soon as it becomes
empty, as this means that a list whose population oscillates around a
power of 2 will repeatedly allocate and deallocate that block.
Instead, the block should be freed when some number of elements from
the next block have been removed, introducing a bit of "hysteresis".
Martin