More text

author: Søren Sandmann Pedersen <ssp@l3000.localdomain> 2012-05-13 12:39:10 -0400
committer: Søren Sandmann Pedersen <ssp@l3000.localdomain> 2012-05-13 12:39:10 -0400
commit: 76c81c6fa2044a3614e32d0b29327320fac29408 (patch)
tree: f440bf1b6fcc4cd5e1ea432e595a559cd9a0305c
parent: d009c6234c499968bba36901160dae658df70b50 (diff)
1 files changed, 59 insertions, 37 deletions
diff --git a/radixheap b/radixheap
index e42f9a0..d7f60ad 100644
--- a/radixheap
+++ b/radixheap
@@ -1,15 +1,15 @@
-The radix heap
+The Radix Heap
 
 The binary heap is a well-known data structure that allows a priority
 queue to be implemented in time O(log n). It's quite efficient because
-it can be done as an array. However, if you accept a few restricitons,
+it can be done as an array. However, if you accept a few restrictions,
 it's possible to do better from a caching point of view. The
 restrictions that you must accept are (a) that all the keys in the
 heap are integers and (b) that you can never insert a new item that is
 smaller than all the other items in the heap.
 
-These restrictions are not that severe. The resulting heap still works
-in many algorithms that use heaps as a subroutine: Dijkstra's
+These restrictions are not that severe. The resulting Radix Heap still
+works in many algorithms that use heaps as a subroutine: Dijkstra's
 shortest-path algorithm, Prim's minimum spanning tree algorithm,
 various sweepline algorithms in computation geometry.
 
@@ -33,9 +33,10 @@ The invariant is as follows:
       that is different from last_deleted. The items in bucket 0
       are equal to last_deleted.
 
-That means items in bucket 10 are different in the 9th bit, and
-possibly in bit 0 to 8. And items in bucket 32 are different in bit 31
-and possibly any other bit.
+That means for items in bucket 10, the 9th bit is different, bit 0 to
+8 are possibly different, and bit 11 to 31 are equal to those in
+last_deleted. Items stored in bucket 32 are different in bit 31 and
+possibly any other bit.
 
 Operations
 
@@ -43,22 +44,16 @@ When a new item is inserted, it has to be added to the correct
 bucket. How can we compute the bucket number? We have to find the
 highest bit where the new item differs from last_deleted. This is
 easily done by XOR'ing them together, and then finding the highest set
-bit. Adding one gives the bucket number. Inserting a new item is as
-simple as adding it to the bucket. The invariant is clearly preserved
-because last_deleted didn't change and the new item will be in the
-correct bucket.
+bit. Adding one then gives the bucket number. Inserting the item
+clearly preserves the invariant because last_deleted didn't change and
+the new item will be in the correct bucket.
 
 Extracting the minimum involves first finding the minimal item by
 walking the lowest-numbered non-empty bucket and finding the minimal
 item in that bucket. Then that item is deleted and last_deleted is
-updated.  Then the bucket is walked again and each item is reinserted
+updated. Then the bucket is walked again and each item is reinserted
 into a new bucket according to the new last_deleted item.
 
-We have to make sure this preserves this invariant. All the items in
-the bucket that is being redistributed will satisfy the invariant
-because they are simply being reinserted. The items that are not being
-redistributed will still differ
-
 We have to be sure this doesn't violate the invariant. First note that
 all the items in the bucket that is being redistributed will satisfy
 the invariant because they are simply being reinserted. And a bucket
@@ -68,24 +63,51 @@ itself have belonged in bucket c. And there are no non-empty buckets
 with a lower number than the one that is being redistributed.
 
 Performance
-- Of insertion
-- Of extraction
-
-Refinement
-- Maintain a pointer to the minimum item in each bucket.
-- Maintain a pointer to the minimum bucket
-
-
-
-So 
-
-items are different from the previous last_deleted at position k -
-1. Since a bit can only be 0 or 1, 
-
-the removed item has the same bit
 
-One reason the radix heap is more efficient than the binary heap is
-that it has better caching behavior when implemented properly. Various
-descriptions of this data structure recommend implementing the buckets
-with doubly linked lists, but it's much better to implement them as
-arrays.
+The time complexity if insertion is O(1) since all that is required is
+adding the new item to a list. (There is also a requirement to
+determine the highest set bit, but that can be done in constant time
+with an instruction such as bsr).
+
+The performance of extraction is dominated by the redistribution of
+items. When a bucket is redistributed, it ends up being empty. To see
+why, remember that all the items in bucket k are different from
+last_deleted in the k-1'th bit. Because the new last_deleted comes
+from bucket k, that means the items are now all *equal* to
+last_deleted in the k-1th bit. Hence they will all be redistributed to
+a lower-numbered bucket.
+
+Now consider the life-cycle of a single element. In the worst case, it
+starts out being added to bucket 31. Then each redistribution moves
+the element to a lower-numbered bucket. When it reaches bucket 0, it
+will be next in line for extraction. Since a redistribution takes
+constant time per element distributed, it follows that the total
+amount of redistribution that an element will experience is
+proportional to the number of bits in the keys. 
+
+In other words, in the worst case the time complexity of extraction is
+O(d) where d is the number of bits in the integer keys. It's worth
+noting that in practice, the complexity will be even better since most
+items will not move through all the buckets.
+
+
+Caching performance
+
+Some descriptions of the radix heap recommend implementing the buckets
+as doubly linked lists. That's a big mistake because cache locality
+will be terrible. Instead, implement them as dynamically growing
+arrays. 
+
+If you do that, the top of the buckets will tend to be hot which means
+the per-item time to do a redistribute a bucket will be more like
+O(1/B), where B is the number of integers in a cache line. This means
+the amortized complexity of extraction will be closer to O(d/B) than
+O(d).
+
+In a regular binary heap, both insertion and extraction requires O(log
+n) swaps, and each swap (except those very close to the top of the
+heap) will cause a cache miss. 
+
+In other words, if d = Theta (log n), extraction from a radix heap
+will need O(log n / B) cache misses, where a binary heap will require
+O(log n).
author	Søren Sandmann Pedersen <ssp@l3000.localdomain>	2012-05-13 12:39:10 -0400
committer	Søren Sandmann Pedersen <ssp@l3000.localdomain>	2012-05-13 12:39:10 -0400
commit	76c81c6fa2044a3614e32d0b29327320fac29408 (patch)
tree	f440bf1b6fcc4cd5e1ea432e595a559cd9a0305c
parent	d009c6234c499968bba36901160dae658df70b50 (diff)