anv/allocator: Allow state pools to allocate large states

Previously, the maximum size of a state that could be allocated from a state pool was a block. However, this has caused us various issues particularly with shaders which are potentially very large. We've also hit issues with render passes with a large number of attachments when we go to allocate the block of surface state. This effectively removes the restriction on the maximum size of a single state. (There's still a limit of 1MB imposed by a fixed-length bucket array.) For states larger than the block size, we just grab a large block off of the block pool rather than sub-allocating. When we go to allocate some chunk of state and the current bucket does not have state, we try to pull a chunk from some larger bucket and split it up. This should improve memory usage if a client occasionally allocates a large block of state. This commit is inspired by some similar work done by Juan A. Suarez Romero <jasuarez@igalia.com>. Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
author: Jason Ekstrand <jason.ekstrand@intel.com> 2017-04-24 01:51:51 -0700
committer: Jason Ekstrand <jason.ekstrand@intel.com> 2017-05-04 19:07:54 -0700
commit: f82d3d38b62048246c4df999a1789b5cb60184c6 (patch)
tree: a2ebc05fe825f8149e87edcbca1ef10f9f927d7f /src/intel
parent: 8c079b566e59df2f6d0e0deb951078aba862991d (diff)
1 files changed, 69 insertions, 0 deletions
diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_allocator.c
index a74b8decd2..3988a1ab2b 100644
--- a/src/intel/vulkan/anv_allocator.c
+++ b/src/intel/vulkan/anv_allocator.c
@@ -663,6 +663,12 @@ anv_fixed_size_state_pool_alloc_new(struct anv_fixed_size_state_pool *pool,
    struct anv_block_state block, old, new;
    uint32_t offset;
 
+   /* If our state is large, we don't need any sub-allocation from a block.
+    * Instead, we just grab whole (potentially large) blocks.
+    */
+   if (state_size >= block_size)
+      return anv_block_pool_alloc(block_pool, state_size);
+
  restart:
    block.u64 = __sync_fetch_and_add(&pool->block.u64, state_size);
 
@@ -715,6 +721,69 @@ anv_state_pool_alloc_no_vg(struct anv_state_pool *pool,
       goto done;
    }
 
+   /* Try to grab a chunk from some larger bucket and split it up */
+   for (unsigned b = bucket + 1; b < ANV_STATE_BUCKETS; b++) {
+      int32_t chunk_offset;
+      if (anv_free_list_pop(&pool->buckets[b].free_list,
+                            &pool->block_pool.map, &chunk_offset)) {
+         unsigned chunk_size = anv_state_pool_get_bucket_size(b);
+
+         /* We've found a chunk that's larger than the requested state size.
+          * There are a couple of options as to what we do with it:
+          *
+          *    1) We could fully split the chunk into state.alloc_size sized
+          *       pieces.  However, this would mean that allocating a 16B
+          *       state could potentially split a 2MB chunk into 512K smaller
+          *       chunks.  This would lead to unnecessary fragmentation.
+          *
+          *    2) The classic "buddy allocator" method would have us split the
+          *       chunk in half and return one half.  Then we would split the
+          *       remaining half in half and return one half, and repeat as
+          *       needed until we get down to the size we want.  However, if
+          *       you are allocating a bunch of the same size state (which is
+          *       the common case), this means that every other allocation has
+          *       to go up a level and every fourth goes up two levels, etc.
+          *       This is not nearly as efficient as it could be if we did a
+          *       little more work up-front.
+          *
+          *    3) Split the difference between (1) and (2) by doing a
+          *       two-level split.  If it's bigger than some fixed block_size,
+          *       we split it into block_size sized chunks and return all but
+          *       one of them.  Then we split what remains into
+          *       state.alloc_size sized chunks and return all but one.
+          *
+          * We choose option (3).
+          */
+         if (chunk_size > pool->block_size &&
+             state.alloc_size < pool->block_size) {
+            assert(chunk_size % pool->block_size == 0);
+            /* We don't want to split giant chunks into tiny chunks.  Instead,
+             * break anything bigger than a block into block-sized chunks and
+             * then break it down into bucket-sized chunks from there.  Return
+             * all but the first block of the chunk to the block bucket.
+             */
+            const uint32_t block_bucket =
+               anv_state_pool_get_bucket(pool->block_size);
+            anv_free_list_push(&pool->buckets[block_bucket].free_list,
+                               pool->block_pool.map,
+                               chunk_offset + pool->block_size,
+                               pool->block_size,
+                               (chunk_size / pool->block_size) - 1);
+            chunk_size = pool->block_size;
+         }
+
+         assert(chunk_size % state.alloc_size == 0);
+         anv_free_list_push(&pool->buckets[bucket].free_list,
+                            pool->block_pool.map,
+                            chunk_offset + state.alloc_size,
+                            state.alloc_size,
+                            (chunk_size / state.alloc_size) - 1);
+
+         state.offset = chunk_offset;
+         goto done;
+      }
+   }
+
    state.offset = anv_fixed_size_state_pool_alloc_new(&pool->buckets[bucket],
                                                       &pool->block_pool,
                                                       state.alloc_size,
author	Jason Ekstrand <jason.ekstrand@intel.com>	2017-04-24 01:51:51 -0700
committer	Jason Ekstrand <jason.ekstrand@intel.com>	2017-05-04 19:07:54 -0700
commit	f82d3d38b62048246c4df999a1789b5cb60184c6 (patch)
tree	a2ebc05fe825f8149e87edcbca1ef10f9f927d7f /src/intel
parent	8c079b566e59df2f6d0e0deb951078aba862991d (diff)