diff options
author | Jordan Justen <jordan.l.justen@intel.com> | 2018-03-06 23:28:00 -0800 |
---|---|---|
committer | Jordan Justen <jordan.l.justen@intel.com> | 2018-03-09 16:15:58 -0800 |
commit | 24b415270ffeef873ba4772d1b3c7c185c9b1958 (patch) | |
tree | 8b45498a09faa641af94d564ab4361153b9f3d38 | |
parent | 06e3bd02c01e499332a9c02b40f506df9695bced (diff) |
intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview
Ken suggested that we might be underallocating scratch space on HD
400. Allocating scratch space as though there was actually 8 EUs
seems to help with a GPU hang seen on synmark CSDof.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
-rw-r--r-- | src/intel/vulkan/anv_allocator.c | 45 |
1 files changed, 28 insertions, 17 deletions
diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_allocator.c index fe14d6cfab..a27af4eccc 100644 --- a/src/intel/vulkan/anv_allocator.c +++ b/src/intel/vulkan/anv_allocator.c @@ -1097,24 +1097,35 @@ anv_scratch_pool_alloc(struct anv_device *device, struct anv_scratch_pool *pool, &device->instance->physicalDevice; const struct gen_device_info *devinfo = &physical_device->info; - /* WaCSScratchSize:hsw - * - * Haswell's scratch space address calculation appears to be sparse - * rather than tightly packed. The Thread ID has bits indicating which - * subslice, EU within a subslice, and thread within an EU it is. - * There's a maximum of two slices and two subslices, so these can be - * stored with a single bit. Even though there are only 10 EUs per - * subslice, this is stored in 4 bits, so there's an effective maximum - * value of 16 EUs. Similarly, although there are only 7 threads per EU, - * this is stored in a 3 bit number, giving an effective maximum value - * of 8 threads per EU. - * - * This means that we need to use 16 * 8 instead of 10 * 7 for the - * number of threads per subslice. - */ const unsigned subslices = MAX2(physical_device->subslice_total, 1); - const unsigned scratch_ids_per_subslice = - device->info.is_haswell ? 16 * 8 : devinfo->max_cs_threads; + + unsigned scratch_ids_per_subslice; + if (devinfo->is_haswell) { + /* WaCSScratchSize:hsw + * + * Haswell's scratch space address calculation appears to be sparse + * rather than tightly packed. The Thread ID has bits indicating + * which subslice, EU within a subslice, and thread within an EU it + * is. There's a maximum of two slices and two subslices, so these + * can be stored with a single bit. Even though there are only 10 EUs + * per subslice, this is stored in 4 bits, so there's an effective + * maximum value of 16 EUs. Similarly, although there are only 7 + * threads per EU, this is stored in a 3 bit number, giving an + * effective maximum value of 8 threads per EU. + * + * This means that we need to use 16 * 8 instead of 10 * 7 for the + * number of threads per subslice. + */ + scratch_ids_per_subslice = 16 * 8; + } else if (devinfo->is_cherryview) { + /* Cherryview devices have either 6 or 8 EUs per subslice, and each EU + * has 7 threads. The 6 EU devices appear to calculate thread IDs as if + * it had 8 EUs. + */ + scratch_ids_per_subslice = 8 * 7; + } else { + scratch_ids_per_subslice = devinfo->max_cs_threads; + } uint32_t max_threads[] = { [MESA_SHADER_VERTEX] = devinfo->max_vs_threads, |