summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJason Ekstrand <jason.ekstrand@intel.com>2018-05-21 09:51:50 -0700
committerDylan Baker <dylan@pnwbakers.com>2018-07-03 10:24:58 -0700
commit2fa4b4dfc0a9a02b8eb46d10e2dee359612698f8 (patch)
treed707d5f6a3eb843837576c991dc9da85ab99ed81
parentf102eada9b5c133146a2918e4514a6e582928eb3 (diff)
intel/fs: Split instructions low to high in lower_simd_width
Commit 0d905597f fixed an issue with the placement of the zip and unzip instructions. However, as a side-effect, it reversed the order in which we were emitting the split instructions so that they went from high group to low instead of low to high. This is fine for most things like texture instructions and the like but certain render target writes really want to be emitted low to high. This commit just switches the order back around to be low to high. Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: 0d905597f "intel/fs: Be more explicit about our placement of [un]zip" (cherry picked from commit d5b617a28e89fda62fb6cceec10686b0bb4b4fb2)
-rw-r--r--src/intel/compiler/brw_fs.cpp37
1 files changed, 35 insertions, 2 deletions
diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index 815650706c..27018ee483 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -5588,16 +5588,49 @@ fs_visitor::lower_simd_width()
* after \p inst, inst->next is a moving target and we need to save
* it off here so that we insert the zip instructions in the right
* place.
+ *
+ * Since we're inserting split instructions after after_inst, the
+ * instructions will end up in the reverse order that we insert them.
+ * However, certain render target writes require that the low group
+ * instructions come before the high group. From the Ivy Bridge PRM
+ * Vol. 4, Pt. 1, Section 3.9.11:
+ *
+ * "If multiple SIMD8 Dual Source messages are delivered by the
+ * pixel shader thread, each SIMD8_DUALSRC_LO message must be
+ * issued before the SIMD8_DUALSRC_HI message with the same Slot
+ * Group Select setting."
+ *
+ * And, from Section 3.9.11.1 of the same PRM:
+ *
+ * "When SIMD32 or SIMD16 PS threads send render target writes
+ * with multiple SIMD8 and SIMD16 messages, the following must
+ * hold:
+ *
+ * All the slots (as described above) must have a corresponding
+ * render target write irrespective of the slot's validity. A slot
+ * is considered valid when at least one sample is enabled. For
+ * example, a SIMD16 PS thread must send two SIMD8 render target
+ * writes to cover all the slots.
+ *
+ * PS thread must send SIMD render target write messages with
+ * increasing slot numbers. For example, SIMD16 thread has
+ * Slot[15:0] and if two SIMD8 render target writes are used, the
+ * first SIMD8 render target write must send Slot[7:0] and the
+ * next one must send Slot[15:8]."
+ *
+ * In order to make low group instructions come before high group
+ * instructions (this is required for some render target writes), we
+ * split from the highest group to lowest.
*/
exec_node *const after_inst = inst->next;
- for (unsigned i = 0; i < n; i++) {
+ for (int i = n - 1; i >= 0; i--) {
/* Emit a copy of the original instruction with the lowered width.
* If the EOT flag was set throw it away except for the last
* instruction to avoid killing the thread prematurely.
*/
fs_inst split_inst = *inst;
split_inst.exec_size = lower_width;
- split_inst.eot = inst->eot && i == 0;
+ split_inst.eot = inst->eot && i == n - 1;
/* Select the correct channel enables for the i-th group, then
* transform the sources and destination and emit the lowered