summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTopi Pohjolainen <topi.pohjolainen@intel.com>2017-10-25 16:50:11 +0300
committerTopi Pohjolainen <topi.pohjolainen@intel.com>2017-10-25 16:51:14 +0300
commit8057bae2e73701bd3f6cc357a499775864620d58 (patch)
tree9458c8a61fd4eff9412c1d502581db5e9ad502e8
parent9e5a5a11ed93637fe28735e3dd161e59c4c3e5d0 (diff)
intel/compiler/gen9: Pixel shader header only workarounddaimler
Fixes intermittent GPU hangs on Broxton with an Intel internal test case. There are plenty of similar fragment shaders in piglit that do not use any varyings and any uniforms. According to the documentation special timing is needed between pipeline stages. Apparently we just don't hit that with piglit. Even with the failing test case one doesn't always get the hang. Moreover, according to the error states the hang happens significantly later than the execution of the problematic shader. There are multiple render cycles (primitive submissions) in between. I've also seen error states where the ACTHD points outside the batch. Almost as if the hardware writes somewhere that gets used later on. That would also explain why piglit doesn't suffer from this - most tests kick off one render cycle and any corruption is left unseen. v2 (Ken): Instead of enabling push constants, enable one of the inputs (PSIZ). CC: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
-rw-r--r--src/intel/compiler/brw_fs.cpp29
1 files changed, 29 insertions, 0 deletions
diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index 30e8841242..f62b8f5a42 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -6164,6 +6164,31 @@ fs_visitor::run_gs()
return !failed;
}
+/* From the SKL PRM, Volume 16, Workarounds:
+ *
+ * 0877 3D Pixel Shader Hang possible when pixel shader dispatched with
+ * only header phases (R0-R2)
+ *
+ * WA: Enable a non-header phase (e.g. push constant) when dispatch would
+ * have been header only.
+ *
+ * Instead of enabling push constants one can alternatively enable one of the
+ * inputs. Here one simply chooses point size which shouldn't impose much
+ * overhead.
+ */
+static void
+gen9_ps_header_only_workaround(struct brw_wm_prog_data *wm_prog_data)
+{
+ if (wm_prog_data->num_varying_inputs)
+ return;
+
+ if (wm_prog_data->base.curb_read_length)
+ return;
+
+ wm_prog_data->urb_setup[VARYING_SLOT_PSIZ] = 0;
+ wm_prog_data->num_varying_inputs = 1;
+}
+
bool
fs_visitor::run_fs(bool allow_spilling, bool do_rep_send)
{
@@ -6227,6 +6252,10 @@ fs_visitor::run_fs(bool allow_spilling, bool do_rep_send)
optimize();
assign_curb_setup();
+
+ if (devinfo->gen >= 9)
+ gen9_ps_header_only_workaround(wm_prog_data);
+
assign_urb_setup();
fixup_3src_null_dest();