summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorNirmoy Das <nirmoy.das@intel.com>2024-04-22 22:19:51 +0200
committerAndi Shyti <andi.shyti@linux.intel.com>2024-04-24 18:48:32 +0200
commit4d3421e04c5dc38baf15224c051256204f223c15 (patch)
tree55e9cca99f1d7a86e9dba479d47f74059761bf9b
parent31c3c53ee3a3e39aac690dffab75765d25e318dd (diff)
drm/i915: Fix gt reset with GuC submission is disableddrm-intel-gt-next-2024-04-26
Currently intel_gt_reset() kills the GuC and then resets requested engines. This is problematic because there is a dedicated CSB FIFO which only GuC can access and if that FIFO fills up, the hardware will block on the next context switch until there is space that means the system is effectively hung. If an engine is reset whilst actively executing a context, a CSB entry will be sent to say that the context has gone idle. Thus if reset happens on a very busy system then killing GuC before killing the engines will lead to deadlock because of filled up CSB FIFO. To address this issue, the GuC should be killed only after resetting the requested engines and before calling intel_gt_init_hw(). v2: Improve commit message(John) Cc: John Harrison <john.c.harrison@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240422201951.633-2-nirmoy.das@intel.com
-rw-r--r--drivers/gpu/drm/i915/gt/intel_reset.c16
1 files changed, 14 insertions, 2 deletions
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index 6990cb5c89ae..de775f8ddc2a 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -879,8 +879,17 @@ static intel_engine_mask_t reset_prepare(struct intel_gt *gt)
intel_engine_mask_t awake = 0;
enum intel_engine_id id;
- /* For GuC mode, ensure submission is disabled before stopping ring */
- intel_uc_reset_prepare(&gt->uc);
+ /**
+ * For GuC mode with submission enabled, ensure submission
+ * is disabled before stopping ring.
+ *
+ * For GuC mode with submission disabled, ensure that GuC is not
+ * sanitized, do that after engine reset. reset_prepare()
+ * is followed by engine reset which in this mode requires GuC to
+ * process any CSB FIFO entries generated by the resets.
+ */
+ if (intel_uc_uses_guc_submission(&gt->uc))
+ intel_uc_reset_prepare(&gt->uc);
for_each_engine(engine, gt, id) {
if (intel_engine_pm_get_if_awake(engine))
@@ -1226,6 +1235,9 @@ void intel_gt_reset(struct intel_gt *gt,
intel_overlay_reset(gt->i915);
+ /* sanitize uC after engine reset */
+ if (!intel_uc_uses_guc_submission(&gt->uc))
+ intel_uc_reset_prepare(&gt->uc);
/*
* Next we need to restore the context, but we don't use those
* yet either...