Age | Commit message (Collapse) | Author | Files | Lines |
|
For: VIZ-0000
Signed-off-by: Do Not Submit <DoNotSubmit@Nowhere.com>
|
|
The changes in gen8_emit_pipe_control_qw_store_index() and
gen8_emit_flush_coherentl3_wa() are probably necessary, or at least more
strictly correct than the current code.
OTOH turning preemption off/on in gen9_init_indirectctx_bb() and
gen9_init_perctx_bb() is just a hack, in case it isn't really fixed in
SKL D1+.
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
|
|
|
|
and implement several preemption level options.
New range is 0-6(+)
0: none
1: between-batch
2: mid-batch cooperative
3: mid-batch but no mid-thread or mid-object
4: mid-thread-group but not mid-thread or mid-object
5: mid-thread-group and mid-object (but still no mid-thread)
6: same as 5, but allow user override
7+: unrestricted (h/w defaults)
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
Since we now use PIPE_CONTROL to update seqno on the render ring, it
may be necessary to include a few extra workarounds for this opcode.
Using PIPE_CONTROL this way was recently introduced in:
7c17d37 drm/i915: Use ordered seqno write interrupt generation on gen8+
execlists
which appears to have triggered some stability regression in Yocto after
recent forklift.
Conflicts:
drivers/gpu/drm/i915/intel_lrc.c
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
The code in gen6_render_ring_flush() wasn't quite right; whereas the
workarounds were emitted with GFX_OP_PIPE_CONTROL(5) (allowing for a
QWord write), the final PIPE_CONTROL used GFX_OP_PIPE_CONTROL(4) even
though it would also have the PIPE_CONTROL_QW_WRITE flag set. This may
(should?) have resulted in the next instruction word being consumed as
data, or the PIPE_CONTROL not working as intended due to the flags-vs-
length mismatch.
The refactored code has a Gen6-specific function which will emit a
complete PIPE_CONTROL sequence, with the correct number of extension
DWords. The emit-workarounds and emit-flush functions are then trivially
expressed as calls to the new low-level function.
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
The names used for the various flag bits in the PIPE_CONTROL instruction
are neither self-consistent nor self-explanatory, nor, in some cases,
easily mapped to the names used in the BSpec,
This patch renames a few of them to reduce ambiguity (DC != Depth
Cache) or increase consistency (e.g. remove "ENABLE" everywhere),
and explain which bits are supposed to go where (pre-SNB vs SNB vs
post-SNB).
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
WaEnableSamplerGPGPUPreemptionSupport fixes a problem
related to mid thread pre-emption.
Change-Id: Idea8e709eaa94e5d7addf901b4fb5d40fd744603
Tracked-On: https://jira01.devtools.intel.com/browse/OAM-21345
Signed-off-by: Tim Gore <tim.gore@intel.com>
|
|
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
v2: Fixed a typo (and improved the names in general). Updated for
changes to notify() code.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
requests
This patch refactors the rinbuffer-level code (in execlists/GuC mode
only) and enhances it so that it can emit the proper sequence of opcode
for preemption requests.
A preemption request is similar to an batch submission, but doesn't
actually invoke a batchbuffer, the purpose being simply to get the
engine to stop what it's doing so that the scheduler can then send it a
new workload instead.
Preemption requests use different locations in the hardware status page
to hold the 'active' and 'done' seqnos from regular batches, so that
information pertaining to a preempted batch is not overwritten. Also,
whereas a regular batch clears its 'active' flag when it finishes (so
that TDR knows it's no longer to blame), preemption requests leave this
set and the driver clears it once the completion of the preemption
request has been noticed. Only one preemption (per ring) can be in
progress at one time, so this handshake ensures correct sequencing of
the request between the GPU and CPU.
Actually-preemptive requests are still disabled via a module parameter
at this stage, but all the components should now be ready for us to turn
it on :)
v2: Updated to use locally cached request pointer and to fix the
location of the dispatch trace point.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
Batch buffers which have been pre-emption mid-way through execution
must be handled separately. Rather than simply re-submitting the batch
as a brand new piece of work, the driver only needs to requeue the
context. The hardware will take care of picking up where it left off.
v2: New patch in series.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
This patch adds the scheduler logic for postprocessing of completed
preemption requests. It cleans out both the fence_signal list (dropping
references as it goes) and the primary request_list. Requests that
didn't complete are put into the 'preempted' state for resubmission by
the scheduler, and their ringbuffers are emptied by setting head==tail
so thers is no pending work in any preempted context. The (dummy)
preemption request itself is also recycled in the same way, and should
then be (re)selected by the scheduler to be submitted next (unless
anything with even hogher priority as been queued in the meantime); but
because there are now no requests flying, the next-submitted batch will
not need to preempt, and so will be launched 'for real' as a regular
non-preemptive batch.
Actually-preemptive requests are still disabled via a module parameter
at this stage, as we don't yet have the code to emit preemption requests
into the ringbuffer.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
After preemption, we need to empty out the ringbuffers associated
with preempted requests, so that the scheduler has a clean ring
into which to (re-)insert requests (not necessarily in the same
order as before they were preempted).
So this patch refactors the existing routine intel_lr_context_reset()
into a new inner core intel_lr_context_resync() which just updates
a context and the associated ringbuffer, and an outer wrapper which
implements the original operation of intel_lr_context_reset() in
terms of resync().
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
Author: John Harrison <John.C.Harrison@Intel.com>
Date: Thu Apr 10 10:41:06 2014 +0100
The scheduler needs to know what each seqno that pops out of the ring is
referring to. This change adds a hook into the the 'submit some random
work that got forgotten about' clean up code to inform the scheduler
that a new seqno has been sent to the ring for some non-batch buffer
operation.
Reworked for latest scheduler+preemption by Dave Gordon: with the newer
implementation, knowing about untracked requests is merely helpful for
debugging rather than being mandatory, as we have already taken steps to
prevent untracked requests intruding at awkward moments!
v2: Removed unnecessary debug spew.
For: VIZ-2021
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
This patch adds the GEM & scheduler logic for detection and first-stage
processing of completed preemption requests. Similar to regular batches,
they deposit their sequence number in the hardware status page when
starting and again when finished, but using different locations so that
information pertaining to a preempted batch is not overwritten. Also,
the in-progress flag is not by the GPU cleared at the end of the batch;
instead driver software is responsible for clearing this once the
request completion has been noticed.
Actually-preemptive requests are still disabled via a module parameter
at this early stage, as the rest of the logic to deal with the
consequences of preemption isn't in place yet.
v2: Re-worked to simplify 'pre-emption in progress' logic.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
If the scheduler is busy (e.g. processing a preemption) it will need to
be able to acquire the struct_mutex, so we can't allow untracked
requests to bypass the scheduler and go directly to the hardware (much
confusion will result). Since untracked requests are used only for
initialisation of logical contexts, we can avoid the problem by forcing
any thread trying to initialise a context at an unfortunate time to drop
the mutex and retry later.
*v?* Add documentation.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
Once a preemptive request has been dispatched to the hardware-layer
submission mechanism, the scheduler must not send any further requests
to the same ring until the preemption completes. Here we add the logic
that ensure that only one preemption per ring can be in progress at one
time.
Actually-preemptive requests are still disabled via a module parameter
at this early stage, as the logic to process completion isn't in place
yet.
*v?* Added documentation.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
This patch adds the scheduler logic for managing potentially preemptive
requests, including validating dependencies and working out when a
request can be downgraded to non-preemptive (e.g. when there's nothing
ahead for it to preempt).
Actually-preemptive requests are still disabled via a module parameter
at this early stage, as the rest of the logic to deal with the
consequences of preemption isn't in place yet.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
This is the very first stage of the scheduler's preemption logic, where
it determines whether a request should be marked as potentially
preemptive, at the point where it is added to the scheduler's queue.
Subsequent logic will determine how to handle the request on the basis
of the flags set here.
Actually-preemptive requests are disabled via a module parameter at this
early stage, as the rest of the logic to process them isn't in place
yet.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
With the scheduler, request allocation can happen long before
the ring is filled in, and in a different order. So for that case,
we update the request head at the start of _final (the initialisation
on allocation is stull useful for the direct-submission mode).
v2: Updated to use locally cached request pointer.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
The current setting of request 'head' in add_request() isn't useful
and has been replaced for purposes of knowing how full the ring is
by 'postfix'. So we can instead use 'head' to define and locate the
entire range spanned by a request.
Pictorially,
head postfix tail
| | |
v v v
ringbuffer: [......S......P.......I.....]
where S, P, and I are the Start of the request, start of the Postfix,
and the user-Interrupt respectively. To help with debugging, this
request's tail should also be the next request's head, this showing
that all ringbuffer usage is accounted for.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
'relative_constants_mode' has always been tracked per-device, but this
is wrong in execlists (or GuC) mode, as INSTPM is saved and restored
with the logical context, and the per-context value could therefore get
out of sync with the tracked value. This patch moves the tracking
element from the dev_priv structure into the intel_context structure,
with corresponding adjustments to the code that initialises and uses it.
Test case (if anyone wants to write it) would be to create two contexts,
submit a batch with a non-default mode in the first context, submit a
batch with the default mode in the other context, submit another batch
in the first context, but this time in default mode. The driver will
fail to insert the instructions to reset INSTPM into the first context's
ringbuffer.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92448
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
Record a few more things about the requests outstanding at the time of
capture ...
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
Context capture hasn't worked for a while now, probably since the
introduction of execlists; this patch makes it work again by using
a different way of identifying the context of interest.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
Per-context initialisation GPU instructions (which are injected directly
into the ringbuffer rather than being submitted as a batch) should not
be allowed to mix with user-generated batches in the same submission; it
will cause confusion for the GuC (which might merge a subsequent
preemptive request with the non-preemptive iniitalisation code), and for
the scheduler, which wouldn't know how to reinject a non-batch request
if it were the victim of preemption.
Therefore, we should wait for the iniitalisation request to complete
before making the newly-initialised context available for user-mode
submissions.
Here, we add a call to i915_wait_request() after each existing call to
i915_add_request_no_flush() (in i915_gem_init_hw(), for the default
per-engine contexts, and intel_lr_context_deferred_create(), for all
others).
Adapted from Alex's earlier patch, which added the wait only to
intel_lr_context_render_state_init(), and which John Harrison was
dubious about:
"JH thinks this isn't a good idea. Why do we need to wait?".
But we will need to after all, if only because of preemption.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
If a batch is submitted via the preemptive (KMD_HIGH-priority) client
then instead of ringing the doorbell we dispatch it using the GuC
"REQUEST_PREEMPTION" action. Also, we specify "clear work queue" and
"clear submit queue" in that request, so the scheduler can reconsider
what is to be done next after preemption.
Note that the preemption request requires a reference to the GuC per-
context shared data, which in early versions of the GuC firmware was at
the end of the context object but nowadays is at the start.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
This second client is created with priority KMD_HIGH, and marked
as preemptive.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
Also decode and output CSB entries, in time order
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
At present, execlist status/ctx_id and CSBs, not the submission queue
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
To reinitialise a ringbuffer after a hang (or preemption), we need to not
only to not only set both h/w and s/w HEAD and TAIL to 0, but also clear
last_retired_head and recalculate the available space.
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
Added various definitions that will be useful for the scheduler in
general and pre-emptive context switching in particular.
v4: Corrected a spelling typo in a comment.
For: VIZ-1587
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
|
|
Big squash of patches to implement pre-emptive scheduling.
NB: MI_ARB_CHECK must be disabled in LRC mode otherwise the context
switch interrupts stop coming out :(.
Change-Id: I5dc3facb962a492226b642542dc48407b3d2602d
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|
|
For: VIZ-0000
Signed-off-by: Do Not Submit <DoNotSubmit@Nowhere.com>
|
|
There are useful statistics and debug information about fences that
can be returned via the scheduler's existing reporting mechanisms
(sysfs and debug output). These changes were previously part of the
patches that originally added those mechanisms. However, as the sync
framework has now been rebased to after the scheduler patches, they
must now be done as a separate patch on top.
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|