Age | Commit message (Collapse) | Author | Files | Lines |
|
Added a facility for triggering the scheduler state dump via a debugfs
entry.
v2: New patch in series.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Updated to use 'to_i915()' instead of dev_private. Converted all enum
labels to uppercase. [review feedback from Joonas Lahtinen]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|
|
When debugging batch buffer submission issues, it is useful to be able
to see what the current state of the scheduler is. This change adds
functions for decoding the internal scheduler state and reporting it.
v3: Updated a debug message with the new state_str() function.
v4: Wrapped some long lines to keep the style checker happy. Removed
the fence/sync code as that will now be part of a separate patch series.
v5: Removed forward declarations and white space. Added documentation.
[Joonas Lahtinen]
Also squashed in later patch to add seqno information from the start.
It was only being added in a separate patch due to historical reasons
which have since gone away.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Added 'for_each_scheduler_node()' helper macro. Updated to use
'to_i915()' instead of dev_private. Converted all enum labels to
uppercase. Moved the enum to string conversion function to debugfs.c
rather than scheduler.c [review feedback from Joonas Lahtinen]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
The scheduler has always tracked batch buffer dependencies based on
DRM object usage. This means that it will not submit a batch on one
ring that has outstanding dependencies still executing on other rings.
This is exactly the same synchronisation performed by
i915_gem_object_sync() using hardware semaphores where available and
CPU stalls where not (e.g. in execlist mode and/or on Gen8 hardware).
Unfortunately, when a batch buffer is submitted to the driver the
_object_sync() call happens first. Thus in case where hardware
semaphores are disabled, the driver has already stalled until the
dependency has been resolved.
This patch adds an optimisation to _object_sync() to ignore the
synchronisation in the case where it will subsequently be handled by
the scheduler. This removes the driver stall and (in the single
application case) provides near hardware semaphore performance even
when hardware semaphores are disabled. In a busy system where there is
other work that can be executed on the stalling ring, it provides
better than hardware semaphore performance as it removes the stall
from both the driver and from the hardware. There is also a theory
that this method should improve power usage as hardware semaphores are
apparently not very power efficient - the stalled ring does not go
into as low a power a state as when it is genuinely idle.
The optimisation is to check whether both ends of the synchronisation
are batch buffer requests. If they are, then the scheduler will have
the inter-dependency tracked and managed. If one or other end is not a
batch buffer request (e.g. a page flip) then the code falls back to
the CPU stall or hardware semaphore as appropriate.
To check whether the existing usage is a batch buffer, the code simply
calls the 'are you tracking this request' function of the scheduler on
the object's last_read_req member. To check whether the new usage is a
batch buffer, a flag is passed in from the caller.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Replaced the i915_scheduler_is_request_tracked() function with
i915_scheduler_is_request_batch_buffer() as the need for the former
has gone away and it was really being used to ask the latter question
in a convoluted manner. [review feedback from Joonas Lahtinen]
Issue: VIZ-5566
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|
|
If a given context submits too many hanging batch buffers then it will
be banned and no further batch buffers will be accepted for it.
However, it is possible that a large number of buffers may already
have been accepted and are sat in the scheduler waiting to be
executed. This patch adds a late ban check to ensure that these will
also be discarded.
v4: New patch in series.
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|
|
Added an interface for user land applications/libraries/services to
set their GPU scheduler priority. This extends the existing context
parameter IOCTL interface to add a scheduler priority parameter. The
range is +/-1023 with +ve numbers meaning higher priority. Only
system processes may set a higher priority than the default (zero),
normal applications may only lower theirs.
v2: New patch in series.
v6: Updated to use 'to_i915()' instead of dev_private.
[review feedback from Joonas Lahtinen]
For: VIZ-1587
Signed-off-by: Dave Gordon <David.S.Gordon@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|
|
Now that all the scheduler patches have been applied, it is safe to enable.
v5: Updated for new module parameter.
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|
|
The TDR code needs to know what the scheduler is up to in order to
work out whether a ring is really hung or not.
v4: Removed some unnecessary braces to keep the style checker happy.
v5: Removed white space and added documentation. [Joonas Lahtinen]
Also updated for new module parameter.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Added 'for_each_scheduler_node()' helper macro. Updated to use
'to_i915()' instead of dev_private. [review feedback from Joonas
Lahtinen]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
It is useful for know what the scheduler is doing for both debugging
and performance analysis purposes. This change adds a bunch of
counters and such that keep track of various scheduler operations
(batches submitted, completed, flush requests, etc.). The data can
then be read in userland via the debugfs mechanism.
v2: Updated to match changes to scheduler implementation.
v3: Updated for changes to kill code and flush code.
v4: Removed the fence/sync code as that will be part of a separate
patch series. Wrapped a long line to keep the style checker happy.
v5: Updated to remove forward declarations and white space. Added
documentation. [Joonas Lahtinen]
Used lighter weight spinlocks.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Added 'for_each_scheduler_node()' and 'assert_scheduler_lock_held()'
helper macros. Updated to use 'to_i915()' instead of dev_private.
Converted all enum labels to uppercase. Removed even more white space.
Moved the enum to string conversion function to debugfs.c rather than
scheduler.c [review feedback from Joonas Lahtinen]
Added running totals of 'flying' and 'queued' nodes rather than
re-calculating each time as a minor CPU performance optimisation.
Added stats to the new file queue wait implementation.
v6.1: Added a target engine parameter to the throttle function. This
is used for tracking per engine throttle stats introduced in the v6
update. This fixes a random memory corruption bug caused by using an
invalid engine pointer.
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
One of the major purposes of the GPU scheduler is to avoid stalling
the CPU when the GPU is busy and unable to accept more work. This
change adds support to the ring submission code to allow a ring space
check to be performed before attempting to submit a batch buffer to
the hardware. If insufficient space is available then the scheduler
can go away and come back later, letting the CPU get on with other
work, rather than stalling and waiting for the hardware to catch up.
v3: Updated to use locally cached request pointer.
v4: Line wrapped some comments differently to keep the style checker
happy. Downgraded a BUG_ON to a WARN_ON as the latter is preferred.
Removed some obsolete, commented out code.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Updated to use 'to_i915()' instead of dev_private. [review feedback
from Joonas Lahtinen]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|
|
There are various parameters within the scheduler which can be tuned
to improve performance, reduce memory footprint, etc. This change adds
support for altering these via debugfs.
v2: Updated for priorities now being signed values.
v5: Squashed priority bumping entries into this patch rather than a
separate patch all of their own.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Updated to use 'to_i915()' instead of dev_private. [review feedback
from Joonas Lahtinen]
Added an admin only check when setting the parameters to prevent rogue
user code trying to break the system with strange settings. [review
feedback from Jesse Barnes]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
The scheduler decouples the submission of batch buffers to the driver
from their subsequent submission to the hardware. This means that an
application which is continuously submitting buffers as fast as it can
could potentialy flood the driver. To prevent this, the driver now
tracks how many buffers are in progress (queued in software or
executing in hardware) and limits this to a given (tunable) number. If
this number is exceeded then the queue to the driver will return
EAGAIN and thus prevent the scheduler's queue becoming arbitrarily
large.
v3: Added a missing decrement of the file queue counter.
v4: Updated a comment.
v5: Updated due to changes to earlier patches in series - removing
forward declarations and white space. Also added some documentation.
[Joonas Lahtinen]
v6: Updated to newer nightly (lots of ring -> engine renaming).
Replace the simple 'return to userland when full' scheme with a 'sleep
on request' scheme. The former could lead to the busy polling and
wasting lots of CPU time as user land continuously retried the execbuf
IOCTL in a tight loop. Now the driver will sleep (without holding the
mutex lock) on the oldest request outstanding for that file and then
automatically retry. This is closer to the pre-scheduler behaviour of
stalling on a full ring buffer.
v6.1: Moved throttle point to later common location. Required for a
subsequent patch that needs the engine to have been determined already.
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|
|
Added trace points to the scheduler to track all the various events,
node state transitions and other interesting things that occur.
v2: Updated for new request completion tracking implementation.
v3: Updated for changes to node kill code.
v4: Wrapped some long lines to keep the style checker happy.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Dropped 'min_seqno' value from 'i915_scheduler_remove' tracepoint as
it has also been removed from the code.
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
The seqno value is now only used for the final test for completion of
a request. It is no longer used to track the request through the
software stack. Thus it is no longer necessary to allocate the seqno
immediately with the request. Instead, it can be done lazily and left
until the request is actually sent to the hardware. This is particular
advantageous with a GPU scheduler as the requests can then be
re-ordered between their creation and their hardware submission
without having out of order seqnos.
v2: i915_add_request() can't fail!
Combine with 'drm/i915: Assign seqno at start of exec_final()'
Various bits of code during the execbuf code path need a seqno value
to be assigned to the request. This change makes this assignment
explicit at the start of submission_final() rather than relying on an
auto-generated seqno to have happened already. This is in preparation
for a future patch which changes seqno values to be assigned lazily
(during add_request).
v3: Updated to use locally cached request pointer.
v4: Changed some white space and comment formatting to keep the style
checker happy.
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|
|
When the seqno wraps around zero, the entire GPU is forced to be idle
for some reason (possibly only to work around issues with hardware
semaphores but no-one seems too sure!). This causes a problem if the
force idle occurs at an inopportune moment such as in the middle of
submitting a batch buffer. Specifically, it would lead to recursive
submits - submitting work requires a new seqno, the new seqno requires
idling the ring, idling the ring requires submitting work, submitting
work requires a new seqno...
This change adds a 'flush' parameter to the idle function call which
specifies whether the scheduler queues should be flushed out. I.e. is
the call intended to just idle the ring as it is right now (no flush)
or is it intended to force all outstanding work out of the system
(with flush).
In the seqno wrap case, pending work is not an issue because the next
operation will be to submit it. However, in other cases, the intention
is to make sure everything that could be done has been done.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Added kerneldoc for intel_engine_idle().
Wrapped boolean 'flush' parameter with an _flush() macro.
[review feedback from Joonas Lahtinen]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
It can be useful to be able to disable the GPU scheduler via a module
parameter for debugging purposes.
v5: Converted from a multi-feature 'overrides' mask to a single
'enable' boolean. Further features (e.g. pre-emption) will now be
separate 'enable' booleans added later. [Chris Wilson]
v6: Moved scheduler parameter declaration to correct place in
i915_params struct. [review feedback from Matt Roper]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Matt Roper <matthew.d.roper@intel.com>
|
|
When the watchdog resets the GPU, all interrupts get disabled despite
the reference count remaining. As the scheduler probably had
interrupts enabled during the reset (it would have been waiting for
the bad batch to complete), it must be poked to tell it that the
interrupt has been disabled.
v5: New patch in series.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Updated to use 'to_i915()' instead of dev_private. Converted all enum
labels to uppercase. [review feedback from Joonas Lahtinen]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|
|
When requesting that all GPU work is completed, it is now necessary to
get the scheduler involved in order to flush out work that queued and
not yet submitted.
v2: Updated to add support for flushing the scheduler queue by time
stamp rather than just doing a blanket flush.
v3: Moved submit_max_priority() to this patch from an earlier patch
is it is no longer required in the other.
v4: Corrected the format of a comment to keep the style checker happy.
Downgraded a BUG_ON to a WARN_ON as the latter is preferred.
v5: Shuffled functions around to remove forward prototypes, removed
similarly offensive white space and added documentation. Re-worked the
mutex locking around the submit function. [Joonas Lahtinen]
Used lighter weight spinlocks.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Added 'for_each_scheduler_node()' helper macro. Updated to use
'to_i915()' instead of dev_private. Converted all enum labels to
uppercase. [review feedback from Joonas Lahtinen]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
GPU page faults can now require scheduler operation in order to
complete. For example, in order to free up sufficient memory to handle
the fault the handler must wait for a batch buffer to complete that
has not even been sent to the hardware yet. Thus EAGAIN no longer
means a GPU hang, it can occur under normal operation.
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
The scheduler can cause batch buffers, and hence requests, to be
submitted to the ring out of order and asynchronously to their
submission to the driver. Thus at the point of waiting for the
completion of a given request, it is not even guaranteed that the
request has actually been sent to the hardware yet. Even it is has
been sent, it is possible that it could be pre-empted and thus
'unsent'.
This means that it is necessary to be able to submit requests to the
hardware during the wait call itself. Unfortunately, while some
callers of __wait_request() release the mutex lock first, others do
not (and apparently can not). Hence there is the ability to deadlock
as the wait stalls for submission but the asynchronous submission is
stalled for the mutex lock.
This change hooks the scheduler in to the __wait_request() code to
ensure correct behaviour. That is, flush the target batch buffer
through to the hardware and do not deadlock waiting for something that
cannot currently be submitted. Instead, the wait call must return
EAGAIN at least as far back as necessary to release the mutex lock and
allow the scheduler's asynchronous processing to get in and handle the
pre-emption operation and eventually (re-)submit the work.
v3: Removed the explicit scheduler flush from i915_wait_request().
This is no longer necessary and was causing unintended changes to the
scheduler priority level which broke a validation team test.
v4: Corrected the format of some comments to keep the style checker
happy.
v5: Added function description. [Joonas Lahtinen]
v6: Updated to newer nightly (lots of ring -> engine renaming).
Updated to use 'to_i915()' instead of dev_private. Changed extra
boolean wait_request() parameter to a flags word and consumed the
original boolean parameter too. Also, replaced the
i915_scheduler_is_request_tracked() function with
i915_scheduler_is_mutex_required() as the need for the former has gone
away and it was really being used to ask the latter question in a
convoluted manner. [review feedback from Joonas Lahtinen]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
The scheduler keeps its own lock on various DRM objects in order to
guarantee safe access long after the original execbuff IOCTL has
completed. This is especially important when pre-emption is enabled as
the batch buffer might need to be submitted to the hardware multiple
times. This patch hooks the clean up of these locks into the request
retire function. The request can only be retired after it has
completed on the hardware and thus is no longer eligible for
re-submission. Thus there is no point holding on to the locks beyond
that time.
v3: Updated to not WARN when cleaning a node that is being cancelled.
The clean will happen later so skipping it at the point of
cancellation is fine.
v5: Squashed the i915_scheduler.c portions down into the 'start of
scheduler' patch. [Joonas Lahtinen]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
The scheduler needs to track interdependencies between batch buffers.
These are calculated by analysing the object lists of the buffers and
looking for commonality. The scheduler also needs to keep those
buffers locked long after the initial IOCTL call has returned to user
land.
v3: Updated to support read-read optimisation.
v5: Updated due to changes to earlier patches in series for splitting
bypass mode into a separate function and consoliding the clean up code.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Renamed 'saved_objects' to just 'objs'. [review feedback from Joonas
Lahtinen]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Ring space is reserved when constructing a request to ensure that the
subsequent 'add_request()' call cannot fail due to waiting for space
on a busy or broken GPU. However, the scheduler jumps in to the middle
of the execbuffer process between request creation and request
submission. Thus it needs to cancel the reserved space when the
request is simply added to the scheduler's queue and not yet
submitted. Similarly, it needs to re-reserve the space when it finally
does want to send the batch buffer to the hardware.
v3: Updated to use locally cached request pointer.
v5: Updated due to changes to earlier patches in series - for runtime
PM calls and splitting bypass mode into a separate function.
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Updated the execbuffer() code to pass the packaged up batch buffer
information to the scheduler rather than calling execbuffer_final()
directly. The scheduler queue() code is currently a stub which simply
chains on to _final() immediately.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Renamed 'i915_gem_execbuff_release_batch_obj' to
'i915_gem_execbuf_release_batch_obj'. [review feedback from Joonas
Lahtinen]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
The scheduler needs to do interrupt triggered work that is too complex
to do in the interrupt handler. Thus it requires a deferred work
handler to process such tasks asynchronously.
v2: Updated to reduce mutex lock usage. The lock is now only held for
the minimum time within the remove function rather than for the whole
of the worker thread's operation.
v5: Removed objectionable white space and added some documentation.
[Joonas Lahtinen]
v6: Updated to newer nightly (lots of ring -> engine renaming).
Added an i915_scheduler_destroy() function instead of doing explicit
clean up of scheduler internals from i915_driver_unload().
[review feedback from Joonas Lahtinen]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
The scheduler needs to know when requests have completed so that it
can keep its own internal state up to date and can submit new requests
to the hardware from its queue.
v2: Updated due to changes in request handling. The operation is now
reversed from before. Rather than the scheduler being in control of
completion events, it is now the request code itself. The scheduler
merely receives a notification event. It can then optionally request
it's worker thread be woken up after all completion processing is
complete.
v4: Downgraded a BUG_ON to a WARN_ON as the latter is preferred.
v5: Squashed the i915_scheduler.c portions down into the 'start of
scheduler' patch. [Joonas Lahtinen]
v6: Updated to newer nightly (lots of ring -> engine renaming).
Changed an '|=' to an 'if() ='. [review feedback from Joonas Lahtinen]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
The scheduler decouples the submission of batch buffers to the driver
with submission of batch buffers to the hardware. Thus it is possible
for an application to close its DRM file handle while there is still
work outstanding. That means the scheduler needs to know about file
close events so it can remove the file pointer from such orphaned
batch buffers and not attempt to dereference it later.
v3: Updated to not wait for outstanding work to complete but merely
remove the file handle reference. The wait was getting excessively
complicated with inter-ring dependencies, pre-emption, and other such
issues.
v4: Changed some white space to keep the style checker happy.
v5: Added function documentation and removed apparently objectionable
white space. [Joonas Lahtinen]
Used lighter weight spinlocks.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Added 'for_each_scheduler_node()' helper macro. Updated to use
'to_i915()' instead of dev_private. Removed the return value from
i915_scheduler_closefile() as it is not used for anything. [review
feedback from Joonas Lahtinen]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
MMIO flips are the preferred mechanism now but more importantly, pipe
based flips cause issues for the scheduler. Specifically, submitting
work to the rings around the side of the scheduler could cause that
work to be lost if the scheduler generates a pre-emption event on that
ring.
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|
|
Hardware sempahores require seqno values to be continuously
incrementing. However, the scheduler's reordering of batch buffers
means that the seqno values going through the hardware could be out of
order. Thus semaphores can not be used.
On the other hand, the scheduler superceeds the need for hardware
semaphores anyway. Having one ring stall waiting for something to
complete on another ring is inefficient if that ring could be working
on some other, independent task. This is what the scheduler is meant
to do - keep the hardware as busy as possible by reordering batch
buffers to avoid dependency stalls.
v4: Downgraded a BUG_ON to WARN_ON as the latter is preferred.
v5: Squashed the i915_scheduler.c portions down into the 'start of
scheduler' patch. [Joonas Lahtinen]
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Initial creation of scheduler source files. Note that this patch
implements most of the scheduler functionality but does not hook it in
to the driver yet. It also leaves the scheduler code in 'pass through'
mode so that even when it is hooked in, it will not actually do very
much. This allows the hooks to be added one at a time in bite size
chunks and only when the scheduler is finally enabled at the end does
anything start happening.
The general theory of operation is that when batch buffers are
submitted to the driver, the execbuffer() code packages up all the
information required to execute the batch buffer at a later time. This
package is given over to the scheduler which adds it to an internal
node list. The scheduler also scans the list of objects associated
with the batch buffer and compares them against the objects already in
use by other buffers in the node list. If matches are found then the
new batch buffer node is marked as being dependent upon the matching
node. The same is done for the context object. The scheduler also
bumps up the priority of such matching nodes on the grounds that the
more dependencies a given batch buffer has the more important it is
likely to be.
The scheduler aims to have a given (tuneable) number of batch buffers
in flight on the hardware at any given time. If fewer than this are
currently executing when a new node is queued, then the node is passed
straight through to the submit function. Otherwise it is simply added
to the queue and the driver returns back to user land.
The scheduler is notified when each batch buffer completes and updates
its internal tracking accordingly. At the end of the completion
interrupt processing, if any scheduler tracked batches were processed,
the scheduler's deferred worker thread is woken up. This can do more
involved processing such as actually removing completed nodes from the
queue and freeing up the resources associated with them (internal
memory allocations, DRM object references, context reference, etc.).
The work handler also checks the in flight count and calls the
submission code if a new slot has appeared.
When the scheduler's submit code is called, it scans the queued node
list for the highest priority node that has no unmet dependencies.
Note that the dependency calculation is complex as it must take
inter-ring dependencies and potential preemptions into account. Note
also that in the future this will be extended to include external
dependencies such as the Android Native Sync file descriptors and/or
the linux dma-buff synchronisation scheme.
If a suitable node is found then it is sent to execbuff_final() for
submission to the hardware. The in flight count is then re-checked and
a new node popped from the list if appropriate. All nodes that are not
submitted have their priority bumped. This ensures that low priority
tasks do not get starved out by busy higher priority ones - everything
will eventually get its turn to run.
Note that this patch does not implement pre-emptive scheduling. Only
basic scheduling by re-ordering batch buffer submission is currently
implemented. Pre-emption of actively executing batch buffers comes in
the next patch series.
v2: Changed priority levels to +/-1023 due to feedback from Chris
Wilson.
Removed redundant index from scheduler node.
Changed time stamps to use jiffies instead of raw monotonic. This
provides lower resolution but improved compatibility with other i915
code.
Major re-write of completion tracking code due to struct fence
conversion. The scheduler no longer has it's own private IRQ handler
but just lets the existing request code handle completion events.
Instead, the scheduler now hooks into the request notify code to be
told when a request has completed.
Reduced driver mutex locking scope. Removal of scheduler nodes no
longer grabs the mutex lock.
v3: Refactor of dependency generation to make the code more readable.
Also added in read-read optimisation support - i.e., don't treat a
shared read-only buffer as being a dependency.
Allowed the killing of queued nodes rather than only flying ones.
v4: Updated the commit message to better reflect the current state of
the code. Downgraded some BUG_ONs to WARN_ONs. Used the correct array
memory allocator function (kmalloc_array instead of kmalloc).
Corrected the format of some comments. Wrapped some lines differently
to keep the style checker happy.
Fixed a WARN_ON when killing nodes. The dependency removal code checks
that nodes being destroyed do not have any oustanding dependencies
(which would imply they should not have been executed yet). In the
case of nodes being destroyed, e.g. due to context banning, then this
might well be the case - they have not been executed and do indeed
have outstanding dependencies.
Re-instated the code to disble interrupts when not in use. The
underlying problem causing broken IRQ reference counts seems to have
been fixed now.
v5: Shuffled various functions around to remove forward declarations
as apparently these are frowned upon. Removed lots of white space as
apparently having easy to read code is also frowned upon. Split the
direct submission scheduler bypass code out into a separate function.
Squashed down the i915_scheduler.c sections of various patches into
this patch. Thus the later patches simply hook in existing code into
various parts of the driver rather than adding the code as well. Added
documentation to various functions. Re-worked the submit function in
terms of mutex locking, error handling and exit paths. Split the
delayed work handler function in half. Made use of the kernel 'clamp'
macro. [Joonas Lahtinen]
Added runtime PM calls as these must be done at the top level before
acquiring the driver mutex lock. [Chris Wilson]
Removed some obsolete debug code that had been forgotten about.
Moved more clean up code into the 'i915_gem_scheduler_clean_node()'
function rather than replicating it in mutliple places.
Used lighter weight spinlocks.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Added 'for_each_scheduler_node()' and 'assert_scheduler_lock_held()'
helper macros. Renamed 'i915_gem_execbuff_release_batch_obj' to
'i915_gem_execbuf_release_batch_obj'. Updated to use 'to_i915()'
instead of dev_private. Converted all enum labels to uppercase.
Removed various unnecessary WARNs. Renamed 'saved_objects' to just
'objs'. Split code for counting incomplete nodes out into a separate
function. Removed even more white space. Added a destroy() function.
[review feedback from Joonas Lahtinen]
Added running totals of 'flying' and 'queued' nodes rather than
re-calculating each time as a minor CPU performance optimisation.
Removed support for out of order seqno completion. All the prep work
patch series (seqno to request conversion, late seqno assignment,
etc.) that has now been done means that the scheduler no longer
generates out of order seqno completions. Thus all the complex code
for coping with such is no longer required and can be removed.
Fixed a bug in scheduler bypass mode introduced in the clean up code
refactoring of v5. The clean up function was seeing the node in the
wrong state and thus refusing to process it.
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
The seqno value cannot always be used when debugging issues via trace
points. This is because it can be reset back to start, especially
during TDR type tests. Also, when the scheduler arrives the seqno is
only valid while a given request is executing on the hardware. While
the request is simply queued waiting for submission, it's seqno value
will be zero (meaning invalid).
v4: Wrapped a long line to keep the style checker happy.
v5: Added uniq to the dispatch trace point [Svetlana Kukanova]
For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
|
|
Keep a local copy of the request pointer in the _final() functions
rather than dereferencing the params block repeatedly.
v3: New patch in series.
v6: Updated to newer nightly (lots of ring -> engine renaming).
For: VIZ-1587
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Split the execbuffer() function in half. The first half collects and
validates all the information required to process the batch buffer. It
also does all the object pinning, relocations, active list management,
etc - basically anything that must be done upfront before the IOCTL
returns and allows the user land side to start changing/freeing
things. The second half does the actual ring submission.
This change implements the split but leaves the back half being called
directly from the end of the front half.
v2: Updated due to changes in underlying tree - addition of sync fence
support and removal of cliprects.
v3: Moved local 'ringbuf' variable to make later patches in the
series a bit neater.
v4: Corrected a typo in the commit message and downgraded a BUG_ON to
a WARN_ON as the latter is preferred. Also removed all the external
sync/fence support as that will now be a separate patch series.
v5: Updated for runtime PM changes.
v6: Updated to newer nightly (lots of ring -> engine renaming).
Renamed 'i915_gem_execbuff_release_batch_obj' to
'i915_gem_execbuf_release_batch_obj' and updated to use 'to_i915()'
instead of dev_private. [review feedback from Joonas Lahtinen]
Added an explicit remove_from_client() call to the failure path to fix
a race condition with invalid requests the client list.
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
The scheduler decouples the submission of batch buffers to the driver
with their submission to the hardware. This basically means splitting
the execbuffer() function in half. This change rearranges some code
ready for the split to occur.
v5: Dropped runtime PM calls as they conflict with the mutex lock.
Instead of being done at the lowest submission level, they are now
left right at the top driver entry level. [feedback from Chris Wilson]
v6: Updated to newer nightly (lots of ring -> engine renaming).
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
When there are lots and lots and even more lots of contexts (e.g. when
running with execlists) it is useful to be able to immediately see
what the total context count is.
v4: Re-typed a variable (review feedback from Joonas)
For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
For: VIZ-0000
Signed-off-by: Do Not Submit <DoNotSubmit@Nowhere.com>
|
|
The notify function can be called many times without the seqno
changing. A large number of duplicates are to prevent races due to the
requirement of not enabling interrupts until requested. However, when
interrupts are enabled the IRQ handle can be called multiple times
without the ring's seqno value changing. This patch reduces the
overhead of these extra calls by caching the last processed seqno
value and early exiting if it has not changed.
v3: New patch for series.
v5: Added comment about last_irq_seqno usage due to code review
feedback (Tvrtko Ursulin).
v6: Minor update to resolve a race condition with the wait_request
optimisation.
v7: Updated to newer nightly - lots of ring -> engine renaming plus an
interface change to get_seqno().
For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|
|
Added the '_complete' trace event which occurs when a fence/request is
signaled as complete. Also moved the notify event from the IRQ handler
code to inside the notify function itself.
v3: Added the current ring seqno to the notify trace point.
v5: Line wrapping to keep the style checker happy.
v7: Updated to newer nightly (lots of ring -> engine renaming).
For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|
|
The intended usage model for struct fence is that the signalled status
should be set on demand rather than polled. That is, there should not
be a need for a 'signaled' function to be called everytime the status
is queried. Instead, 'something' should be done to enable a signal
callback from the hardware which will update the state directly. In
the case of requests, this is the seqno update interrupt. The idea is
that this callback will only be enabled on demand when something
actually tries to wait on the fence.
This change removes the polling test and replaces it with the callback
scheme. Each fence is added to a 'please poke me' list at the start of
i915_add_request(). The interrupt handler then scans through the 'poke
me' list when a new seqno pops out and signals any matching
fence/request. The fence is then removed from the list so the entire
request stack does not need to be scanned every time. Note that the
fence is added to the list before the commands to generate the seqno
interrupt are added to the ring. Thus the sequence is guaranteed to be
race free if the interrupt is already enabled.
Note that the interrupt is only enabled on demand (i.e. when
__wait_request() is called). Thus there is still a potential race when
enabling the interrupt as the request may already have completed.
However, this is simply solved by calling the interrupt processing
code immediately after enabling the interrupt and thereby checking for
already completed requests.
Lastly, the ring clean up code has the possibility to cancel
outstanding requests (e.g. because TDR has reset the ring). These
requests will never get signalled and so must be removed from the
signal list manually. This is done by setting a 'cancelled' flag and
then calling the regular notify/retire code path rather than
attempting to duplicate the list manipulatation and clean up code in
multiple places. This also avoid any race condition where the
cancellation request might occur after/during the completion interrupt
actually arriving.
v2: Updated to take advantage of the request unreference no longer
requiring the mutex lock.
v3: Move the signal list processing around to prevent unsubmitted
requests being added to the list. This was occurring on Android
because the native sync implementation calls the
fence->enable_signalling API immediately on fence creation.
Updated after review comments by Tvrtko Ursulin. Renamed list nodes to
'link' instead of 'list'. Added support for returning an error code on
a cancelled fence. Update list processing to be more efficient/safer
with respect to spinlocks.
v5: Made i915_gem_request_submit a static as it is only ever called
from one place.
Fixed up the low latency wait optimisation. The time delay between the
seqno value being to memory and the drive's ISR running can be
significant, at least for the wait request micro-benchmark. This can
be greatly improved by explicitly checking for seqno updates in the
pre-wait busy poll loop. Also added some documentation comments to the
busy poll code.
Fixed up support for the faking of lost interrupts
(test_irq_rings/missed_irq_rings). That is, there is an IGT test that
tells the driver to loose interrupts deliberately and then check that
everything still works as expected (albeit much slower).
Updates from review comments: use non IRQ-save spinlocking, early exit
on WARN and improved comments (Tvrtko Ursulin).
v6: Updated to newer nigthly and resolved conflicts around the
wait_request busy spin optimisation. Also fixed a race condition
between this early exit path and the regular completion path.
v7: Updated to newer nightly - lots of ring -> engine renaming plus an
interface change on get_seqno(). Also added a list_empty() check
before acquring spinlocks and doing list processing.
For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
The request structure is reference counted. When the count reached
zero, the request was immediately freed and all associated objects
were unrefereced/unallocated. This meant that the driver mutex lock
must be held at the point where the count reaches zero. This was fine
while all references were held internally to the driver. However, the
plan is to allow the underlying fence object (and hence the request
itself) to be returned to other drivers and to userland. External
users cannot be expected to acquire a driver private mutex lock.
Rather than attempt to disentangle the request structure from the
driver mutex lock, the decsion was to defer the free code until a
later (safer) point. Hence this patch changes the unreference callback
to merely move the request onto a delayed free list. The driver's
retire worker thread will then process the list and actually call the
free function on the requests.
v2: New patch in series.
v3: Updated after review comments by Tvrtko Ursulin. Rename list nodes
to 'link' rather than 'list'. Update list processing to be more
efficient/safer with respect to spinlocks.
v4: Changed to use basic spinlocks rather than IRQ ones - missed
update from earlier feedback by Tvrtko.
v5: Improved a comment to keep the style checker happy.
v7: Updated to newer nightly (lots of ring -> engine renaming). Also
added a list_empty() check before wasting time with spinlocks and list
processing.
For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
If an execbuff IOCTL call fails for some reason, it would leave the
request in the client list. The request clean up code would remove
this but only later on and only after the reference count has dropped
to zero. The entire sequence is contained within the driver mutex
lock. However, there is still a hole such that any code which does not
require the mutex lock could still find the request on the client list
and start using it. That would lead to broken reference counts, use of
dangling pointers and all sorts of other nastiness.
The throttle IOCTL in particular does not acquire the mutex and does
process the client list. And the likely situation of the execbuff
IOCTL failing is when the system is busy with lots of work
outstanding. That is exactly the situation where the throttle IOCTL
would try to wait on a request.
Currently, this hole is tiny - the gap between the reference count
dropping to zero and the free function being called in response.
However the next patch in this series enlarges that gap considerably
by deferring the free function (to remove the need for the mutex lock
when unreferencing requests).
v7: New patch in series.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|
|
The fence object used inside the request structure requires a sequence
number. Although this is not used by the i915 driver itself, it could
potentially be used by non-i915 code if the fence is passed outside of
the driver. This is the intention as it allows external kernel drivers
and user applications to wait on batch buffer completion
asynchronously via the dma-buff fence API.
To ensure that such external users are not confused by strange things
happening with the seqno, this patch adds in a per context timeline
that can provide a guaranteed in-order seqno value for the fence. This
is safe because the scheduler will not re-order batch buffers within a
context - they are considered to be mutually dependent.
v2: New patch in series.
v3: Renamed/retyped timeline structure fields after review comments by
Tvrtko Ursulin.
Added context information to the timeline's name string for better
identification in debugfs output.
v5: Line wrapping and other white space fixes to keep style checker
happy.
v7: Updated to newer nightly (lots of ring -> engine renaming).
For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
The change to the implementation of i915_gem_request_completed() means
that the lazy coherency flag is no longer used. This can now be
removed to simplify the interface.
v6: Updated to newer nightly and resolved conflicts.
v7: Updated to newer nightly (lots of ring -> engine renaming).
For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
|
|
There is a construct in the linux kernel called 'struct fence' that is
intended to keep track of work that is executed on hardware. I.e. it
solves the basic problem that the drivers 'struct
drm_i915_gem_request' is trying to address. The request structure does
quite a lot more than simply track the execution progress so is very
definitely still required. However, the basic completion status side
could be updated to use the ready made fence implementation and gain
all the advantages that provides.
This patch makes the first step of integrating a struct fence into the
request. It replaces the explicit reference count with that of the
fence. It also replaces the 'is completed' test with the fence's
equivalent. Currently, that simply chains on to the original request
implementation. A future patch will improve this.
v3: Updated after review comments by Tvrtko Ursulin. Added fence
context/seqno pair to the debugfs request info. Renamed fence 'driver
name' to just 'i915'. Removed BUG_ONs.
v5: Changed seqno format in debugfs to %x rather than %u as that is
apparently the preferred appearance. Line wrapped some long lines to
keep the style checker happy.
v6: Updated to newer nigthly and resolved conflicts. The biggest issue
was with the re-worked busy spin precursor to waiting on a request. In
particular, the addition of a 'request_started' helper function. This
has no corresponding concept within the fence framework. However, it
is only ever used in one place and the whole point of that place is to
always directly read the seqno for absolutely lowest latency possible.
So the simple solution is to just make the seqno test explicit at that
point now rather than later in the series (it was previously being
done anyway when fences become interrupt driven).
v7: Rebased to newer nightly - lots of ring -> engine renaming and
interface change to get_seqno().
For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
For: VIZ-0000
Signed-off-by: Do Not Submit <DoNotSubmit@Nowhere.com>
|
|
|
|
Conflicts:
drivers/cpufreq/intel_pstate.c
|
|
|
|
|
|
|
|
|