Age | Commit message (Collapse) | Author | Files | Lines |
|
Adds support for per-client engine busyness stats i915 exports in sysfs
and produces output like the below:
==========================================================================
intel-gpu-top - 935/ 935 MHz; 0% RC6; 14.73 Watts; 1097 irqs/s
IMC reads: 1401 MiB/s
IMC writes: 4 MiB/s
ENGINE BUSY MI_SEMA MI_WAIT
Render/3D/0 63.73% |███████████████████ | 3% 0%
Blitter/0 9.53% |██▊ | 6% 0%
Video/0 39.32% |███████████▊ | 16% 0%
Video/1 15.62% |████▋ | 0% 0%
VideoEnhance/0 0.00% | | 0% 0%
PID NAME RCS BCS VCS VECS
4084 gem_wsim |█████▌ ||█ || || |
4086 gem_wsim |█▌ || ||███ || |
==========================================================================
Apart from the existing physical engine utilization it now also shows
utilization per client and per engine class.
v2:
* Version to match removal of global enable_stats toggle.
* Plus various fixes.
v3:
* Support brief backward jumps in client stats.
v4:
* Support device selection.
v5:
* Rebase for class aggregation.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
TIOCGWINSZ returns zero columns and rows on serial so lets assume 80x24.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Similarly to how top(1) handles SMP, we can default to showing engines of
a same class as a single bar graph entry.
To achieve this a little bit of hackery is employed. PMU sampling is left
as is and only at the presentation layer we create a fake set of engines,
one for each class, summing and normalizing the load respectively.
v2:
* Fix building the aggregated engines.
* Tidy static variable handling.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Analoguous to top(1) we can enable the user to exit from the tool by
pressing 'q' on the console.
v2:
* Fix sleep period with closed stdin. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Ask for CLOCK_MONOTONIC which is more stable than the default perf clock.
(Ability to select a clock has been available since kernel version 4.1.)
The change should not have any significant impact on the IGT as whole
apart from maybe improving the occasional jitter in tests/tools which use
nanosleep(2) and use time slept together with perf reported time delta
either in direct or indirect calculations.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
igt_free_spins() took the lock to iterate the list, igt_spin_free() took
the lock to remove the list element. We only want one.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2823
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Measure the sample gt-awake time while each engine and every engine is
busy. They should all report the same duration, the elapsed runtime of
the batch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
|
|
In order to prevent issues with 32b stateless address, the last page
under 4G is excluded for non-48b objects.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
We cannot assume we know how many PMU there are exactly, so pick -1ULL
to represent all invalid metrics. Similarly, we have to rely on explicit
testing for each PMU to prove their existence and correct functioning.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
If we let an object idle in a shared GTT, it may be evicted by the
kernel in favour of another client. Thus, we have to be very careful
when asserting that two different executions of the same object will
be at the same address. If there's an idle point between the two
asserts, it will only be guaranteed to hold for full-ppgtt.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2810
References: c20db20a7cd7 ("i915/api_intel_bb: Only assert objects are unmoved for full-ppgtt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Acked-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
|
|
Made a silly mistake and didn't update bool success to true,
fixing that with this patch.
Signed-off-by: Kunal Joshi <kunal1.joshi@intel.com>
Reviewed-by: Karthik B S <karthik.b.s@intel.com>
|
|
In kms_atomic_transition subtest,it will test all the
pipes defined in IGT, i.e. IGT_MAX_PIPES whether
the output is available or not.Later it has to be
analysed and discarded as valid skip.To save this
time, updated the test to first check the available
outputs and then execute the test.
V7: -Modified commit message. (Karthik)
-Replaced for_each_single_pipe_with_single__output with
for_each_connected_output to execute the test only on
connected display. (Karthik)
v8: -Modified the description subject line. (Petri)
-Modified subtests names to remove redundancy. (Petri)
-Added extra line before for loop. (Karthik)
v9: -Made separate subtests for non-blocking and
fencing parameters. (Petri)
Signed-off-by: Nidhi Gupta <nidhi1.gupta@intel.com>
Reviewed-by: Karthik B S <karthik.b.s@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
|
|
With full-ppgtt, userspace has complete control over their GTT. Verify
that we can place an object at the very beginning and the very end of
our GTT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Without opting into 48B addressing, objects are strictly limited to
being placed only the first (4G - 4K). This is to avoid an issue with
stateless 32b addressing being unable to access the last 32b page.
Assert that we do indeed fail to fit in a 4G object without setting the
EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag.
Reported-by: CQ Tang <cq.tang@intel.com>
References:: 48ea1e32c39d ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1 page")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: CQ Tang <cq.tang@intel.com>
|
|
Commit 265f1d4a5a14 ("tests/device_reset: Work around for driver unbind
issue with audio") introduced an assertion on is_i915_device() by way of
intel_get_drm_devid(). Since this is a work-around for an i915-specific
issue, guard it with a check on the device type so the test runs on
other devices.
Signed-off-by: Jeremy Cline <jcline@redhat.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
RC6 should work before suspend, and continue to increment while idle
after suspend. Should.
v2: Include a longer sleep after suspend; it appears we are reticent to
idle so soon after waking up.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Refactor the allocation such that we utilise just enough memory pressure
to invoke the shrinker, and just enough processes to spread across the
CPUs and contend on the shrinker.
v2: Reduce over-allocation from mem_size/2 to mem_size/8, and 9
processes per cpu.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Check that timeslices for an oversaturated system (where there is more
work than can be supported by a single engine) are evenly distributed
between the clients.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
realloc() and friends return NULL if they fail; simplify the
new_escaped_json_string() by allocating all the necessary memory
up-front and checking for a failed allocation.
new_escaped_json_string() can already return NULL since
json_oject_new_string_len() returns NULL for various undocumented error
paths, and NULL is valid input for json_object_object_add(), which this
new_escaped_json_string() is currently exclusively used with. Thus,
returning NULL when memory allocation fails should be safe.
Signed-off-by: Jeremy Cline <jcline@redhat.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
|
|
We're about to remove the filtering in i915 on 0 reason fields because
we assume this was a possibility, but it turned out to be a corruption
in the tail pointer register that made us read cleared data.
This test is here to verify that our assumption hold that the HW never
produces such reports.
v2: Fix len checking (Umesh)
Count report lost events (Umesh)
Check report sanity (Umesh)
Limit test to 3 times the OA buffer (Umesh)
v3: Bump OA sampling frequency to 20us, seeing some lost buffer
failures on KBL (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> (v2)
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
If we know the right device fd, we can find the exact matching pci
device.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
|
|
Replace the open-coded value of 0x3 with the I915_CONTEXT_PARAM_GTT_SIZE
from i915_drm.h
Suggested-by: Bruce Chang <yu.bruce.chang@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Bruce Chang <yu.bruce.chang@intel.com>
|
|
Oversaturate the virtual engines on the system and check that each
workload receives a fair share of the available GPU time.
v2: Apply a modicum of statistical integrity.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
core_setmaster does special trickery to the dri device files before it
calls drm_open_device. Now that drm_load_module() exists for just
loading the modules without the other things that drm_open_device()
does, use it to ensure the files exist.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Closes: https://gitlab.freedesktop.org/drm/igt-gpu-tools/-/issues/91
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
In order to find the correct aperture size for the test, we want to pass
the test's device into the query.
Reported-by: Bruce Chang <yu.bruce.chang@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
|
|
While this regularly breaks upstream, that is also a good reason to keep
testing! Let's see if upstream is in a working mood.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
|
|
We always expect to be able to create new buffer, regardless of the
state of the GPU.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Matthew Auld <matthew.auld@intel.com>
|
|
Report we cannot run the timeline tests (SKIP) if the kernel doesn't
support the API.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
|
|
Matthew postulated that we should be able to hit a race in
__i915_vm_close() between the RCU object free and vma unbind viz
GEM_BUG_ON(!list_empty(&vm->bound_list));
due to the effect of leaving the vma on the list if we are unable to
obtain the kref to the object. Let's try and find that race.
In practice, this does not happen because to race the object free vma
cleanup against vm close requires a leak of a ppGTT vma.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Acked-by: Matthew Auld <matthew.auld@intel.com>
|
|
Submit a chain of spinners across all the engines, using the submit
fence to launch them in parallel.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
|
|
Concurrent access to a mmap is covered by gem_mmap_gtt/concurrent,
if we add tiled access to it, we make gem_threaded_access_tiled entirely
redundant.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
|
|
Use intel_bb / intel_buf to remove libdrm dependency.
Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Acked-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Simplify the cross-check by asserting that the existence of an engine in
the list matches the existence of the engine as reported by GETPARAM.
By using the comparison, we check both directions at once.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Check that every engine listed can be used in execbuf.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
|
|
We may still be interested in results of a test even if it has tainted
the kernel. On the other hand, we need to kill the test on taint if no
other means of killing it on a jam is active.
If abort on both kernel taint or a timeout is requested, decrease all
potential timeouts significantly while the taint is detected instead of
aborting immediately. However, report the taint as the reason of the
abort if a timeout decreased by the taint expires.
v2: Fix missing show_kernel_task_state() lost on rebase conflict
resolution (Chris - thanks!)
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
|
|
Instead of going through all the delta even if we got success with one,
now breaking when we pass, thus saving some time and decreasing load on
chamelium for capturing the rest of the frames.
Signed-off-by: Kunal Joshi <kunal1.joshi@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
|
|
Do a check to see if we support a pollable spinner before forking to
avoid upsetting libigt.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Race the execution and interrupt handlers along a context, while
closing it at a random time.
v2: Some comments to handwave away the knowledge of internal
implementation details.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Use the elapsed time for the fastest timeslice to define the acceptable
error threshold. This gives us a lot more leeway for the slow devices.
Maybe too much leeway?
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
|
|
If we let an object idle in a shared GTT, it may be evicted by the
kernel in favour of another client. Thus, we have to be very careful
when asserting that two different executions of the same object will
be at the same address. If there's an idle point between the two
asserts, it will only be guaranteed to hold for full-ppgtt.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2754
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
|
|
On packed formats there's no need to show separate colors on separate fbs.
Here augmented path with packed color handling which check colors just on
one fb.
This cannot be directly applied to planar yuv formats because of scaler
use with those formats.
On my ICL this drop test execution time from 44.91s to 21.98s
v2 (vsyrjala): paint entire screen instead of lines. small refinement.
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Patnana Venkata Sai <venkata.sai.patnana@intel.com>
|
|
Use the spinners to provide exactly the right amount of background
busyness.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
|
|
These tests were removed under a blatantly false statement to hide a
regression. Denying that userspace can deadlock other clients, does not
stop the violations. Introducing such deadlocks deliberately is a cause
for concern.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Default device list prefers vendor and device names. Add -n switch
to display vendor/device as hex strings.
Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
In multi device world we may want to see generation of device we're
tracking counters. Add pretty name of the device to be more verbose.
Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
If we want to use pci device id for not opened device we need to
keep it in card structure.
Export igt_device_get_pretty_name() function to the caller to
allow return pretty name (for lsgpu, intel_gpu_top and others).
Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
While we want the capture to last long enough to delay the concurrent
client, we don't want to wait forever for the capture to complete to
proceed with the testing.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2559
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
|