Age | Commit message (Collapse) | Author | Files | Lines |
|
The nvc0 subsystem of the nouveau driver
previously contained sufficient format modifier
support to report the format modifiers supported
on a given chipset and the format modifier used
by a given image/texture. However, when importing
a dma-buf object as an EGLImage, the format
modifier specified was ignored unless it was
DRM_FORMAT_MOD_LINEAR. Since the nouveau kernel
driver lacked any support for format modifiers,
this effectively limited the useage to exporting
block linear buffers on a Tegra GPU and importing
them as framebuffers with the Tegra DRM kernel
driver.
This change fleshes out the existing support and
switches to the more descriptive
DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() format
modifier macro to export support for specific
page kinds, including compressed surfaces. This
allows differentiating between surfaces compatible
with Tegra GPUs and desktop GPUs, improved
performance when using the compressed layouts, and
differentiating between formats supported by nvc0-
class hardware and other GPU families which use
slightly different variations of the block linear
buffer layout.
Note this change does not add support for format
modifiers to the NV5x driver backend, nor does
it attempt to add modifier support for Turing-
class hardware. The code would be very similar,
but each would use a slightly different set of
format modifiers than those exposed here.
XXX Need to disable compressed modifiers (0xdb page kind) for 32-bit color buffers on Fermi?
XXX Test on Fermi hardware.
XXX Test on Tegra TK1-TX2 class hardware with Tegra DRM.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Older Tegra GPUs use a different sector bit
swizzling layout than desktop and Xavier GPUs.
Hence their format modifiers must be
differentiated from those of other GPUs. As
a precursor to supporting more expressive
block linear format modifiers, deduce the
sector layout used for a given GPU from its
chipset and stash the layout in the nouveau
screen structure.
XXX Test on at least one Tegra sector layout chip.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Pulls in the definition of the more expressive
DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() format
modifier macro. This macro can be used to define
the existing DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK()-
based format modifiers, as well as format
modifiers making use of additional fields to
differentiate Tegra and Desktop GPU block linear
layouts, other "page kinds" to support on-chip
lossless framebuffer compression, CDE compression,
and to differentiate page kinds and block layouts
between the various families of NVIDIA GPUs that
support some version of the block linear layouts.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
|
|
|
|
This reverts commit a54bc1337b09a87d44371ddad5ef9d1b909c17f0.
|
|
When cross-compiling OpenCL support, clover will encode the LLVM library
path so that it can add the proper directory (containing opencl-c.h) to
the include path during runtime compilation of programs. In order for
that to work, the LLVM library directory needs to be an absolute path in
the host filesystem.
However, during cross-compilation the LLVM library directory will also
be used to find the clang modules to link against. But at build time the
clang modules will have to be looked up in th LLVM library directory
within the host sysroot, which is a cross-compilation staging area that
is located in an arbitrary directory on the build filesystem.
However, the library search path provided by the dep_llvm dependency
contains the correct path to the clang modules at build time, so the
dependency can be passed to the cc.find_library() command to properly
check for the existence of the library and whether it can actually be
linked to.
NOTE: This depends on a patch that hasn't been merged into Meson yet.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
|
|
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Allow USB devices to be used as output slaves for PRIME. Note that this
currently doesn't work on the X.Org server's built-in modesetting driver
because it requires glamor in order to expose the necessary capabilities
through RandR.
It should be possible to use this in order to accelerate Wayland clients
on the GPU, though it's questionable how useful that is without having a
compositor that gets accelerated.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
A single blank line is enough to separate functions from each other, no
need for two.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
This can be useful for debugging purposes because the flush flag names
are easier to read for humans than the numerical values.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Adds a simple helper that can be used to dump the name of a framebuffer
modifier for debug purposes.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Implements fence FDs based on new libdrm API and the accompanying IOCTL.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
And only use --process-isolation false for the quick_gl tests.
This will hopefully avoid variance in the test results that we've been
seeing lately. But even if it doesn't, it should at least help narrow
down the cause of the variance.
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
|
|
This expression was unused by the macro, probably why it didn't
register in the compilation.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
This is a more accurate description of what happens in processing the
OA reports.
Previously we only had a somewhat difficult to parse state machine
tracking the context ID.
What we really only need to do to decide if the delta between 2
reports (r0 & r1) should be accumulated in the query result is :
* whether the r0 is tagged with the context ID relevant to us
* if r0 is not tagged with our context ID and r1 is: does r0 have a
invalid context id? If not then we're in a case where i915 has
resubmitted the same context for execution through the execlist
submission port
v2: Update comment (Ken)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
If we read the OA reports late enough after the query happens, we can
get a timestamp in the report that is significantly in the past
compared to the start timestamp of the query. The current code must
deal with the wraparound of the timestamp value (every ~6 minute). So
consider that if the difference is greater than half that wraparound
period, we're probably dealing with an old report and make the caller
aware it should read more reports when they're available.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
We always add an empty buffer in the list when creating the query.
Let's set the len appropriately so that we can recognize it when we
read OA reports up to the end of a query.
We were using an 0 timestamp value associated with the empty buffer
and incorrectly assuming this was a valid value. In turn that led to
not reading enough reports and resulted in deltas added to our counter
values which should have been discarded because those would be flagged
for a different context.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
Accumulation happens between 2 reports, it can be between a start/end
report from another context. So only consider updating the hw_id of
the results when it's not already valid and that we have a valid value
to put in there.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 41b54b5faf ("i965: move OA accumulation code to intel/perf")
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
My fix wasn't totally correct as pointed out by Marek.
Ported from RadeonSI.
Fixes: deafe4cc587 ("radv/gfx10: fix primitive indices orientation for NGG GS")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
The number of loaded channels should always be > 0 now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
No pipeline-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
In d1c4e64a69e, we added a parameter to tell the back-end compiler to
ignore the param array and just push however many constants you ask it
to push. Iris doesn't want to push anything so it gives a bogus number
of parameters and trusts the back-end compiler to dead-code all of them.
Now that we can tell the back-end compiler to stop re-arranging things,
delete the hack and enable the new simpler code path.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
Fixes: 44a6b0107b37 (gallivm: add nir->llvm translation (v2))
Tested-by: Vinson Lee <vlee@freedesktop.org>
|
|
To make NIR transitioning easier, move the driver to using
texcoord semantics.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
Reject the new formats in swr to prevent crashes because it doesn't
know how to handle the new formats.
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
|
|
They don't seem to be hugely useful, and seem to be bogging down gitlab.
Signed-off-by: Rob Clark <robdclark@chromium.org>
|
|
gl_Viewport is also in the VUE header so we need to whack the read
offset to 0 and emit a default (no overrides) SBE_SWIZ entry in that
case as well.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
Fixes skopeo copy failures.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
|
|
Fixes some CTS regressions.
Fixes: e61a826f396 ("ac/llvm: fix pointer type for global atomics")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
Tomeu: - Small rebase fixups
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
|
|
This wires up the front facing value as a sysval, I'd like to
remove the other facing code but I'd need to confirm VMware
don't use it first.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
just return 0 for unbound atomic operations.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
This is no longer used.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
|
|
Now that the Mali T720 GPU is supoprted at the same level as the T760,
test it on PINE64 H64 boards.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
|
|
During the past months, Panfrost has matured considerably and several
tests stopped being flaky or failing at all.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
|
|
Support for this GPU is equal now to that of T760, so whitelist it.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
|
|
Make sure that the fragment is complete when writing it out.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
|
|
We need to always upload anyway.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
|
|
Fixes dEQP-GLES3.functional.primitive_restart.*. Note the 0x18000 value
is accidentally somehow enabling primitive restart for some reason.
I'm not sure where this value came from but let's not.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
|
|
The algorithm is as described. Nothing fancy here, just need to add some
new code paths depending on which model we're running on.
Tomeu:
- Also disable tiling when !hierarchy and !vertex_count
- Avoid creating polygon lists smaller than the minimum when
vertex_count > 0 but tile size smaller than 16 byte
- Take into account tile size when calculating polygon list size for
!hierarchy
- Allow 0-sized tiles in a single dimension
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
|
|
We've figured out most of the big pieces, and though it looks faintly
like other Midgards, it's much simpler.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
|
|
Similarly to how it's already done in the compiler, add a way to express
differences between GPU models that need to be taken into account when
assembling the cmdstream.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
|