Age | Commit message (Collapse) | Author | Files | Lines |
|
When cross-compiling OpenCL support, clover will encode the LLVM library
path so that it can add the proper directory (containing opencl-c.h) to
the include path during runtime compilation of programs. In order for
that to work, the LLVM library directory needs to be an absolute path in
the host filesystem.
However, during cross-compilation the LLVM library directory will also
be used to find the clang modules to link against. But at build time the
clang modules will have to be looked up in th LLVM library directory
within the host sysroot, which is a cross-compilation staging area that
is located in an arbitrary directory on the build filesystem.
However, the library search path provided by the dep_llvm dependency
contains the correct path to the clang modules at build time, so the
dependency can be passed to the cc.find_library() command to properly
check for the existence of the library and whether it can actually be
linked to.
NOTE: This depends on a patch that hasn't been merged into Meson yet.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Using empty lists ([]) for placeholder dependencies causes warnings from
recent versions of Meson about how dependency types cannot be compared
to a list.
To fix that, use proper placeholder dependencies using:
dependency('', required : False)
This creates an object that can be used in the same context as normal
dependencies, but it is marked as not found and doesn't provide any
flags or libraries on compiler command-lines.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
declare_dependency() returns a project internal dependency which is
always marked as found. However, that's not what we want for default
external dependencies. External dependencies should by default not be
found so that conditionals will behave correctly.
The proper way to do that is using dependency('', required : false).
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
|
|
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Allow USB devices to be used as output slaves for PRIME. Note that this
currently doesn't work on the X.Org server's built-in modesetting driver
because it requires glamor in order to expose the necessary capabilities
through RandR.
It should be possible to use this in order to accelerate Wayland clients
on the GPU, though it's questionable how useful that is without having a
compositor that gets accelerated.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
A single blank line is enough to separate functions from each other, no
need for two.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
This can be useful for debugging purposes because the flush flag names
are easier to read for humans than the numerical values.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Adds a simple helper that can be used to dump the name of a framebuffer
modifier for debug purposes.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Implements fence FDs based on new libdrm API and the accompanying IOCTL.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Resources created with modifiers are treated as scanout because there is
no way for applications to specify the usage (though that capability may
be useful to have in the future). Currently all the resources created by
applications with modifiers are for scanout, so make sure they have bind
flags set accordingly.
This is necessary in order to properly export buffers for such resources
so that they can be shared with scanout hardware.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Resources created for scanout but without modifiers need to be treated
as pitch-linear. This is because applications that don't use modifiers
to create resources must be assumed to not understand modifiers and in
turn won't be able to create a DRM framebuffer and passing along which
modifiers were picked by the implementation.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Disabled by default for now, it can be enabled with
RADV_PERFTEST=outoforder.
No CTS regressions on Polaris, and all Vulkan games I tested
look good as well.
Expect small performance improvements for applications where
out-of-order rasterization can be enabled by the driver.
Loosely based on RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
With cb_target_enabled_4bit in order to have four bits per CB.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
Some will be used for further optimizations (ie. out-of-order rast).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
For GFX9+ only, RadeonSI does this too.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
Ported from RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
Ported from RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
../src/intel/compiler/brw_reg.h: In function ‘bool brw_regs_negative_equal(const brw_reg*, const brw_reg*)’:
../src/intel/compiler/brw_reg.h:305:1: warning: control reaches end of non-void function [-Wreturn-type]
Introduced by 8f83eea71e233 ("i965: Add negative_equals methods").
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
From the SPIR-V spec, OpTypeImage:
"Depth is whether or not this image is a depth image. (Note that
whether or not depth comparisons are actually done is a property of
the sampling opcode, not of this type declaration.)"
The sampling opcodes that specify depth comparisons are
OpImageSample{Proj}Dref{Explicit,Implicit}Lod, so we should set
is_shadow only for these (we were using the deph property of the
image until now).
v2:
- Do the same for OpImageDrefGather.
- Set is_shadow to false if the sampling opcode is not one of these (Jason)
- Reuse an existing switch statement instead of adding a new one (Jason)
Fixes crashes in:
dEQP-VK.spirv_assembly.instruction.graphics.image_sampler.depth_property.*
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
|
|
Gen8+ use 48-bit address relocations so need to extend the sign
to 64-bit return value. Without it we have higher bits zeroed
and missing the negavive values.
Haswell and older use 32-bit deltas so are unaffected by this issue.
v2:
used int32_t fucntion parameter instead of explicit type conversion.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101408
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Tested-by: Andriy Khulap <andriy.khulap@globallogic.com>
Tested-by: Stuart Young <cefiar@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0 17.3" <mesa-stable@lists.freedesktop.org>
|
|
Otherwise we may end up trying to coalesce in a case such as
ssa_1 = fadd r1, r2
r3.x = fneg(r2);
r3 = vec4(ssa_1, ssa_1.y, ...)
and that would cause us to move the writes to r3 from the vec to the
fadd which would re-order them with respect to the write from the fneg.
In order to solve this, we just don't coalesce if the destination of the
vec is not SSA. We could try to get clever and still coalesce if there
are no writes to the destination of the vec between the vec and the ALU
source. However, since registers only come from phi webs and indirects,
the chances of having a vec with a register destination that is actually
coalescable into its source is very slim.
Shader-db results on Haswell:
total instructions in shared programs: 13657906 -> 13659101 (<.01%)
instructions in affected programs: 149291 -> 150486 (0.80%)
helped: 0
HURT: 592
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105440
Fixes: 2458ea95c56 "nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible"
Reported-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Tested-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
If we close the fd before calling DRM_IOCTL_PRIME_FD_TO_HANDLE the kernel
will hit a -EBADF error. Move the close(fd) call to the end of
anv_CreateDmaBufImageINTEL().
Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
This fixes a bug in radeonsi where LLVM cannot handle the case where
a break exists but its not the last instruction in the block.
LLVM would fail with:
Terminator found in the middle of a basic block!
LLVM ERROR: Broken function found, compilation aborted!
Fixes: 96fe8834f539 "glsl_to_tgsi: do fewer optimizations with GLSLOptimizeConservatively"
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105317
|
|
Fixes: bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Reviewed-by: Daniel Stone <daniels@collabora.com>
CC: Jason Ekstrand <jason@jlekstrand.net>
|
|
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
|
|
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit ba371c7262a484391cace9d5e17635ed14c58692)
|
|
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 3bf5c10c5c0e9fac6eb0b2c201bcf44755ecfaec)
|
|
When running virgl on a GLES host the only sRGB formats that support
rendering is RGBA and RGBX. That pipe format is in the sRGB default
lists that the state tracker uses when mapping mesa formats.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
|
|
Prior to printing a decoded field, print out all dwords that field
belongs to. In particular with address fields spanning multiple
dwords, we want to have all the dwords presented before the field is
decoded to make it easier to read.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
|
|
MI_LOAD_REGISTER_IMM can load multiple (register, value) tuples in one
command. In our drivers we only use one tuple at a time, but the
kernel might load more than one at a time.
Instead of making all the tuple part of a group, we leave out the
first tuple (the one we use in the generated packing structures).
This is particularly useful for looking at error stats generated by
the kernel.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
|
|
For example, a PIPE_CONTROL with DWordLength = 2 should look like
this :
0xffffe374: 0x7a000002: PIPE_CONTROL
0xffffe374: 0x7a000002 : Dword 0
DWord Length: 2
0xffffe378: 0x00800000 : Dword 1
Depth Cache Flush Enable: false
Stall At Pixel Scoreboard: false
State Cache Invalidation Enable: false
Constant Cache Invalidation Enable: false
VF Cache Invalidation Enable: false
DC Flush Enable: false
Pipe Control Flush Enable: false
Notify Enable: false
Indirect State Pointers Disable: false
Texture Cache Invalidation Enable: false
Instruction Cache Invalidate Enable: false
Render Target Cache Flush Enable: false
Depth Stall Enable: false
Post Sync Operation: 0 (No Write)
Generic Media State Clear: false
TLB Invalidate: false
Global Snapshot Count Reset: false
Command Streamer Stall Enable: false
Store Data Index: 0
LRI Post Sync Operation: 1 (MMIO Write Immediate Data)
Destination Address Type: 0 (PPGTT)
Flush LLC: false
0xffffe37c: 0x00000000 : Dword 2
Address: 0x00000000
0xffffe384: 0x05000000: MI_BATCH_BUFFER_END
Prior to this change, fields beyond the length of the command would be
decoded (notice the MI_BATCH_BUFFER_END decoded as part of the
previous PIPE_CONTROL) :
0xffffe374: 0x7a000002: PIPE_CONTROL
0xffffe374: 0x7a000002 : Dword 0
DWord Length: 2
0xffffe378: 0x00800000 : Dword 1
Depth Cache Flush Enable: false
Stall At Pixel Scoreboard: false
State Cache Invalidation Enable: false
Constant Cache Invalidation Enable: false
VF Cache Invalidation Enable: false
DC Flush Enable: false
Pipe Control Flush Enable: false
Notify Enable: false
Indirect State Pointers Disable: false
Texture Cache Invalidation Enable: false
Instruction Cache Invalidate Enable: false
Render Target Cache Flush Enable: false
Depth Stall Enable: false
Post Sync Operation: 0 (No Write)
Generic Media State Clear: false
TLB Invalidate: false
Global Snapshot Count Reset: false
Command Streamer Stall Enable: false
Store Data Index: 0
LRI Post Sync Operation: 1 (MMIO Write Immediate Data)
Destination Address Type: 0 (PPGTT)
Flush LLC: false
0xffffe37c: 0x00000000 : Dword 2
Address: 0x00000000
0xffffe380: 0x00000000 : Dword 3
0xffffe384: 0x05000000 : Dword 4
Immediate Data: 83886080
0xffffe384: 0x05000000: MI_BATCH_BUFFER_END
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
|
|
The kernel reports workaround batch buffers, but we're not presenting
them currently. Also they might not be useful for debugging purely
userspace driver issues, when problems arise because of interactions
between kernel & userspace drivers, it's nice to be able to decode
them.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
|
|
Helpful to debug kernel workaround batchbuffers.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
|
|
This variable controls whether we link using the glsl code path or the
spirv path. It's set when we validate that all shaders are glsl or
spirv, but if there are no shaders attached to the program it will
remain unset, resulting in undefined behavior. We want to go down the
glsl path in that case, so initialize to false.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105820
Fixes: 16f6634e7fb5ada308e55b852cd49251e7f3f8b1
("mesa/program: Link SPIR-V shaders using the SPIR-V code-path")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
|
|
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
|
|
The driver already supports exporting the Layer and ViewportIndex
built-ins from vertex or tessellation shaders.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
Add helpers to get the number of src/dest components for an intrinsic,
and update spots that were open-coding this logic to use the helpers
instead.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
Since we were MARK flag for both preventing loops, and tracking whether
instructions were used, we could end up in an infinite loop due to
bd2ca2bcdd. Instead invert the logic.. mark all instructions UNUSED
up front and clear the flag as we visit them.
Fixes: bd2ca2bcdd freedreno/ir3: eliminate unused false-deps
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Without this the return value will never get set to -1. This
was first added in 49866c8f3457 and copied in 2b396eeed983.
Fixes: 2b396eeed983 "gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102342
|
|
This reverts commit 41cf30b8bc55fdf36adac3311002dc32b6715949.
Commit caused regressions with KHR-GLES3.packed_pixels.* tests.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Eric Anholt <eric@anholt.net>
|
|
Include llvm-c/Transforms/Utils.h with the newest LLVM 7
Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
|
|
Include llvm-c/Transforms/Utils.h with the newest LLVM 7
Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
|
|
Include llvm-c/Transforms/Utils.h with the newest LLVM 7
Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
|
|
When allocating a buffer for DRI2, set the modifier to INVALID to inform
the backend that we have no supplied modifiers and it should do its own
thing. The missed initialisation forced linear, even if the
implementation had made other decisions.
This resulted in VC4 DRI2 clients failing with:
Modifier 0x0 vs. tiling (0x700000000000001) mismatch
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Andreas Müller <schnitzeltony@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes: 3f8513172ff6 ("gallium/winsys/drm: introduce modifier field to winsys_handle")
|