summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)AuthorFilesLines
2018-04-20i965/miptree: Don't leak the clear_color_boNanley Chery1-2/+1
Free the clear_color_bo in addition to freeing the intel_miptree_aux_buffer which holds the reference to it. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-20i965/blorp: Do the gen11 BTI flushJason Ekstrand1-0/+14
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-20anv/blorp: Do the gen11 BTI flushJason Ekstrand1-0/+14
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-20etnaviv: fix texture_format_needs_swizLucas Stach1-1/+1
memcmp returns 0 when both swizzles are the same, which means we don't need any hardware swizzling. texture_format_needs_swiz should return true when the return value of the memcmp is non-zero. Fixes: 751ae6afbefd ("etnaviv: add support for swizzled texture formats") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Marek Vasut <marex@denx.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2018-04-20ac/nir: fix image dimension for subpass attachmentsSamuel Pitoiset1-3/+15
For subpass attachments we need one more coordinate with the layer, so make them array types. This fixes a bunch of CTS fails with RADV. Fixes: 24fb3e6aa1 ("ac/nir: use ac_build_image_opcode for image intrinsics") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20radv: Mark GTT memory as device local for APUs.Bas Nieuwenhuizen1-3/+5
Otherwise a lot of games complain about not having enough memory, and it is sort of local so this seems reasonable to me. CC: 18.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-20radv/winsys: allow to submit up to 4 IBs for chips without chainingSamuel Pitoiset1-50/+168
The SI family doesn't support chaining which means the maximum size in dwords per CS is limited. When that limit was reached we failed to submit the CS and the application crashed. This patch allows to submit up to 4 IBs which is currently the limit, but recent amdgpu supports more than that. Please note that we can reach the limit of 4 IBs per submit but currently we can't improve that. The only solution is to upgrade libdrm. That will be improved later but for now this should fix crashes on SI or when using RADV_DEBUG=noibs. Fixes: 36cb5508e89 ("radv/winsys: Fail early on overgrown cs.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105775 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20gallium/util: Android backtrace supportStefan Schake3-1/+114
We can't use any of the existing implementations in u_debug_stack. Android technically has libunwind, but it's been modified to the point where it no longer compiles with the Mesa usage. The library is also not meant to be referenced by vendor libraries. The officially sanctioned way of obtaining backtraces is through the Android own libbacktrace, a C++ library. Access it through a separate C++ source file on Android only. Signed-off-by: Stefan Schake <stschake@gmail.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-20gallium/util: Don't stub u_debug_stack on AndroidStefan Schake1-1/+2
The fallback path for no libunwind ends up being stubs for Android. Don't compile them in so we can provide our own implementation. Signed-off-by: Stefan Schake <stschake@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-20ac/nir: handle nir_intrinsic_load_first_vertex like base_vertexSamuel Pitoiset1-2/+2
This fixes a ton of CTS crashes. Fixes: c366f422f0 ("nir: Offset vertex_id by first_vertex instead of base_vertex") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20radv/winsys: allow local BOs on APUsSamuel Pitoiset1-1/+2
Ported from RadeonSI. Local BOs ignore BO priorities, and we don't need those on APUs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20radv: use a global BO list only for VK_EXT_descriptor_indexingSamuel Pitoiset3-9/+34
Maintaining two different paths is annoying but this gets rid of the performance regression introduced by the global BO list. We might find a better solution in the future, but for now just keeps two paths. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20Revert "radv: Don't store buffer references in the descriptor set."Samuel Pitoiset5-13/+82
In order to reduce a performance regression introduced by 4b13fe55a4 ("radv: Keep a global BO list for VkMemory."), we are going to maintain two different paths. One when VK_EXT_descriptor_indexing is enabled by the application because we need to have a global BO list, and one (the old one) when it's not enabled. With Talos on Polaris, the global BO list reduces performance by 10% which is too much for me. This reverts commit ab6cadd3ecc7fbdd9079808b407674e0b19c52f0. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20i965/fs: retype offset_reg to UD at load_ssboJose Maria Casanova Crespo1-1/+1
All operations with offset_reg at do_vector_read are done with UD type. So copy propagation was not working through the generated MOVs: mov(8) vgrf9:UD, vgrf7:D This change allows removing the MOV generated for reading the first components for 16-bit and 64-bit ssbo reads with non-constant offsets. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-20ac/nir: use ac_build_image_opcode for image intrinsicsNicolai Hähnle3-140/+78
So that we'll use the dimension-aware intrinsics in the future. Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20radeonsi: generate image load/store/atomic ops using ac_build_image_opcodeNicolai Hähnle4-164/+210
In preparation of dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20amd/common: pass address components individually to ac_build_image_intrinsicNicolai Hähnle5-409/+295
This is in preparation for the new image intrinsics. Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20amd/common: pass new enum ac_image_dim to ac_build_image_opcodeNicolai Hähnle4-13/+114
This is in preparation for the new, dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20radeonsi/nir: fix crash in test involving the sample maskNicolai Hähnle1-1/+2
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20radeonsi/nir: set FS properties only when scanning a fragment shaderNicolai Hähnle1-1/+2
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20ac/nir: fix atomic compare-and-swapNicolai Hähnle1-0/+1
The LLVM instruction returns { i32, i1 }, where the i1 indicates success. We're only interested in the first part, which is the loaded value. Fixes dEQP-GLES31.functional.compute.shared_var.atomic.compswap.* Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20radeonsi: fix error paths of si_texture_transfer_mapNicolai Hähnle1-13/+12
trans is zero-initialized, but trans->resource is setup immediately so needs to be dereferenced. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20glsl: prevent spurious Valgrind errors when serializing NIRNicolai Hähnle1-2/+4
It looks as if the structure fields array is fully initialized below, but in fact at least gcc in debug builds will not actually overwrite the unused bits of bit fields. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-19clover: Fix host access validation for sub-buffer creationAaron Watry1-2/+7
From CL 1.2 Section 5.2.1: CL_INVALID_VALUE if buffer was created with CL_MEM_HOST_WRITE_ONLY and flags specify CL_MEM_HOST_READ_ONLY , or if buffer was created with CL_MEM_HOST_READ_ONLY and flags specify CL_MEM_HOST_WRITE_ONLY , or if buffer was created with CL_MEM_HOST_NO_ACCESS and flags specify CL_MEM_HOST_READ_ONLY or CL_MEM_HOST_WRITE_ONLY . Fixes CL 1.2 CTS test/api get_buffer_info v2: Correct host_access_flags check (Francisco) Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-04-19nir: Offset vertex_id by first_vertex instead of base_vertexNeil Roberts6-13/+6
base_vertex will be zero for non-indexed calls and in that case we need vertex_id to be offset by the ‘first’ parameter instead. That is what we get with first_vertex. This is true for both GL and Vulkan. The freedreno driver is also setting vertex_id_zero_based on nir_options. In order to avoid breakage this patch switches the relevant code to handle SYSTEM_VALUE_FIRST_VERTEX so that it can retain the same behavior. v2: change a3xx/fd3_emit.c and a4xx/fd4_emit.c from SYSTEM_VALUE_BASE_VERTEX to SYSTEM_VALUE_FIRST_VERTEX (Kenneth). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: Rob Clark <robdclark@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-19spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEXNeil Roberts3-5/+18
The base vertex in Vulkan is different from GL in that for non-indexed primitives the value is taken from the firstVertex parameter instead of being set to zero. This coincides with the new SYSTEM_VALUE_FIRST_VERTEX instead of BASE_VERTEX. v2 (idr): Add comment describing why SYSTEM_VALUE_FIRST_VERTEX is used for SpvBuiltInBaseVertex. Suggested by Jason. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-19intel: Handle firstvertex in an identical way to BaseVertexAntia Puentes7-13/+35
Until we set gl_BaseVertex to zero for non-indexed draw calls both have an identical value. The Vertex Elements are kept like that: * VE 1: <BaseVertex/firstvertex, BaseInstance, VertexID, InstanceID> * VE 2: <Draw ID, 0, 0, 0> v2 (idr): Mark nir_intrinsic_load_first_vertex as "unreachable" in emit_system_values_block and fs_visitor::nir_emit_vs_intrinsic.
2018-04-19intel/compiler: Add a uses_firstvertex flagNeil Roberts2-0/+5
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-19compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsicsAntia Puentes5-0/+21
This VS system value will contain the value passed as <basevertex> for indexed draw calls or the value passed as <first> for non-indexed draw calls. It can be used to calculate the gl_VertexID as SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX. From the OpenGL 4.6 spec, 10.4 "Drawing Commands Using Vertex Arrays": - Page 352: "The index of any element transferred to the GL by DrawArraysOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is first + i." - Page 355: "The index of any element transferred to the GL by DrawElementsOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is the sum of basevertex and the value stored in the currently bound element array buffer at offset indices + i." Currently the gl_VertexID calculation uses SYSTEM_VALUE_BASE_VERTEX but this will have to change when the value of gl_BaseVertex is fixed. Currently its value is broken for non-indexed draw calls because it must be zero but we are setting it to <first>. v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth). v3 (idr): Rebase on Rob Clark converting nir_intrinsics.h to be generated. Reformat commit message to 72 columns. Reviewed-by: Neil Roberts <nroberts@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-19meson: Build st_tests_common with gtestMike Lothian1-1/+1
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106131 Fixes: 34cb4d0ebc1 ("meson: build tests for gallium mesa state tracker") Signed-off-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-19radv: Add Vega M support.Bas Nieuwenhuizen4-2/+11
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-19radv: Add bound checking workaround for dynamic buffers.Bas Nieuwenhuizen3-1/+5
I have seen a few applications and games do the dynamic buffer bounds incorrectly, this make it easier to work around, e.g. for debugging. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-19svga: Fix incorrect advertizing of EGL_KHR_gl_colorspaceThomas Hellstrom1-1/+1
When advertizing this extension, egl_dri2 uses the DRI2_RENDERER_QUERY extension to query whether an sRGB format is supported. That extension will query our driver with the BIND flag PIPE_BIND_RENDER_TARGET rather than PIPE_BIND_DISPLAY_TARGET which is used when building the configs. We only return the correct value for PIPE_BIND_DISPLAY_TARGET. The inconsistency causes EGL to crash at surface initialization if sRGB is not supported. Fix this by supporting both bind flags. Testing done: piglit egl_gl_colorspace srgb Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-19swr: Fix include for createPromoteMemoryToRegisterPassMike Lothian1-0/+3
Include llvm/Transforms/Utils.h with the newest LLVM 7 v2: Include with " " rather than < > (Vinson Lee) v3: Use LLVM_VERSION_MAJOR rather than HAVE_LLVM (George Kyriazis) Signed-of-by: Mike Lothian <mike@fireburn.co.uk> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-04-19radv: enable DCC for MSAA 2x textures on VI under an optionSamuel Pitoiset4-1/+13
This can be enabled with RADV_PERFTEST=dccmsaa. DCC for MSAA textures is actually not as easy to implement. It looks like there is some corner cases. I will improve support incrementally. Vega support, as well as Polaris improvements, will be added later. No CTS changes on Polaris using RADV_DEBUG=zerovram and RADV_PERFTEST=dccmsaa. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19radv: decompress DCC for multisampled source images before resolvingSamuel Pitoiset4-4/+18
Multisampled source images (ie. color attachments) can be now DCC compressed, so the driver needs to perform a DCC decompression pass before resolving Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19radv: add a workaround for fast clears with DCC and MSAA texturesSamuel Pitoiset1-0/+9
This should be fixed at some point in order to improve performance. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19radv: allocate CMASK for DCC fast clear with MSAASamuel Pitoiset1-0/+7
CMASK is required because it should be cleared to 0xCCCCCCCC for MSAA textures. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19radv: implement fast color clear for DCC with MSAASamuel Pitoiset1-1/+16
When DCC is enabled with MSAA textures, CMASK should be cleared to 0xCCCCCCCC. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19radv: make sure to sync after resolving using the compute pathSamuel Pitoiset1-0/+3
This fixes some random CTS failures: dEQP-VK.renderpass.multisample.*. Performing a fast-clear eliminate is still useless, but it seems that we need to sync. Found while running CTS with RADV_DEBUG=zerovram. Fixes: 56a171a499c ("radv: don't fast-clear eliminate after resolving a subpass with compute") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19radv: dump the SHA1 of SPIRV in the hang reportSamuel Pitoiset1-1/+8
Might be useful for debugging purposes, especially when we want to replace a shader on the fly. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-18radv: Enable VK_EXT_descriptor_indexing.Bas Nieuwenhuizen3-0/+63
This adds everything except non-uniform indexing, which needs a bit more work and testing. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18spirv: Add support for runtime descriptor array cap.Bas Nieuwenhuizen2-0/+5
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18spirv: Add support for VK_EXT_descriptor_indexing uniform indexing caps.Bas Nieuwenhuizen2-0/+7
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18radv: Support allocating variable size descriptor sets.Bas Nieuwenhuizen1-4/+17
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18radv: Add support for variable descriptor set layouts.Bas Nieuwenhuizen2-1/+30
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18radv: Fix GetDescriptorSetLayoutSupport.Bas Nieuwenhuizen1-3/+0
The continue means we do alignment differently than during creation, making the buffer smaller than expected. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18radv: Use sorted bindings for set layout creation.Bas Nieuwenhuizen1-2/+41
Previously we did not care about havin the set storage in order, but for variable descriptor count we want the highest binding at the end of the storage. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18radv: Don't store buffer references in the descriptor set.Bas Nieuwenhuizen5-82/+13
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18radv: Keep a global BO list for VkMemory.Bas Nieuwenhuizen4-39/+146
With update after bind we can't attach bo's to the command buffer from the descriptor set anymore, so we have to have a global BO list. I am somewhat surprised this works really well even though we have implicit synchronization in the WSI based on the bo list associations and with the new behavior every command buffer is associated with every swapchain image. But I could not find slowdowns in games because of it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>