summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-03-27wsi: Implement linux-drm-syncobj-v1explicit-syncJoshua Ashton1-5/+101
This implements explicit sync with linux-drm-syncobj-v1 for the Wayland WSI. Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-27wsi: Add common infrastructure for explicit syncJoshua Ashton3-34/+474
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-27wsi: Get timeline semaphore exportable handle typesJoshua Ashton2-1/+13
We need to know this for explicit sync Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-25wsi: Track CPU side present ordering via a serialJoshua Ashton2-0/+4
We will use this in our hueristics to pick the most optimal buffer in AcquireNextImageKHR Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-25wsi: Add acquired member to wsi_imageJoshua Ashton3-1/+19
Tracks whether this wsi_image has been acquired by the app Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-22wsi: Track if timeline semaphores are supportedJoshua Ashton2-0/+3
This will be needed before we expose and use explicit sync. Even if the host Wayland compositor supports timeline semaphores, in the case of Venus, etc the underlying driver may not. Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-22build: Add linux-drm-syncobj-v1 wayland protocolJoshua Ashton2-0/+2
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-22wsi: Add explicit_sync to wsi_drm_image_paramsJoshua Ashton2-0/+5
Allow the WSI frontend to request explicit sync buffers. Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-22wsi: Add explicit_sync to wsi_image_infoJoshua Ashton3-10/+8
Will be used in future for specifying explicit sync for Vulkan WSI when supported. Additionally cleans up wsi_create_buffer_blit_context, etc.. Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-20wsi: Pass wsi_drm_image_params to wsi_configure_prime_imageJoshua Ashton1-8/+9
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-20wsi: Pass wsi_drm_image_params to wsi_configure_native_imageJoshua Ashton1-13/+9
No need to split this out into function parameters, it's just less clean. Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-20radv: add a workaround for null IBO on GFX6Samuel Pitoiset3-0/+9
Based on PAL. Fixes dEQP-VK.draw.*nulldescriptor_maintenance_5_maintenance6 on GFX6. Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28263>
2024-03-20broadcom/ci: add new expected failuresJuan A. Suarez Romero2-67/+80
Add more expected failures that should have been included in 74be42d9a4ac. Fixes: 74be42d9a4a ("broadcom/ci: add new expected test failures") Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28298>
2024-03-20zink: do io fixup on patch variables tooMike Blumenkrantz1-1/+1
fixes spec@arb_separate_shader_objects@rendezvous by location (5 stages) cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28296>
2024-03-20radv: use dual_color_blend_by_location with Half-Life AlyxRhys Perry1-0/+4
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Ethan Lee <flibitijibibo@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10462 Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28269>
2024-03-20intel/brw: Eliminate top-level FIND_LIVE_CHANNEL & BROADCAST onceKenneth Graunke1-1/+22
brw_fs_opt_eliminate_find_live_channel eliminates FIND_LIVE_CHANNEL outside of control flow. None of our optimization passes generate additional cases of that instruction, so once it's gone, we shouldn't ever have to run the pass again. Moving it out of the loop should save a bit of CPU time. While we're at it, also clean adjacent BROADCAST instructions that consume the result of our FIND_LIVE_CHANNEL. Without this, we have to perform copy propagation to get the MOV 0 immediate into the BROADCAST, then algebraic to turn it into a MOV, which enables more copy propagation...not to mention CSE gets involved. Since this FIND_LIVE_CHANNEL + BROADCAST pattern from emit_uniformize() is really common, and it's trivial to clean up, we can do that. This lets the initial copy prop in the loop see MOV instead of BROADCAST. Zero impact on fossil-db, but less work in the optimization loop. Together with the previous patches, this cuts compile time in Borderlands 3 on Alchemist by -1.38539% +/- 0.1632% (n = 24). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>
2024-03-20intel/brw: Don't consider UNIFORM_PULL_CONSTANT_LOAD a send-from-GRFKenneth Graunke1-2/+0
It's a logical opcode which is lowered to a send-from-GRF later. That lowering code is responsible for ensuring the sources are set up in a proper SEND payload. This was preventing copy propagation of surface handles which started out as scalars, were splatted out to full-SIMD values with NoMask, then actually consumed as only component 0 (scalar again), because we thought that scalar values were not allowed. fossil-db on Alchemist shows improvements in q2rtx but no other titles: Totals: Instrs: 161310436 -> 161310152 (-0.00%) Cycles: 14370605159 -> 14370601066 (-0.00%) Totals from 17 (0.00% of 652298) affected shaders: Instrs: 16097 -> 15813 (-1.76%) Cycles: 185508 -> 181415 (-2.21%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>
2024-03-20intel/brw: Split out 64-bit lowering from algebraic optimizationsKenneth Graunke4-72/+101
We don't necessarily want to split up MOVs for 64-bit addresses into 2x 32-bit MOVs right away, as this makes things like copy propagating the whole address around harder. We should do this late, once, while still doing other algebraic optimizations earlier. fossil-db results for Alchemist show tiny improvements: Totals: Instrs: 161310502 -> 161310436 (-0.00%); split: -0.00%, +0.00% Cycles: 14370605606 -> 14370605159 (-0.00%); split: -0.00%, +0.00% Totals from 33 (0.01% of 652298) affected shaders: Instrs: 15053 -> 14987 (-0.44%); split: -0.64%, +0.20% Cycles: 196947 -> 196500 (-0.23%); split: -0.25%, +0.02% Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>
2024-03-19iris: Use resource_get_param in resource_get_handleNanley Chery1-8/+15
Refactor iris_resource_get_handle to use iris_resource_get_param to pick up the fix from the previous patch. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9994 Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28258>
2024-03-19iris: Report the correct modifier for Tile4 imagesNanley Chery1-2/+26
In iris_resource_get_param, report the Tile4 modifier for Tile4 images instead of reporting the linear modifier. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28258>
2024-03-19intel/dev: remove pci revision from shader cache keyMark Janes2-1/+9
Pci revision was included in the shader cache key because it can enable platform workarounds. While some platform workarounds exist in the compiler, none are dependent on the silicon stepping. Many platforms differ only in the pci revision id, causing needless duplication in cache entries between platforms. When a platform ships publicly with stepping-specific compiler workarounds, pci id must be incorporated into the shader cache key. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28085>
2024-03-19aco: Allow passing constant operand to is_overwritten_since.Timur Kristóf1-5/+13
This is to make it more intuitive and also consistent with last_writer_idx which does allow constant operands. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28046>
2024-03-19zink: acquire - maybe clear timeout after waiting for presentation fenceGert Wollny1-0/+19
If the presentation fence was signalled and we still hold max_acquires or more images, then clear the timeout to avoid a possible deadlock. With that we avoid the validation error VUID-vkAcquireNextImageKHR-surface-07783 triggered by piglit spec@!opengl 1.0@gl-1.0-drawbuffer-modes and others. v2: clear timeout only if we have acquired more images than the reported max and add some comment why the timeout is cleared (Mike). Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28245>
2024-03-19nouveau: Add support for TERT opcodes in vk_push_printMary Guillemard1-52/+93
Those opcodes are vestige of the old command format. This implement handling of them and fix issues when analysing command buffers that use thoses. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28277>
2024-03-19intel/fs: Avoid generating useless UNDEFs for every SSA defKenneth Graunke1-1/+4
Emitting UNDEF is only necessary when the instructions we generate to produce the NIR def are considered partial writes. By adding a simple check (adapted from fs_inst::is_partial_write()), we can avoid creating loads of unnecessary UNDEFs that we have to clean up later. Our first dead code elimination pass does get rid of them pretty quickly, but this should save memory and time during our first split_virtual_grfs and dead_code_elimination passes. This generates roughly 30% fewer instructions at the beginning. Improves compilation time of shaders: - Rise of the Tomb Raider: -3.51563% +/- 0.103951% (n=7) - Borderlands 3: -3.64422% +/- 0.300951% (n=7). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28169>
2024-03-19radv/printf: Use fprintf instead of printfKonstantin Seurer3-15/+15
For using other destinations than stdout. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28228>
2024-03-19radv: Skip more acceleration structure build markersKonstantin Seurer1-17/+20
We should skip even more stuff when using updates only. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28228>
2024-03-19anv: Enable VK_KHR_shader_quad_controlCaio Oliveira2-0/+5
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27279>
2024-03-19intel/brw: Use predicates for quad_vote_any and quad_vote_all when availableCaio Oliveira1-0/+22
Up until Xe2, we can use the predicates ANY4H and ALL4H to achieve the same result with less instructions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27279>
2024-03-19intel/brw: Implement quad_vote_any and quad_vote_allCaio Oliveira1-0/+52
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27279>
2024-03-19intel/fs: Don't allow 0 stride on MOV destinationIan Romanick2-1/+4
Outside SIMD1 instructions, a destination stride of zero doesn't make any sense. When such strides exist, they would be fixed by the FS generator. Currently the only place that intentionally generates such a stride is setup_barrier_message_payload_gfx125, and this commit changes that. The existence of a zero stride that won't really be a zero stride causes a variety of problems with other optimization passes. Those passes don't know that 0 actually means 1, and they make incorrect assumptions about sizes written, etc. The assertion helped catch many bugs in some other work in progress that tries to store convergent values in SIMD8 registers regardless of the dispatch width. That code would accidentally generate destination strides of zero. v2: Check stride differently depending on register file. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28256>
2024-03-19freedreno/replay: Use real queueid for submissions and waitsDanylo Piliaiev1-2/+19
Otherwise it failed when expected queueid is not 0. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27123>
2024-03-19zink/ci: enable RADV_PERFTEST=shader_object for polaris10Samuel Pitoiset1-0/+1
It's passing in CI now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28273>
2024-03-19radv/rt: Use 32-bit offsets for load_sbt_entryKonstantin Seurer1-15/+8
Totals from 82 (18.06% of 454) affected shaders: MaxWaves: 820 -> 821 (+0.12%) Instrs: 2765694 -> 2766338 (+0.02%); split: -0.08%, +0.10% CodeSize: 14751988 -> 14735464 (-0.11%); split: -0.13%, +0.01% VGPRs: 8464 -> 8448 (-0.19%) SpillSGPRs: 454 -> 512 (+12.78%) Latency: 19368679 -> 19344967 (-0.12%); split: -0.21%, +0.09% InvThroughput: 5354427 -> 5346317 (-0.15%); split: -0.24%, +0.08% VClause: 100183 -> 100331 (+0.15%); split: -0.02%, +0.17% SClause: 66584 -> 66590 (+0.01%); split: -0.02%, +0.03% Copies: 237008 -> 238684 (+0.71%); split: -0.53%, +1.23% Branches: 113344 -> 113386 (+0.04%); split: -0.00%, +0.04% PreSGPRs: 6141 -> 6194 (+0.86%) PreVGPRs: 7916 -> 7880 (-0.45%) Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27725>
2024-03-19radv: Use radv_buffer_map for parsing IBsKonstantin Seurer1-5/+6
We need matching pointers pointers for annotations to work. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27549>
2024-03-19ac: Improve context roll readabilityKonstantin Seurer1-1/+12
Add new lines to improve visual separation and color registers: - red = unchanged - green = changed Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27549>
2024-03-19radv: Add an IB annotation layerKonstantin Seurer5-4/+156
The layer annotates the command buffers with api entrypoint names. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27549>
2024-03-19radv: Add support for IB annotationsKonstantin Seurer4-2/+52
Wires up ac_parse_ib annotation support. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27549>
2024-03-19ac/parse_ib: Implement annotationsKonstantin Seurer2-0/+8
Annotates the IB dump with driver specified strings. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27549>
2024-03-19ac/parse_ib: Replace the parameter list with ac_ib_parserKonstantin Seurer6-61/+117
It's more code but it should be more readable. This also makes adding optional arguments easier. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27549>
2024-03-19ac: Annotate context rollsKonstantin Seurer4-7/+20
Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27549>
2024-03-19radv: Use mapped driver locations for determining I/O strides.Timur Kristóf3-4/+45
This will allow us to more accurately determine the input and output strides, because the I/O locations mapped by RADV don't match the locations in NIR. As a result, ESO will use less LDS. It also fixes the per-patch output stride of tess control shaders, because previously we omitted tess factors from them. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28021>
2024-03-19radv: Extract input and output stride info to new functions.Timur Kristóf1-14/+53
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28021>
2024-03-19r300: mark new failsEric Engestrom1-0/+11
https://gitlab.freedesktop.org/mesa/mesa/-/jobs/56480445 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28271>
2024-03-19nvk: Add NVK to the Vulkan device nameEcho J1-2/+8
Other Mesa Vulkan drivers do the same thing (this helps to identify the driver better especially with the recent official name import) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28262>
2024-03-19freedreno/devices: Do not write to 8E79 on a750, KGSL has it protectedDanylo Piliaiev1-1/+0
Writing REG_A7XX_RB_UNKNOWN_8E79 causes: adreno-gen7-gmu 3d68000.qcom,gmu: CP | Protected mode error | WRITE | addr=0x08e79 | status=0x00608e79 Fixes: ebde7d5e870d7d0d0386d553cf36854697e17824 ("tu/a7xx: Write even more magic regs to fix rendering issues on Android") Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27912>
2024-03-19aco: use small_vec as Block::edge_vec for predecessors and successorsDaniel Schürmann9-31/+29
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27984>
2024-03-19aco/util: small_vec few additionsDaniel Schürmann1-7/+42
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27984>
2024-03-19aco/util: add small_vecRhys Perry1-0/+147
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27984>
2024-03-19aco: reorder code and use namespaces in aco_interface.cppDaniel Schürmann1-119/+124
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27984>