summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-03-29intel/eu/validate: Allow SIMD16 for mixed mode float operations on xe2+xe2-compiler-2-subset-2Rohan Garg1-1/+2
Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2024-03-29intel/brw: Lower DWORD scattered read writes to lscRohan Garg1-6/+8
Rework: * Francisco Jerez: Rebase on 07b9bfacc789 ("intel/compiler: Move logical-send lowering to a separate file") * Jordan: Move SHADER_OPCODE_DWORD_SCATTERED_*_LOGICAL from previous patch, as it seems to make more sense here. * Jordan: Change `devinfo->has_lsc` ?: to if/else as suggested by idr Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2024-03-29intel/brw: Handle typed surface and atomic messages for xe2+Rohan Garg2-2/+14
Reworks: * Francisco: Rebase on 07b9bfacc789 ("intel/compiler: Move logical-send lowering to a separate file") * Jordan: Rebase on 952a523abb2 ("intel: switch over to unified atomics") Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2024-03-29intel/brw/xehp+: Drop redundant arguments of lsc_msg_desc*().Francisco Jerez4-94/+44
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2024-03-29intel/eu/xehp+: Don't initialize mlen and rlen descriptor fields from ↵Francisco Jerez1-11/+0
lsc_msg_desc*(). These fields are overlapping with the ones set by brw_message_desc(), so the latter should be used instead. This fixes corruption of the LSC message descriptors when inconsistent values are specified through both helpers, which can happen if the 'inst->mlen' field is modified during optimization (e.g. by opt_split_sends()). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2024-03-29intel/brw/xehp+: Replace lsc_msg_desc_dest_len()/lsc_msg_desc_src0_len() ↵Francisco Jerez5-41/+77
with helpers to do the computation. We cannot rely on the immediate message descriptor having accurate values for mlen and rlen at the IR level, since they are updated at codegen time via 'inst->mlen' and 'inst->size_written', which could end up with values inconsistent with the message descriptor if e.g. the split sends optimization had an effect. Instead, define helpers that do the computation without relying on the message descriptor, and use the pre-existing brw_message_desc_mlen()/brw_message_desc_rlen() helpers (fully equivalent to the lsc helpers deleted here) during disassembly. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2024-03-29intel/brw/xe2: Update uniform handling to account for 512b physical registersIan Romanick1-2/+17
Rework: * Jordan: Drop FINISHME (s-b Caio) * Jordan: Use reg_unit() in asserts rather than a ver check (s-b Caio) * Ian: Make use of reg_unit() in round_components_to_whole_registers() Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
2024-03-29intel/brw/xe2: Update brw_nir_analyze_ubo_ranges to account for 512b ↵Ian Romanick1-8/+23
physical registers Rework: * Jordan: Use `REG_SIZE * reg_unit` (Suggested by Caio) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
2024-03-29intel/brw: Add a src array for the common case in fs_instCaio Oliveira2-15/+59
In the common case, fs_inst will have up to 4 sources (the HW instructions have up to 3, and our representation of SENDs have 4). Embed such array into the fs_inst, and use it whenever applicable instead of allocating a new array. Also change the code to reuse the allocated src array when resizing to a smaller length. Between the changes above and the reduced amount of initializing fs_regs, this reduces fossil-db time by around 2% for Borderlands 3 and Rise of the Tomb Raider, and around 1.5% for Total War Warhammer 3. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28379>
2024-03-29intel/brw: Remove vestiges of sources on IF opcode, only valid on Gfx6Caio Oliveira2-6/+0
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28379>
2024-03-29intel/brw: Rearrange fs_inst fieldsKenneth Graunke1-53/+63
For better packing, and to make all the small fields easier to hash and compare en masse. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28379>
2024-03-29rpi/ci: another batch of flakesEric Engestrom3-0/+14
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28481>
2024-03-29nvk: Advertise VK_KHR_maintenance6Faith Ekstrand2-1/+10
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27796>
2024-03-29nvk: Add support for version 2 of all descriptor binding commandsValentine Burley2-63/+85
Signed-off-by: Valentine Burley <valentine.burley@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27796>
2024-03-29nvk: Support VkBindMemoryStatusKHRFaith Ekstrand2-0/+10
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27796>
2024-03-29intel/brw/xe2+: Use phys_nr and phys_subnr in DPAS encodingIan Romanick1-8/+8
Suggested-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
2024-03-29intel/brw/xe2+: DPAS must be SIMD16 nowIan Romanick3-8/+13
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
2024-03-29nir: intel/brw: Change the order of sources for nir_dpas_intelIan Romanick3-13/+17
It was by pure luck that all sources (and the result) of nir_dpas_intel had the same number of components. It is possible to support matrix sizes where the accumlator matrix and the result matrix are larger (e.g., 16x8 * 8x16 = 16x16). This breaks all of the assumptions of NIR's infrastructure for code generating intrinsics. Fix the by making the accumulator matrix be the first source. The accumulator and the result will always have the same dimensions (due to rules of matrix multiplication) and the same type (due to restructions of the cooperative matrix extension). This forces them to have the same number of components. This doesn't fix all the potential problems. NIR expects that all 0-sized sources will have the same number of components. This just ensures that the result has the correct number of components. Fixes: 6b14da33ad3 ("intel/fs: nir: Add nir_intrinsic_dpas_intel") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
2024-03-29intel/brw: Use enums for DPAS source regioningIan Romanick1-3/+12
Was previously passing 1, 1, 0 as the regioning. This generated incorrect disassembly because the encoding for a width of 1 is 0. Use the enums to ensure the correct values are used. Fixes: 1c92dad5cb7 ("intel/disasm: Disassembly support for DPAS") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
2024-03-29intel/brw: Clear write_accumulator flag when changing the destinationIan Romanick1-0/+6
If the destination was the accumulator but is no longer, having the flag set is not correct. On Xe2 this also causes a validation error. v2: Reword the comment to be more clear. Suggested by Jordan. Fixes: efa4e4bc5fc ("intel/fs: Introduce regioning lowering pass.") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
2024-03-29nak: Implement load_ubo with an indirect cbuf indexFaith Ekstrand2-4/+31
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28474>
2024-03-29nak: Plumb through LDC modesFaith Ekstrand3-2/+32
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28474>
2024-03-29ci_run_n_monitor: explain how to pass multiple targets without having to use ↵Eric Engestrom1-1/+2
regexes Fixes: 6825c67c991fc1fc6192 ("ci_run_n_monitor: allow passing multiple targets") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28461>
2024-03-29ci: don't run rustfmt on every core changeEric Engestrom1-1/+4
Only keep the two parts we want: disabling the job in the nightly pipeline, and running the job if the CI itself is modified. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28468>
2024-03-29wgl: The default swap interval is supposed to be 1Jesse Natalie1-0/+3
Per WGL_EXT_swap_control: > The default swap interval is 1. Reviewed-by: Giancarlo Devich <gdevich@microsoft.com> Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28471>
2024-03-29radv/ci: dEQP-VK.spirv_assembly.type.vec4.i8.mod_geom Fail -> Crash on tahitiEric Engestrom1-1/+1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28472>
2024-03-29radv/ci: another batch of flakesEric Engestrom2-0/+31
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28472>
2024-03-29zink: only check that CUBE_COMPATIBLE for images doesn't subtract flagsMike Blumenkrantz1-1/+1
the flags may change if e.g., HOST_TRANSFER is enabled by adding CUBE fixes #10924 cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28460>
2024-03-29aco: avoid breaking clauses with waitcntRhys Perry2-1/+123
fossil-db (navi31): Totals from 3573 (4.50% of 79395) affected shaders: Instrs: 6172096 -> 6170009 (-0.03%); split: -0.04%, +0.01% CodeSize: 31448052 -> 31439660 (-0.03%); split: -0.03%, +0.01% Latency: 37317302 -> 37307935 (-0.03%); split: -0.03%, +0.00% InvThroughput: 6820967 -> 6819930 (-0.02%); split: -0.02%, +0.00% VClause: 163424 -> 157705 (-3.50%) SClause: 135441 -> 135295 (-0.11%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28433>
2024-03-29meson: fix link failure with llvm-18Karol Herbst1-1/+4
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10739 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10775 cc: mesa-stable Signed-off-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28267>
2024-03-28radv, aco: Remove the code that jumped to RADV's TCS epilogs.Timur Kristóf3-87/+2
The actual TCS epilog selection code is kept unchanged for now, we'll delete it when RadeonSI also gets rid of TCS epilogs. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28408>
2024-03-28radv: Completely delete TCS epilogs.Timur Kristóf10-232/+3
TCS epilogs are not needed anymore because the TCS can implement dynamic states by itself now. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28408>
2024-03-28ac/nir/tess: Emit tess factor stores based on new intrinsics.Timur Kristóf4-16/+46
This allows the TCS to read the primitive mode and whether TES reads the tess factors, from an SGPR arg, which lets it decide how to store them at runtime. For linked shaders, the conditions will be constant and NIR optimizations can delete the dead CF. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28408>
2024-03-28radv: Call nir_opt_dead_cf in radv_optimize_nir_algebraic.Timur Kristóf1-0/+1
In case lowering passes added dead CF. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28408>
2024-03-28radv: Implement new tess intrinsics.Timur Kristóf4-1/+25
For linked shaders, the information is available as constant, while for unlinked shaders, the info is in a SGPR arg. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28408>
2024-03-28radv: Copy TES primitive mode to TCS info.Timur Kristóf1-0/+1
Will be needed by the ABI lowering of the new intrinsic that tells the TCS the primitive type, if it's known. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28408>
2024-03-28radeonsi: Implement new intrinsics for monolithic shaders.Timur Kristóf1-0/+14
For now, only monolithic shaders will hit the code path that will generate these intrinsics. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28408>
2024-03-28nir: Add two new AMD specific tess intrinsics.Timur Kristóf2-0/+6
These will be needed to implement some tessellation dynamic states within the TCS as opposed to using an epilog. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28408>
2024-03-28d3d12: Support HEVC slice L0/L1 active number overrideSil Vilerino3-139/+127
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28424>
2024-03-28d3d12: Support H264 slice L0/L1 active number overrideSil Vilerino3-70/+96
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28424>
2024-03-28d3d12: Bump directx-headers dependency to v613Sil Vilerino2-2/+2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28424>
2024-03-28aco: always emit float mode for merged shaders compiled separatelyRhys Perry1-7/+11
We don't know what the float mode was by the end of the previous shader, so we should always set it. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28392>
2024-03-28intel/brw: minor rework to de duplicate variable assignmentRohan Garg1-3/+1
Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27235>
2024-03-28intel/brw: adjust the copy propgation pass to account for wider GRF's on Xe2+Rohan Garg1-1/+1
Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27235>
2024-03-28intel/brw: update disassembly for MATH pipeRohan Garg1-1/+2
Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27235>
2024-03-28intel/brw: Xe2+ can do SIMD16 for extended math on HF typesRohan Garg1-2/+2
BSpec 56797: Math operation rules when half-floats are used on both source and destination operands and both source and destinations are packed. The execution size must be 16. Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27235>
2024-03-28intel/brw: account for sources when determining if a operation uses half floatsRohan Garg1-4/+26
Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27235>
2024-03-28radv/ci: another batch of flakesEric Engestrom5-0/+18
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28450>
2024-03-28v3dv/ci: another batch of flakesEric Engestrom2-0/+17
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28449>
2024-03-28Revert "zink: store last pipeline directly for zink_gfx_program::last_pipeline"Juston Li4-5/+5
This reverts commit be8b7980e66f3526d7c1eb9b137772fb6fc90a96. Store the cache entry so that the fast path picks up the optimized pipeline when its available from a background optimized_compile_job(). Observed traces where it would take the fast path back and forth using an unoptimized pipeline and never pick up the optimized pipeline leading to >50% fps drop. Signed-off-by: Juston Li <justonli@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28440>