~miku/mesa - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2020-04-18	ir3/ra: Fix off-by-one issues with live-range extension	Connor Abbott	3	-2/+34
	The intersects() function assumes that inside each instruction values always die before they are defined, so that if the end of one range is the same instruction as the beginning of the next then they don't intersect. However, this isn't the case for values that become live at the beginning of a basic block, which become live before the first instruction, or instructions that die at the end of a basic block which die after the last instruction. For example, imagine that we have two values, A which is defined earlier in the block and B which is defined in the last instruction of the block and both die at the end of the basic block (e.g. are used in the next iteration of a loop). We would compute a range for A of, say, (10, 20) and for B of (20, 20) since each block's end_ip is the same as the ip of the last instruction, and RA would consider them to not interfere. There's a similar problem with values that become live at the beginning. The fix is to offset the block's start_ip and end_ip by one so that they don't correspond to any actual instruction. One way to think about this is that we're adding fake instructions at the beginning and end of a block where values become live & die. We could invert the order, so that values consumed by each instruction are considered dead at the end of the previous instruction, but then values that become dead at the beginning of the basic block would incorrectly have an empty live range, with a similar problem at the end of the basic block if we try to say that values are defined at the beginning of the next instruction. So the extra padding instructions are unavoidable. This fixes an accidental infinite loop in the shader for dEQP-VK.spirv_assembly.type.scalar.u32.switch_vert. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4614>
2020-04-17	tu: Use tu_cs_add_entries() with non-render-pass secondaries	Connor Abbott	1	-1/+1
	Even though vkCmdRenderPassBegin() isn't allowed inside a secondary command buffer, vkCmdDispatch() is, and we emit an IB with compute dispatches, which means that if the secondary command buffer records a vkCmdDispatch() then we'll have an IB inside an IB, which is illegal. Fixes hangs in e.g. dEQP-VK.api.command_buffers.record_simul_use_secondary_one_primary. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4605>
2020-04-16	turnip: image_view rework	Jonathan Marek	4	-379/+322
	Instead of exposing various layout functions, move image-related logic into tu_image.c and have the image_view pre-fill relevant register values. This changes the clear/blit code to use image_view. This will make it much easier to deal with aspect masks, in particular for planar formats and D32_S8. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4581>
2020-04-16	turnip: don't limit framebuffer size to image size	Jonathan Marek	1	-13/+0
	Minor cleanup, I couldn't find anything that suggests this should be done, and anv doesn't do it either. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4581>
2020-04-16	turnip: compute render_components/srgb_cntl at renderpass creation time	Jonathan Marek	3	-32/+30
	Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4581>
2020-04-16	tu: Align GMEM resolve blit scissor	Connor Abbott	1	-2/+1
	Even though we normally use the CP_BLIT path with resolves that aren't aligned, there's a special case when we're resolving the entire image and there's enough padding so that we can still use CP_EVENT_WRITE::BLIT when the render area isn't aligned. The hardware seems to not like unaligned scissors when not clearing, and sometimes hangs rather than silently round the scissor. This causes hangs in e.g. dEQP-VK.glsl.derivate.dfdx.texture.msaa4.float_highp. There was some concern that the CP_BLIT path might use this scissor also, but I confirmed that this isn't the case by setting it to 0 before resolving and then noting that CP_BLIT still works (but CP_EVENT_WRITE doesn't). Furthermore, this is actually impossible because of how the 2D engine is set up: it gets its own pair of register banks, which can be switched independently of the 3D register banks, so that 2D events (CP_BLIT) normally aren't synchronized relative to 3D events (CP_EVENT_WRITE, CP_DRAW_*, and CP_EXEC_CS) and therefore they can't share any registers except for non-pipelined registers like RB_CCU_CNTL that don't use the register bank mechanism at all. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4585>
2020-04-15	ir3: Handle load_ubo_ir3 when promoting to constants	Connor Abbott	1	-10/+30
	This restores support for promoting UBO loads to constant loads when using LDC. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4568>
2020-04-15	ir3: Fix LDC offset units	Connor Abbott	6	-11/+95
	I had missed that LDC actually uses vec4 units for its offset. This means that we have to create a new instruction, and lower it in ir3_nir_lower_io_offsets, similar to the existing SSBO instructions. Unfortunately we can't assume that loads are always vec4-aligned, so we have to use the alignment information that NIR gives us. Unfortunately, it's currently woefully inadequate, and will have to be fixed to give us good codegen in the future. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4568>
2020-04-15	freedreno/turnip: Update GRAS_LAYER_CNTL to GRAS_MAX_LAYER_INDEX	Brian Ho	2	-8/+3
	After some experimentation, I believe that GRAS_LAYER_CNTL is actually just a count register storing the number of layers in the render target. While debugging cube_array geometry tests, I noticed that the blob was setting an unknown 0x8 to LAYER_CNTL, so I checked the value of LAYER_CNTL for various layer sizes: 1: LAYER_CNTL=0 2: LAYER_CNTL=1 3: LAYER_CNTL=2 4: LAYER_CNTL=3 9: LAYER_CNTL=8 256: LAYER_CNTL=255 2000: LAYER_CNTL=1999 Seems like this register just stores a count of the largest layer that can be written to via gl_Layer. This commit updates the reg docs, freedreno's gs implementation, and turnip's gs implementation. Fixes dEQP-VK.geometry.layered.cube_array.* Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4541>
2020-04-15	turnip: Emit geometry shader descriptor consts	Brian Ho	1	-0/+9
	Without these consts, the geometry shader is unable to read from textures or uniforms. Fixes dEQP-VK.geometry.layered.*.readback Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4541>
2020-04-15	turnip: Correctly set layer stride for 3D images	Brian Ho	1	-2/+4
	Previously we were using layout.layer_size for the layer stride, but in Vulkan, you can alias a 3D image as an array of 2D images via the VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT flag. One reason to use this behavior is so the geometry shader can write to a specific depth in a 3D framebuffer with gl_Layer. Since the 3D image is not a true layered image, layer_size is 0. Instead, we can copy what freedreno does and use the slice size. Fixes dEQP-VK.geometry.layered.3d.* Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4541>
2020-04-14	freedreno/ir3: don't overwrite wrmask in ir3_SAM	Jonathan Marek	1	-2/+2
	Fixes (with other patches to allow these tests to run): dEQP-VK.ycbcr.query.size_lod.vertex.* Suggested-by: Rob Clark <robclark@gmail.com> Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4557>
2020-04-14	freedreno/ir3: fix emit_tex_info split_dest	Jonathan Marek	1	-2/+1
	Fixes a "free(): invalid next size (fast)" error in: dEQP-VK.glsl.texture_functions.query.texturequerylevels.* Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4557>
2020-04-14	ir3: Fix txs with bindless	Connor Abbott	1	-3/+3
	I missed that this had a micro-optimization to assume that there was only ever one source, which is no longer valid for the bindless model since we now have a bindless handle source. Remove the optimization to fix assertion failures with turnip. Fixes e.g. dEQP-VK.glsl.texture_functions.query.texturesize.sampler2d_fixed_vertex Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4548>
2020-04-13	freedreno/ir3/ra: cleanup some leftovers	Rob Clark	1	-6/+0
	Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
2020-04-13	freedreno/ir3: rename depth->dce	Rob Clark	7	-68/+18
	Since DCE is the only remaining function of this pass, after the pre-RA scheduler rewrite. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
2020-04-13	freedreno/ir3: better cleanup when removing unused instructions	Rob Clark	4	-10/+36
	Do a better job of pruning when removing unused instructions, including cleaning up dangling false-deps. This introduces a new ssa src pointer iterator, which makes it easy to clear links without having to think about whether it is a normal ssa src or a false-dep. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
2020-04-13	freedreno/ir3/ra: handle array case for SFU select_reg opt	Rob Clark	1	-3/+10
	The src of the SFU instruction could also be array/reg (non-SSA). Handle this case too. The postsched cp pass makes this scenario more likely. Fixes: cc82521de4e ("freedreno/ir3: round-robin RA") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
2020-04-13	freedreno/ir3: add mov/cov stats	Rob Clark	3	-4/+12
	While not always avoidable, cov instructions are a useful thing to look at to see if we could fold into src. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
2020-04-13	freedreno/ir3/postsched: avoid moving tex ahead of kill	Rob Clark	1	-0/+18
	Add extra dependencies of tex/mem instructions on previous kill instructions to avoid moving them ahead of kills. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
2020-04-13	freedreno/ir3/postsched: remove some leftovers	Rob Clark	1	-9/+0
	These aren't used in postsched. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
2020-04-13	freedreno/ir3/sched: awareness of partial liveness	Rob Clark	1	-1/+44
	Realize that certain instructions make a vecN live, and account for this, in hopes of scheduling the remaining components of the vecN sooner. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
2020-04-13	freedreno/ir3: new pre-RA scheduler	Rob Clark	3	-396/+426
	This replaces the depth-first search scheduler with a more traditional ready-list scheduler. It primarily tries to reduce register pressure (number of live values), with the exception of trying to schedule kills as early as possible. (Earlier iterations of this scheduler had a tendency to push kills later, and in particular moving texture fetches which may not be necessary ahead of kills.) Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
2020-04-13	freedreno/ir3: fix location of inserted mov's	Rob Clark	1	-1/+11
	If the group pass must insert a mov to resolve conflicts, avoid the mov appearing after the meta:collect whose src it is. The current pre-RA scheduler doesn't really care about the initial instruction order, but the new one will in some cases. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
2020-04-13	freedreno/ir3: simplify grouping pass	Rob Clark	1	-31/+18
	Since bdf6b7018cedf95b554e21953d5a1935d3067ce7 the logic only needs to handle grouping collect srcs, So remove the now unnecessary indirection. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
2020-04-13	freedreno/ir3: make falsedep use's optional	Rob Clark	3	-4/+6
	Add option when collecting uses to control whether they include falsedeps or not. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
2020-04-13	freedreno/ir3: spiff out disasm a bit	Rob Clark	1	-5/+17
	for verbose mode, print also the instruction "cycle" (which takes into account (rptN) and (nopN)) in addition to instruction offset. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
2020-04-13	freedreno/computerator: support bindless sampler instructions	Jonathan Marek	2	-1/+4
	Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4526>
2020-04-13	freedreno/computerator: support nop prefix	Jonathan Marek	2	-1/+6
	Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4526>
2020-04-13	freedreno/ir3: CSE the up/downconversion of SEL's cond's size.	Eric Anholt	3	-7/+22
	Not many programs hit this, but if you were, say, selecting between vec4s, you'd convert the cond 4 times. instructions in affected programs: 2957 -> 2717 (-8.12%) nops in affected programs: 989 -> 899 (-9.10%) non-nops in affected programs: 1968 -> 1818 (-7.62%) dwords in affected programs: 3232 -> 2752 (-14.85%) last-baryf in affected programs: 102 -> 90 (-11.76%) full in affected programs: 5 -> 4 (-20.00%) sstall in affected programs: 329 -> 329 (0.00%) (ss) in affected programs: 86 -> 105 (22.09%) (sy) in affected programs: 14 -> 12 (-14.29%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4516>
2020-04-13	freedreno/ir3: Stop doing b2n on the SEL condition.	Eric Anholt	2	-3/+11
	SEL_B32 (and presumably B16) checks for 0 or nonzero in the condition (tested by just stuffing a uniform's value into it), so there's no need to do ir3_b2n() on it, or any preceding ir3_n2b(). instructions in affected programs: 664444 -> 659927 (-0.68%) nops in affected programs: 267898 -> 266312 (-0.59%) non-nops in affected programs: 420260 -> 417329 (-0.70%) dwords in affected programs: 144032 -> 137568 (-4.49%) last-baryf in affected programs: 10801 -> 10321 (-4.44%) full in affected programs: 2003 -> 2002 (-0.05%) sstall in affected programs: 76670 -> 77405 (0.96%) (ss) in affected programs: 4515 -> 4525 (0.22%) (sy) in affected programs: 612 -> 604 (-1.31%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4516>
2020-04-10	freedreno: Fix leak of binning shader variants.	Eric Anholt	1	-0/+2
	The v->binning variant is never added to shader->variants, so just free each one as we free the nonbinning variant. Noticed from drm-shim mode running out of open fds, since each bo ends up with an fd. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4502>
2020-04-10	freedreno/ir3: Fix sz vs class confusion	Kristian H. Kristensen	2	-6/+24
	Add bounds checking to make sure we don't silently access out of bounds again. Fixes: 90f7d12236c ("freedreno/ir3/ra: pick higher numbered scalars in first pass") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4503>
2020-04-09	tu: Implement descriptor set update templates	Connor Abbott	2	-4/+125
	Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09	tu: Add missing code for immutable samplers	Connor Abbott	1	-0/+20
	Actually fill out the samplers, based on the radv implementation. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09	tu: Emit CP_LOAD_STATE6 for descriptors	Connor Abbott	5	-5/+304
	This restores the pre-loading of descriptor state, using the new SS6_BINDLESS method that allows us to pre-load bindless resources. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09	tu: Switch to the bindless descriptor model	Connor Abbott	7	-868/+673
	Under the bindless model, there are 5 "base" registers programmed with a 64-bit address, and sam/ldib/ldc and so on each specify a base register and an offset, in units of 16 dwords. The base registers correspond to descriptor sets in Vulkan. We allocate a buffer at descriptor set creation time, hopefully outside the main rendering loop, and then switching descriptor sets is just a matter of programming the base registers differently. Note, however, that some kinds of descriptors need to be patched at command recording time, in particular dynamic UBO's and SSBO's, which need to be patched at CmdBindDescriptorSets time, and input attachments which need to be patched at draw time based on the the pipeline that's bound. We reserve the fifth base register (which seems to be unused by the blob driver) for these, creating a descriptor set on-the-fly and combining all the dynamic descriptors from all the different descriptor sets. This way, we never have to copy the rest of the descriptor set at draw time like the blob seems to do. I mostly chose to do this because the infrastructure was already there in the form of dynamic_descriptors, and other drivers (at least radv) don't cheat either when implementing this. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09	ir3: Rewrite UBO push analysis to support bindless	Connor Abbott	3	-59/+107
	Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09	ir3: Plumb through bindless support	Connor Abbott	10	-99/+520
	Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09	ir3: LDC also has a destination	Connor Abbott	1	-1/+1
	Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09	ir3: Also don't propagate immediate offset with LDC	Connor Abbott	1	-3/+3
	Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09	ir3: Plumb through support for a1.x	Connor Abbott	11	-67/+164
	This will need to be used in some cases for the upcoming bindless support, plus ldc.k instructions which push data from a UBO to const registers. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09	ir3: Add bindless instruction encoding	Connor Abbott	3	-103/+277
	Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09	freedreno/a6xx: Add registers for the bindless model	Connor Abbott	2	-0/+39
	In Vulkan, descriptors for samplers, SSBO's, etc. are collected into descriptor sets, and shaders can use multiple descriptor sets. At command-recording time, users can swap out only some of the descriptor sets, and the driver is supposed to do the minimum amount necessary to update any internal binding tables, knowing that only some of the descriptors have changed. With the old binding model, focused on GL, where there are separate tables for each type of resource, we can do somewhat better than now by preserving descriptors from lower descriptor sets when switching higher descriptor sets. However we still have to copy around descriptors before each draw. At least for a6xx, qualcomm went further, essentially copying the Vulkan binding model as an alternate way to load resources. There's an array of registers (actually an array for compute and one for everything else), where each register holds a pointer to a descriptor set that can contain various different descriptor types. The descriptors are padded out to 16 dwords, so that every instruction can use an index instead of a dword offset. It's called "bindless", I think, because it can also be used to implement the old GL bindless extensions (presumably it allows more samplers and textures than the old model). This commit adds the register and cmdstream parts. Next up will be the instruction encoding. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09	freedreno/a6xx: Add UBO size field	Connor Abbott	1	-1/+1
	Verified with the vulkan blob, which uses ldc and UBO descriptors, and turnip will too soon. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09	tu: ir3: Emit push constants directly	Connor Abbott	6	-26/+86
	Carve out some space at the beginning for push constants, and push them directly, rather than remapping them to a UBO and then relying on the UBO pushing code. Remapping to a UBO is easy now, where there's a single table of UBO's, but with the bindless model it'll be a lot harder. I haven't removed all the code to move the remaining UBO's over by 1, though, because it's going to all get rewritten with bindless anyways. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09	tu: Dump out shader assembly when requested	Connor Abbott	1	-0/+14
	We don't use the ir3 variant machinery, so we have to do this ourselves. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09	turnip: new clear/blit implementation with shader path fallback	Jonathan Marek	15	-1990/+2578
	The shader path is used to implement the following cases: * stencil aspect mask on D24S8 (for image_to_buffer,buffer_to_image) * clear/copy msaa destination (2D engine can't have msaa dest) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
2020-04-09	turnip: add vk_format_is_snorm/is_float	Jonathan Marek	1	-0/+12
	Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
2020-04-09	turnip: rework format helpers	Jonathan Marek	6	-29/+41
	* Take tile_mode as input directly * tu6_format_gmem to tu6_base_format, use may not be limited to GMEM * Add new helpers that will return the correct tile_mode as for image level as part of the format. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>