Age | Commit message (Collapse) | Author | Files | Lines |
|
I had 3.x putting swizzling in the texture state only for 16-bit texture
returns, and in the shader for 32-bit. This may be due to having mixed up
the return channel setup on 3.x back before I had moved it into the
compiler. On 4.x, the non-border-color texwrap tests are passing nicely
with both 16 and 32-bit returns with swizzling in the texture state.
|
|
|
|
Now that the actions are reused for centroid and nonperspective, give them
a more generic name.
|
|
The fxcd/fycd instructions now return half-integer pixel centers when not
doing sample-rate shading.
|
|
Revealed that I was writing past the TSDA, not the Z buffer as I expected.
|
|
|
|
We no longer have the small depth-specific output format enum, and instead
depth is just at the end of the output image format enum.
|
|
The RGBX8 formats were dropped from V3D 4.x, but we don't really need them
anyway (we already handle other non-alpha formats by forcing A to 1).
|
|
|
|
This is a major performance boost on all of V3D, but is required on V3D
4.x where shaders are always either 2- or 4-threaded.
|
|
This required extending the CL submit ioctl, because the tile alloc/state
buffer setup has moved from the BCL to register writes.
|
|
|
|
This required moving the register accesses to a separate v3dx file, since
the register definitions for each V3D version collide. It seems that
initializing the v3d_hw from a file dictating 3.3
(v3d_simulator_wrapper.cpp) is safe, though.
|
|
The TLB load/store path is rebuilt in this version. There is no longer a
single-byte resolved store or the 3-byte extended store. Instead, you get
to always use general loads/stores (which, honestly, was tempting even in
previous versions).
|
|
I accidentally emitted this into the RCL instead of the per-tile generic
list, so we wouldn't get tiles after the first cleared.
|
|
This is going to get more complicated with V3D 4.1 support, which redoes
all the TLB packets.
|
|
To conditionally compile cl_emit() macros per V3D version, we need it to
expand to whatever V3D we're building for. This required emitting #define
V3D_VERSION 33 in all our currently 3.3-only code.
|
|
This creates two new internal dependencies, idep_nir_headers and
idep_nir. The former encapsulates the generation of nir_opcodes.h and
nir_builder_opcodes.h and adding src/compiler/nir as an include path.
This ensures that any target that needs nir headers will have the
includes and that the generated headers will be generated before the
target is build. The second, idep_nir, includes the first and
additionally links to libnir.
This is intended to make it easier to avoid race conditions in the build
when using nir, since the number of consumers for libnir and it's
headers are quite high.
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
|
|
I found that we were getting GPU hangs on most tests rendering to them,
and the simulator was assertion failing.
|
|
We were trying to load/store the logical width/height number of compressed
blocks. As long as the textures were large, single-level, and the
load/store at (0,0), it kind of worked.
|
|
Fixes overflow that caused failure in
dEQP-GLES3.functional.texture.filtering.2d.sizes.128x128_linear.
|
|
Apparently the other funcs will have observable differences when early Z
is enabled.
Fixes (new) simulator assertion failures in
dEQP-GLES3.functional.rasterizer_discard.basic.clear_depth.
|
|
|
|
|
|
This means that with no flatshading we'll emit the single-byte
ZERO_ALL_FLAT_SHADE_FLAGS, and otherwise emit a set of FLAT_SHADE_FLAGS to
get all the bits we need set.
There's a _SET enum in the packet we could use to possibly set entire
ranges of the bitfield without using another packet, but this at least
fixes the conformance failure.
|
|
In updating the simulator, behavior changed slightly so that our old code
wasn't getting glxgears's flatshading interpolated right. Emit flat
shading code just like we would for a normal flat-shaded varying, by
passing a flag in the shader key for glShadeModel(GL_FLAT) state and
customizing the color inputs based on that.
|
|
It seems that the HW team has decided that it's the only supported mode,
and it's the mode I actually meant to be using but forgot. Our table of
return_32_bit should have matched the default non-OVRTMUOUT behavior, so
this change should be invisible.
However, the change revealed that some my return_size checks for swizzling
were a bit confused in the shadow case, so I had to move them to draw time
once we have both the sampler and the view together.
Fixes assertion failures in the updated simulator, where the non-OVRTMUOUT
support has been removed.
|
|
The compiler decides how many LDTMUs we're going to emit, and that must
match the P1 flags. This brings the return channel counting to a single
place (so all that's passed into the compiler is "how many return channels
you may request from this texture's format), and was a necessary step for
shadow samplers once we stop using OVRTMUOUT=0.
|
|
This means that we get a single copy of it emitted, instead of once at the
start of each tile (though it's still executed once per tile). Fixes
assertion failures with the updated simulator.
|
|
In newer versions they've removed the C interface, so make one here. This
also isolates the Mesa codebase from the simulator codebase, so we don't
have conflicts over things like "unreachable"
|
|
Most piglit textures happened to work out by RGBW not changing in that
bit, but it did cause failures in RGBA16F fbo-generatemipmap-formats.
|
|
I wrote this early in driver development, and our UIF handling is much
better now.
|
|
The mb_tile_layout table was just the utile_w/h times two, so reuse the
utile code instead.
|
|
Apparently gallium's u_blitter wants depth from at least the .z component,
and other swizzling appears to apply on top of that. Fixes
fbo-generatemipmap-formats failures with depth formats.
|
|
This matches freedreno's behavior.
|
|
|
|
There may be some more RCL work to be done (I think I need to split my Z/S
stores when doing separate stencil), but this gets piglit's "texwrap
GL_ARB_depth_buffer_float" working.
v2: Unwrap the z32f_wrapper before calling the helper, rather than having
the helper have a callback.
v3: Rebase on Rob Clark's u_transfer_helper instead
|
|
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
|
|
This allow us to encapsulate the compiler and linkage requirements of
each driver in a reusable way. The result will be that each target that
needs a specific driver can simply add `driver_<name>` to its
dependencies line and the necessary libraries and compiler args will be
added. This will allow for a lot of code de-duplication between gallium
targets.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
|
|
The HW doesn't add the base level anywhere (the min/max lod clamping is
what does base level), so we need to add it manually in this case.
Fixes piglit tex-miplevel-selection *Lod 2D.
|
|
Fixes piglit array-texture.
|
|
Fixes ext_transform_feedback-generatemipmap prims_generated
|
|
After the first output, we were padding by an extra size of the previous
output. Fixes piglit ext_transform_feedback-output-type mat4x3[2] and
friends.
|
|
The HW was computing an implicit height for the surface based on the image
size, but that may be smaller than the surface with ARB_fbo mismatched
sizes. In that case, we need to tell it about the pad, either with the
little 4-bit field in the RT config, or the extended field in
CLEAR_COLORS_PART3.
Fixes piglit arb_framebuffer_object-mixed-buffer-sizes.
|
|
Fixes tex-miplevel-selection GL2:texture() 1D
|
|
Otherwise, the simulator would complain in tex-miplevel-selection that the
min/max clamp was out of order. The actual HW seems to have clamped to
the max anyway.
|
|
We were overflowing, because of all the little 4k allocations for CLs that
were getting expanded to 128kb in the simulator due to the GMP alignment.
|
|
The original spec I had didn't expose integer textures and suggested that
you use unfiltered floats. Now there are proper formats for them.
Fixes 16- and 32-bit texwrap integer tests in piglit, and
dEQP-GLES3.functional.fbo.completeness.renderable.renderbuffer.color0.rgb10_a2ui.
|
|
When we tried to clear color while storing depth, it assertion failed
about basically not having enough information to decide which color RT to
clear. It turns out the STORE_GENERAL picks the buffer according to the
color buffer being stored, or all of them if NONE. If you're doing depth,
it doesn't know which to pick.
|
|
This centralizes the calculation in the surface, instead of in each
load/store.
|