Age | Commit message (Collapse) | Author | Files | Lines |
|
Note: this causes spurious regressions in some current piglit tests,
because the tests incorrectly assume that there is no denorm support for
doubles. I'm going to send out a fix for those tests as well.
|
|
The LLVM intrinsic has existed for a long time. The current name was
established in LLVM 3.9.
|
|
|
|
|
|
|
|
|
|
|
|
The status quo is quite the mess:
1. tgsi_exec will do a per-channel computation, and store the dst[0]
result (significand) correctly for each channel. The dst[1] result
(exponent) will be written to the first bit set in the writemask.
So per-component calculation only works partially.
2. r600 will only do a single computation. It will replicate the
exponent but not the significand.
3. The docs pretend that there's per-component calculation, but even
get dst[0] and dst[1] confused.
4. Luckily, st_glsl_to_tgsi only ever emits single-component instructions,
and kind-of assumes that everything is replicated, generating this for
the dvec4 case:
DFRACEXP TEMP[0].xy, TEMP[1].x, CONST[0][0].xyxy
DFRACEXP TEMP[0].zw, TEMP[1].y, CONST[0][0].zwzw
DFRACEXP TEMP[2].xy, TEMP[1].z, CONST[0][1].xyxy
DFRACEXP TEMP[2].zw, TEMP[1].w, CONST[0][1].zwzw
Settle on the simplest behavior, which is single-component calculation
with replication, document it, and adjust tgsi_exec and r600.
|
|
Sourcing the exponent for the zw destination pair from Z is consistent
with both tgsi_exec and gallivm. In practice, st_glsl_to_tgsi always
generates per-channel instructions anyway.
|
|
|
|
GLSL ES requires both, and while GLSL explicitly doesn't require correct
overflow handling, it does appear to require handling input inf/denorms
correctly.
Fixes dEQP-GLES31.functional.shaders.builtin_functions.precision.ldexp.*
Cc: mesa-stable@lists.freedesktop.org
|
|
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
|
|
enclosing_scope already contains enclosing_scope_first_read.
What we really want to check here -- not for correctness, but
for speed -- is whether last_read_scope already contains
enclosing_scope.
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
|
|
Fixes various piglit tests on Stoney, see the comment.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
This assertion is triggered on Stoney in Piglit
./bin/framebuffer-blit-levels {draw,read} stencil -auto -fbo
and similar tests. It should be harmless -- just relax it until
we can get internal clarification.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
buffers
The GLSL rules for interpolateAtSample are unfortunate:
"Returns the value of the input interpolant variable at
the location of sample number sample. If
multisample buffers are not available, the input
variable will be evaluated at the center of the pixel.
If sample sample does not exist, the position used to
interpolate the input variable is undefined."
This fix will fallback to monolithic shader compilation when
interpolateAtSample is used without multisampling.
One alternative would be to always upload 16 sample positions,
filling the buffer up with repetition when the actual number of
samples is less, and then ANDing the sample ID with 0xf. However,
that punishes all well-behaving users of interpolateAtSample,
when in reality, only conformance tests should be affected by
the issue.
Fixes
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.non_multisample_buffer.*
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
gl_SampleMaskIn is supposed to contain set bits only for the samples that
are covered by the current fragment shader invocation, but the VGPR
initialization hardware loads the set of all bits that are covered at the
current pixel.
Fixes various tests in
dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.*
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
We need to keep the workaround for older firmware, though.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
If the last operation happens to be a non-draw, such as a
transfer_map that triggers a decompress blit, there may be
interesting messages left in the driver log.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
Add InstanceStrideEnable field and rename InstanceDataStepRate to
InstanceAdvancementState in INPUT_ELEMENT_DESC structure.
Add stubs for handling InstanceStrideEnable in FetchJit::JitLoadVertices()
and FetchJit::JitGatherVertices() and assert if they are triggered.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Make more robust to handle strange strange configurations like a vmware
exported 4-way numa X 1-core configuration.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Add new field in SWR_BACKEND_STATE::vertexClipCullOffset to specify the
start of the clip/cull section of the vertex header. Removed use of
hardcoded slot from binner.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Moved from from SWR_RASTSTATE to SWR_BACKEND_STATE.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
SwrStallBE stalls the backend threads until all work submitted before
the stall has finished. The frontend threads can continue to make
forward progress.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
The function is only called from one place, which is hidden behind
the same `#ifdef DEBUG`.
Fixes: ca73c3358c91434e68ab "glsl: Mark functions static"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
|
|
Per the spec:
"Resetting a command buffer is an operation that discards any
previously recorded commands and puts a command buffer in the
initial state."
As far I'm concerned, that flag can be changed by calling
VkCmdPushConstants() (or any other functions which update it),
so it should be cleared as well.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
|
|
GFX9 changes how images are layed out, so this needs updating.
Fixes: dEQP-VK.query_pool.statistics_query.*
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
For the comp_swap case this was overflowing and crashing
sometimes.
Fixes:
dEQP-VK.image.atomic_operations.compare_exchange.*
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
This field covers the whole resource.
Fixes:
dEQP-VK.pipeline.image.suballocation.sampling_type.combined.view_type.3d.format.*
dEQP-VK.texture.filtering.3d.combinations.*
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
As GFX9 can't handle 1D depth textures, radeonsi and
apparantly pro just update all 1D textures to 2D,
and work around it.
This ports the workarounds from radeonsi.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Work out the width/height from the level manually, as on GFX9
we won't minify the iview width/height.
This fixes:
dEQP-VK.api.image_clearing.core.clear_color_image* on gfx9
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
We are looking up the execution type prior to checking how many sources
we have. This leads to looking for a type for src1 on MOV instructions
which is bogus. On BDW+, the src1 register type overlaps with the
64-bit immediate and causes us problems.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
|
|
This reverts commit 1cda9a2fee05effd9c64bd773bc6005281593662.
It works now.
|
|
We can't use it anyway in fast clears, and on GFX9 it seems to
actually hange the card if we specify it.
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
|
|
The current DCC init routine doesn't account for initializing a
single layer or level. Multilayer seems hard for small textures on
pre-GFX9 as tre metadata for the layers can be interleaved. For
GFX9 multilevel textures are a problem for similar reasons.
So just disable this for now, until we handle the texture modes
correctly.
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
|