Age | Commit message (Collapse) | Author | Files | Lines |
|
BuilderSWR::swr_gs_llvm_fetch_input() (and consequently
swr_gs_llvm_fetch_input()), did not handle the case where
is_vindex_indirect or is_aindex_direct is set.
Implement it, using the code in draw_llvm.c as a guideline.
Fixes the following piglit tests:
dynamic_input_array_index (crash)
gs-input-array-vec4-index-rd
vs-output-array-vec4-index-wr-before-gs
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Allows for call-stack and exception handling for jitted functions.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Using a gather for elements less than 32-bits in size can cause
pagefaults when loading the last elements in a page-aligned-sized
buffer.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Results in far smaller and useful IR output.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Bake in USE_SIMD16_BUILDER code paths (for USE_SIMD16_SHADER defined),
remove USE_SIMD16_BUILDER define, remove deprecated psuedo-SIMD16 code
paths.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
This enables the SWR driver, but doesn't actually hook it up to any of
the targets yet. I felt like this patch was big and complicated enough
without adding that.
v2: - Fix typo 'delemeited' -> 'delimited' (Eric E)
- Fix type 'errror' -> 'error' (Eric E)
- Use variables to hold files instead of looking above the current
meson build (Eric E)
- Use foreach loops to reduce the number of unique generators
- Add comment about why some generators have names and some are just
added to a list
v3: - Remove trailing whitespace
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
|
|
Should be 0x80000000 instead of 0x8000000.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
LLVM api change.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104381
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
|
|
When calculating buffer offsets for client buffers account for info.index_bias.
Fixes the follow piglit tests:
arb_draw_elements_base_vertex-drawelements-user_varrays
arb_draw_elements_base_vertex-negative-index-user_varrays
Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Replace use of x86 intrinsic with general llvm IR instruction.
Generates the same final assembly.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Add BASE_NUMA_NODE, BASE_CORE, BASE_THREAD parameters to
SwrCreateContext.
Add optional SWR_API_THREADING_INFO parameter to SwrCreateContext to
control reservation of API threads.
Add SwrBindApiThread() function to allow binding of API threads to
reserved HW threads.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Also widen the 16-bit a 8-bit integer vertex component gathers to SIMD16.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Move out of binner/clipper; hand them down from the frontend code instead.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Ease future code maintenance, prepare for folding simd8 and simd16 versions.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Simplifies calling code, gets gather function interface closer to llvm's
masked_gather.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Widen vertex gather/storage to SIMD16 for all component types.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
binner's GatherScissors() will be turned into a real gather in the not
too distant future.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
This patch fixes piglit tex3d-maxsize by correcting 4 things:
The total_size calculation was using 32-bit math, therefore a >4GB
allocation request overflowed and was not returning false (unsupported).
Changed AlignedMalloc arguments from "unsigned int" to size_t, to handle
>4GB allocations.
Added error checking on texture allocations to fail gracefully.
Finally, temporarily decreased supported max texture size from 4GB to 2GB.
The gallivm texture-sampler needs some additional work to correctly handle
larger than 2GB textures (offsets to LLVMBuildGEP are signed).
I'm working on a follow-on patch to allow up to 4GB textures, as this is
useful in HPC visualization applications.
Fixes piglit tex3d-maxsize.
v2: Updated patch description to clarify ">4GB".
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
|
|
Environment variable KNOB_MAX_WORKER_THREADS allows the user to override
default thread creation and thread binding. Previous commit to adjust
linux cpu topology caused setting this KNOB to bind all threads to a single
core.
This patch restores correct functionality of override.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
gen_BackendPixelRate*.cpp depends on gen_ar_eventhandler.hpp.
Fix missing dependency.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
gen_rasterizer*.cpp depends on gen_ar_eventhandler.hpp.
Account for new dependency.
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
|
|
Keep non-default simd8 frontend code running for comparison purposes.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Disabled for now.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
General cleanup, and prep work for possibly moving to llvm masked
gather intrinsic.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Needed to ensure alignment for avx512.
Fixes address sanitizer crash.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Fixes piglit glsl-1.20:vs-clip-vertex-primitives and
glsl-1.30:vs-clip-distance-primitives.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Widen fetch shader to SIMD16, enable SIMD16 types in the jitter,
and provide utility EXTRACT/INSERT SIMD8 <-> SIMD16 utility functions.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
Speed up simd16 frontend (default) on avx/avx2 platforms;
fixes performance regression caused by switch to simdlib.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: mesa-stable@lists.freedesktop.org
|
|
Speed up avx512 platforms; fixes performance regression caused
by swithc to simdlib.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: mesa-stable@lists.freedesktop.org
|
|
State validation is performed during clear and draw calls. Validation
during clear was still accessing vertex buffer state. When the currently
set vertex buffers are client arrays, this could lead to accessing freed
memory. Such is the case with the VMD application.
Previously, vertex buffer validation depended on a dirty bit or the
draw info indicating an indexed draw. This required special handling for
clears. But, vertex buffer validation still occurred which was unnecessary
and wrong.
Now, only minimal validation is performed during clear, deferring the
remainder to the next draw. And, by setting the dirty bit in swr_draw_vbo
for indexed draws, vertex buffer validation is only dependent upon a
single dirty bit.
This fixes a bug exposed by the VMD application when changing models.
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
|
|
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|