~nh/mesa - nh's Mesa repository; mostly radeonsi related development

Age	Commit message (Collapse)	Author	Files	Lines
2018-01-10	swr: Handle indirect indices in GS	George Kyriazis	1	-8/+39
	BuilderSWR::swr_gs_llvm_fetch_input() (and consequently swr_gs_llvm_fetch_input()), did not handle the case where is_vindex_indirect or is_aindex_direct is set. Implement it, using the code in draw_llvm.c as a guideline. Fixes the following piglit tests: dynamic_input_array_index (crash) gs-input-array-vec4-index-rd vs-output-array-vec4-index-wr-before-gs Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-10	swr/rast: switch win32 jit format to COFF	Tim Rowley	1	-2/+2
	Allows for call-stack and exception handling for jitted functions. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-10	swr/rast: don't use 32-bit gathers for elements < 32-bits in size	Tim Rowley	1	-1/+60
	Using a gather for elements less than 32-bits in size can cause pagefaults when loading the last elements in a page-aligned-sized buffer. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-10	swr/rast: autogenerate named structs instead of literal structs	Tim Rowley	1	-8/+15
	Results in far smaller and useful IR output. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-10	swr/rast: SIMD16 fetch shader jitter cleanup	Tim Rowley	1	-720/+368
	Bake in USE_SIMD16_BUILDER code paths (for USE_SIMD16_SHADER defined), remove USE_SIMD16_BUILDER define, remove deprecated psuedo-SIMD16 code paths. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-10	swr/rast: shuffle header files for msvc pre-compiled header usage	Tim Rowley	10	-88/+143
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-10	swr/rast: SIMD16 builder - cleanup naming (simd2 -> simd16)	Tim Rowley	5	-233/+239
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-08	meson: Build SWR driver	Dylan Baker	2	-0/+447
	This enables the SWR driver, but doesn't actually hook it up to any of the targets yet. I felt like this patch was big and complicated enough without adding that. v2: - Fix typo 'delemeited' -> 'delimited' (Eric E) - Fix type 'errror' -> 'error' (Eric E) - Use variables to hold files instead of looking above the current meson build (Eric E) - Use foreach loops to reduce the number of unique generators - Add comment about why some generators have names and some are just added to a list v3: - Remove trailing whitespace Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-04	swr/rast: fix invalid sign masks in avx512 simdlib code	Tim Rowley	3	-3/+3
	Should be 0x80000000 instead of 0x8000000. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-03	swr/rast: fix MemoryBuffer build break for llvm-6	Tim Rowley	1	-0/+4
	LLVM api change. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104381 Tested-by: Laurent Carlier <lordheavym@gmail.com> Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-19	gallium: plumb context priority through to driver	Rob Clark	1	-0/+1
	Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-12-18	swr: Account for index_bias in offsets	George Kyriazis	1	-3/+3
	When calculating buffer offsets for client buffers account for info.index_bias. Fixes the follow piglit tests: arb_draw_elements_base_vertex-drawelements-user_varrays arb_draw_elements_base_vertex-negative-index-user_varrays Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: Move more RTAI handling out of binner	Tim Rowley	2	-12/+2
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: EXTRACT2 changed from vextract/vinsert to vshuffle	Tim Rowley	3	-61/+32
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: Fix cache of API thread event manager	Tim Rowley	1	-1/+1
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: Replace VPSRL with LSHR	Tim Rowley	4	-41/+4
	Replace use of x86 intrinsic with general llvm IR instruction. Generates the same final assembly. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: Rework thread binding parameters for machine partitioning	Tim Rowley	7	-88/+322
	Add BASE_NUMA_NODE, BASE_CORE, BASE_THREAD parameters to SwrCreateContext. Add optional SWR_API_THREADING_INFO parameter to SwrCreateContext to control reservation of API threads. Add SwrBindApiThread() function to allow binding of API threads to reserved HW threads. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: Pull of RTAI gather & offset out of clip/bin code	Tim Rowley	7	-146/+203
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: Remove no-op VBROADCAST of vID	Tim Rowley	1	-2/+2
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: SIMD16 Fetch - Fully widen 32-bit integer vertex components	Tim Rowley	4	-17/+109
	Also widen the 16-bit a 8-bit integer vertex component gathers to SIMD16. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: Replace INSERT2 vextract/vinsert with JOIN2 vshuffle	Tim Rowley	3	-105/+30
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: SIMD16 Fetch - Fully widen 16-bit float vertex components	Tim Rowley	1	-7/+48
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: SIMD16 Fetch - Fully widen 32-bit float vertex components	Tim Rowley	4	-32/+194
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: Pass prim to ClipSimd	Tim Rowley	1	-5/+5
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: Pull most of the VPAI manipulation out of the binner/clipper	Tim Rowley	7	-158/+177
	Move out of binner/clipper; hand them down from the frontend code instead. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: Move GatherScissors to header	Tim Rowley	2	-127/+127
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: Rewrite Shuffle8bpcGatherd using shuffle	Tim Rowley	1	-182/+62
	Ease future code maintenance, prepare for folding simd8 and simd16 versions. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: Convert gather masks to Nx1bit	Tim Rowley	2	-40/+14
	Simplifies calling code, gets gather function interface closer to llvm's masked_gather. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: WIP - Widen fetch shader to SIMD16	Tim Rowley	1	-27/+689
	Widen vertex gather/storage to SIMD16 for all component types. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: Corrections to multi-scissor handling	Tim Rowley	1	-88/+88
	binner's GatherScissors() will be turned into a real gather in the not too distant future. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: Binner fixes for viewport index offset handling	Tim Rowley	2	-2/+12
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-15	swr/rast: Remove unneeded copy of gather mask	Tim Rowley	2	-79/+23
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-13	swr: Correct texture allocation and limit max size to 2GB	Bruce Cherniak	2	-4/+10
	This patch fixes piglit tex3d-maxsize by correcting 4 things: The total_size calculation was using 32-bit math, therefore a >4GB allocation request overflowed and was not returning false (unsupported). Changed AlignedMalloc arguments from "unsigned int" to size_t, to handle >4GB allocations. Added error checking on texture allocations to fail gracefully. Finally, temporarily decreased supported max texture size from 4GB to 2GB. The gallivm texture-sampler needs some additional work to correctly handle larger than 2GB textures (offsets to LLVMBuildGEP are signed). I'm working on a follow-on patch to allow up to 4GB textures, as this is useful in HPC visualization applications. Fixes piglit tex3d-maxsize. v2: Updated patch description to clarify ">4GB". Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2017-12-13	swr: Fix KNOB_MAX_WORKER_THREADS thread creation override.	Bruce Cherniak	1	-2/+1
	Environment variable KNOB_MAX_WORKER_THREADS allows the user to override default thread creation and thread binding. Previous commit to adjust linux cpu topology caused setting this KNOB to bind all threads to a single core. This patch restores correct functionality of override. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-12-06	swr/scons: Fix another intermittent build failure	George Kyriazis	1	-0/+1
	gen_BackendPixelRate*.cpp depends on gen_ar_eventhandler.hpp. Fix missing dependency. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-12-01	swr/scons: Fix intermittent build failure	George Kyriazis	1	-0/+1
	gen_rasterizer*.cpp depends on gen_ar_eventhandler.hpp. Account for new dependency. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-11-20	swr/rast: Repair simd8 frontend code rot	Tim Rowley	1	-1/+1
	Keep non-default simd8 frontend code running for comparison purposes. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20	swr/rast: Implement AVX-512 GATHERPS in SIMD16 fetch shader	Tim Rowley	4	-29/+220
	Disabled for now. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20	swr/rast: Simplify GATHER* jit builder api	Tim Rowley	4	-48/+48
	General cleanup, and prep work for possibly moving to llvm masked gather intrinsic. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20	swr/rast: Add alignment to transpose targets	Tim Rowley	1	-8/+8
	Needed to ensure alignment for avx512. Fixes address sanitizer crash. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20	swr/rast: Cache eventmanager	Tim Rowley	3	-0/+9
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20	swr/rast: Enable AVX-512 targets in the jitter	Tim Rowley	2	-10/+0
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20	swr/rast: Points with clipdistance can't go through simplepoints path	Tim Rowley	1	-1/+2
	Fixes piglit glsl-1.20:vs-clip-vertex-primitives and glsl-1.30:vs-clip-distance-primitives. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20	swr/rast: Code style change (NFC)	Tim Rowley	1	-2/+7
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20	swr/rast: Widen fetch shader to SIMD16	Tim Rowley	5	-3/+151
	Widen fetch shader to SIMD16, enable SIMD16 types in the jitter, and provide utility EXTRACT/INSERT SIMD8 <-> SIMD16 utility functions. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-20	swr/rast: Support flexible vertex layout for DS output	Tim Rowley	2	-0/+3
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-11-14	swr/rast: Faster emulated simd16 permute	Tim Rowley	1	-23/+11
	Speed up simd16 frontend (default) on avx/avx2 platforms; fixes performance regression caused by switch to simdlib. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> Cc: mesa-stable@lists.freedesktop.org
2017-11-14	swr/rast: Use gather instruction for i32gather_ps on simd16/avx512	Tim Rowley	1	-11/+1
	Speed up avx512 platforms; fixes performance regression caused by swithc to simdlib. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> Cc: mesa-stable@lists.freedesktop.org
2017-11-10	swr: Fixed an uncommon freed-memory access during state validation	Bruce Cherniak	2	-17/+25
	State validation is performed during clear and draw calls. Validation during clear was still accessing vertex buffer state. When the currently set vertex buffers are client arrays, this could lead to accessing freed memory. Such is the case with the VMD application. Previously, vertex buffer validation depended on a dirty bit or the draw info indicating an indexed draw. This required special handling for clears. But, vertex buffer validation still occurred which was unnecessary and wrong. Now, only minimal validation is performed during clear, deferring the remainder to the next draw. And, by setting the dirty bit in swr_draw_vbo for indexed draws, vertex buffer validation is only dependent upon a single dirty bit. This fixes a bug exposed by the VMD application when changing models. Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2017-11-09	util: move os_time.[ch] to src/util	Nicolai Hähnle	2	-2/+2
	Reviewed-by: Marek Olšák <marek.olsak@amd.com>