~nh/mesa - nh's Mesa repository; mostly radeonsi related development

Age	Commit message (Collapse)	Author	Files	Lines
2018-03-14	ac_surface: don't apply the 256-byte alignment to staging surfacesuser_stride-v2	Nicolai Hähnle	1	-1/+4
	Having the over-alignment on staging surfaces breaks the user_stride mechanism. This whole thing is a hack. We should really have a generic mechanism for specifying minimum stride alignments. In the meantime, I'm not sure if this breaks radv with GFX6/GFX9 hybrid graphics (e.g., pre-gfx9 on Raven). Cc: Dave Airlie <airlied@redhat.com>
2018-03-14	radeonsi: implement transfer_map with user_stride	Nicolai Hähnle	1	-5/+28
	The stride ends up being aligned by AddrLib in ways that are inconvenient to express clearly, but basically, a stride that is aligned to both 64 pixels and 256 bytes will go through unchanged in practice.
2018-03-14	radeonsi: fix failure paths of r600_texture_transfer_map	Nicolai Hähnle	1	-13/+12
	trans is zero-initialized, but trans->resource is setup immediately so needs to be dereferenced.
2018-03-14	st/dri: implement __DRI_IMAGE_TRANSFER_MAP_USER_STRIDE	Nicolai Hähnle	1	-6/+11

2018-03-14	gallium: add user_stride parameter to pipe_context::transfer_map	Nicolai Hähnle	33	-12/+53
	Allow callers to prescribe a desired stride for a transfer. Drivers are free to ignore this new parameter. There is no new capability because it's unclear how strict requirements on this feature should be expressed.
2018-03-14	gallium: use pipe_transfer_map_box inline helper	Nicolai Hähnle	26	-42/+58
	We will change pipe_context::transfer_map in a subsequent commit. Wrapping it in an inline function makes that subsequent change less noisy.
2018-03-14	dri_interface: add __DRI_IMAGE_TRANSFER_USER_STRIDE	Nicolai Hähnle	1	-3/+13
	Allow the caller to specify the row stride (in bytes) with which an image should be mapped. Note that completely ignoring USER_STRIDE is a valid implementation of mapImage. This is horrible API design. Unfortunately, cros_gralloc does indeed have a horrible API design -- in that arbitrary images should be allowed to be mapped with the stride that a linear image of the same width would have. There is no separate capability bit because it's unclear how stricter requirements should be defined.
2018-03-14	dri_interface: document error behavior of mapImage	Nicolai Hähnle	1	-0/+2
	This function is meant to return NULL on error, unlike some other APIs (such as mmap()), which return MAP_FAILED.
2018-03-14	radeonsi/cik+: report 64KB local size for compute	Nicolai Hähnle	1	-2/+4

2018-03-14	HACK: ac_surface: disable an assertion triggered by radv on Carrizo	Nicolai Hähnle	1	-3/+4

2018-03-14	HACK: disable NIR-level optimizations	Nicolai HÃ¤hnle	1	-20/+20

2018-03-14	[AMD] dri3: Add adaptive_sync_enable driconf option	Nicolai Hähnle	4	-1/+70
	When enabled, this will request FreeSync via the hybrid amdgpu DDX's AMDGPU X11 protocol extension. Due to limitations in the DDX this will only work for applications that cover the entire X screen (which is important to keep in mind when you have a multi-monitor setup). v2: set adaptive_sync_enable = 0 by default
2018-03-14	radeonsi: asynchronous flushes don't have to wait for the submit thread	Nicolai Hähnle	1	-1/+13

2018-03-14	radeonsi: add a warning message for DCC disable failure	Nicolai Hähnle	1	-2/+8
	Pointed out by Coverity. CID: 1418608
2018-03-14	DBG add ac_emit_sethalt	Nicolai Hähnle	1	-0/+46

2018-03-14	DBG HACK add si_sethalt	Nicolai Hähnle	1	-0/+10

2018-03-14	DBG add no_tc_compatible_{mipmaps,flat}	Nicolai Hähnle	1	-0/+7

2018-03-14	DBG gallium/radeon: optionally check definedness of data written into IBs	Nicolai Hähnle	2	-1/+22

2018-03-14	nir/print: add const qualifiers to some more function arguments	Nicolai Hähnle	2	-5/+5

2018-03-14	WIP amd/addrtool	Nicolai Hähnle	3	-0/+725

2018-03-14	util/bitset: add BITSET_LAST_BIT	Nicolai Hähnle	1	-0/+21

2018-03-14	util/bitset: add BITSET_AND and BITSET_AND_NOT	Nicolai Hähnle	1	-0/+32

2018-03-14	ac: add ac_get_exec_mask helper function	Nicolai Hähnle	2	-0/+17
	TODO: this needs an optimization barrier to prevent hoisting / speculating
2018-03-14	gallium/radeon: add EarlyCSE pass	Nicolai Hähnle	1	-0/+1
	This helps to get rid of all the back-and-forth casting with 64-bit ints and doubles, and so exposes more optimization opportunities. TODO test with shader-db
2018-03-14	egl: add support for EGL_MESA_drm_image_formats	Nicolai Hähnle	2	-1/+22
	Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-14	st/mesa: use asynchronous flushes for glFlush	Nicolai Hähnle	1	-1/+1
	Having the gallium driver thread flush in the background should be sufficient for glFlush semantics. Various end-of-frame flushes (from st_context_flush and st/dri) still use a synchronous flush. We should eventually be able to transition those to asynchronous flushes as well by passing fences explicitly via the X protocol.
2018-03-14	spirv: Handle doubles when multiplying a mat by a scalar	Neil Roberts	1	-3/+3
	The code to handle mat multiplication by a scalar tries to pick either imul or fmul depending on whether the matrix is float or integer. However it was doing this by checking whether the base type is float. This was making it choose the int path for doubles (and presumably float16s). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14	anv/entrypoints: VkGetDeviceProcAddr returns NULL for core instance commands	Iago Toral Quiroga	1	-1/+5
	af5f2322d0c64 addressed this for extension commands, but the spec mandates this behavior also for core API commands. From the Vulkan spec, Table 2. vkGetDeviceProcAddr behavior: device pname return ---------------------------------------------------------- (..) device core device-level command fp (...) See that it specifically states "device-level". Since the vk.xml file doesn't state if core commands are instance or device level, we identify device level commands as the ones that take a VkDevice, VkQueue or VkCommandBuffer as their first parameter. Fixes test failures in new work-in-progress CTS tests. Also see the public issue: https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/issues/2323 v2: - Include reference to github issue (Emil) - Rebased on top of Vulkan 1.1 changes. v3: - Remove the not in the condition and switch the then/else cases (Jason) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14	anv/entrypoints: dispatches to VkQueue are device-level	Iago Toral Quiroga	1	-2/+7
	v2: - Add trampoline functions (Jason) - Add an assertion for unhandled trampoline cases Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14	radv: drop assert on bindingDescriptorCount > 0	Dave Airlie	1	-1/+0
	The spec is pretty clear that this can be 0, and that it operates as a reserved binding. Fixes: dEQP-VK.binding_model.descriptor_update.empty_descriptor.uniform_buffer Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-13	sched.h needs to be imported on Darwin/OSX targets.	Apple SWE	1	-0/+4
	sched_yield is used but the include reference on Darwin is missing. This patch conditionally guards on Darwin/OSX to import sched.h first. Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2018-03-13	Add processor topology calculation implementation for Darwin/OSX targets.	Apple SWE	1	-0/+55
	The implementation for bootstrapping SWR on Darwin targets is based on the Linux version. Instead of reading the output of /proc/cpuinfo, sysctlbyname is used to determine the physical identifiers, processor identifiers, core counts and thread-processor affinities. With this patch, it is possible to use SWR as an alternate renderer on OSX to softpipe and llvmpipe. Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2018-03-14	r600: fix abs for op3 sources	Roland Scheidegger	1	-54/+56
	If a src was referencing the same temp as the dst, the per-component copy code didn't work. e.g. cndge r0.xy, r0.xx, \|r2\|, r3 got expanded into mov r12.x, \|r2\| cndge r0.x, r0.x, r12, r3 mov r12.y, \|r2\| cndge r0.y, r0.x, r12, r3 hence for the second cndge r0.x was mistakenly the previous cndge result. Fix this by doing all the movs first, so there's no bogus alu.last in between. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102905 Tested-by: <iive@yahoo.com> Reviewed-by: Dave Airlie <airlied@gmail.com>
2018-03-14	radv: mark all tess output for an indirect access.	Dave Airlie	1	-8/+13
	If a shader does a tcs store with an indirect access, we were only marking the first spot as used. For indirect access we always now mark all slots used by the variable. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.) Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14	ac/nir: pass the nir variable through tcs loading.	Dave Airlie	4	-22/+15
	I was going to have to add another parameter to this monster, so we should just pass the nir_variable in, I can't find any reason this would be a bad idea. This needed for the next fix. Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14	radv: get correct offset into LDS for indexed vars.	Dave Airlie	1	-1/+1
	This seems more correct to me, since if we have an array of floats they'll be vec4 aligned, and if we do af[2], we want the const index to increase by 2 slots in the non compact case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-13	nir: lower_load_const_to_scalar fix for 8/16b types	Rob Clark	1	-4/+15
	Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-13	Update the documentation for meson	Dylan Baker	1	-13/+23
	Meson is pretty well tested and works in most configurations now, so we can remove the warning about it being unsuited for actual use. It's also worth documenting that meson 0.42.0 or greater is required. v2: - Minor rewording of supported platforms as suggested by Emil - Add two missing tags as reported by xmllint --html Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
2018-03-13	ac/nir: Use lower_vote_eq_to_ballot instead of ac_nir_lower_subgroups	Jason Ekstrand	6	-99/+1
	Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13	nir/subgroups: Add lowering for vote_ieq/vote_feq to a ballot	Jason Ekstrand	2	-0/+49
	This is based heavily on 97f10934edf8ac, "ac/nir: Add vote_ieq/vote_feq lowering pass." from Bas Nieuwenhuizen. This version is a bit more general since it's in common code. It also properly handles NaN due to not flipping the comparison for floats. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13	meson: don't use compiler.has_header	Dylan Baker	1	-1/+1
	Meson's compiler.has_header is completely useless, it only checks that a header exists, not whether it's usable. This creates problems if a header contains a conditional #error declaration, like so: > #if __x86_64__ > # error "Doesn't work with x86_64!" > #endif Compiler.has_header will return true in this case, even when compiling for x86_64. This is useless. Instead, we'll do a compile check so that any #error declarations will be treated as errors, and compilation will work. Fixes compilation on x32 architecture. Gentoo Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=649746 meson bug: https://github.com/mesonbuild/meson/issues/2246 Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-13	i965: Emit texture cache invalidates around blorp_copy	Jason Ekstrand	1	-0/+15
	This is a terrible hack but it fixes CTS regressions. It's still incredibly unclear exactly what is going wrong in the hardware to cause this to be an issue so this isn't a good fix by any means. However, it does fix tests so there is that. Fixes: fb0e9b5197 "i965: Track the depth and render caches separately" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103746 Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-13	brodacom/vc4: Fix simulator since the perfmon change.	Eric Anholt	1	-0/+1
	It would be nice to support perfmon with simulator, and might be a useful tool for regression testing performance (since the simulator would be deterministic).
2018-03-13	spirv: Silence compiler warning about undefined srcs[0]	Eric Anholt	1	-0/+1
	v2: Use assume() at the srcs[] definition instead. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-03-13	ac/nir: rename radeon_llvm_reg_index_soa() to ac_llvm_reg_index_soa()	Samuel Pitoiset	3	-12/+12
	Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13	ac/nir: remove some unnecessary includes and declarations	Samuel Pitoiset	2	-9/+1
	Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13	ac/nir: drop radv prefix from radv_lower_gather4_integer()	Samuel Pitoiset	1	-4/+4
	Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13	ac/nir: move ac_nir_compiler_options and friends to radv folder	Samuel Pitoiset	7	-89/+89
	Also replace ac_ by radv_. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13	ac: move ac_shader_info to radv folder	Samuel Pitoiset	11	-99/+63
	This is RADV specific code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13	ac/nir: move ac_shader_variant_info and friends to radv folder	Samuel Pitoiset	7	-136/+139
	Also replace ac_ by radv_. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>