summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-03-14ac_surface: don't apply the 256-byte alignment to staging surfacesuser_stride-v2Nicolai Hähnle1-1/+4
Having the over-alignment on staging surfaces breaks the user_stride mechanism. This whole thing is a hack. We should really have a generic mechanism for specifying minimum stride alignments. In the meantime, I'm not sure if this breaks radv with GFX6/GFX9 hybrid graphics (e.g., pre-gfx9 on Raven). Cc: Dave Airlie <airlied@redhat.com>
2018-03-14radeonsi: implement transfer_map with user_strideNicolai Hähnle1-5/+28
The stride ends up being aligned by AddrLib in ways that are inconvenient to express clearly, but basically, a stride that is aligned to both 64 pixels and 256 bytes will go through unchanged in practice.
2018-03-14radeonsi: fix failure paths of r600_texture_transfer_mapNicolai Hähnle1-13/+12
trans is zero-initialized, but trans->resource is setup immediately so needs to be dereferenced.
2018-03-14st/dri: implement __DRI_IMAGE_TRANSFER_MAP_USER_STRIDENicolai Hähnle1-6/+11
2018-03-14gallium: add user_stride parameter to pipe_context::transfer_mapNicolai Hähnle33-12/+53
Allow callers to prescribe a desired stride for a transfer. Drivers are free to ignore this new parameter. There is no new capability because it's unclear how strict requirements on this feature should be expressed.
2018-03-14gallium: use pipe_transfer_map_box inline helperNicolai Hähnle26-42/+58
We will change pipe_context::transfer_map in a subsequent commit. Wrapping it in an inline function makes that subsequent change less noisy.
2018-03-14dri_interface: add __DRI_IMAGE_TRANSFER_USER_STRIDENicolai Hähnle1-3/+13
Allow the caller to specify the row stride (in bytes) with which an image should be mapped. Note that completely ignoring USER_STRIDE is a valid implementation of mapImage. This is horrible API design. Unfortunately, cros_gralloc does indeed have a horrible API design -- in that arbitrary images should be allowed to be mapped with the stride that a linear image of the same width would have. There is no separate capability bit because it's unclear how stricter requirements should be defined.
2018-03-14dri_interface: document error behavior of mapImageNicolai Hähnle1-0/+2
This function is meant to return NULL on error, unlike some other APIs (such as mmap()), which return MAP_FAILED.
2018-03-14radeonsi/cik+: report 64KB local size for computeNicolai Hähnle1-2/+4
2018-03-14HACK: ac_surface: disable an assertion triggered by radv on CarrizoNicolai Hähnle1-3/+4
2018-03-14HACK: disable NIR-level optimizationsNicolai Hähnle1-20/+20
2018-03-14[AMD] dri3: Add adaptive_sync_enable driconf optionNicolai Hähnle4-1/+70
When enabled, this will request FreeSync via the hybrid amdgpu DDX's AMDGPU X11 protocol extension. Due to limitations in the DDX this will only work for applications that cover the entire X screen (which is important to keep in mind when you have a multi-monitor setup). v2: set adaptive_sync_enable = 0 by default
2018-03-14radeonsi: asynchronous flushes don't have to wait for the submit threadNicolai Hähnle1-1/+13
2018-03-14radeonsi: add a warning message for DCC disable failureNicolai Hähnle1-2/+8
Pointed out by Coverity. CID: 1418608
2018-03-14DBG add ac_emit_sethaltNicolai Hähnle1-0/+46
2018-03-14DBG HACK add si_sethaltNicolai Hähnle1-0/+10
2018-03-14DBG add no_tc_compatible_{mipmaps,flat}Nicolai Hähnle1-0/+7
2018-03-14DBG gallium/radeon: optionally check definedness of data written into IBsNicolai Hähnle2-1/+22
2018-03-14nir/print: add const qualifiers to some more function argumentsNicolai Hähnle2-5/+5
2018-03-14WIP amd/addrtoolNicolai Hähnle3-0/+725
2018-03-14util/bitset: add BITSET_LAST_BITNicolai Hähnle1-0/+21
2018-03-14util/bitset: add BITSET_AND and BITSET_AND_NOTNicolai Hähnle1-0/+32
2018-03-14ac: add ac_get_exec_mask helper functionNicolai Hähnle2-0/+17
TODO: this needs an optimization barrier to prevent hoisting / speculating
2018-03-14gallium/radeon: add EarlyCSE passNicolai Hähnle1-0/+1
This helps to get rid of all the back-and-forth casting with 64-bit ints and doubles, and so exposes more optimization opportunities. TODO test with shader-db
2018-03-14egl: add support for EGL_MESA_drm_image_formatsNicolai Hähnle2-1/+22
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-14st/mesa: use asynchronous flushes for glFlushNicolai Hähnle1-1/+1
Having the gallium driver thread flush in the background should be sufficient for glFlush semantics. Various end-of-frame flushes (from st_context_flush and st/dri) still use a synchronous flush. We should eventually be able to transition those to asynchronous flushes as well by passing fences explicitly via the X protocol.
2018-03-14spirv: Handle doubles when multiplying a mat by a scalarNeil Roberts1-3/+3
The code to handle mat multiplication by a scalar tries to pick either imul or fmul depending on whether the matrix is float or integer. However it was doing this by checking whether the base type is float. This was making it choose the int path for doubles (and presumably float16s). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14anv/entrypoints: VkGetDeviceProcAddr returns NULL for core instance commandsIago Toral Quiroga1-1/+5
af5f2322d0c64 addressed this for extension commands, but the spec mandates this behavior also for core API commands. From the Vulkan spec, Table 2. vkGetDeviceProcAddr behavior: device pname return ---------------------------------------------------------- (..) device core device-level command fp (...) See that it specifically states "device-level". Since the vk.xml file doesn't state if core commands are instance or device level, we identify device level commands as the ones that take a VkDevice, VkQueue or VkCommandBuffer as their first parameter. Fixes test failures in new work-in-progress CTS tests. Also see the public issue: https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/issues/2323 v2: - Include reference to github issue (Emil) - Rebased on top of Vulkan 1.1 changes. v3: - Remove the not in the condition and switch the then/else cases (Jason) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14anv/entrypoints: dispatches to VkQueue are device-levelIago Toral Quiroga1-2/+7
v2: - Add trampoline functions (Jason) - Add an assertion for unhandled trampoline cases Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14radv: drop assert on bindingDescriptorCount > 0Dave Airlie1-1/+0
The spec is pretty clear that this can be 0, and that it operates as a reserved binding. Fixes: dEQP-VK.binding_model.descriptor_update.empty_descriptor.uniform_buffer Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-13sched.h needs to be imported on Darwin/OSX targets.Apple SWE1-0/+4
sched_yield is used but the include reference on Darwin is missing. This patch conditionally guards on Darwin/OSX to import sched.h first. Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2018-03-13Add processor topology calculation implementation for Darwin/OSX targets.Apple SWE1-0/+55
The implementation for bootstrapping SWR on Darwin targets is based on the Linux version. Instead of reading the output of /proc/cpuinfo, sysctlbyname is used to determine the physical identifiers, processor identifiers, core counts and thread-processor affinities. With this patch, it is possible to use SWR as an alternate renderer on OSX to softpipe and llvmpipe. Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2018-03-14r600: fix abs for op3 sourcesRoland Scheidegger1-54/+56
If a src was referencing the same temp as the dst, the per-component copy code didn't work. e.g. cndge r0.xy, r0.xx, |r2|, r3 got expanded into mov r12.x, |r2| cndge r0.x, r0.x, r12, r3 mov r12.y, |r2| cndge r0.y, r0.x, r12, r3 hence for the second cndge r0.x was mistakenly the previous cndge result. Fix this by doing all the movs first, so there's no bogus alu.last in between. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102905 Tested-by: <iive@yahoo.com> Reviewed-by: Dave Airlie <airlied@gmail.com>
2018-03-14radv: mark all tess output for an indirect access.Dave Airlie1-8/+13
If a shader does a tcs store with an indirect access, we were only marking the first spot as used. For indirect access we always now mark all slots used by the variable. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.) Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14ac/nir: pass the nir variable through tcs loading.Dave Airlie4-22/+15
I was going to have to add another parameter to this monster, so we should just pass the nir_variable in, I can't find any reason this would be a bad idea. This needed for the next fix. Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14radv: get correct offset into LDS for indexed vars.Dave Airlie1-1/+1
This seems more correct to me, since if we have an array of floats they'll be vec4 aligned, and if we do af[2], we want the const index to increase by 2 slots in the non compact case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-13nir: lower_load_const_to_scalar fix for 8/16b typesRob Clark1-4/+15
Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-13Update the documentation for mesonDylan Baker1-13/+23
Meson is pretty well tested and works in most configurations now, so we can remove the warning about it being unsuited for actual use. It's also worth documenting that meson 0.42.0 or greater is required. v2: - Minor rewording of supported platforms as suggested by Emil - Add two missing tags as reported by xmllint --html Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
2018-03-13ac/nir: Use lower_vote_eq_to_ballot instead of ac_nir_lower_subgroupsJason Ekstrand6-99/+1
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13nir/subgroups: Add lowering for vote_ieq/vote_feq to a ballotJason Ekstrand2-0/+49
This is based heavily on 97f10934edf8ac, "ac/nir: Add vote_ieq/vote_feq lowering pass." from Bas Nieuwenhuizen. This version is a bit more general since it's in common code. It also properly handles NaN due to not flipping the comparison for floats. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13meson: don't use compiler.has_headerDylan Baker1-1/+1
Meson's compiler.has_header is completely useless, it only checks that a header exists, not whether it's usable. This creates problems if a header contains a conditional #error declaration, like so: > #if __x86_64__ > # error "Doesn't work with x86_64!" > #endif Compiler.has_header will return true in this case, even when compiling for x86_64. This is useless. Instead, we'll do a compile check so that any #error declarations will be treated as errors, and compilation will work. Fixes compilation on x32 architecture. Gentoo Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=649746 meson bug: https://github.com/mesonbuild/meson/issues/2246 Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-13i965: Emit texture cache invalidates around blorp_copyJason Ekstrand1-0/+15
This is a terrible hack but it fixes CTS regressions. It's still incredibly unclear exactly what is going wrong in the hardware to cause this to be an issue so this isn't a good fix by any means. However, it does fix tests so there is that. Fixes: fb0e9b5197 "i965: Track the depth and render caches separately" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103746 Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-13brodacom/vc4: Fix simulator since the perfmon change.Eric Anholt1-0/+1
It would be nice to support perfmon with simulator, and might be a useful tool for regression testing performance (since the simulator would be deterministic).
2018-03-13spirv: Silence compiler warning about undefined srcs[0]Eric Anholt1-0/+1
v2: Use assume() at the srcs[] definition instead. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-03-13ac/nir: rename radeon_llvm_reg_index_soa() to ac_llvm_reg_index_soa()Samuel Pitoiset3-12/+12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13ac/nir: remove some unnecessary includes and declarationsSamuel Pitoiset2-9/+1
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13ac/nir: drop radv prefix from radv_lower_gather4_integer()Samuel Pitoiset1-4/+4
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13ac/nir: move ac_nir_compiler_options and friends to radv folderSamuel Pitoiset7-89/+89
Also replace ac_ by radv_. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13ac: move ac_shader_info to radv folderSamuel Pitoiset11-99/+63
This is RADV specific code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13ac/nir: move ac_shader_variant_info and friends to radv folderSamuel Pitoiset7-136/+139
Also replace ac_ by radv_. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>