summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-03-20glxglvnddispatch: Add missing dispatch for GetDriverConfigHEADmasterHans de Goede2-0/+19
Together with some fixes to xdriinfo this fixes xdriinfo not working with glvnd. Since apps (xdriinfo) expect GetDriverConfig to work without going to need through the dance to setup a glxcontext (which is a reasonable expectation IMHO), the dispatch for this ends up significantly different then any other dispatch function. This patch gets the job done, but I'm not really happy with how this patch turned out, suggestions for a better fix are welcome. Cc: Kyle Brenneman <kbrenneman@nvidia.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2017-03-20anv/genX: Solve the vkCreateGraphicsPipelines crashXu,Randy1-2/+2
The crash is due to NULL pColorBlendState, which is legal if the pipeline has rasterization disabled or if the subpass of the render pass the pipeline is created against does not use any color attachments. Test: Sample subpasses from LunarG can run without crash Signed-off-by: Xu,Randy <randy.xu@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-20radv: fix logic for when to flush on multiple CS emissionDave Airlie1-8/+8
The current code evaluated to always true, we only want to flush on the first submit. Rename the variable to do_flush, and only emit on the first iteration. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-20spirv: Implement IsInf using an integer comparisonJason Ekstrand1-1/+1
Since we already do fabs on the one source, we're guaranteed to get positive infinity if we get any infinity at all. Since +inf only has one IEEE 754 representation, we can use an integer comparison and avoid all of the ordered/unordered issues. Cc: Dave Airlie <airlied@redhat.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-20radv/meta: fix image clears for r4g4 format.Dave Airlie1-0/+8
This just uses an 8-bit clear and packs the values. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-20Revert "radv: fallback to an in-memory cache when no pipline cache is provided"Dave Airlie3-13/+6
This reverts commit 2845a108a9a8bd4b0e6e9b590c976452fb99eb10. This break VK-GL-CTS randomly. ./deqp-vk --deqp-case=dEQP-VK.texture.filtering.3d.formats.r4g4b4a4* bounces around here from 6/6 to 3/6 or 4/6 to hanging. Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-20mesa: disable glthread when glNewList() is calledTimothy Arceri1-1/+1
glNewList() swaps dispatch tables, and we don't have anything in place to handle that in glthread. Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-03-20radv: fix primitive reset index emissionDave Airlie1-1/+1
This was meant to be checking the index type to get the correct index not the last emitted one. This fixes: dEQP-VK.pipeline.input_assembly.primitive_restart.index_type_uint32.triangle_strip_with_adjacency Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-20util/disk_cache: check rename resultGrazvydas Ignotas1-2/+6
I haven't seen this causing problems in practice, but for correctness we should also check if rename succeeded to avoid breaking accounting and leaving a .tmp file behind. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-20util/disk_cache: delete .tmp if target existsGrazvydas Ignotas1-1/+3
At the time of target file check, .tmp file is already created and file lock is held, so we should remove the .tmp, like in other error paths. With this, piglit no longer leaves large amount of empty .tmp files behind, which waste directory entries and may interfere with eviction. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-20util/disk_cache: fix stored_keys indexGrazvydas Ignotas1-2/+2
It seems there is a bug because: - 20 bytes are compared, but only 1 byte stored_keys step is used - entries can overlap each other by 19 bytes - index_mmap is ~1.3M in size, but only first 64K is used With this fix for Deus Ex: - startup time (from launch to Feral logo): ~38s -> ~16s - disk_cache_has_key() hit rate: ~50% -> ~96% Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-19nv30: create uploader after pipe->screen is setIlia Mirkin1-6/+6
Fixes crashes after recent upload rework. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-18nv50,nvc0: enable TEX_LZ and TXF_LZIlia Mirkin3-4/+17
There should be minimal gain, if any, for nvc0, but nv50 may end up noticing more often that the lod argument is uniform. This, in turn, will remove the need for some unnecessary transformations, which were being hit due to the checks being done pre-ssa. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-18st/mesa: set result writemask based on ir typeIlia Mirkin1-0/+1
This prevents textureQueryLevels, which maps as LODQ, from ending up with a xyzw writemask, which is illegal. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100061 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-18nvc0/ir: treat FMA like MAD for operand propagationKarol Herbst1-0/+1
Helps mainly Feral-ported games, due to their use of fma() shader-db changes: total instructions in shared programs : 3901147 -> 3842505 (-1.50%) total gprs used in shared programs : 471258 -> 467359 (-0.83%) total local used in shared programs : 27405 -> 27361 (-0.16%) total bytes used in shared programs : 35749888 -> 35214176 (-1.50%) local gpr inst bytes helped 17 1829 4091 4091 hurt 4 44 3 3 Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org
2017-03-18util/disk_cache: pass predicate functions file stats directly (v4)Alan Swanson1-34/+21
Since switching to LRU eviction the only user of these predicate functions now resolves directory entry stats itself so pass them directly saving calling fstat and strlen twice (and the expensive strlen is skipped entirely if access time is newer). v2: Update for empty cache dir detection changes v3: Fix passing string length to predicate with the +1 for NULL termination and also pass sb as pointer v4: Missed ampersand for passing sb as pointer Reviewed-by: Grazvydas Ignotas <notasas@gmail.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-18glsl: use set for copy propagation killsTimothy Arceri1-37/+28
Previously each time we saw a variable we just created a duplicate entry in the list. This is particularly bad for loops were we add everything twice, and then throw nested loops into the mix and the list was growing expoentially. This stops the glsl-vs-unroll-explosion test which has 16 nested loops from reaching the tests mem usage limit in this pass. The test now hits the mem limit in opt_copy_propagation_elements() instead. I suspect this was also part of the reason this pass can be so slow with some shaders. Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-03-18st/dri: wait for thread to finish before unbinding contextTimothy Arceri1-0/+3
Fixes a bunch of piglit crashes that hit an assert() when trying to delete the framebuffer. The assert() was triggered because WinSysDrawBuffer was set to NULL before glDeleteFramebuffers() was called. Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-18glsl: don't leak memory when trying to count loop iterationsTimothy Arceri1-2/+3
Suggested-by: Damian Dixon <damian.dixon@gmail.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99789
2017-03-17genxml: Make MI_STORE_DATA_IMM have a single 64-bit data fieldJason Ekstrand6-12/+6
This is way more convenient than having two separate dword fields. Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17anv: Turn on inherited queriesJason Ekstrand1-1/+1
It all just works since it's just a hardware register so we might as well turn it on. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17anv: Implement pipeline statistics queriesIlia Mirkin4-12/+226
In the end, pipeline statistics queries look a lot like occlusion queries only with between 1 and 11 begin/end pairs being generated instead of just the one. Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17anv: Disable VF statistics for blorp and SOL memcpyJason Ekstrand4-3/+18
In order to get accurate statistics, we need to disable statistics for blits, clears, and the surface state memcpy at the top of each secondary command buffer. There are two possible approaches to this: 1) Disable before the blit/memcpy and re-enable afterwards 2) Move emitting 3DSTATE_VF_STATISTICS from initialization and make it part of pipeline state and then just disabale statistics before blits and memcpy operations. Emitting 3DSTATE_VF_STATISTICS should be fairly cheap so it doesn't really matter which path we take. We choose the second option as it's more consistent with the way the rest of the statistics are enabled and disabled. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17anv/pipeline: Enable clipper statisticsJason Ekstrand1-0/+1
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17genxml: s/Clipper Statistics Enable/Statistics Enable/Jason Ekstrand5-5/+5
It's in 3DSTATE_CLIP, so it doesn't really need the extra detail. This matches what we do for VS, FS, etc. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17anv/query: Rework store_query_resultJason Ekstrand1-15/+24
The new version is a nice GPU parallel to cpu_write_query_result and it nicely handles things like dealing with 32 vs. 64-bit offsets in the destination buffer. Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17anv/query: Break GPU query calculation into a helperJason Ekstrand1-12/+18
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17genxml: Add pipeline statistics registers on gen7+Jason Ekstrand4-0/+176
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17anv/query: Add a helper for writing a query pool resultJason Ekstrand1-16/+17
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17anv/query: Use a variable-length slot sizeJason Ekstrand2-28/+33
Not all queries are the same. Even the two queries we support today require a different amount of data per slot. Once we introduce pipeline statistics queries, the size will vary wildly. Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17anv/query: Move the available bits to the frontJason Ekstrand2-28/+19
We're about to make slots variable-length and always having the available bits at the front makes certain operations substantially easier once we do that. Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17anv/query: Let 32-bit values wrapJason Ekstrand1-2/+0
From the Vulkan 1.0.39 Specification: "If VK_QUERY_RESULT_64_BIT is not set and the result overflows a 32-bit value, the value may either wrap or saturate." So we can either clamp or wrap. Wrapping is both easier and what the user gets if they use vkCmdCopyQueryPoolResults and we should be consistent. We could make vkCmdCopyQueryPoolResults clamp but it's annoying and ends up burning extra batch for something the spec clearly doesn't require. Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17radeonsi: add new polaris12 pci idAlex Deucher1-0/+1
Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: 17.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-17gallium/radeon: formalize that create_batch_query doesn't need pipe_contextMarek Olšák3-13/+12
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17gallium/radeon: formalize that create_query doesn't need pipe_contextMarek Olšák3-32/+32
for threaded gallium Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17gallium/radeon: reference pipe_resource in pipe_transferMarek Olšák2-2/+5
for threaded gallium Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17radeonsi: compile all TGSI compute shaders asynchronouslyMarek Olšák1-44/+81
required by threaded gallium Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17radeonsi: require that compiler threads are enabledMarek Olšák2-11/+13
threaded gallium can't use pipe_context's LLVM target machine, because create_shader_selector can be called from a non-driver thread. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17trace: remove leftover assertions after pipe_resource wrapping removalMarek Olšák1-6/+0
2017-03-17gallium/u_upload: make the first persistent mapping unsynchronizedMarek Olšák1-0/+1
This is simpler for drivers.
2017-03-17anv/device: init timestampPeriod from devinfoRobert Bragg1-3/+1
Now that there's a timebase_scale in gen_device_info which is effectively the 'period' this switches anv_GetPhysicalDeviceProperties to using this common device info to initialize the timestampPeriod device limit. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-17i965: Allow a per gen timebase scale factorRobert Bragg6-27/+114
Prior to Skylake the Gen HW timestamps were driven by a 12.5MHz clock with the convenient property of being able to scale by an integer (80) to nanosecond units. For Skylake the frequency is 12MHz or a scale factor of 83.333333 This updates gen_device_info to track a floating point timebase_scale factor and makes corresponding _queryobj.c changes to no longer assume a scale factor of 80 works across all gens. Although the gen6_ code could have been been left alone, the changes keep the code more comparable, and it now shares a few utility functions for scaling raw timestamps and calculating deltas. The utility for calculating deltas takes into account 32 or 36bit overflow depending on the current kernel version. Note: this leaves the timestamp handling of ARB_query_buffer_object untouched, which continues to use an incorrect scale of 80 on Skylake for now. This is more awkward to solve since the scaling is currently done using a very limited uint64 ALU available to the command parser that doesn't support multiply or divide where it's already taking a large number of instructions just to effectively multiple by 80. This fixes piglit arb_timer_query-timestamp-get on Skylake v2: (Ken) Update timebase_scale for platforms past Skylake/Broxton too. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-17anv/device: Remove a use of a compound literalJason Ekstrand1-1/+1
Older versions of GCC don't like compound literals in static const variable declarations because they don't think it's an actual constant value. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17i965: bounds checks while concatenating sysfs pathsRobert Bragg1-11/+32
This adds some missing return value checks for all uses of snprintf in brw_performance_query.c. This also switches a use of strncpy + strncat for snprintf for consistency and to avoid the chance of the strncpy leaving an unterminated string in the dest buffer if the src is too long. This issue with strncpy was picked up by Coverity. CID: 1402201 Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17mesa: automake: add all headers to the tarball.Emil Velikov1-0/+2
Fixes: d8d81fbc316 ("mesa: Add infrastructure for a worker thread to process GL commands.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-17mapi: automake: add all python scripts to EXTRA_DISTEmil Velikov1-0/+3
Otherwise it'll be missing in the tarball and make distcheck will fail. Fixes: 05dd4a1104e ("glapi: Generate GL API marshalling code from the XML.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-17glapi: avoid using $< in non-suffix make rulesJonathan Gray1-2/+2
Using $< in non-suffix make rules is a GNU extension. Explicitly use the name of the python script to fix the build on OpenBSD. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabore.com>
2017-03-17radv/ac: Fix shared memory offset calculationAlex Smith1-1/+1
The index passed to get_shared_memory_ptr is an attribute slot index, i.e. the index of a vec4 within LDS. Therefore this must be scaled by sizeof(vec4) to give the LDS byte offset. Fixes: f4e499ec791 ("radv: add initial non-conformant radv vulkan driver") Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> CC: <mesa-stable@lists.freedesktop.org>
2017-03-17radv: Fix using more than 4 bound descriptor setsJames Legg1-1/+3
Avoid a buffer overflow in ac_nir_to_llvm.c's create_function when using more than 4 descriptor sets. radv claims support for 8. Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17util/build-id: check dlpi_name before strstr callTapani Pälli1-0/+6
According to dl_iterate_phdr man page first object visited is the main program where dlpi_name is an empty string. This fixes segfault on Android when using build-id as identifier. Fixes: d4fa083e11f ("util: Add utility build-id code.") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>