summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-03-16i965: Replace OPT_V() with OPT().wip/nir-optimization-progressMatt Turner1-23/+19
We want to be able to check the progress of each pass and dump the NIR for debugging purposes if it changed.
2017-03-16i965/fs: Return progress from demote_sample_qualifiers().Matt Turner1-1/+6
2017-03-16i965/fs: Return progress from move_interpolation_to_top().Matt Turner1-1/+6
And mark as static at the same time.
2017-03-16i965: Return progress from brw_nir_lower_uniforms().Matt Turner1-3/+3
2017-03-16nir: Return progress from nir_convert_from_ssa().Matt Turner2-7/+16
2017-03-16nir: Return progress from nir_lower_io().Matt Turner2-6/+15
2017-03-16nir: Return progress from nir_lower_regs_to_ssa().Matt Turner2-6/+10
And from nir_lower_regs_to_ssa_impl() as well.
2017-03-16nir: Return progress from nir_lower_samplers().Matt Turner2-12/+19
2017-03-16nir: Return progress from nir_lower_atomics().Matt Turner2-7/+13
2017-03-16nir: Return progress from nir_lower_clamp_color_outputs().Matt Turner2-10/+22
2017-03-16nir: Return progress from nir_lower_clip_fs().Matt Turner2-3/+5
2017-03-16nir: Return progress from nir_lower_clip_vs().Matt Turner2-5/+7
2017-03-16nir: Return progress from nir_move_vec_src_uses_to_dest().Matt Turner2-6/+18
2017-03-16nir: Return progress from nir_lower_to_source_mods().Matt Turner2-6/+29
2017-03-16nir: Return progress from nir_lower_clip_cull_distance_arrays().Matt Turner2-5/+14
2017-03-16nir: Return progress from nir_lower_var_copies().Matt Turner2-4/+16
2017-03-16nir: Return progress from nir_lower_load_const_to_scalar().Matt Turner2-7/+17
2017-03-16nir: Return progress from nir_lower_64bit_pack().Matt Turner2-4/+12
2017-03-16nir: Return progress from nir_lower_doubles().Matt Turner2-15/+21
2017-03-16nir: Return progress from nir_lower_vars_to_ssa().Matt Turner2-3/+7
2017-03-16nir: Fix syntax.Matt Turner2-6/+6
et is not an abbreviation.
2017-03-16nir: Fix misspellings.Matt Turner4-7/+7
2017-03-16nir: Stop using apostrophes to pluralize.Matt Turner21-43/+43
2017-03-09i965: Rename brw_format_for_mesa_format() to brw_isl_format_for_mesa_format()Anuj Phogat7-14/+14
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-09i965: Add more Haswell OA metrics setsRobert Bragg1-1/+3403
This extends the brw_oa_hsw.xml to expose these additional queries: - Compute Metrics Basic Gen7.5 - Compute Metrics Extended Gen7.5 - Memory Reads Distribution Gen7.5 - Memory Writes Distribution Gen7.5 - Metric set Sampler Balance Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09i965: Expose OA counters via INTEL_performance_queryRobert Bragg2-13/+1158
This adds support for exposing basic Observation Architecture performance counters on Haswell. This support is based on the i915 perf kernel interface which is used to configure the OA unit, allowing Mesa to emit MI_REPORT_PERF_COUNT commands around queries to collect counter snapshots. To take into account the small chance that some of the 32bit counters could wrap around for long queries (~50 milliseconds for a GT3 Haswell @ 1.1GHz) the implementation also collects periodic metrics. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09exec_list: Add a foreach_list_typed_from macroRobert Bragg1-0/+5
This allows iterating list nodes from a given start point instead of necessarily the list head. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09i965: Add script to gen code for OA counter queriesRobert Bragg3-2/+575
Avoiding lots of error prone boilerplate and easing our ability to add + maintain support for multiple OA performance counter queries for each generation: This adds a python script to generate code for building up performance_queries from the metric sets and counters described in brw_oa_hsw.xml as well as functions to normalize each counter based on the RPN expressions given. Although the XML file currently only includes a single metric set, the code generated assumes there could be many sets. The metrics as described in XML get translated into C structures which are registered in a brw->perfquery.oa_metrics_table hash table keyed by the GUID of the metric set in XML. v2: numerous python style improvements (Dylan) v3: Makefile.am fixups (Emil) v4: Pattern rule for codegen + orthogonal .c and .h rules (Robert) Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09i965: extend query/counter structs for OA queriesRobert Bragg2-1/+21
In preparation for generating code from brw_oa_hsw.xml for describing OA performance counter queries this adds some OA specific members to brw_perf_query that our generated code will initialize: - The oa_metric_set_id is the ID we will pass to DRM_IOCTL_I915_PERF_OPEN, and is an ID got via sysfs under: /sys/class/drm/<card>/metrics/<guid/id - The oa_format is the OA report layout we will request from the kernel - The accumulator offsets determine where the different groups of A, B and C counters are located within an intermediate 64bit 'accumulator' buffer. Additionally brw_perf_query_counter now has 64bit or float _read() callback members for OA counters. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09i965: brw_context.h additions for OA unit query codegenRobert Bragg1-0/+21
In preparation for generating code from the XML performance counter meta data, this makes some additions to brw_context.h for this code to be able to reference. It adds a brw->perfquery.oa_metrics_table hash table for indexing built up query descriptions by the GUID that is expected to be advertised by the kernel (via sysfs) to be able to use that query. It adds an 'OA_COUNTERS' brw_query_kind to be assigned to queries built up by generated code. It adds a brw->perfquery.sys_vars structure to have a consistent place to represent the different system variables like $EuCoresTotalCount and $EuSlicesTotalCount that are referenced by OA counter normalization equations. Although extending + referencing gen_device_info for these variables was considered, these are some of the (mostly minor) reasons for going with a dedicated structure: - Currently we only need this info for the performance_query backend and it might be a bit tedious to go back and initialize the state for pre-Haswell devinfo structures. - Considering the $SubsliceMask then the requirement for how multiple per-slice masks are packed only comes from how the variables are references by availability tests in XML, and might not be a good general representation for tracking subslice masks if another use case arises. - If we used gen_device_info then we'd likely want to avoid making assumptions about the C types during codegen and adding explicit casts, while that's not necessary with a dedicated struct with all members being uint64_t. - This structure and the code for initializing it is currently shared (just through copy & paste) with a few other projects dealing with OA counters, and that's been convenient so far. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09i965: XML description of Haswell OA metric setRobert Bragg1-0/+998
In preparation for exposing Gen Observation Architecture performance counters via INTEL_performance_query this adds an XML description for an initial 'Render Metrics Basic Gen7.5' query and corresponding counters. The intention is to auto generate code for building a query from these counters as well as the code for normalizing the individual counters. Note that the upstream for this XML data is currently GPU Top: https://github.com/rib/gputop The files are maintained under gputop-data/ and they are themselves derived from files in an internal 'MDAPI XML' schema. There are scripts under gputop-scripts/ and make rules in gputop-data/Makefile.xml for maintaining these files. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09nv50/ir: check for origin insn in findOriginForTestWithZeroPierre Moreau1-0/+2
Function arguments do not have an "origin" instruction, causing a NULL-pointer dereference without this check. Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-09mesa/main: make use of lookup_samplerobj_locked()Samuel Pitoiset1-11/+1
There is no need to check sampler == 0 twice. This removes now unused _mesa_lookup_samplerobj_locked(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-09mesa/main: inline {begin,end}_samplerobj_lookups()Samuel Pitoiset1-16/+2
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-09glsl/blob: clear padding bytesGrazvydas Ignotas1-3/+6
Since blob is intended for serializing data, it's not a good idea to leave padding holes with uninitialized data, which may leak heap contents and hurt compression if the blob is later compressed, like done by shader cache. Clear it. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-09util/disk_cache: fix size subtraction on 32bitGrazvydas Ignotas1-3/+3
Negating size_t on 32bit produces a 32bit result. This was effectively adding values close to UINT_MAX to the cache size (the files are usually small) instead of intended subtraction. Fixes 'make check' disk_cache failures on 32bit. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-09util/disk_cache: fix compressed size calculationGrazvydas Ignotas1-1/+1
It incorrectly doubles the size on each iteration. Fixes: 85a9b1b5 "util/disk_cache: compress individual cache entries" Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-09glsl: builtin: always return clones of the builtinsLionel Landwerlin3-8/+20
Builtins are created once and allocated using their own private ralloc context. When reparenting IR that includes builtins, we might be steal bits of builtins. This is problematic because these builtins might now be freed when the shader that includes then last is disposed. This might also lead to inconsistent ralloc trees/lists if shaders are created on multiple threads. Rather than including builtins directly into a shader's IR, we should include clones of them in the ralloc context of the shader that requires them. This fixes double free issues we've been seeing when running shader-db on a big multicore (72 threads) server. v2: Also rename _mesa_glsl_find_builtin_function_by_name() to better reflect how this function is used. (Ken) v3: Rename ctx to mem_ctx (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-08i965: Delete render ring prelude.Kenneth Graunke2-10/+0
This was a hook I came up when trying to do the initial performance counter work years ago. Nothing's used it for a long time, and the upcoming performance counter support doesn't want it either. So, goodbye render ring prelude. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-03-08swr: s/uint/enum pipe_render_cond_flag/Vinson Lee1-1/+1
Fix build error. swr_context.cpp: In function ‘void swr_blit(pipe_context*, const pipe_blit_info*)’: swr_context.cpp:336:44: error: invalid conversion from ‘uint {aka unsigned int}’ to ‘pipe_render_cond_flag’ [-fpermissive] ctx->render_cond_mode); ~~~~~^~~~~~~~~~~~~~~~ Fixes: b0d39384307d ("gallium: s/uint/enum pipe_render_cond_flag/ for set_render_condition()") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100133 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-03-09radv: Don't flush the CB before doing a fast clear eliminate.Bas Nieuwenhuizen1-2/+0
The only way we write CMASK/DCC compressed textures through shaders is fast clears and CMASK/DCC inits, which have their own flushes. Hence the CB cache is always up to date. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09radv: Don't emit cache flushes on subpass switch.Bas Nieuwenhuizen3-6/+0
I think we should only flush right before an action (draw/dispatch etc.), as otherwise it is too easy to issue redundant flushes. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09radv: Only flush for the needed stages, and before the flushes.Bas Nieuwenhuizen1-6/+1
Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09radv: Don't invalidate CB/DB for images that aren't modified outside CB/DB.Bas Nieuwenhuizen1-9/+19
Without stores, the only writes are fast clears, transfers and metadata initialization, each of which have the appropiate invalidations already. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09radv: Flush more caches after writes.Bas Nieuwenhuizen1-3/+9
Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09radv: Don't flush for fixed-function reading.Bas Nieuwenhuizen1-1/+0
The data should always be in memory after a src flush. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09radv: Invalidate the correct caches for CB/DB dst barriers.Bas Nieuwenhuizen1-5/+11
Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09radv: Determine cache flushes per object.Bas Nieuwenhuizen1-17/+19
Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09mesa/main: remove unused _mesa_new_texture_image()Samuel Pitoiset2-20/+0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-03-09radv/ac: fixup texture coord to have right number of channels.Dave Airlie2-4/+4
Jason has patches to add validation to this area, this should fix radv shaders. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>