~cwabbott0/mesa - Connor's silly Mesa stuff.

Age	Commit message (Collapse)	Author	Files	Lines
2015-10-16	i965/sched: use liveness analysis for computing register pressurei965-sched-conservative-v2	Connor Abbott	1	-56/+244
	Previously, we were using some heuristics to try and detect when a write was about to begin a live range, or when a read was about to end a live range. We never used the liveness analysis information used by the register allocator, though, which meant that the scheduler's and the allocator's ideas of when a live range began and ended were different. Not only did this make our estimate of the register pressure benefit of scheduling an instruction wrong in some cases, but it was preventing us from knowing the actual register pressure when scheduling each instruction, which we want to have in order to switch to register pressure scheduling only when the register pressure is too high. This commit rewrites the register pressure tracking code to use the same model as our register allocator currently uses. We use the results of liveness analysis, as well as the compute_payload_ranges() function that we split out in the last commit. This means that we compute live ranges twice on each round through the register allocator, although we could speed it up by only recomputing the ranges and not the live in/live out sets after scheduling, since we only shuffle around instructions within a single basic block when we schedule. Shader-db results on bdw: total instructions in shared programs: 7130187 -> 7129880 (-0.00%) instructions in affected programs: 1744 -> 1437 (-17.60%) helped: 1 HURT: 1 total cycles in shared programs: 172535126 -> 172473226 (-0.04%) cycles in affected programs: 11338636 -> 11276736 (-0.55%) helped: 876 HURT: 873 LOST: 8 GAINED: 0
2015-10-16	i965/fs: split out calculation of payload live ranges	Connor Abbott	2	-22/+31
	We'll need this for the scheduler too, since it wants to know when the live ranges of payload registers end in order to model them in our register pressure calculations.
2015-10-16	i965: dump scheduling cycle estimates	Connor Abbott	4	-9/+35
	The heuristic we're using is rather lame, since it assumes everything is non-uniform and loops execute 10 times, but it should be enough for measuring improvements in the scheduler that don't result in a change in the number of instructions. v2: - Switch loops and cycle counts to be compatible with older shader-db. - Make loop heuristic 10x to match with spilling code.
2015-10-16	i965: always run the post-RA scheduler	Connor Abbott	1	-2/+1
	Before, we would only do scheduling after register allocation if we spilled, despite the fact that the pre-RA scheduler was only supposed to be for register pressure and set the latencies of every instruction to 1. This meant that unless we spilled, which we rarely do, then we never considered instruction latencies at all, and we usually never bothered to try and hide texture fetch latency. Although a later commit removes the setting the latency to 1 part, we still want to always run the post-RA scheduler since it's able to take the false dependencies that the register allocator creates into account, and it can be more aggressive than the pre-RA scheduler since it doesn't have to worry about register pressure at all. XXX perf data
2015-10-16	i965/sched: write-after-read dependencies are free	Connor Abbott	1	-4/+4
	Although write-after-write dependencies have the same latency as read-after-write dependencies due to how the register scoreboard works, write-after-read dependencies aren't checked by the EU at all, so they're purely a constraint on how the scheduler can order the instructions.
2015-10-16	i965: fix cycle estimates when there's a pipeline stall	Connor Abbott	1	-7/+8
	The issue time for an instruction is how many cycles it takes to actually put it into the pipeline. If there's a pipeline stall that causes the instruction to be delayed, we should first take that into account to figure out when the instruction would start executing and then add the issue time. The old code had it backwards, and so we would underestimate the total time whenever we thought there would be a pipeline stall by up to the issue time of the instruction.
2015-10-15	nir/glsl: Use shader_prog->Name for naming the NIR shader	Jason Ekstrand	1	-1/+1
	This has the better name to use. Aparently, sh->Name is usually 0. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-10-15	nir: Add helpers for creating variables and adding them to lists	Jason Ekstrand	4	-46/+99
	Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-10-15	nir/prog: Use nir_foreach_variable	Jason Ekstrand	1	-1/+1
	Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-10-15	mesa: wrap a ridiculously long line in es1_conversion.c	Brian Paul	1	-1/+19
	Reviewed-by: Eric Anholt <eric@anholt.net>
2015-10-15	mesa: add num_buffers() helper in blend.c	Brian Paul	1	-15/+22
	Reviewed-by: Eric Anholt <eric@anholt.net>
2015-10-15	mesa: optimize _UsesDualSrc blend flag setting	Brian Paul	1	-1/+6
	For glBlendFunc and glBlendFuncSeparate(), the _UsesDualSrc flag will be the same for all buffers, so no need to compute it N times. Reviewed-by: Eric Anholt <eric@anholt.net>
2015-10-15	mesa: fix incorrect error string in _mesa_BlendEquationiARB()	Brian Paul	1	-1/+1
	Reviewed-by: Eric Anholt <eric@anholt.net>
2015-10-15	mesa: move validate_blend_factors() call after no-change check	Brian Paul	1	-6/+6
	A redundant call to glBlendFuncSeparateiARB() is more likely than getting invalid values, so do the no-op check first. Reviewed-by: Eric Anholt <eric@anholt.net>
2015-10-15	mesa: optimize no-change check in _mesa_BlendEquationSeparate()	Brian Paul	1	-15/+26
	Reviewed-by: Eric Anholt <eric@anholt.net>
2015-10-15	mesa: optimize no-change check in _mesa_BlendEquation()	Brian Paul	1	-12/+23
	Same story as preceeding change to _mesa_BlendFuncSeparate(). Reviewed-by: Eric Anholt <eric@anholt.net>
2015-10-15	mesa: optimize no-change check in _mesa_BlendFuncSeparate()	Brian Paul	1	-15/+28
	Streamline the checking for no state change in _mesa_BlendFuncSeparate() (and _mesa_BlendFunc()). If _BlendFuncPerBuffer is false, we only need to check the 0th buffer state. Move argument validation after the no-op check. I'm looking at an app that issues about 1000 redundant glBlendFunc() calls per frame! Reviewed-by: Eric Anholt <eric@anholt.net>
2015-10-15	mesa: short-cut new_state == _NEW_LINE in _mesa_update_state_locked()	Brian Paul	1	-1/+5
	We can skip to the end of _mesa_update_state_locked() if only the _NEW_LINE flag is set since none of the derived state depends on it (just like _NEW_CURRENT_ATTRIB). Note that we still call the ctx->Driver.UpdateState() function, of course. v2: use bitmask-based test, per Eric. Reviewed-by: Eric Anholt <eric@anholt.net>
2015-10-15	mesa: remove FLUSH_VERTICES() in _mesa_MatrixMode()	Brian Paul	1	-1/+0
	Changing the matrix mode alone has no effect on rendering and does not need to trigger a flush or state validation. Reviewed-by: Eric Anholt <eric@anholt.net>
2015-10-15	mesa: android: Fix the incorrect path of sse_minmax.c	Chih-Wei Huang	1	-1/+1
	Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Fixes: 669cfc267a1 (android: mesa: fix the path of the SSE4_1 optimisations) Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-10-15	i965: android: add the i965_compile_FILES sources to the driver	Mauro Rossi	1	-0/+1
	i965_compile_FILES are needed otherwise we'll error out as below: target SharedLib: i915_dri (out/target/product/x86/obj/SHARED_LIBRARIES/i915_dri_intermediates/LINKED/i915_dri.so) external/mesa/src/mesa/drivers/dri/i965/brw_ir_fs.h:181: error: undefined reference to 'fs_inst::~fs_inst()' ... ... external/mesa/src/mesa/drivers/dri/i965/intel_screen.c:1484: error: undefined reference to 'brw_compiler_create' collect2: error: ld returned 1 exit status build/core/shared_library.mk:81: recipe for target 'out/target/product/x86/obj/SHARED_LIBRARIES/i965_dri_intermediates/LINKED/i965_dri.so' failed make: *** [out/target/product/x86/obj/SHARED_LIBRARIES/i965_dri_intermediates/LINKED/i965_dri.so] Error 1 [Emil Velikov: tweak commit message] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-10-15	program: convert _mesa_init_gl_program() to take struct gl_program *	Emil Velikov	10	-67/+68
	Rather than accepting a void pointer, only to down and up cast around it, convert the function to take the base (struct gl_program) pointer. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-10-15	nir: include nir_instr_set.h in the tarball	Emil Velikov	1	-0/+1
	Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-10-15	glsl: Allow arrays of arrays in GLSL ES 3.10 and GLSL 4.30	Timothy Arceri	3	-18/+20
	V3: use a check__allowed style function for requirements checking rather than has_ which doesn't encapsulate the error message V2: add missing 's' to the extension name in error messages and add decimal place in version string Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
2015-10-15	glsl: allow for AoA in calculating offset to ubo start region	Timothy Arceri	1	-2/+1
	Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-10-15	glsl: build ubo name and indexing offset for AoA	Timothy Arceri	1	-30/+86
	V2: split out unrelated change as suggested by Samuel Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-10-15	glsl: link uniform block arrays of arrays	Timothy Arceri	3	-112/+229
	This adds support for setting up the UniformBlock structures for AoA and also adds support for resizing AoA blocks with a packed layout. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-10-15	glsl: Add AoA support when checking for non-const index	Timothy Arceri	1	-1/+1
	When checking for non-const indexing of interfaces take into account arrays of arrays Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-10-15	glsl: Add support for lowering interface block arrays of arrays	Timothy Arceri	1	-14/+38
	V2: make array processing functions static Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-10-15	glsl: add AoA support for an inteface with unsized array members	Timothy Arceri	1	-4/+12
	Add support for setting the max access of an unsized member of an interface array of arrays. For example ifc[j][k].foo[i] where foo is unsized. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-15	glsl: add AoA support for linking interface blocks with unsized members	Timothy Arceri	2	-6/+7
	Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-15	glsl: avoid hitting assert for arrays of arrays	Timothy Arceri	1	-0/+6
	Also add TODO comment about adding proper support Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-15	glsl: add AoA support for atomic counters	Timothy Arceri	1	-23/+54
	This marks all counters in an AoA as active. For AoA all but the innermost array are treated as separate counters/uniforms. The Nvidia binary also goes further and finds inactive counters in the AoA, in future we should do this too, however this gets things working for the time being. This change also removes the use of UniformHash for atomic counters, this avoids having to generate name strings used as hash keys. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-15	glsl: add std140 layout support for AoA	Timothy Arceri	1	-7/+8
	Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-15	i965: add arrays of arrays support for varyings	Timothy Arceri	2	-5/+3
	V2: get the correct vector elements value for outputs Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-10-15	glsl: calculate AoA uniform offset correctly for structs	Timothy Arceri	1	-1/+16
	This allows the correct offset to be calculated for use in indirect indexing of samplers. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-10-15	glsl: remove dead code in a single pass	Timothy Arceri	4	-17/+57
	Currently only one ir assignment is removed for each var in a single dead code optimisation pass. This means if a var has more than one assignment, then it requires all the glsl optimisations to be run again for each additional assignment to be removed. Another pass is also required to remove the variable itself. With this change all assignments and the variable are removed in a single pass. Some of the arrays of arrays conformance tests that were looping through 8 dimensions ended up with a var with hundreds of assignments. This change helps ES31-CTS.arrays_of_arrays.InteractionFunctionCalls1 go from around 3 min 20 sec -> 2 min ES31-CTS.arrays_of_arrays.InteractionFunctionCalls2 went from around 9 min 20 sec to 7 min 30 sec I had difficulty getting the public shader-db to give a consistent result with or without this change but the results seemed unchanged at between 15-20 seconds. Thomas Helland measured change with shader-db on his machine from approx 117 secs to 112 secs. V3: Simplify freeing of list as suggested by Ian, and spelling fixes. V2: Add assert to be sure references are counted before assignments. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-By: Thomas Helland <thomashelland90@gmail.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-15	glsl: dont allow gl_PerVertex to be redeclared as an array of arrays	Timothy Arceri	2	-1/+8
	V3: move patch after fixes to ast for AoA and add const to helper as suggested by Ian V2: move single dimensional array detection into a helper Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-15	glsl: check that only the outermost array is unsized	Timothy Arceri	1	-0/+22
	Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-10-15	glsl: allow AoA to be sized by initializer or constructor	Timothy Arceri	5	-41/+82
	V2: Split out unsized array validation to its own patch as suggested by Samuel. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-10-15	glsl: add support for initialising sampler AoA	Timothy Arceri	1	-34/+49
	Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-15	glsl: Add support for linking uniform arrays of arrays	Timothy Arceri	2	-6/+14
	V3: Fix setting of data.location for struct AoA UBO members V2: Handle arrays of arrays in the same way structures are handled The ARB_arrays_of_arrays spec doesn't give very many details on how AoA uniforms are intended to be implemented. However in the ARB_program_interface_query spec there are details that show AoA are intended to be handled in a similar way to structs. Issues 7 from the ARB_program_interface_query spec: We define rules consistent with our enumeration rules for other complex types. For existing one-dimensional arrays, we enumerate a single entry if the array is an array of basic types, or separate entries for each array element if the array is an array of structures. We follow similar rules here. For a uniform array such as: uniform vec4 a[5][4][3]; we enumerate twenty different entries ("a[0][0][0]" through "a[4][3][0]"), each of which is treated as an array with three elements. This is morally equivalent to what you'd get if you worked around the limitation in current GLSL via: struct ArrayBottom { vec4 c[3]; }; struct ArrayMid { ArrayBottom b[3]; }; uniform ArrayMid a[5]; which would enumerate "a[0].b[0].c[0]" through "a[4].b[3].c[0]". Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-14	i965: Don't hardcode FS in "validation failed!" message.	Kenneth Graunke	1	-1/+1
	Instead, print "Scalar VS" or "Scalar FS". Otherwise it's really confusing which stage is broken. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-10-14	glsl: Support uint index in lower_vector_insert	Jordan Justen	1	-1/+5
	The ES31-CTS.compute_shader.pipeline-compute-chain test case generates an unsigned index by using gl_LocalInvocationID.x and gl_LocalInvocationID.y as array indices. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-10-14	glsl: Support uint index in do_vec_index_to_cond_assign	Jordan Justen	1	-1/+3
	The ES31-CTS.compute_shader.pipeline-compute-chain test case generates an unsigned index by using gl_LocalInvocationID.x and gl_LocalInvocationID.y as array indices. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-10-14	i965/fs: Ignore compute shaders in brw_nir_lower_inputs	Jordan Justen	1	-0/+4
	The commit shown below caused compute shaders to hit the unreachable in the default of the switch block. Since compute shaders don't have any inputs, we can make brw_nir_lower_inputs a no-op for CS. commit 2953c3d76178d7589947e6ea1dbd902b7b02b3d4 Author: Kenneth Graunke <kenneth@whitecape.org> Date: Fri Aug 14 15:15:11 2015 -0700 i965/vs: Map scalar VS input locations properly; avoid tons of MOVs. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-14	i965/fs: Simplify FS in brw_nir_lower_inputs to only support scalar mode	Jordan Justen	1	-1/+2
	Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-14	mesa: remove unused functions in program.c	Brian Paul	1	-51/+0
	replace_registers() and adjust_param_indexes() were unused. Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-10-14	mesa: minor indentation fix in _mesa_BindTextureUnit()	Brian Paul	1	-1/+1

2015-10-14	mesa: remove unused texUnit local in _mesa_BindTextureUnit()	Brian Paul	1	-7/+0
	The texture unit is error-checked before this and the texUnit var is unused, so remove it. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>