~nh/mesa - nh's Mesa repository; mostly radeonsi related development

Age	Commit message (Collapse)	Author	Files	Lines
2016-11-01	amd: fix a typo in PIXEL_PIPE_STAT_RESET definition	Marek Olšák	1	-1/+1
	Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01	gallium/radeon: add enum radeon_micro_mode	Marek Olšák	3	-7/+14
	Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01	gallium/radeon: make it clear that DRM 2.x.x fast clear constraint is CIK-only	Marek Olšák	1	-2/+2
	Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01	gallium/radeon: remove r600_surface::level_info	Marek Olšák	3	-7/+6
	Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01	gallium/radeon: add radeon_surf::is_linear	Marek Olšák	8	-13/+15
	Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01	gallium/radeon: remove radeon_surf_level::pitch_bytes	Marek Olšák	13	-44/+48
	Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01	gallium/radeon: don't call u_format helpers if we have that info already	Marek Olšák	2	-10/+8
	Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01	gallium/radeon: replace radeon_surf_info::dcc_enabled with num_dcc_levels	Marek Olšák	6	-15/+19
	Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01	radeonsi: add a driver query for counting CP DMA calls	Marek Olšák	4	-0/+13
	CP DMA calls are synchronous with regard to shaders, but can be made asynchronous if needed. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01	radeonsi: add a driver query for shader cache hits	Marek Olšák	4	-1/+16
	This is an 8-month old patch. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01	gbm: set up the interop extension for egl/drm	Marek Olšák	3	-0/+3
	breaking libgbm -> libEGL ABI? Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-01	nvc0: do not duplicate similar performance metrics	Samuel Pitoiset	1	-43/+7
	Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
2016-11-01	anv/device: Return DEVICE_LOST if execbuf2 fails	Jason Ekstrand	1	-6/+4
	This makes more sense than OUT_OF_HOST_MEMORY. Technically, you can recover from a failed execbuf2 but the batch you just submitted didn't fully execute so things are in an ill-defined state. The app doesn't want to continue from that point anyway. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-11-01	i965/gen8: Fix vertex attrib upload for dvec3/4 shader inputs	Antia Puentes	5	-22/+20
	The emission of vertex attributes corresponding to dvec3 and dvec4 vertex shader input variables was not correct when the <size> passed to the VertexAttribL* commands was <= 2. This was because we were using the vertex array size when emitting vertices to decide if we uploaded a 64-bit floating point attribute as 1 slot (128-bits) for sizes 1 and 2, or 2 slots (256-bits) for sizes 3 and 4. This caused problems when mapping the input variables to registers because, for deciding which registers contain the values uploaded for a certain variable, we use the size and type given to the variable in the shader, so we will be assigning 256-bits to dvec3/4 variables, even if we only uploaded 128-bits for them, which happened when the vertex array size was <= 2. The patch uses the shader information to only emit as 128-bits those 64-bit floating point variables that were declared as double or dvec2 in the vertex shader. Dvec3 and dvec4 variables will be always uploaded as 256-bits, independently of the <size> given to the VertexAttribL* command. From the ARB_vertex_attrib_64bit specification: "For the 64-bit double precision types listed in Table X.1, no default attribute values are provided if the values of the vertex attribute variable are specified with fewer components than required for the attribute variable. For example, the fourth component of a variable of type dvec4 will be undefined if specified using VertexAttribL3dv or using a vertex array specified with VertexAttribLPointer and a size of three." We are filling these unspecified components with zeros, which coincidentally is also what the GL44-CTS.vertex_attrib_binding.basic-inputL-case1 expects. v2: Do not use bitcount (Kenneth Graunke) Fixes: GL44-CTS.vertex_attrib_binding.basic-inputL-case1 test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97287 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-11-01	radv: drop some unused cmask info members.	Dave Airlie	2	-8/+0
	These were assigned but never used. Inspired by similiar patch in radeonsi. Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-31	intel: aubinator: fix printing missing gen option	Lionel Landwerlin	1	-2/+2
	Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-10-31	intel: aubinator: fix assumptions on amount of required data	Lionel Landwerlin	1	-1/+5
	We require 12 bytes of headers but in some cases we just need 4. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-10-31	intel: aubinator: don't print out blocks twice	Lionel Landwerlin	1	-1/+0
	Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-10-31	i965: Move gen8_disable_stages to brw_upload_initial_gpu_state	Nanley Chery	4	-56/+13
	3DSTATE_WM_CHROMAKEY isn't programmed anywhere else. 3DSTATE_WM_HZ_OP is programmed, then cleared by blorp during a HZ op, so repeatedly clearing it after every blorp execution is redundant. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-10-31	i965: Program 3DSTATE_AA_LINE_PARAMETERS in upload_invariant_state	Nanley Chery	3	-36/+10
	This packet is non-pipelined and doesn't ever change across emissions. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-10-31	st/omx/dec: disable tunnel for size different case	Leo Liu	3	-1/+11
	When the video coded size is different from frame size, we need the result buffers are same as coded size, which are not size compatible with encode required size, so that simply use no tunnel for this case instead of frame by frame converting. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: 13.0 <mesa-stable@lists.freedesktop.org>
2016-10-31	st/omx/dec: result buffers size should match codec decoder size	Leo Liu	3	-19/+18
	Otherwise fails the check of matching between decoder size and buffers size in kernel. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: 13.0 <mesa-stable@lists.freedesktop.org>
2016-10-31	swr: [rasterizer] added EventHandlerFile contructor	George Kyriazis	1	-1/+6
	Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-10-31	swr: [rasterizer core] Frontend dependency work	George Kyriazis	3	-2/+18
	Add frontend dependency concept in the DRAW_CONTEXT, which allows serialization of frontend work if necessary. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-10-31	swr: [rasterizer core] Refactor/cleanup backends	George Kyriazis	2	-360/+351
	Used for common code reuse and simplification Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-10-31	swr: [rasterizer core] Remove deprecated simd intrinsics	George Kyriazis	4	-990/+1
	Used in abandoned all-or-nothing approach to converting to AVX512 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-10-31	swr: [rasterizer archrast] Add thread tags to event files.	George Kyriazis	5	-4/+24
	This allows the post-processor to easily detect the API thread and to process frame information. The frame information is needed to optimized how data is processed from worker threads. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-10-31	glsl: use a non-malloc'd storage for short ir_variable names	Marek Olšák	3	-3/+22
	Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31	glsl: use the linear allocator in opt_constant_propagation	Marek Olšák	1	-3/+11
	Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31	glsl: use the linear allocator in opt_copy_propagation	Marek Olšák	1	-1/+6
	Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31	glsl: use the linear allocator in opt_copy_propagation_elements	Marek Olšák	1	-4/+11
	Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31	glsl: use the linear allocator in opt_dead_code_local	Marek Olšák	1	-3/+9
	Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31	glsl: use the linear allocator in glsl_symbol_table	Marek Olšák	1	-8/+8
	no ralloc_free occurences Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31	glsl: use the linear allocator for ast_node and derived classes	Marek Olšák	6	-113/+114
	Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31	glsl/lexer: use the linear allocator	Marek Olšák	3	-8/+12
	Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31	glcpp: use the linear allocator for most objects	Marek Olšák	3	-118/+91
	v2: cosmetic changes Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2016-10-31	ralloc: add a linear allocator as a child node of ralloc	Marek Olšák	2	-4/+433
	v2: remove goto, cosmetic changes Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31	ralloc: remove memset from ralloc_size	Marek Olšák	1	-15/+11
	only do it in rzalloc_size as it was supposed to be Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
2016-10-31	ralloc: use rzalloc where it's necessary	Marek Olšák	11	-15/+19
	No change in behavior. ralloc_size is equivalent to rzalloc_size. That will change though. Calls not switched to rzalloc_size: - ralloc_vasprintf - glsl_type::name allocation (it's filled with snprintf) - C++ classes where valgrind didn't show uninitialized values I switched most of non-glsl stuff to rzalloc without checking whether it's really needed. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31	ralloc: add DECLARE_RZALLOC_CXX_OPERATORS	Marek Olšák	1	-2/+7
	Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whiteacpe.org>
2016-10-31	nir: zero allocated memory where needed	Juha-Pekka Heikkila	6	-7/+7
	Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-10-31	i965/fs: fill allocated memory with zeros where needed	Juha-Pekka Heikkila	2	-3/+3
	Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-10-31	i965/vec4: zero allocated memory where needed	Juha-Pekka Heikkila	1	-2/+2
	Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-10-31	glsl/glcpp: initialize all fields of glcpp_parser_t on creation	Tapani Pälli	1	-0/+3
	this fixes some of the regressions with "ralloc: remove memset from ralloc_size" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-10-31	glsl: Fix reading of uninitialized memory	Juha-Pekka Heikkila	2	-4/+4
	Switch to use memory allocations which zero memory for places where needed. v2: modify and rebase on top of Marek's series (Tapani) Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-10-31	glsl: initialize glsl_struct_field properly	Marek Olšák	2	-38/+6
	don't rely on ralloc doing memset Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whiteacpe.org>
2016-10-31	ralloc: don't memset ralloc_header, clear it manually	Marek Olšák	1	-1/+15
	time GALLIUM_NOOP=1 ./run shaders/private/alien_isolation/ >/dev/null Before (2 takes): real 0m8.734s 0m8.773s user 0m34.232s 0m34.348s sys 0m0.084s 0m0.056s After (2 takes): real 0m8.448s 0m8.463s user 0m33.104s 0m33.160s sys 0m0.088s 0m0.076s Average change in "real" time spent: -3.4% calloc should only do 2 things compared to malloc: - check for overflow of "n * size" - call memset I'm not sure if that explains the difference. v2: clear "parent" and "next" in the caller of add_child. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1) Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2016-10-30	clover: Implement clGetExtensionFunctionAddressForPlatform.	Serge Martin	3	-1/+21
	Add clGetExtensionFunctionAddressForPlatform (CL 1.2). Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-10-30	clover: Introduce CLOVER_EXTRA_*_OPTIONS environment variables	Vedran Miletić	1	-3/+7
	The options specified in the CLOVER_EXTRA_BUILD_OPTIONS shell variable are appended to the options specified by the OpenCL program in the clBuildProgram function call, if any. Analogously, the options specified in the CLOVER_EXTRA_COMPILE_OPTIONS and CLOVER_EXTRA_LINK_OPTIONS variables are appended to the options specified in clCompileProgram and clLinkProgram function calls, respectively. v2: * rename to CLOVER_EXTRA_COMPILER_OPTIONS * use debug_get_option * append to linker options as well v3: code cleanups v4: separate CLOVER_EXTRA_LINKER_OPTIONS options v5: * fix documentation typo * use CLOVER_EXTRA_COMPILER_OPTIONS in link stage v6: * separate in CLOVER_EXTRA_{BUILD,COMPILE,LINK}_OPTIONS * append options in cl{Build,Compile,Link}Program Signed-off-by: Vedran Miletić <vedran@miletic.net> Reviewed-by[v1]: Edward O'Callaghan <funfunctor@folklore1984.net> v7 [Francisco Jerez]: Slight simplification. Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-10-30	clover: Pass unquoted compiler arguments to Clang	Vedran Miletić	1	-4/+36
	OpenCL apps can quote arguments they pass to the OpenCL compiler, most commonly include paths containing spaces. If the Clang OpenCL compiler was called via a shell, the shell would split the arguments with respect to to quotes and then remove quotes before passing the arguments to the compiler. Since we call Clang as a library, we have to split the argument with respect to quotes and then remove quotes before passing the arguments. v2: move to tokenize(), remove throwing of CL_INVALID_COMPILER_OPTIONS v3: simplify parsing logic, use more C++11 v4: restore error throwing, clarify a comment Signed-off-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>