summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)AuthorFilesLines
2016-11-01amd: fix a typo in PIXEL_PIPE_STAT_RESET definitionMarek Olšák1-1/+1
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01gallium/radeon: add enum radeon_micro_modeMarek Olšák3-7/+14
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01gallium/radeon: make it clear that DRM 2.x.x fast clear constraint is CIK-onlyMarek Olšák1-2/+2
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01gallium/radeon: remove r600_surface::level_infoMarek Olšák3-7/+6
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01gallium/radeon: add radeon_surf::is_linearMarek Olšák8-13/+15
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01gallium/radeon: remove radeon_surf_level::pitch_bytesMarek Olšák13-44/+48
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01gallium/radeon: don't call u_format helpers if we have that info alreadyMarek Olšák2-10/+8
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01gallium/radeon: replace radeon_surf_info::dcc_enabled with num_dcc_levelsMarek Olšák6-15/+19
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01radeonsi: add a driver query for counting CP DMA callsMarek Olšák4-0/+13
CP DMA calls are synchronous with regard to shaders, but can be made asynchronous if needed. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01radeonsi: add a driver query for shader cache hitsMarek Olšák4-1/+16
This is an 8-month old patch. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-01gbm: set up the interop extension for egl/drmMarek Olšák3-0/+3
breaking libgbm -> libEGL ABI? Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-01nvc0: do not duplicate similar performance metricsSamuel Pitoiset1-43/+7
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
2016-11-01anv/device: Return DEVICE_LOST if execbuf2 failsJason Ekstrand1-6/+4
This makes more sense than OUT_OF_HOST_MEMORY. Technically, you can recover from a failed execbuf2 but the batch you just submitted didn't fully execute so things are in an ill-defined state. The app doesn't want to continue from that point anyway. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-11-01i965/gen8: Fix vertex attrib upload for dvec3/4 shader inputsAntia Puentes5-22/+20
The emission of vertex attributes corresponding to dvec3 and dvec4 vertex shader input variables was not correct when the <size> passed to the VertexAttribL* commands was <= 2. This was because we were using the vertex array size when emitting vertices to decide if we uploaded a 64-bit floating point attribute as 1 slot (128-bits) for sizes 1 and 2, or 2 slots (256-bits) for sizes 3 and 4. This caused problems when mapping the input variables to registers because, for deciding which registers contain the values uploaded for a certain variable, we use the size and type given to the variable in the shader, so we will be assigning 256-bits to dvec3/4 variables, even if we only uploaded 128-bits for them, which happened when the vertex array size was <= 2. The patch uses the shader information to only emit as 128-bits those 64-bit floating point variables that were declared as double or dvec2 in the vertex shader. Dvec3 and dvec4 variables will be always uploaded as 256-bits, independently of the <size> given to the VertexAttribL* command. From the ARB_vertex_attrib_64bit specification: "For the 64-bit double precision types listed in Table X.1, no default attribute values are provided if the values of the vertex attribute variable are specified with fewer components than required for the attribute variable. For example, the fourth component of a variable of type dvec4 will be undefined if specified using VertexAttribL3dv or using a vertex array specified with VertexAttribLPointer and a size of three." We are filling these unspecified components with zeros, which coincidentally is also what the GL44-CTS.vertex_attrib_binding.basic-inputL-case1 expects. v2: Do not use bitcount (Kenneth Graunke) Fixes: GL44-CTS.vertex_attrib_binding.basic-inputL-case1 test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97287 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-11-01radv: drop some unused cmask info members.Dave Airlie2-8/+0
These were assigned but never used. Inspired by similiar patch in radeonsi. Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-31intel: aubinator: fix printing missing gen optionLionel Landwerlin1-2/+2
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-10-31intel: aubinator: fix assumptions on amount of required dataLionel Landwerlin1-1/+5
We require 12 bytes of headers but in some cases we just need 4. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-10-31intel: aubinator: don't print out blocks twiceLionel Landwerlin1-1/+0
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-10-31i965: Move gen8_disable_stages to brw_upload_initial_gpu_stateNanley Chery4-56/+13
3DSTATE_WM_CHROMAKEY isn't programmed anywhere else. 3DSTATE_WM_HZ_OP is programmed, then cleared by blorp during a HZ op, so repeatedly clearing it after every blorp execution is redundant. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-10-31i965: Program 3DSTATE_AA_LINE_PARAMETERS in upload_invariant_stateNanley Chery3-36/+10
This packet is non-pipelined and doesn't ever change across emissions. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-10-31st/omx/dec: disable tunnel for size different caseLeo Liu3-1/+11
When the video coded size is different from frame size, we need the result buffers are same as coded size, which are not size compatible with encode required size, so that simply use no tunnel for this case instead of frame by frame converting. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: 13.0 <mesa-stable@lists.freedesktop.org>
2016-10-31st/omx/dec: result buffers size should match codec decoder sizeLeo Liu3-19/+18
Otherwise fails the check of matching between decoder size and buffers size in kernel. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: 13.0 <mesa-stable@lists.freedesktop.org>
2016-10-31swr: [rasterizer] added EventHandlerFile contructorGeorge Kyriazis1-1/+6
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-10-31swr: [rasterizer core] Frontend dependency workGeorge Kyriazis3-2/+18
Add frontend dependency concept in the DRAW_CONTEXT, which allows serialization of frontend work if necessary. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-10-31swr: [rasterizer core] Refactor/cleanup backendsGeorge Kyriazis2-360/+351
Used for common code reuse and simplification Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-10-31swr: [rasterizer core] Remove deprecated simd intrinsicsGeorge Kyriazis4-990/+1
Used in abandoned all-or-nothing approach to converting to AVX512 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-10-31swr: [rasterizer archrast] Add thread tags to event files.George Kyriazis5-4/+24
This allows the post-processor to easily detect the API thread and to process frame information. The frame information is needed to optimized how data is processed from worker threads. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-10-31glsl: use a non-malloc'd storage for short ir_variable namesMarek Olšák3-3/+22
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31glsl: use the linear allocator in opt_constant_propagationMarek Olšák1-3/+11
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31glsl: use the linear allocator in opt_copy_propagationMarek Olšák1-1/+6
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31glsl: use the linear allocator in opt_copy_propagation_elementsMarek Olšák1-4/+11
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31glsl: use the linear allocator in opt_dead_code_localMarek Olšák1-3/+9
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31glsl: use the linear allocator in glsl_symbol_tableMarek Olšák1-8/+8
no ralloc_free occurences Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31glsl: use the linear allocator for ast_node and derived classesMarek Olšák6-113/+114
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31glsl/lexer: use the linear allocatorMarek Olšák3-8/+12
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31glcpp: use the linear allocator for most objectsMarek Olšák3-118/+91
v2: cosmetic changes Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2016-10-31ralloc: add a linear allocator as a child node of rallocMarek Olšák2-4/+433
v2: remove goto, cosmetic changes Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31ralloc: remove memset from ralloc_sizeMarek Olšák1-15/+11
only do it in rzalloc_size as it was supposed to be Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
2016-10-31ralloc: use rzalloc where it's necessaryMarek Olšák11-15/+19
No change in behavior. ralloc_size is equivalent to rzalloc_size. That will change though. Calls not switched to rzalloc_size: - ralloc_vasprintf - glsl_type::name allocation (it's filled with snprintf) - C++ classes where valgrind didn't show uninitialized values I switched most of non-glsl stuff to rzalloc without checking whether it's really needed. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31ralloc: add DECLARE_RZALLOC_CXX_OPERATORSMarek Olšák1-2/+7
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whiteacpe.org>
2016-10-31nir: zero allocated memory where neededJuha-Pekka Heikkila6-7/+7
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-10-31i965/fs: fill allocated memory with zeros where neededJuha-Pekka Heikkila2-3/+3
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-10-31i965/vec4: zero allocated memory where neededJuha-Pekka Heikkila1-2/+2
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-10-31glsl/glcpp: initialize all fields of glcpp_parser_t on creationTapani Pälli1-0/+3
this fixes some of the regressions with "ralloc: remove memset from ralloc_size" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-10-31glsl: Fix reading of uninitialized memoryJuha-Pekka Heikkila2-4/+4
Switch to use memory allocations which zero memory for places where needed. v2: modify and rebase on top of Marek's series (Tapani) Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-10-31glsl: initialize glsl_struct_field properlyMarek Olšák2-38/+6
don't rely on ralloc doing memset Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whiteacpe.org>
2016-10-31ralloc: don't memset ralloc_header, clear it manuallyMarek Olšák1-1/+15
time GALLIUM_NOOP=1 ./run shaders/private/alien_isolation/ >/dev/null Before (2 takes): real 0m8.734s 0m8.773s user 0m34.232s 0m34.348s sys 0m0.084s 0m0.056s After (2 takes): real 0m8.448s 0m8.463s user 0m33.104s 0m33.160s sys 0m0.088s 0m0.076s Average change in "real" time spent: -3.4% calloc should only do 2 things compared to malloc: - check for overflow of "n * size" - call memset I'm not sure if that explains the difference. v2: clear "parent" and "next" in the caller of add_child. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1) Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2016-10-30clover: Implement clGetExtensionFunctionAddressForPlatform.Serge Martin3-1/+21
Add clGetExtensionFunctionAddressForPlatform (CL 1.2). Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-10-30clover: Introduce CLOVER_EXTRA_*_OPTIONS environment variablesVedran Miletić1-3/+7
The options specified in the CLOVER_EXTRA_BUILD_OPTIONS shell variable are appended to the options specified by the OpenCL program in the clBuildProgram function call, if any. Analogously, the options specified in the CLOVER_EXTRA_COMPILE_OPTIONS and CLOVER_EXTRA_LINK_OPTIONS variables are appended to the options specified in clCompileProgram and clLinkProgram function calls, respectively. v2: * rename to CLOVER_EXTRA_COMPILER_OPTIONS * use debug_get_option * append to linker options as well v3: code cleanups v4: separate CLOVER_EXTRA_LINKER_OPTIONS options v5: * fix documentation typo * use CLOVER_EXTRA_COMPILER_OPTIONS in link stage v6: * separate in CLOVER_EXTRA_{BUILD,COMPILE,LINK}_OPTIONS * append options in cl{Build,Compile,Link}Program Signed-off-by: Vedran Miletić <vedran@miletic.net> Reviewed-by[v1]: Edward O'Callaghan <funfunctor@folklore1984.net> v7 [Francisco Jerez]: Slight simplification. Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-10-30clover: Pass unquoted compiler arguments to ClangVedran Miletić1-4/+36
OpenCL apps can quote arguments they pass to the OpenCL compiler, most commonly include paths containing spaces. If the Clang OpenCL compiler was called via a shell, the shell would split the arguments with respect to to quotes and then remove quotes before passing the arguments to the compiler. Since we call Clang as a library, we have to split the argument with respect to quotes and then remove quotes before passing the arguments. v2: move to tokenize(), remove throwing of CL_INVALID_COMPILER_OPTIONS v3: simplify parsing logic, use more C++11 v4: restore error throwing, clarify a comment Signed-off-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>