~dbaker/mesa - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2015-11-06	radeon/uvd: fix VC-1 simple/main profile decode v2jenkins	Boyuan Zhang	2	-2/+7
	We just needed to set the extra width/height fields to get this working. v2 (chk): rebased, CC stable added, commit message added, fixed coding style Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
2015-11-06	st/vaapi: fix vaapi VC-1 simple/main corruption v2	Boyuan Zhang	1	-0/+2
	Apply the start code fix only to advanced profile. v2 (chk): add commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
2015-11-06	st/va: add support for RGBX and BGRX in VPP	Julien Isorce	2	-18/+23
	Before it was only possible to convert a NV12 surface to RGBA or BGRA. This patch uses the same post processing function, "handleVAProcPipelineParameterBufferType", but add definitions for RGBX and BGRX. This patch also makes vlVaQuerySurfaceAttributes more generic to avoid copy and pasting the same lines. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian K<C3><B6>nig <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-06	vl/buffers: add RGBX and BGRX to the supported formats	Julien Isorce	1	-0/+18
	Useful is one wants to create RGBX or BGRX surfaces. The infrastructure is such that it required just a few definitions to support these formats. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian K<C3><B6>nig <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-06	st/va: properly use brackets in vlVaAcquireBufferHandle's switch	Julien Isorce	1	-5/+4
	In "switch (mem_type)" the brackets were surrounding "case+default" instead of "case" only. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian K<C3><B6>nig <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-06	st/va: properly indent buffer.c, config.c, image.c and picture.c	Julien Isorce	4	-56/+56
	Some lines were using 4 indentation spaces instead of 3. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian K<C3><B6>nig <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-06	freedreno/a4xx: fix blend color	Rob Clark	1	-5/+9
	Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-11-06	freedreno: update generated headers	Rob Clark	6	-43/+54
	Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-11-06	freedreno: add a305 support	Guillaume Charifi	1	-0/+1
	Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-11-06	freedreno/ir3: Use nir_foreach_variable	Boyan Ding	1	-3/+3
	Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-11-06	nir: some small cleanups	Rob Clark	2	-14/+14
	The various cf nodes all get allocated w/ shader as their ralloc_parent, so lets make this more explicit. Plus couple other corrections/ clarifications. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-11-06	nvc0: reintroduce BGRA4 format support	Ilia Mirkin	2	-3/+1
	Commit 342e68dc60 (nvc0: remove BGRA4 format support) removed the support to fix a WoW trace. However after further experimentation, I was able to get the blit to work by using a different "fake" format in the 2d engine. The reason why this worked on nv50 is that nv50 falls back to the 3d blit path in case either the src or the dst aren't "faithfully" supported, while nvc0 only does it for the dst format. RG8 is better supported by the nvc0 2d engine than R16. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-05	mesa: report enum name in glClientActiveTexture() error string	Brian Paul	1	-1/+2
	As we do for glActiveTexture(). Trivial.
2015-11-05	st/va: fix memory leak on error in vlVaCreateSurfaces2	Julien Isorce	1	-3/+9
	Found by coverity: CID #1337953 Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-05	st/va: indent vlVaQuerySurfaceAttributes and vlVaCreateSurfaces2	Julien Isorce	1	-283/+283
	Some lines were using 4 indentation spaces instead of 3. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-05	i965: Fix scalar VS float[] and vec2[] output arrays.	Kenneth Graunke	4	-2/+17
	The scalar VS backend has never handled float[] and vec2[] outputs correctly (my original code was broken). Outputs need to be padded out to vec4 slots. In fs_visitor::nir_setup_outputs(), we tried to process each vec4 slot by looping from 0 to ALIGN(type_size_scalar(type), 4) / 4. However, this is wrong: type_size_scalar() for a float[2] would return 2, or for vec2[2] it would return 4. This looked like a single slot, even though in reality each array element would be stored in separate vec4 slots. Because of this bug, outputs[] and output_components[] would not get initialized for the second element's VARYING_SLOT, which meant emit_urb_writes() would skip writing them. Nothing used those values, and dead code elimination threw a party. To fix this, we introduce a new type_size_vec4_times_4() function which pads array elements correctly, but still counts in scalar components, generating correct indices in store_output intrinsics. Normally, varying packing avoids this problem by turning varyings into vec4s. So this doesn't actually fix any Piglit or dEQP tests today. However, if varying packing is disabled, things would be broken. Tessellation shaders can't use varying packing, so this fixes various tcs-input Piglit tests on a branch of mine. v2: Shorten the implementation of type_size_4x to a single line (caught by Connor Abbott), and rename it to type_size_vec4_times_4() (renaming suggested by Jason Ekstrand). Use type_size_vec4 rather than using type_size_vec4_times_4 and then dividing by 4. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-11-05	llvmpipe: disable texture cache	Roland Scheidegger	1	-1/+1
	There are some weird problems with 8-wide vectors.
2015-11-05	nouveau: send back a debug message when waiting for a fence to complete	Ilia Mirkin	10	-16/+30
	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-05	nv50,nvc0: provide debug messages with shader compilation stats	Ilia Mirkin	11	-9/+28
	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-05	nouveau: add support for sending debug messages via KHR_debug	Ilia Mirkin	5	-0/+26
	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-05	st/clover: provide a path for drivers to call through to pfn_notify	Ilia Mirkin	4	-4/+36
	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> [ Francisco Jerez: Clean up clover::context interface by passing around a function object. ]
2015-11-05	st/mesa: set debug callback for debug contexts	Ilia Mirkin	1	-0/+57
	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>
2015-11-05	gallium: expose a debug message callback settable by context owner	Ilia Mirkin	6	-0/+82
	This will allow gallium drivers to send messages to KHR_debug endpoints Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-05	st/mesa: account for texture views when doing CopyImageSubData	Ilia Mirkin	1	-0/+8
	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-05	i965/fs: Do not mark used surfaces in FS_OPCODE_GET_BUFFER_SIZE	Iago Toral Quiroga	2	-4/+4
	Do it in the visitor, like we do for other opcodes. v2: use const, get rid of useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-11-05	i965/vec4: Do not mark used surfaces in VS_OPCODE_GET_BUFFER_SIZE	Iago Toral Quiroga	2	-5/+5
	Do it in the visitor, like we do for other opcodes. v2: use const, get rid of useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-11-05	i965/vec4: Do not mark used direct surfaces in VS_OPCODE_PULL_CONSTANT_LOAD	Iago Toral Quiroga	3	-13/+8
	Right now the generator marks direct surfaces as used but leaves marking of indirect surfaces to the caller. Just make the callers handle marking in both cases for consistency. v2: Use const, do not add unnecessary temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-11-05	i965/fs: Do not mark used direct surfaces in UNIFORM_PULL_CONSTANT_LOAD	Iago Toral Quiroga	2	-11/+1
	Right now the generator marks direct surfaces as used but leaves marking of indirect surfaces to the caller. Just make the callers handle marking in both cases for consistency. Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-11-05	i965/fs: Do not mark direct used surfaces in VARYING_PULL_CONSTANT_LOAD	Iago Toral Quiroga	3	-13/+8
	Right now the generator marks direct surfaces as used but leaves marking of indirect surfaces to the caller. Just make the callers handle marking in both cases for consistency. v2: Use const and remove useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-11-05	i965/skl+: Enable support for 16x multisampling	Neil Roberts	2	-1/+10
	Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-11-05	mesa/meta: Use interpolateAtOffset for 16x MSAA copy blit	Neil Roberts	1	-2/+37
	Previously there was a problem in i965 where if 16x MSAA is used then some of the sample positions are exactly on the 0 x or y axis. When the MSAA copy blit shader interpolates the texture coordinates at these sample positions it was possible that it would jump to a neighboring texel due to rounding errors. It is likely that these positions would be used on 16x MSAA because that is where they are defined to be in D3D. To fix that this patch makes it use interpolateAtOffset in the blit shader whenever 16x MSAA is used and the GL_ARB_gpu_shader5 extension is available. This forces it to interpolate the texture coordinates at the pixel center to avoid these problematic positions. This fixes ext_framebuffer_multisample-unaligned-blit and ext_framebuffer_multisample-clip-and-scissor-blit with 16x MSAA on SKL+. v2: Use interpolateAtOffset instead of interpolateAtSample v3: Always try to enable GL_ARB_gpu_shader5 in the shader [Ian Romanick] Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-11-05	meta/blit: Always try to enable GL_ARB_sample_shading	Neil Roberts	1	-14/+2
	Previously this extension was only enabled when blitting between two multisampled buffers. However I don't think it does any harm to just enable it all the time. The ‘enable’ option is used instead of ‘require’ so that the shader will still compile if the extension isn't available in the cases where it isn't used. This will make the next patch simpler because it wants to add another optional extension. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-11-05	meta: Support 16x MSAA in the multisample scaled blit shader	Neil Roberts	4	-11/+49
	v2: Fix the x_scale in the shader. Remove the doubts in the commit message. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-11-05	i965/meta: Support 16x MSAA in the meta stencil blit	Neil Roberts	1	-5/+17
	The destination rectangle is now drawn at 4x4 the size and the shader code to calculate the sample number is adjusted accordingly. Acked-by: Ben Widawsky <ben@bwidawsk.net>
2015-11-05	i965/fs/skl+: Fix calculating gl_SampleID for 16x MSAA	Neil Roberts	1	-1/+7
	In order to accomodate 16x MSAA, the starting sample pair index is now 3 bits rather than 2 on SKL+. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-11-05	i965: Support allocating the MCS buffer for 16x MSAA	Neil Roberts	1	-0/+6
	When 16 samples are used the MCS buffer needs 64 bits per pixel. Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-11-05	i965: Support calculating the bits needed to set up 16x MSAA	Neil Roberts	1	-1/+1
	The gen7_surface_msaa_bits function already returns the right values for 16 samples but it just needs its assert to be relaxed. Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-11-05	i965/fs: Add a sampler program key for whether the texture is 16x MSAA	Neil Roberts	3	-1/+16
	When 16x MSAA is used for sampling with texelFetch the compiler needs to use a different instruction which passes more arguments for the MCS data. Previously on skl+ it was unconditionally using this new instruction. However since 16x MSAA is probably going to be pretty rare, it is probably worthwhile to avoid using this instruction for the other sample counts. In order to do that this patch adds a new member to brw_sampler_prog_key_data to track when a sampler refers to a buffer with 16 samples. Note that this isn't done for the vec4 backend because it wouldn't change how many registers it uses. Acked-by: Ben Widawsky <ben@bwidawsk.net>
2015-11-05	i965/vec4/skl+: Use ld2dms_w instead of ld2dms	Neil Roberts	3	-2/+18
	In order to support 16x MSAA, skl+ has a wider version of ld2dms that takes two parameters for the MCS data. The MCS data in the response still fits in a single register so we just need to ensure we copy both values rather than just the lower one. Acked-by: Ben Widawsky <ben@bwidawsk.net>
2015-11-05	i965/fs/skl+: Use ld2dms_w instead of ld2dms	Neil Roberts	6	-5/+60
	In order to support 16x MSAA, skl+ has a wider version of ld2dms that takes two parameters for the MCS data. The MCS data retrieved from the ld_mcs instruction already returns 4 or 8 registers and is documented to return zeroes for the mcsh value when the sample count is less than 16. v2: Use get_lowered_simd_width to fall back to SIMD8 instructions when the message length would be too long in SIMD16. Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-11-05	i965: Program 16x MSAA sample positions.	Neil Roberts	3	-7/+34
	This is the standard pattern used by the other 3D graphics API. BDW has slots for these values, but they aren't actually used until SKL. Even though the documentation for BDW says they must be zero, it doesn't seem to cause any harm to program them anyway. The comment above for the 8x sample positions says that the hardware implements centroid interpolation by picking the centre-most sample that is inside the primitive. That implies that it might be worthwhile to pick a pattern that includes 0.5,0.5. However by experimentation this doesn't seem to actually be the case. With the sample positions in this patch, if I modify the piglit test below so that it instead reports the centroid position, it reports 0.492188,0.421875 which doesn't match any of the positions. If I modify the sample positions so that they include one at exactly 0.5,0.5 it doesn't help and it reports another position which is even further from the center for some reason. arb_gpu_shader5-interpolateAtSample-different Kenneth Graunke experimented with some other patterns that have a higher standard deviation but I think after some discussion it was decided that it would be better to pick the same pattern as the other graphics API in case there are games that rely on this pattern. (Based on a patch by Kenneth Graunke) Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben at bwidawsk.net>
2015-11-05	i965: Handle 16x MSAA in IMS dimension munging code.	Kenneth Graunke	1	-2/+6
	Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-11-05	nir: Rename nir_live_variables.c to nir_liveness.c.	Kenneth Graunke	2	-1/+1
	It doesn't actually operate on variables. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-11-05	nir: Rename live_variables to live_ssa_defs.	Kenneth Graunke	7	-14/+14
	This computes liveness of SSA values, not nir_variables. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-11-05	i965/vec4: select predicate based on writemask for sel emissions	Alejandro Piñeiro	1	-1/+17
	Equivalent to commit 8ac3b525c but with sel operations. In this case we select the PredCtrl based on the writemask. This patch helps on cases like this: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null:D, vgrf40.xxxx:D, 0D 3: (+f0.0) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD In this case, cmod propagation can't optimize instruction #2, because instructions #1 and #2 have different writemasks, and we can't update directly instruction #2 writemask because our code thinks that sel at instruction #3 reads all four channels of the flag, when it actually only reads .x. So, with this patch, the previous case becames this: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null:D, vgrf40.xxxx:D, 0D 3: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD Now only the x channel of the flag is used, allowing dead code eliminate to update the writemask at the second instruction: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null.x:D, vgrf40.xxxx:D, 0D 3: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD So now cmod propagation can simplify out #2: 1: cmp.l.f0.0 vgrf40.0.x:F, attr18.wwww:F, vgrf7.xxxx:F 2: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD Shader-db numbers: total instructions in shared programs: 6235835 -> 6228008 (-0.13%) instructions in affected programs: 219850 -> 212023 (-3.56%) total loops in shared programs: 1979 -> 1979 (0.00%) helped: 1192 HURT: 0
2015-11-04	nouveau: relax fence emit space assert	Ilia Mirkin	3	-3/+3
	We also have the "reserved for kick" space available. Some of my earlier changes can probably be removed, but this is a quick fix for some of the rarer fallout. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>
2015-11-04	vc4: When the create ioctl fails, free our cache and try again.	Eric Anholt	1	-5/+24
	This greatly increases the pressure you can put on the driver before create fails. Ultimately we need to let the kernel take control of our cached BOs and just take them from us (and other clients) directly, but this is a very easy patch for the moment. Cc: "11.0" <mesa-stable@lists.freedesktop.org>
2015-11-04	vc4: Print the rounded shader size in debug output.	Eric Anholt	1	-1/+1
	It's surprising to see "0kb" printed for debug on short shaders, while 4kb alignment won't be suprising.
2015-11-04	vc4: Fix dumping the size of BOs allocated/cached.	Eric Anholt	1	-2/+2
	60MB of cached BOs are a lot less scary than 600MB.
2015-11-04	mesa/tests: add glBufferStorageEXT to ES 3.1 dispatch list	Ilia Mirkin	1	-0/+3
	I thought that aliased functions didn't need to be added, but that might only be if the function aliases something in the same {desktop,ES} space. Resolves the dispatch sanity test failure. Fixes: 13b19aa81 (mesa: expose support for GL_EXT_buffer_storage) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92824 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>