~johnharr/scheduler - i915 GPU scheduler patches

Age	Commit message (Collapse)	Author	Files	Lines
2016-06-28	drm/i915: Native syncLibDRM	John Harrison	1	-0/+9

2016-04-14	drm/i915: Add wrapper for context priority interface	John Harrison	1	-0/+5
	There is an EGL extension to set execution priority per context. This can be implemented via the i915 per context priority parameter. This patch adds a wrapper to connect the two together in a way that can be updated as necessary without breaking one side or the other. Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
2015-12-14	intel: Add support for softpin	Michał Winiarski	1	-0/+1
	Softpin allows userspace to take greater control of GPU virtual address space and eliminates the need of relocations. It can also be used to mirror addresses between GPU and CPU (shared virtual memory). Calls to drm_intel_bo_emit_reloc are still required to build the list of drm_i915_gem_exec_objects at exec time, but no entries in relocs are created. Self-relocs don't make any sense for softpinned objects and can indicate a programming errors, thus are forbidden. Softpinned objects are marked by asterisk in debug dumps. Cc: Thomas Daniel <thomas.daniel@intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Zou Nanhai <nanhai.zou@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2015-12-14	intel: 48b ppgtt support (EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag)	Michel Thierry	1	-0/+1
	Gen8+ supports 48-bit virtual addresses, but some objects must always be allocated inside the 32-bit address range. In specific, any resource used with flat/heapless (0x00000000-0xfffff000) General State Heap (GSH) or Instruction State Heap (ISH) must be in a 32-bit range, because the General State Offset and Instruction State Offset are limited to 32-bits. The i915 driver has been modified to provide a flag to set when the 4GB limit is not necessary in a given bo (EXEC_OBJECT_SUPPORTS_48B_ADDRESS). 48-bit range will only be used when explicitly requested. Callers to the existing drm_intel_bo_emit_reloc function should set the use_48b_address_range flag beforehand, in order to use full ppgtt range. v2: Make set/clear functions nops on pre-gen8 platforms, and use them internally in emit_reloc functions (Ben) s/48BADDRESS/48B_ADDRESS/ (Dave) v3: Keep set/clear functions internal, no-one needs to use them directly. v4: Don't set 48bit-support flag in emit reloc, check for ppgtt type before enabling set/clear function, print full offsets in debug statements, using port of lower_32_bits and upper_32_bits from linux kernel (Michał) References: http://lists.freedesktop.org/archives/intel-gfx/2015-July/072612.html Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-10	intel: wrap intel_bufmgr.h C code for C++ compilation/linking	Tapani Pälli	1	-0/+8
	We need this include in porting changes for the OpenGL ES conformance suite. v2: remove c_plusplus usage Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-18	intel: Export total subslice and EU counts	Jeff McGee	1	-0/+3
	Update kernel interface with new I915_GETPARAM ioctl entries for subslice total and EU total. Add a wrapping function for each parameter. Userspace drivers need these values when constructing GPGPU commands. This kernel query method is intended to replace the PCI ID-based tables that userspace drivers currently maintain. The kernel driver can employ fuse register reads as needed to ensure the most accurate determination of GT config attributes. This first became important with Cherryview in which the config could differ between devices with the same PCI ID. The kernel detection of these values is device-specific. Userspace drivers should continue to maintain ID-based tables for older devices which return ENODEV when using this query. v2: remove unnecessary include of <stdbool.h> and increment the I915_GETPARAM indices to match updated kernel patch. For: VIZ-4636 Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
2014-09-17	intel: Add support for userptr objects	Tvrtko Ursulin	1	-0/+5
	Allow userptr objects to be created and used via libdrm_intel. At the moment tiling and mapping to GTT aperture is not supported due hardware limitations across different generations and uncertainty about its usefulness. v2: Improved error handling in feature detection per review comments. v3: Rebase on top of the drm_public addition, minor whitespace addition. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> (v3) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v1,v2)
2014-01-20	intel: Create a new drm_intel_bo offset64 field.	Kenneth Graunke	1	-3/+9
	The existing 'offset' field is unfortunately typed as 'unsigned long', which is unfortunately only 4 bytes with a 32-bit userspace. Traditionally, the hardware has only supported 32-bit virtual addresses, so even though the kernel uses a __u64, the value would always fit. However, Broadwell supports 48-bit addressing. So with a 64-bit kernel, the card virtual address may be too large to fit in the 'offset' field. Ideally, we would change the type of 'offset' to be a uint64_t---but this would break the libdrm ABI. Instead, we create a new 'offset64' field to hold the full 64-bit value from the kernel, and store the 32-bit truncation in the existing 'offset' field, for compatibility. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-15	intel: Add support for GPU reset status query ioctl	Ian Romanick	1	-0/+5
	I would have just used the drmIoctl interface directly in Mesa, but the ioctl needs some data from the drm_intel_context that is not exposed outside libdrm. This ioctl is in the drm-intel-next tree as b635991. v2: Update based on Mika's kernel work. v3: Fix compile failures from last-minute typos. Sigh. v4: Import the actual changes from the kernel i915_drm.h. Only comments on some fields of drm_i915_reset_stats differed. There are still some deltas between the kernel i915_drm.h and the one in libdrm, but those can be resolved in other patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3] Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08	Revert "intel: Add support for GPU reset status query ioctl"	Dave Airlie	1	-5/+0
	This reverts commit 6335e1d28c422050024bcf4100c4fb3a5bac2afb. No taxation without representation, in other words no userspace without kernel stuff being in a stable location, either drm-next but I'll accept drm-intel-next for intel specific stuff.
2013-11-07	intel: Add support for GPU reset status query ioctl	Ian Romanick	1	-0/+5
	I would have just used the drmIoctl interface directly in Mesa, but the ioctl needs some data from the drm_intel_context that is not exposed outside libdrm. v2: Update based on Mika's kernel work. v3: Fix compile failures from last-minute typos. Sigh. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-06-10	intel/aub: Implement a way to specify the output .aub filename	Damien Lespiau	1	-0/+3
	Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-10	intel: Add a function for the new register read ioctl.	Eric Anholt	1	-0/+3
	Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2012-07-20	intel: add prime interface for getting/setting a prime bo. (v4)	Dave Airlie	1	-0/+4
	This adds interfaces for the X driver to use to create a prime handle from a buffer, and create a bo from a handle. v2: use Chris's suggested naming (well from at least for consistency) v3: git commit --amend fail v4: fix as per Chris's suggestions, group assignments, add get tiling Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-06-27	intel/context: new execbuf interface for contexts	Ben Widawsky	1	-0/+5
	To support this we extract the common execbuf2 functionality to be called with, or without contexts. The context'd execbuf does not support some of the dri1 stuff. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-06-27	intel/context: Add drm_intel_context type	Ben Widawsky	1	-0/+1
	Add an opaque type representing a HW context. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-06-27	intel: wait render timeout implementation	Ben Widawsky	1	-0/+1
	int drm_intel_gem_bo_wait(drm_intel_bo *bo, uint64_t timeout_ns) This should bump the libdrm version. We're waiting for context support so we can do both features in one bump. v2: don't return remaining timeout amount use get param and fallback for older kernels v3: only doing getparam at init prototypes now have a signed input value v4: update comments fall back to correct polling behavior with new userspace and old kernel v5: since the drmIoctl patch was not well received, return appropriate values in this function instead. As Daniel pointed out, the polling case (timeout == 0) should also return -ETIME. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-10	intel: Add the ability to supply annotations for .aub files.	Paul Berry	1	-0/+10
	This patch adds a new function, drm_intel_bufmgr_gem_set_aub_annotations(), which can be used to annotate the type and subtype of data stored in various sections of each buffer. This data is used to populate type and subtype fields when generating the .aub file, which improves the ability of later debugging tools to analyze the contents of the .aub file. If drm_intel_bufmgr_gem_set_aub_annotations() is not called, then we fall back to the old set of annotations (annotate the portion of the batchbuffer that is executed as AUB_TRACE_TYPE_BATCH, and everything else as AUB_TRACE_TYPE_NOTYPE). Reviewed-by: Eric Anholt <eric@anholt.net>
2012-03-10	intel: Add support for (possibly) unsynchronized maps.	Eric Anholt	1	-0/+2
	This improves the performance of Mesa's GL_MAP_UNSYNCHRONIZED_BIT path in GL_ARB_map_buffer_range. Improves Unigine Tropics performance at 1024x768 by 2.30482% +/- 0.0492146% (n=61) v2: Fix comment grammar. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-03-09	intel: Bump the copyright dates on the bufmgr files.	Eric Anholt	1	-1/+1
	We've been hacking these constantly.
2012-03-09	intel: Add .aub file output support.	Eric Anholt	1	-0/+14
	This will allow the driver to capture all of its execution state to a file for later debugging. intel_gpu_dump is limited in that it only captures batchbuffers, and Mesa's captures, while more complete, still capture only a portion of the state involved in execution. This is a squash commit of a long series of hacking as we tried to get the resulting traces to work in the internal simulator. It contains contributions by Yuanhan Liu and Kenneth Graunke. v2: Drop the MI_FLUSH_ENABLE setup. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-03-09	intel: Add support for overriding the PCI ID via an environment variable	Kenneth Graunke	1	-0/+1
	For example: export INTEL_DEVID_OVERRIDE=0x162 If this variable is set, don't actually submit the batchbuffer to the GPU, it probably contains commands for the wrong generation of hardware. v2: Introduce a getter for the overridden devid, and avoid getenv per exec. Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eric Anholt <eric@anholt.net>
2012-01-04	intel: Add an interface for setting the output file for decode.	Eric Anholt	1	-0/+2
	Consumers often want to choose stdout vs stderr, and for testing I want to output to an open_memstream file. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2011-12-29	intel: Get intel_decode.c minimally building.	Eric Anholt	1	-0/+12
	My plan is to use this drm_intel_dump_batchbuffer() interface for the current GPU tools, and the current Mesa batch dumping usage, while eventually building more interesting interfaces for other uses. Warnings are currently suppressed by using a helper lib with CFLAGS set manually, because the code is totally not ready for libdrm's warnings setup. Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Eugeni Dodonov <eugeni@dodonov.net>
2011-12-05	intel: Add an interface to limit vma caching	Chris Wilson	1	-0/+2
	There is a per-process limit on the number of vma that the process can keep open, so we cannot keep an unlimited cache of unused vma's (besides keeping track of all those vma in the kernel adds considerable overhead). However, in order to work around inefficiencies in the kernel it is beneficial to reuse the vma, so keep a MRU cache of vma. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-28	intel: Add an interface for removing relocs after they're added.	Eric Anholt	1	-0/+2
	This lets us replace the current inner drawing loop of mesa: for each prim { compute bo list if (check_aperture_space(bo list)) { batch_flush() compute bo list if (check_aperture_space(bo list)) { whine_about_batch_size() fall back; } } upload state to BOs } with this inner loop: for each prim { retry: upload state to BOs if (check_aperture_space(batch)) { if (!retried) { reset_to_last_prim() batch_flush() } else { if (batch_flush()) whine_about_batch_size() goto retry; } } } This avoids having to implement code to walk over certain sets of GL state twice (the "compute bo list" step). While it's not a performance improvement, it's a significant win in code complexity: about -200 lines, and one place to make mistakes related to aperture space instead of N places to forget some BO we should have included. Note how if we do a reset in the new loop , we immediately flush. We don't need to check aperture space -- the kernel will tell us if we actually ran out of aperture or not. And if we did run out of aperture, it's because either the single prim was too big, or because check_aperture was wrong at the point of setting up the last primitive. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2011-06-04	intel: Add interface to query aperture sizes.	Chris Wilson	1	-0/+2
	Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-19	intel: Export CONSTANT_BUFFER addressing mode	Chris Wilson	1	-1/+1
	Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-25	intel: Add a forward declaration of struct drm_clip_rect	Chris Wilson	1	-2/+4
	... so that intel_bufmgr.h can be compiled standalone. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-08-26	Avoid use of c++ reserved keyword "virtual" when using a C++ compiler.	Eric Anholt	1	-1/+5
	Avoids requiring nasty hacks around libdrm headers in the new C++ parts of Mesa drivers.
2010-06-06	intel: Add support for kernel multi-ringbuffer API.	Zou Nan hai	1	-0/+3
	This introduces a new API to exec on BSD ring buffer, for H.264 VLD decoding. Signed-off-by: Xiang Hai hao <haihao.xiang@intel.com> Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
2010-05-11	intel: query whether a buffer is reusable.	Chris Wilson	1	-0/+1
	Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-03-02	libdrm/intel: execbuf2 support	Jesse Barnes	1	-0/+5
	This patch to libdrm adds support for the new execbuf2 ioctl. If detected, it will be used instead of the old ioctl. By using the new drm_intel_bufmgr_gem_enable_fenced_relocs(), you can indicate that any time a fence register is actually required for a relocation target you will call drm_intel_bo_emit_reloc_fence instead of drm_intel_bo_emit_reloc, which will reduce fence register pressure. Signed-off-by: Eric Anholt <eric@anholt.net>
2009-11-20	Merge remote branch 'origin/master' into libdrm	Kristian Høgsberg	1	-0/+1

2009-11-17	Move libdrm/ up one level	Kristian Høgsberg	1	-0/+212