Age | Commit message (Collapse) | Author | Files | Lines |
|
This commit enable all the rest glamor rendering functions.
Tested with latest glamor master branch, can pass rendercheck.
One thing need to be pointed out is the picture's handling.
Pictures support many different color formats, but glamor's
texture only support a few color formats. And the most common
scenario is that we create a pixmap with a color depth and
then attach it to a picture which has a specific color format
with the same color depth. But there is no way to change a
texture's internal format after the texture was allocated.
If you do that, the OpenGL will allocate a new texture. And
then the glamor side and UXA side will be inconsitence. So
for all the picture related operations, we can't fallback to
UXA path directly, even it is rather a strainth forward
operation. So for the get_image, Addtraps.., we have to add
wrappers function for them to jump into glamor firstly.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Since we can not keep an unlimited number of vma cached due to the hard
per-process limits on the number of mappings and recreating mappings is
slow due to excruciatingly slow GTT pagefaults, we need to compromise
and keep a small MRU cache of inactive mmaps.
This uses the new API in libdrm-2.4.29 to specify the limit upon the VMA
cache maintained by libdrm.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
libdrm expires its bo 2s after entry into the cache, but we need to free
a buffer to trigger the reaper. So schedule a timer event to trigger 3s
after the last rendering is submitted to free any resident bo during
long periods of idleness.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The paranoia wasn't in vain.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
More paranoia is good for the soul.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This is principally to catch the cases of compositing after a fresh
PutImage.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
A poor cousin to vmap is to instead allocate snooped bo and use a CPU
mapping for zero-copy uploads into GPU resident memory. For maximum
performance, we still need tiled GPU buffers so CPU bo are only useful
in situations where we are frequently migrating data.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
In the happy scenario where the pixmap only resides upon the GPU we can
forgo the CPU allocation entirely. The goal is to reduce the number of
needless mmaps performed by the system memory allocator and reduce
overall memory consumption.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Since the VMA may be reaped at any time whilst the mapping is idle.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
In benchmarking firefox this performs whose - it would appear the
sources are indeed used more often than not.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
For large render targets, we prefer to use tiled bo in order to avoid
severe performance degradation. However, if we don't have a GPU bo but
do have a CPU bo and the operation would be untiled, then simply use the
CPU bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
In theory we should be able to disable dual-stream mode and so be
subject to much looser restrictions (such as the pitch need only be
dword aligned). However, achieving single-stream mode seems quite
difficult!
Reported-by: Paul Neumann <paul104x@yahoo.de>
References: https://bugs.freedesktop.org/show_bug.cgi?id=43706
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Our goal is to achieve "single-stream" rendering where the entire
RenderCache is allocated to the colour buffer (rather than split between
colour and depth). In theory all that is required is for the pipeline
not to reference the depth buffer at all, however it is not made clear
when that evaluation is made.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If the src replaces the dst, it could just be a much larger pixmap!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The binding table is intended to be after all the surface descriptions,
so make sure we write it with the appropriate offset into the buffer.
Fixes regression from 699888a64 (uxa/video: Use the common bo
allocations and upload)
Reported-by: Cyril Brulebois <kibi@debian.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43704
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
A VMA cache appears unavoidable thanks to compiz and an excrutiatingly
slow GTT pagefault, though it does look like it will be ineffectual
during everyday usage. Compiz (and presumably other compositing
managers) appears to be undoing all the pagefault minimisation as
demonstrated on gen5 with large XPutImage. It also appears the CPU to
memory bandwidth ratio plays a crucial role in determining whether
going straight to GTT or through the CPU cache is a win - so no trivial
heuristic.
x11perf -putimage10 -putimage500 on i5-2467m:
Before:
bare: 1150,000 2,410
compiz: 438,000 2,670
After:
bare: 1190,000 2,730
compiz: 437,000 2,690
UXA:
bare: 658,000 2,670
compiz: 389,000 2,520
On i3-330m
Before:
bare: 537,000 1,080
compiz: 263,000 398
After:
bare: 606,000 1,360
compiz: 203,000 985
UXA:
bare: 294,000 1,070
compiz: 197,000 821
On pnv:
Before:
bare: 179,000 213
compiz: 106,000 123
After:
bare: 181,000 246
compiz: 103,000 197
UXA:
bare: 114,000 312
compiz: 75,700 191
Reported-by: Michael Larabel <Michael@phoronix.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Missed from the previous patch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Try to avoid a few more unnecessary context switches.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If we are copying over the entire source onto the destination,just copy
across the GPU bo. This is often used for caching images as pixmaps.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
We have to be careful to assume bo via exposed are under our full
control, in particular not to assert their state. :(
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
gen2/3 have a restriction that the 3D pipeline cannot render to a pixmap
with a pitch less than 8/16 respectively. Rather than mandating all
pixmaps to be created with a stride greater than 16, fixup the bo for
the rare occasions when it is necessary.
Reported-by: Paul Neumann <paul104x@yahoo.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43688
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
And share it between the timer and the expiration function, just to
simplify the code.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
A late addition to be flexible for compiling on different systems
heralded its doom.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
For the long interval events (such as expiring the caches), we do not
need precise timing and so can use a coarse timer to allow the system
to coalesce and reduce wakeup events.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If a pixmap lies around for a couple of minutes not being used, it is
unlikely to be used again in the near future. Reap the GPU buffers of
any of those idle pixmaps (copying to a more compact buffer in system
memory) in order to free up resources for use elsewhere. Any object
that is exposed via DRI is obviously exempt from this reaping.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Fixes regression from e0066e77e026b0dd0daa0c3765473c7d63aa6753
(uxa: Simplify Composite solid acceleration for spans by only clipping
once) [2.15.901]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43649
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
In order to avoid inconsistent usage of coherency domains and to avoid
completely unnecessary clflushing during video playback, use the same
buffer allocation and upload functions as the rest of the driver.
Reported-by: Christophe Roland <roll68@gmail.com>
Bugzilla: http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=60;bug=651316
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
A typo confused left and right, rejecting true vertical edges, and worse
might have incurred false positives.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The operations when setting dpms on should be in the order opposite
of what's done when setting dpms off.
This is because of potentially conflicting effects:
~ drmModeConnectoSetProperty() enables/disables the backlight driver.
Some backlight drivers such as intel_backlight set the backlight to 0
when disabled and to max when enabled.
~ intel_output_dpms_backlight() saves the backlight value when turning
DPMS off and restores it when turning DPMS on.
Here's the current order of operations:
xset dpms force off (backlight is nonzero)
drmModeConnectoSetProperty(DPMSModeOff)
kernel: disable backlight, backlight=0
intel_output_dpms_backlight(DPMSModeOff)
save backlight value (0) <-- it has been set to 0 by kernel
set backlight to 0
xset dpms force on
drmModeConnectoSetProperty(DPMSModeOn)
kernel: enable backlight, backlight=max
intel_output_dpms_backlight(DPMSModeOn)
set backlight to saved value (0)
The correct way to do this would be to reverse the operations during
xset dpms force off:
intel_output_dpms_backlight(DPMSModeOff)
save backlight value (nonzero)
set backlight to 0
drmModeConnectoSetProperty(DPMSModeOff)
kernel: enable backlight, backlight=0
This restores the saved nonzero backlight value during the force on.
Signed-off-by: Simon Que <sque@chromium.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|