From 63f78fcab26ae6ccaa81a1fe4dc1a510db59f1a2 Mon Sep 17 00:00:00 2001 From: Stephane Marchesin Date: Sun, 11 Mar 2012 16:42:33 -0700 Subject: More DRM & X driver changes. --- linuxgraphicsdrivers.lyx | 123 +++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 102 insertions(+), 21 deletions(-) diff --git a/linuxgraphicsdrivers.lyx b/linuxgraphicsdrivers.lyx index 9e7189b..a9a1700 100644 --- a/linuxgraphicsdrivers.lyx +++ b/linuxgraphicsdrivers.lyx @@ -5285,48 +5285,104 @@ When designing a Linux graphics driver aiming for more than simple framebuffer \begin_layout Section \lang english -Hardware sharing +DRM batch buffer submission model \end_layout \begin_layout Standard \lang english -Multiplexing of the card command fifo - For cards which only feature a single - hardware command submission fifo, it has to be shared between multiple - user space components. - In that case, this is achieved by the DRM module. +At the core of the DRM design is the DRM_GEM_EXECBUFFER ioctl; which lets + a user space application submit a batch buffer to the kernel, which in + turns puts it on the GPU. + This ioctl allows many things like sharing the hardware, managing memory + and enforcing memory protection. +\end_layout + +\begin_layout Subsection + +\lang english +Hardware sharing \end_layout \begin_layout Standard \lang english -Prevent simultaneous access to the same hw +One of the duties of the DRM is to multiplex the GPU itself between multiple + user space processes. + +\lang american +Given that the GPU holds graphics state, a problem arises when multiple + applications use the same GPU: if nothing is done, the applications can + stomp over each other's state. + Depending on the hardware at hand, there are two major cases: +\end_layout + +\begin_layout Itemize +When the GPU features hardware state tracking, the hardware sharing is simpler + since each application can send to a separate context, and the GPU tracks + each application's state itself. + This is the way the nouveau driver works. +\end_layout + +\begin_layout Itemize + +\lang english +When the GPU doesn't have multiple hardware contexts, the common way of + multiplexing the hardware is to reemit the state at the beggining of each + batch buffer; it's the way the intel and radeon drivers multiplex the GPU. + Note that this duty of reemitting the state relies on user space entirely. + If the user space doesn't reemit the state at the beggining of each batch + buffer, the state from other DRM processes will leak onto it. \end_layout \begin_layout Standard \lang english -Share video memory +The DRM also prevent simultaneous access to the same hardware. \end_layout -\begin_layout Section +\begin_layout Subsection \lang english -Security +Memory management and security \end_layout \begin_layout Standard \lang english -Prevent arbitrary DMAs to memory. - IF the hardware does not feature memory protection, you have to check the - command stream before submitting it to the GPU. +The kernel has the ability to move memory areas around to handle high memory + pressure situations. + Depending on the hardware, there are two ways to achieve it: \end_layout -\begin_layout Section +\begin_layout Itemize +If the hardware has complete memory protection and virtualization, then + it is possible to page in memory resources into the GPU as they get allocated + and isolate the per-process. + Therefore not much is required to support memory protection of GPU memory. +\end_layout + +\begin_layout Itemize \lang english -Memory management +When the hardware doesn't have memory protection, this can still be achieved + entirely in the kernel, in a way where the user space is completely oblivious + to it. + In order to allow relocations to work for a user space process which is + otherwise unaware of them, the command submission ioctl will rewrite the + command buffers in the kernel by replacing all the hardware offsets to + their current locations. + This is possible since the kernel knows about the current position of all + memory buffers. + +\begin_inset Newline newline +\end_inset + +To prevent access to arbitrary GPU memory, the command submission ioctl + can also check that each of these offsets is owned by the calling process, + and reject the batch buffer if it isn't. + This way it is possible to implement memory protection on hardware which + doesn't have that functionality otherwise. \end_layout \begin_layout Standard @@ -5335,6 +5391,12 @@ Memory management GEM, TTM \end_layout +\begin_layout Standard + +\lang english +Share video memory +\end_layout + \begin_layout Section \lang english @@ -5544,6 +5606,12 @@ PreInit \begin_layout Standard +\lang english +This function is in charge of the initialization. +\end_layout + +\begin_layout Standard + \lang english \begin_inset ERT status open @@ -5801,6 +5869,12 @@ FreeScreen \begin_layout Standard +\lang english +Cleanup the ScreenInit +\end_layout + +\begin_layout Standard + \lang english \begin_inset ERT status open @@ -6518,6 +6592,11 @@ PrepareAccess PrepareAccess makes the pixmap accessible from the CPU. This includes mapping it into memory, copying it from unmappable video memory, untiling the pixmap... + What this exactly does is very dependent from the GPU, but the core of + the matter is that you must provide a piece of CPU-accessible memory which + is stored in a linear form. + This can be achieved by either mapping GPU memory into the CPU domain with + a linear view, or by doing a copy from GPU to CPU memory. \end_layout \begin_layout Paragraph @@ -6530,7 +6609,13 @@ FinishAccess \lang english FinishAccess is called once the pixmap is done being accessed, and must - do the opposite of PrepareAccess. + undo what PrepareAccess did to make the pixmap usable by the GPU again. +\end_layout + +\begin_layout Paragraph + +\lang english +A note about EXA performance \end_layout \begin_layout Standard @@ -6554,16 +6639,12 @@ EXA Pixmap migration. \begin_layout Standard \lang english -As a side effect, it is often better to profile before implemnting specific +As a side effect, it is often better to profile before implementing specific EXA composite() functions, and look at the common calling patterns; a very - common example is antialiased fonts (and they will show different calling + common example is antialiased fonts (which will also show different calling patterns if subpixel rendering is enabled or not). \end_layout -\begin_layout Section -Pixman -\end_layout - \begin_layout Standard \lang english -- cgit v1.2.3