summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--linuxgraphicsdrivers.lyx123
1 files changed, 102 insertions, 21 deletions
diff --git a/linuxgraphicsdrivers.lyx b/linuxgraphicsdrivers.lyx
index 9e7189b..a9a1700 100644
--- a/linuxgraphicsdrivers.lyx
+++ b/linuxgraphicsdrivers.lyx
@@ -5285,48 +5285,104 @@ When designing a Linux graphics driver aiming for more than simple framebuffer
\begin_layout Section
\lang english
-Hardware sharing
+DRM batch buffer submission model
\end_layout
\begin_layout Standard
\lang english
-Multiplexing of the card command fifo - For cards which only feature a single
- hardware command submission fifo, it has to be shared between multiple
- user space components.
- In that case, this is achieved by the DRM module.
+At the core of the DRM design is the DRM_GEM_EXECBUFFER ioctl; which lets
+ a user space application submit a batch buffer to the kernel, which in
+ turns puts it on the GPU.
+ This ioctl allows many things like sharing the hardware, managing memory
+ and enforcing memory protection.
+\end_layout
+
+\begin_layout Subsection
+
+\lang english
+Hardware sharing
\end_layout
\begin_layout Standard
\lang english
-Prevent simultaneous access to the same hw
+One of the duties of the DRM is to multiplex the GPU itself between multiple
+ user space processes.
+
+\lang american
+Given that the GPU holds graphics state, a problem arises when multiple
+ applications use the same GPU: if nothing is done, the applications can
+ stomp over each other's state.
+ Depending on the hardware at hand, there are two major cases:
+\end_layout
+
+\begin_layout Itemize
+When the GPU features hardware state tracking, the hardware sharing is simpler
+ since each application can send to a separate context, and the GPU tracks
+ each application's state itself.
+ This is the way the nouveau driver works.
+\end_layout
+
+\begin_layout Itemize
+
+\lang english
+When the GPU doesn't have multiple hardware contexts, the common way of
+ multiplexing the hardware is to reemit the state at the beggining of each
+ batch buffer; it's the way the intel and radeon drivers multiplex the GPU.
+ Note that this duty of reemitting the state relies on user space entirely.
+ If the user space doesn't reemit the state at the beggining of each batch
+ buffer, the state from other DRM processes will leak onto it.
\end_layout
\begin_layout Standard
\lang english
-Share video memory
+The DRM also prevent simultaneous access to the same hardware.
\end_layout
-\begin_layout Section
+\begin_layout Subsection
\lang english
-Security
+Memory management and security
\end_layout
\begin_layout Standard
\lang english
-Prevent arbitrary DMAs to memory.
- IF the hardware does not feature memory protection, you have to check the
- command stream before submitting it to the GPU.
+The kernel has the ability to move memory areas around to handle high memory
+ pressure situations.
+ Depending on the hardware, there are two ways to achieve it:
\end_layout
-\begin_layout Section
+\begin_layout Itemize
+If the hardware has complete memory protection and virtualization, then
+ it is possible to page in memory resources into the GPU as they get allocated
+ and isolate the per-process.
+ Therefore not much is required to support memory protection of GPU memory.
+\end_layout
+
+\begin_layout Itemize
\lang english
-Memory management
+When the hardware doesn't have memory protection, this can still be achieved
+ entirely in the kernel, in a way where the user space is completely oblivious
+ to it.
+ In order to allow relocations to work for a user space process which is
+ otherwise unaware of them, the command submission ioctl will rewrite the
+ command buffers in the kernel by replacing all the hardware offsets to
+ their current locations.
+ This is possible since the kernel knows about the current position of all
+ memory buffers.
+
+\begin_inset Newline newline
+\end_inset
+
+To prevent access to arbitrary GPU memory, the command submission ioctl
+ can also check that each of these offsets is owned by the calling process,
+ and reject the batch buffer if it isn't.
+ This way it is possible to implement memory protection on hardware which
+ doesn't have that functionality otherwise.
\end_layout
\begin_layout Standard
@@ -5335,6 +5391,12 @@ Memory management
GEM, TTM
\end_layout
+\begin_layout Standard
+
+\lang english
+Share video memory
+\end_layout
+
\begin_layout Section
\lang english
@@ -5545,6 +5607,12 @@ PreInit
\begin_layout Standard
\lang english
+This function is in charge of the initialization.
+\end_layout
+
+\begin_layout Standard
+
+\lang english
\begin_inset ERT
status open
@@ -5802,6 +5870,12 @@ FreeScreen
\begin_layout Standard
\lang english
+Cleanup the ScreenInit
+\end_layout
+
+\begin_layout Standard
+
+\lang english
\begin_inset ERT
status open
@@ -6518,6 +6592,11 @@ PrepareAccess
PrepareAccess makes the pixmap accessible from the CPU.
This includes mapping it into memory, copying it from unmappable video
memory, untiling the pixmap...
+ What this exactly does is very dependent from the GPU, but the core of
+ the matter is that you must provide a piece of CPU-accessible memory which
+ is stored in a linear form.
+ This can be achieved by either mapping GPU memory into the CPU domain with
+ a linear view, or by doing a copy from GPU to CPU memory.
\end_layout
\begin_layout Paragraph
@@ -6530,7 +6609,13 @@ FinishAccess
\lang english
FinishAccess is called once the pixmap is done being accessed, and must
- do the opposite of PrepareAccess.
+ undo what PrepareAccess did to make the pixmap usable by the GPU again.
+\end_layout
+
+\begin_layout Paragraph
+
+\lang english
+A note about EXA performance
\end_layout
\begin_layout Standard
@@ -6554,16 +6639,12 @@ EXA Pixmap migration.
\begin_layout Standard
\lang english
-As a side effect, it is often better to profile before implemnting specific
+As a side effect, it is often better to profile before implementing specific
EXA composite() functions, and look at the common calling patterns; a very
- common example is antialiased fonts (and they will show different calling
+ common example is antialiased fonts (which will also show different calling
patterns if subpixel rendering is enabled or not).
\end_layout
-\begin_layout Section
-Pixman
-\end_layout
-
\begin_layout Standard
\lang english