1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<link href="wayland.css" rel="stylesheet" type="text/css">
<script type="text/javascript" src="generated-toc.js"></script>
<title>Wayland</title>
</head>
<body>
<h1><a href="/"><img src="wayland.png" alt="Wayland logo"></a></h1>
<div id="generated-toc" class="generate_from_h2"></div>
<h2>Wayland Architecture</h2>
<p>A good way to understand the wayland architecture and how it is
different from X is to follow an event from the input device to the
point where the change it affects appears on screen.</p>
<p>This is where we are now with X:</p>
<p><img src="x-architecture.png" alt="X architecture diagram"></p>
<ol>
<li>
The kernel gets an event from an input device and sends it to X
through the evdev input driver. The kernel does all the hard work
here by driving the device and translating the different device
specific event protocols to the linux evdev input event standard.
</li>
<li>
The X server determines which window the event affects and sends
it to the clients that have selected for the event in question on
that window. The X server doesn't actually know how to do this
right, since the window location on screen is controlled by the
compositor and may be transformed in a number of ways that the X
server doesn't understand (scaled down, rotated, wobbling, etc).
</li>
<li>
The client looks at the event and decides what to do. Often the
UI will have to change in response to the event - perhaps a check
box was clicked or the pointer entered a button that must be
highlighted. Thus the client sends a rendering request back to
the X server.
</li>
<li>
When the X server receives the rendering request, it sends it to
the driver to let it program the hardware to do the rendering.
The X server also calculates the bounding region of the rendering,
and sends that to the compositor as a <em>damage event</em>.
</li>
<li>
The damage event tells the compositor that something changed in
the window and that it has to recomposite the part of the screen
where that window is visible. The compositor is responsible for
rendering the entire screen contents based on its scenegraph and
the contents of the X windows. Yet, it has to go through the X
server to render this.
</li>
<li>
The X server receives the rendering requests from the compositor
and either copies the compositor back buffer to the front buffer
or does a pageflip. In the general case, the X server has to do
this step so it can account for overlapping windows, which may
require clipping and determine whether or not it can page flip.
However, for a compositor, which is always fullscreen, this is
another unnecessary context switch.
</li>
</ol>
<p>
As suggested above, there are a few problems with this approach.
The X server doesn't have the information to decide which window
should receive the event, nor can it transform the screen
coordinates to window-local coordinates. And even though X has
handed responsibility for the final painting of the screen to the
compositing manager, X still controls the front buffer and
modesetting. Most of the complexity that the X server used to
handle is now available in the kernel or self contained libraries
(KMS, evdev, mesa, fontconfig, freetype, cairo, Qt, etc). In
general, the X server is now just a middle man that introduces an
extra step between applications and the compositor and an extra step
between the compositor and the hardware.
</p>
<p>
In wayland the compositor <em>is</em> the display server. We
transfer the control of KMS and evdev to the compositor. The
wayland protocol lets the compositor send the input events directly
to the clients and lets the client send the damage event directly to
the compositor:
</p>
<p><img src="wayland-architecture.png" alt="Wayland architecture diagram"></p>
<ol>
<li>
The kernel gets an event and sends it to the compositor. This is
similar to the X case, which is great, since we get to reuse all
the input drivers in the kernel.
</li>
<li>
The compositor looks through its scenegraph to determine which
window should receive the event. The scenegraph corresponds to
what's on screen and the compositor understands the
transformations that it may have applied to the elements in the
scenegraph. Thus, the compositor can pick the right window and
transform the screen coordinates to window-local coordinates, by
applying the inverse transformations. The types of transformation
that can be applied to a window is only restricted to what the
compositor can do, as long as it can compute the inverse
transformation for the input events.
</li>
<li>
As in the X case, when the client receives the event, it updates
the UI in response. But in the wayland case, the rendering
happens in the client, and the client just sends a request to the
compositor to indicate the region that was updated.
</li>
<li>
The compositor collects damage requests from its clients and then
recomposites the screen. The compositor can then directly issue
an ioctl to schedule a pageflip with KMS.
</li>
</ol>
<h2>Wayland Rendering</h2>
<p>
One of the details I left out in the above overview is how clients
actually render under wayland. By removing the X server from the
picture we also removed the mechanism by which X clients typically
render. But there's another mechanism that we're already using with
DRI2 under X: <em>direct rendering</em>. With direct rendering, the
client and the server share a video memory buffer. The client links
to a rendering library such as OpenGL that knows how to program the
hardware and renders directly into the buffer. The compositor in
turn can take the buffer and use it as a texture when it composites
the desktop. After the initial setup, the client only needs to tell
the compositor which buffer to use and when and where it has
rendered new content into it.
</p>
<p>
This leaves an application with two ways to update its window
contents:
</p>
<ol>
<li>
Render the new content into a new buffer and tell the compositor
to use that instead of the old buffer. The application can
allocate a new buffer every time it needs to update the window
contents or it can keep two (or more) buffers around and cycle
between them. The buffer management is entirely under application
control.
</li>
<li>
Render the new content into the buffer that it previously told the
compositor to use. While it's possible to just render directly
into the buffer shared with the compositor, this might race with
the compositor. What can happen is that repainting the window
contents could be interrupted by the compositor repainting the
desktop. If the application gets interrupted just after clearing
the window but before rendering the contents, the compositor will
texture from a blank buffer. The result is that the application
window will flicker between a blank window or half-rendered
content. The traditional way to avoid this is to render the new
content into a back buffer and then copy from there into the
compositor surface. The back buffer can be allocated on the fly
and just big enough to hold the new content, or the application
can keep a buffer around. Again, this is under application
control.
</li>
</ol>
<p>
In either case, the application must tell the compositor which area
of the surface holds new contents. When the application renders
directly to the shared buffer, the compositor needs to be noticed
that there is new content. But also when exchanging buffers, the
compositor doesn't assume anything changed, and needs a request from
the application before it will repaint the desktop. The idea that
even if an application passes a new buffer to the compositor, only a
small part of the buffer may be different, like a blinking cursor or
a spinner.
</p>
<h2>Hardware Enabling for Wayland</h2>
<p>
Typically, hardware enabling includes modesetting/display and
EGL/GLES2. On top of that, Wayland needs a way to share buffers
efficiently between processes. There are two sides to that, the
client side and the server side.
</p>
<p>
On the client side we've defined a Wayland EGL platform. In the EGL
model, that consists of the native types (EGLNativeDisplayType,
EGLNativeWindowType and EGLNativePixmapType) and a way to create
those types. In other words, it's the glue code that binds the EGL
stack and its buffer sharing mechanism to the generic Wayland API.
The EGL stack is expected to provide an implementation of the
Wayland EGL platform. The full API is in
the <a href="https://cgit.freedesktop.org/wayland/wayland/tree/src/wayland-egl.h">wayland-egl.h</a>
header. The open source implementation in the mesa EGL stack is
in <a href="https://cgit.freedesktop.org/mesa/mesa/tree/src/egl/wayland/wayland-egl/wayland-egl.c">wayland-egl.c</a>
and <a href="https://cgit.freedesktop.org/mesa/mesa/tree/src/egl/drivers/dri2/platform_wayland.c">platform_wayland.c</a>.
</p>
<p>
Under the hood, the EGL stack is expected to define a
vendor-specific protocol extension that lets the client side EGL
stack communicate buffer details with the compositor in order to
share buffers. The point of the wayland-egl.h API is to abstract
that away and just let the client create an EGLSurface for a Wayland
surface and start rendering. The open source stack uses
the <a href="https://cgit.freedesktop.org/mesa/mesa/tree/src/egl/wayland/wayland-drm/wayland-drm.xml">drm</a>
Wayland extension, which lets the client discover the drm device to
use and authenticate and then share drm (GEM) buffers with the
compositor.
</p>
<p>
The server side of Wayland is the compositor and core UX for the
vertical, typically integrating task switcher, app launcher, lock
screen in one monolithic application. The server runs on top of a
modesetting API (kernel modesetting, OpenWF Display or similar) and
composites the final UI using a mix of EGL/GLES2 compositor and
hardware overlays if available. Enabling modesetting, EGL/GLES2 and
overlays is something that should be part of standard hardware
bringup. The extra requirement for Wayland enabling is
the <a href="https://cgit.freedesktop.org/mesa/mesa/tree/docs/specs/WL_bind_wayland_display.spec">EGL_WL_bind_wayland_display</a>
extension that lets the compositor create an EGLImage from a generic
Wayland shared buffer. It's similar to
the <a href="http://www.khronos.org/registry/egl/extensions/KHR/EGL_KHR_image_pixmap.txt">EGL_KHR_image_pixmap</a>
extension to create an EGLImage from an X pixmap.
</p>
<p>
The extension has a setup step where you have to bind the EGL
display to a Wayland display. Then as the compositor receives
generic Wayland buffers from the clients (typically when the client
calls eglSwapBuffers), it will be able to pass the struct wl_buffer
pointer to eglCreateImageKHR as the EGLClientBuffer argument and
with EGL_WAYLAND_BUFFER_WL as the target. This will create an
EGLImage, which can then be used by the compositor as a texture or
passed to the modesetting code to use as an overlay plane. Again,
this is implemented by the vendor specific protocol extension, which
on the server side will receive the driver specific details about
the shared buffer and turn that into an EGL image when the user
calls eglCreateImageKHR.
</p>
</body>
</html>
|