summaryrefslogtreecommitdiff
path: root/GLdispatch.mdwn
blob: 90cb6e1f41cae3f2d09535ed7d46ab5bbac7edf6 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122

[[!toc ]] 


# Background of API dispatch in Mesa

The operation of nearly every function in the OpenGL API depends on the state of the currently bound context.  Even if all state in two context's is identical, the operation can be different between the two contexts.  Consider a couple common examples: 

* One context is created for direct rendering (requiring that function calls be directed into the local driver) and the other context is created for indirect rendering (requiring that function calls be convertex to GLX protocol and sent to the server). 
* Both contexts are created for direct rendering, but one context is created on screen 0 of the display, which happens to be on a card by manufacturer Foo, and the other context is created on screen 1, which happens to be on a card by manufacturer Bar.  In this function calls are directed to a _different_ local driver for each context. 
To implement this, Mesa uses something similar to the virtual function table for a C++ class.  Each context has an associated _dispatch table_.  This table contains a function pointer for every function in the API.  When a context is bound with `glXMakeCurrent` (or similar function), a pointer to its dispatch table is stored in a global variable.  When an application calls an API function, such as `glVertex3f`, it is actually calling a generic _dispatch stub_ in Mesa (dispatch stubs that are directly called are also referred to as _static dispatch stubs_).  This dispatch stub fetches a pointer to the currently context's dispatch table, looks up a pointer to the desired function in that table, and, finally, calls the function. 

It is also important to note that, since the current context is set per-thread in a multithreaded application, the dispatch pointer is also stored per-thread. 

The [[Linux OpenGL ABI|http://oss.sgi.com/projects/ogl-sample/ABI/index.html]] requires that certain functions be statically exported by the system libGL.  However, nearly any useful implementation will want to expose more functionality.  Since applications cannot depend on these functions being statically available, another method is needed to access them.  Initially implemented as an [[extension|http://oss.sgi.com/projects/ogl-sample/registry/ARB/get_proc_address.txt]] and later incorporated into [[GLX 1.4|http://www.opengl.org/documentation/spec.html]], `glXGetProcAddressARB` (or `glXGetProcAddress`) is used to get a pointer to a named API function.  GLX requires that pointers returned by `glXGetProcAddressARB` be _context independent_.  This means that calling `glXGetProcAddressARB((const GLubyte *) "glVertex3f")` will return the same value no matter what context is bound [^1]. 

In addition, `glXGetProcAddressARB` can be called when _no_ context is bound.  Since Mesa is capable of loading drivers with unknown functionality, Mesa has no way to know a priori that a requested function, such as `glWillNeverExist`, doesn't exist.  For this reason Mesa's implementation of `glXGetProcAddressARB` will _never_ return `NULL` for a well formed API function name.  Other libGL implementations that only operate with a limited set of known drivers (e.g., Nvidia's closed-source libGL) can know which functions will never exist and may return `NULL`. 

Mesa's implementation of `glXGetProcAddressARB` does two important steps when called with a function name that is currently unknown.  It first assigns an offset in the dispatch table to the new function.  Once the location of the function pointer in the dispatch table is known, Mesa generates a _dynamic dispatch stub_ for the function.  A pointer to this function is returned by `glXGetProcAddressARB`.  The name of the function, the assigned offset, and the dispatch stub pointer are all stored in a table used internally by Mesa. 


# Threading models supported by Mesa

In terms of API dispatch, Mesa currently supports four different threading models.  The compile-time choice of threading model dictates the implementation of the dispatch stubs.  The core difference lies in how the global dispatch pointer is stored and retrieved.  The threading mode is selected by defining one of `PTHREADS`, `SOLARIS_THREADS`, `WIN32_THREADS`, `USE_XTHREADS`, or `BEOS_THREADS`.  When `PTHREADS` is selected, `GLX_USE_TLS` can also be used.  If none of these values are defined, Mesa uses a single-threaded mode of operation. 


## Single-threaded

In single threaded mode as single, global variable is used to store the dispatch table pointer.  This results in the simplest possible dispatch stubs as well. 

    void glVertex3fv( const GLfloat * v )
    {
        (*_glapi_Dispatch->Vertex3fv)( v );
    }

## Non-TLS threading models

Mesa implements a generic wrapper function, called `_glapi_get_dispatch`, that is used to get the per-thread dispatch pointer.  The naive implementation of a dispatch stub is shown below. 

    void glVertex3fv( const GLfloat * v )
    {
        const struct _glapi_table * d = _glapi_get_dispatch();
        (*d->Vertex3fv)( v );
    }

This implementation is very simple, but it results in poor performance in the single-threaded case.  Single-threaded applications are by far more common that multi-threaded, so it make sense to do some optimization for that case.  The old variable `_glapi_Dispatch` continues to exist, but its semantic is slightly modified.  When the implementation of `glXMakeCurrent` detects that a new thread is setting a current context, `_glapi_Dispatch` is set to `NULL` and the true dispatch table pointer is stored in some piece of thread local storage.  This allows the dispatch function to use the state of `_glapi_Dispatch` to determine whether or not the application is single- or multi-threaded. 

    void glVertex3fv( const GLfloat * v )
    {
        const struct _glapi_table * d = (_glapi_Dispatch != NULL)
            ? _glapi_Dispatch : _glapi_get_dispatch();
        (*d->Vertex3fv)( v );
    }

The result is vastly improved single-threaded performance with a small penalty to multi-threaded performance. 


### Pthreads optimization

Pthreads is by far the most common threading model used by Mesa.  Some of the overhead of `_glapi_get_dispatch` can be avoided by directly calling `pthread_getspecific` from the dispatch stub.  This helps the multi-threaded case slightly but has no impact on the single-threaded case. 

    void glVertex3fv( const GLfloat * v )
    {
        const struct _glapi_table * d = (_glapi_Dispatch != NULL)
            ? _glapi_Dispatch : pthread_getspecific( & _gl_DispatchTSD );
        (*d->Vertex3fv)( v );
    }

## TLS

Thread-local storage on Linux provides compiler supported, per-thread variables.  Once a variable is defined as being per-thread, it can be accessed with in C code like any global variable.  However, the compiler (and linker) will perform some magic behind the scenes to ensure that each thread has its own data.  On x86 this requires the use of a slightly more expensive addressing mode to access the TLS variables.  The performance penalty of this addressing mode in the single-threaded case has been measure to be comparable to the penalty of the `_glapi_Dispatch` test in the non-TLS case.  The performance advantage of the TLS access versus the call to either `_glapi_get_dispatch` or `pthreads_getspecific` is quite large. 

The name of the dispatch table pointer is changed in the TLS case to prevent conflicts between a TLS libGL and a non-TLS DRI driver. 


    void glVertex3fv( const GLfloat * v )
    {
        (*_glapi_tls_Dispatch->Vertex3fv)( v );
    }

# Implementation of static dispatch functions in Mesa

In the Mesa source tree, the file gl_API.xml describes all of the known API functions.  In addition to describing the parameters to the function, each entry in gl_API.xml also declares a static offset in the dispatch table for that function. 


## Python generator scripts

A series of Python scripts are used to generate both platform independent API files and platform dependent API files.  For the API dispatch code, the most significant scripts that generate platform independent files are summarized in the following table. 
[[!table header="no" class="mointable" data="""
**Script Name** | **Generated File** | **Notes**
gl_apitemp.py | glapitemp.h | C-code dispatch function templates
gl_offsets.py | glapioffsets.h | List of defines of dispatch offsets
gl_procs.py | glprocs.h | Tables used by `glXGetProcAddressARB`
gl_table.py | glapitable.h | C structure definition of the dispatch table
"""]]

There are also several scripts that generate platform dependent files.  In all cases the generated files are assembly language versions of the dispatch stubs for a particular platform. 
[[!table header="no" class="mointable" data="""
**Script Name** | **Generated File** | **Notes**
gl_ppc_asm.py | ppc/glapi_ppc.S | PowerPC dispatch stubs
gl_SPARC_asm.py | sparc/glapi_sparc.S | SPARC (32-bit and 64-bit) dispatch stubs
gl_x86_asm.py | x86/glapi_x86.S | x86 (32-bit) dispatch stubs
gl_x86-64_asm.py | x86-64/glapi_x86-64.S | x86-64 dispatch stubs
"""]]


## Platform specifics


### x86


### x86-64


### SPARC (32-bit and 64-bit)


# Implementation of dynamic dispatch functions in Mesa


[^1] This may not be the same as the address of the static dispatch stub.