summaryrefslogtreecommitdiff
path: root/unoidl/README.md
blob: d6421f8585d89445cb3feddec244bbbbc62336bd (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
# Support for UNOIDL Registry Formats

`Library_unoidl` contains the `unoidl::Manager` and `unoidl::Provider` implementations
for the following registry formats:

* The new `UNOIDL` binary `types.rdb` format.
* The old legacy binary `types.rdb` format (based on modules "store" and
  "registry").
* A source-file format, reading (multiple) `UNOIDL` entity definitions directly
  from a single `.idl` source file.
* A source-tree format, reading `UNOIDL` entity definitions directly from a tree
  of `.idl` source files rooted at a given directory.  (Where an entity named
  `foo.bar.Baz` is expected in a file named `foo/bar/Baz.idl` within that tree.)

(While `.idl` files still contain `#include` directives for legacy idlc, the source-
based formats ignore any preprocessing directives starting with `#` in the `.idl`
files.)  `unoidl::Manager::addProvider` transparently detects the registry format
for a given URI and instantiates the corresponding provider implementation.

`Executable_unoidl-write` is a helper tool to convert from any of the registry
formats to the `UNOIDL` format.  It is used at build-time to compile `UNOIDL` format
`.rdb` files (that are used at build-time only, or included in installation sets
in `URE` or `program/types/` or as part of bundled extensions that are created
during the build and not merely included as pre-built `.oxt` files) from source
`.idl` files.

`Executable_unoidl-read` is a helper tool to convert from any of the registry
formats to the source-file format.  It can be used manually after a LibreOffice
version update to create new reference registries for `Executable_unoidl-check`.

`Executable_unoidl-check` is a helper tool to check that one registry is
backwards-compatible with another registry.  It is used at build-time to detect
inadvertent breakage of the udkapi and offapi APIs.

## Specification of the New UNOIDL types.rdb Format

The format uses byte-oriented, platform-independent, binary files.  Larger
quantities are stored LSB first, without alignment requirements.  Offsets are
32 bit, effectively limiting the overall file size to 4GB, but that is not
considered a limitation in practice (and avoids unnecessary bloat compared to
64 bit offsets).

Annotations can be added for (non-module) entities and certain parts of such
entities (e.g., both for an interface type definition and for a direct method of
an interface type definition; the idea is that it can be added for direct parts
that forma a "many-to-one" relationship; there is a tradeoff between generality
of concept and size of representation, esp. for the C++ representation types in
namespace `unoidl`) and consist of arbitrary sequences of name/value strings.
Each name/value string is encoded as a single UTF-8 string containing a name (an
arbitrary sequence of Unicode code points not containing `U+003D EQUALS SIGN`),
optionally followed by `U+003D EQUALS SIGN` and a value (an arbitrary sequence of
Unicode code points).  The only annotation name currently in use is "deprecated"
(without a value).

The following definitions are used throughout:

* `UInt16`: 2-byte value, LSB first
* `UInt32`: 4-byte value, LSB first
* `UInt64`: 8-byte value, LSB first
* Offset: `UInt32` value, counting bytes from start of file
* `NUL`-Name: zero or more non-`NUL` US-ASCII bytes followed by a `NUL` byte
* Len-String: UInt32 number of characters, with `0x80000000` bit 0, followed by
   that many US-ASCII (for `UNOIDL` related names) resp. UTF-8 (for annotations)
   bytes
* Idx-String: either an Offset (with `0x80000000` bit 1) of a Len-String, or a
   Len-String
* Annotations: `UInt32` number `N` of annotations followed by `N * Idx-String`
* Entry: Offset of `NUL`-Name followed by Offset of payload
* Map: zero or more Entries

The file starts with an 8 byte header, followed by information about the root
map (`unoidl-write` generates files in a single depth-first pass, so the root map
itself is at the end of the file):

* 7 byte magic header `UNOIDL\xFF`
* version byte 0
* Offset of root Map
* `UInt32` number of entries of root Map
...

Files generated by unoidl-write follow that by a

    "\0** Created by LibreOffice " LIBO_VERSION_DOTTED " unoidl-write **\0"

banner (cf. `config_host/config_version.h.in`), as a debugging aid.  (Old versions
used `reg2unoidl` instead of `unoidl-write` in that banner.)

Layout of per-entry payload in the root or a module Map:

* kind byte:

    * 0: module
        * followed by:
            * `UInt32` number `N1` of entries of Map
            * `N1 * Entry`

    * otherwise:
        * `0x80` bit: 1 if published
        * `0x40` bit: 1 if annotated
        * `0x20` bit: flag (may only be 1 for certain kinds, see below)
        * remaining bits:

            * 1: enum type
                * followed by:
                    * `UInt32` number N1 of members
                    * `N1 * tuple` of:
                        * `Idx-String`
                        * `UInt32`
                        * if annotated: Annotations

            * 2: plain struct type (with base if flag is 1)
                * followed by:
                    * if "with base": `Idx-String`
                    * `UInt32` number `N1` of direct members
                    * `N1 * tuple` of:
                        * `Idx-String` name
                        * `Idx-String` type
                        * if annotated: Annotations

            * 3: polymorphic struct type template
                * followed by:
                    * `UInt32` number `N1` of type parameters
                    * `N1 * Idx-String`
                    * `UInt32` number `N2` of members
                    * `N2 * tuple` of:
                        * kind byte: `0x01` bit is 1 if parameterized type
                        * `Idx-String` name
                        * `Idx-String` type
                        * if annotated: Annotations

            * 4: exception type (with base if flag is 1)
                * followed by:
                    * if "with base": `Idx-String`
                    * `UInt32` number `N1` of direct members
                    * `N1 * tuple` of:
                        * `Idx-String` name
                        * `Idx-String` type
                        * if annotated: Annotations

            * 5: interface type
                * followed by:
                    * `UInt32` number `N1` of direct mandatory bases
                    * `N1 * tuple` of:
                        * `Idx-String`
                        * if annotated: Annotations
                    * `UInt32` number `N2` of direct optional bases
                    * `N2 * tuple` of:
                        * `Idx-String`
                        * if annotated: Annotations
                    * `UInt32` number `N3` of direct attributes
                    * `N3 * tuple` of:
                        * kind byte:
                            * `0x02` bit: 1 if read-only
                            * `0x01` bit: 1 if bound
                        * `Idx-String` name
                        * `Idx-String` type
                        * `UInt32` number `N4` of get exceptions
                        * `N4 * Idx-String`
                        * `UInt32` number `N5` of set exceptions
                        * `N5 * Idx-String`
                        * if annotated: Annotations
                    * `UInt32` number `N6` of direct methods
                    * `N6 * tuple` of:
                        * `Idx-String` name
                        * `Idx-String` return type
                        * `UInt32` number `N7` of parameters
                        * `N7 * tuple` of:
                            * direction byte: 0 for in, 1 for out, 2 for in-out
                            * `Idx-String` name
                            * `Idx-String` type
                        * `UInt32` number `N8` of exceptions
                        * N8 * Idx-String
                        * if annotated: Annotations

            * 6: typedef
                * followed by:
                    * `Idx-String`

            * 7: constant group
                * followed by:
                    * `UInt32` number `N1` of entries of Map
                    * `N1 * Entry`

            * 8: single-interface--based service (with default constructor if flag is 1)
                * followed by:
                    * `Idx-String`
                    * if not "with default constructor":
                        * `UInt32` number `N1` of constructors
                        * `N1 * tuple` of:
                            * `Idx-String`
                            * `UInt32` number `N2` of parameters
                            * `N2 * tuple` of
                                * kind byte: `0x04` bit is 1 if rest parameter
                                * `Idx-String` name
                                * `Idx-String` type
                            * `UInt32` number `N3` of exceptions
                            * `N3 * Idx-String`
                            * if annotated: Annotations

            * 9: accumulation-based service
                * followed by:
                    * `UInt32` number `N1` of direct mandatory base services
                    * `N1 * tuple` of:
                        * `Idx-String`
                        * if annotated: Annotations
                    * `UInt32` number `N2` of direct optional base services
                    * `N2 * tuple` of:
                        * `Idx-String`
                        * if annotated: Annotations
                    * `UInt32` number `N3` of direct mandatory base interfaces
                    * `N3 * tuple` of:
                        * `Idx-String`
                        * if annotated: Annotations
                    * `UInt32` number `N4` of direct optional base interfaces
                    * `N4 * tuple` of:
                        * `Idx-String`
                        * if annotated: Annotations
                    * `UInt32` number `N5` of direct properties
                    * `N5 * tuple` of:
                        * `UInt16` kind:
                            * `0x0100` bit: 1 if optional
                            * `0x0080` bit: 1 if removable
                            * `0x0040` bit: 1 if maybedefault
                            * `0x0020` bit: 1 if maybeambiguous
                            * `0x0010` bit: 1 if readonly
                            * `0x0008` bit: 1 if transient
                            * `0x0004` bit: 1 if constrained
                            * `0x0002` bit: 1 if bound
                            * `0x0001` bit: 1 if maybevoid
                            * `Idx-String` name
                            * `Idx-String` type
                            * if annotated: Annotations

            * 10: interface-based singleton
                * followed by:
                * `Idx-String`

            * 11: service-based singleton
                * followed by:
                    * `Idx-String`

        * if annotated, followed by: Annotations

Layout of per-entry payload in a constant group Map:

* kind byte:
    * `0x80` bit: 1 if annotated
    * remaining bits:

        * 0: `BOOLEAN`
            * followed by value byte, 0 represents false, 1 represents true

        * 1: `BYTE`
            * followed by value byte, representing values with two's complement

        * 2: `SHORT`
            * followed by `UInt16` value, representing values with two's complement

        * 3: `UNSIGNED SHORT`
            * followed by `UInt16` value

        * 4: `LONG`
            * followed by `UInt32` value, representing values with two's complement

        * 5: `UNSIGNED LONG`
            * followed by `UInt32` value

        * 6: `HYPER`
            * followed by `UInt64` value, representing values with two's complement

        * 7: `UNSIGNED HYPER`
            * followed by `UInt64` value

        * 8: `FLOAT`
            * followed by 4-byte value, representing values in ISO 60599 binary32 format,
      LSB first

        * 9: `DOUBLE`
            * followed by 8-byte value, representing values in ISO 60599 binary64 format,
      LSB first

* if annotated, followed by: Annotations