summaryrefslogtreecommitdiff
path: root/DEPSOLVE.txt
blob: fce8698b919214bf2eb175dd3b8eacbcdae50c8a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
YUM vs RAZOR
------------

At a very high level, yum's depsolver does something roughly
equivalent to:

    - For each package being installed or removed

	- For each relevant property (provides, requires, conflicts,
          obsoletes):

	    - Figure out what additional packages need to be added to
	      or removed from the system to satisfy this property

which ends up being roughly O(N^2 * M) where N is the total number of
properties and M is the number of packages being acted on.

(I just figured that out off the top of my head, and I'm not totally
familiar with the yum code, so it may be wrong.)

Razor's depsolver is something like:

    - do {

	- For each property to be added to or removed from the system:

	    - Figure out what packages need to be added to or removed
	      from the system to satisfy this property

    - } until we stop adding/remove more packages

with the key being that it's very easy to find the PROVIDES
corresponding to a REQUIRES and vice versa, because the property
arrays are sorted, and so all properties with the same "name" will be
adjacent to one another in the array, allowing many dependencies to be
satisified in essentially constant time. (Actually... we've been
calling it constant, but it's really O(log N) for heavily-depended-on
packages, because the more packages you have, the more variations on
"requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're
going to have to scan through.)

Ideally though, each iteration of the inner loop body happens in
constant time, and thus the inner loop as a whole is O(N), and thus
the depsolver as a whole is O(N * M) (or at least, less than
O(N * M * log N).


FILE DEPENDENCIES
-----------------

Whenever we add a package with a file REQUIRES to a razor_set, we also
add a PROVIDES for that file to the package or packages which provide
that file. This means that if we later add another package that
requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve
its file requirement exactly like we would resolve a property
requirement, in nearly constant time.

When adding a *new* file requirement (ie, a requirement on a file that
no existing package depends on), we still have to scan through the
file tree, which is O(log N) in the number of files.

(AFAICT, there's no reason yum couldn't do the same optimization.
Also, AFAICT, yum currently sticks property dependencies and file
dependencies into the same hash table, so that if any package in the
transaction has a file dependency, it causes *property* dependencies
to become slower to resolve as well...)


THE RULES
---------

This is what we have figured out for transaction-solving rules;
neither yum nor rpm's algorithm seems to be explained in full
anywhere...

    1. Every requested install in the initial package set must be
       satisfied as either a new install or an update:

	- if the requested package name is the name of an upstream
          package:

	    - if there is not a corresponding already-installed
              package, then install the upstream package

	    - else if the upstream package is newer than the
              already-installed package, then update the package

	    - else it's an error (UP_TO_DATE)

	- else if the requested package name is the name of an
          already-installed package:

	    - if there is an upstream package that obsoletes the
              already-installed package, then behave as though the
              user had requested that that package be installed
              instead.

	    - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?)

	- else it's an error (INSTALL_UNAVAILABLE)

    2. Every requested removal in the initial package set must be
       satisfied as a removal. If any requested package name is not
       the name of an installed package, it's an error
       (REMOVE_NOT_INSTALLED)

    REQUIRES processing:

    3. If a package being installed or updated-to REQUIRES a property
       that is not provided by any installed or to-be-installed
       package, we need to find an installable package that provides
       that property. If we find one, install/update it. If not, it's
       an error (UNSATISFIABLE). (If we find an upstream package
       providing the property that corresponds to a system package
       that's being removed, then it's a CONTRADICTION.)

    4. If an already-installed package REQUIRES a property which is
       only provided by a package that is being removed, then that
       package needs to be removed as well.

    5. If an already-installed package REQUIRES a property which is
       only provided by a package that is being upgraded or obsoleted
       (to a new package which does not provide that property), then:

	- if there is an update for the installed package, then update
          the installed package

	- else if there is another installable package that provides
          the required property, then install that.

	- else it's an error (UNSATISFIABLE?)

    CONFLICTS processing

    6. If a package being installed or updated-to CONFLICTS with a
       property provided by an installed package:

	- if there is an update for the installed package, which the
          new package does not conflict with, then update the
          installed package.

	- else it's an error (NEW_CONFLICT)

    7. If an already-installed package CONFLICTS with a property
       provided by a to-be-installed package:

	- if there is an update for the installed package, which does
          not conflict with the new package, then update the installed
          package.

	- else it's an error (NEW_CONFLICT)

    8. If a package being installed or updated-to CONFLICTS with a
       property provided by a to-be-installed package, then it's an
       error (CONTRADICTION).

    OBSOLETES processing. NOTE: OBSOLETES are only matched against
    package names, not against arbitrary provided properties

    9. If a package being installed or updated-to OBSOLETES an
       installed package, then obsolete that package. (ie, remove it,
       but treat it as updated for purposes of dangling REQUIRES).

   10. If an already-installed package OBSOLETES a to-be-installed
       package, then it's an error. (ALREADY_OBSOLETE)

   11. If a package being installed or updated-to OBSOLETES another
       package being installed or updated-to, then it's an error
       (CONTRADICTION).



THE DEPSOLVER
-------------

We start with two razor_sets, system and upstream, and a list of
requested installations and removals.

    FIXME: what about multiple upstream repos? Having to deal with
    arbitrary numbers of razor_sets is possible, but will probably be
    messy... It might be easier to either store all upstream repo data
    in a single .repo file, or else merge all upstream .repo files
    together into a single razor_set at startup. (Or some combination
    of those.)

We create a bit array of the packages in each set, indicating which
ones are installed; the system bitarray starts out all 1s, and the
upstream bitarray all 0s. Each bit is only allowed to change state
once during the transaction; an installed package can be removed, or
an uninstalled package installed, but trying to reinstall a removed
package, or uninstall a newly-installed package is an error. This
means the packages break down into four categories:

    - installed       (1 bit in the system bit array)
    - to-be-removed   (0 bit in the system bit array)
    - to-be-installed (1 bit in the upstream bit array)
    - installable     (0 bit in the upstream bit array)


Depsolver algorithm:

    - Create new razor_transaction_packages ("rtp"s) for each
      requested install or remove. These will be "unresolved", because
      we haven't yet found the razor_packages that correspond to them.

    - while there are new rtps:

	- sort the new rtps

	- Walk the system property list, upstream property list, and
          new rtp list in parallel, and:

	    - For each uninstalled PROVIDES:

		- If the property is a valid package name (that is,
                  either it's a package providing its own name, or it
                  has a matching OBSOLETES), and it matches the name
                  of a new rtp of type INSTALL or FORCED_UPDATE with
                  an unresolved new_package:

		    - If the upstream package has the same version as
		      the system package, we have an UP_TO_DATE error
		      (FIXME: not quite right. This doesn't deal with
		      the case where we try to update an application
		      because of a library update, and it turns out
		      there's no new version of the application, but
		      there IS a compat package containing the old
		      version of the library.)

		    - Otherwise, set the rtp's new_package to point to
		      the package providing this property and set the
		      appropriate bit in the upstream bit array.

	    - For each to-be-installed non-file REQUIRES:

		- See if there's an installed or to-be-installed
		  package that PROVIDES that property.

		- If not, see if there's an installable package that
		  PROVIDES that property, and create a new INSTALL rtp
		  for it if so.

		- If not, see if there's a to-be-removed package that
		  PROVIDES that property. (If we find such a package,
		  we have a CONTRADICTION error.)

		- If none of the above, then we have an UNSATISFIABLE
                  error

	    - For each to-be-installed file REQUIRES:

		- (We create fake file PROVIDES to match file REQUIRES
                  when importing/merging razor sets, so if there is
                  already another installed package that REQUIRES this
                  file, there will be a PROVIDES listed for it as well.)

		- See if there's an installed package that PROVIDES
                  that file.

		- If not, do a binary search of the system file tree
                  looking to see if some installed package provides
                  that file but does not have a PROVIDES for it.

		- If not, see if there's an installable package that
		  PROVIDES that property, and create a new INSTALL rtp
		  for it if so.

		- (If we actually work with multiple upstream
                  razor_sets, then we will need to search the upstream
                  file trees at this point, because it's possible that
                  a package in one upstream repo would require a file
                  in another upstream repo. But if we merge the
                  multiple upstream repos into a single razor_set at
                  some point, then we would not need to do that,
                  because it would be guaranteed that we would have
                  already created a fake PROVIDES if any package
                  provides the file.)

		- If no installed or installable package provides the
		  file, see if there's a to-be-removed package that
		  provides the file. (If we find such a package, we
		  have a CONTRADICTION error.)

		- If none of the above, then we have an UNSATISFIABLE
                  error

	    - For each to-be-installed PROVIDES:

		- Check if the new PROVIDES conflicts with an
		  installed CONFLICTS. If so, create a new
		  FORCED_UPDATE rtp for the installed package, so we
		  can try to upgrade it to a non-conflicting version.
		  (If we can't, we'll have an OLD_CONFLICT error.)

		- Check if the new PROVIDES conflicts with an
                  installed OBSOLETES *and* the PROVIDES property
                  corresponds to the name of its package. (That is,
                  OBSOLETES are only matched against package names,
                  not arbitrary provided properties.) If so, we have
                  an ALREADY_OBSOLETE error.

		- Check if the new PROVIDES conflicts with a
		  to-be-installed CONFLICTS. If so, we have a
		  CONTRADICTION error.

	    - For each to-be-installed CONFLICTS:

		- Basically the reverse of the previous case: check if
		  the new CONFLICTS conflicts with an installed
		  PROVIDES. If so, create a new FORCED_UPDATE rtp for
		  the installed package, so we can try to upgrade it
		  to a non-conflicting version. (If we can't, we'll
		  have an NEW_CONFLICT error.)

		- Check if the new CONFLICTS conflicts with a
		  to-be-installed PROVIDES. If so, we have a
		  CONTRADICTION error.

	    - For each to-be-installed OBSOLETES:

		- Check if there's an installed package that PROVIDES
		  that property. If so, create an OBSOLETED rtp for
		  the installed package.

		- If not, check if there's a to-be-installed package
		  that PROVIDES that property. If so, we have a
		  CONTRADICTION error.


	    - For each installed PROVIDES:

		- If the property is a valid package name (that is,
                  it's a package providing its own name), and it
                  matches the name of a new rtp with an unresolved
                  old_package, then set the rtp's old_package to point
                  to the package providing this property and clear the
                  appropriate bit in the system bit array.

	    - For each to-be-removed PROVIDES:

		- If there's also an identical to-be-installed
		  PROVIDES, we're ok and can skip this

		- Otherwise, for each installed REQUIRES of this
                  property:

		    - Look for some other installed or to-be-installed
		      property that satisfies the REQUIRES.

		    - If there isn't one, then for each installed
		      package in this REQUIRES's package list:

			- If the PROVIDES was lost because the old
			  package was REMOVEd (not FORCED_UPDATE or
			  OBSOLETED), then create a new REMOVE rtp for
			  this package.

			- Otherwise, create a new FORCED_UPDATE rtp
                          for this package.

		- (We don't need to look at to-be-installed REQUIRES
		  of this property, because if there are any, they
		  will cause a CONTRADICTION error when we try to
		  re-satisfy them the next time through.)