Dependency Solver
At a very high level, yum's depsolver does something roughly
equivalent to:
- For each package being installed or removed
- For each relevant property (provides, requires, conflicts,
obsoletes):
- Figure out what additional packages need to be added to
or removed from the system to satisfy this property
which ends up being roughly O(N^2 * M) where N is the total number of
properties and M is the number of packages being acted on.
(I just figured that out off the top of my head, and I'm not totally
familiar with the yum code, so it may be wrong.)
Razor's depsolver is something like:
- do {
- For each property to be added to or removed from the system:
- Figure out what packages need to be added to or removed
from the system to satisfy this property
- } until we stop adding/remove more packages
with the key being that it's very easy to find the PROVIDES
corresponding to a REQUIRES and vice versa, because the property
arrays are sorted, and so all properties with the same "name" will be
adjacent to one another in the array, allowing many dependencies to be
satisified in essentially constant time. (Actually... we've been
calling it constant, but it's really O(log N) for heavily-depended-on
packages, because the more packages you have, the more variations on
"requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're
going to have to scan through.)
Ideally though, each iteration of the inner loop body happens in
constant time, and thus the inner loop as a whole is O(N), and thus
the depsolver as a whole is O(N * M) (or at least, less than
O(N * M * log N).
FILE DEPENDENCIES
-----------------
Whenever we add a package with a file REQUIRES to a razor_set, we also
add a PROVIDES for that file to the package or packages which provide
that file. This means that if we later add another package that
requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve
its file requirement exactly like we would resolve a property
requirement, in nearly constant time.
When adding a *new* file requirement (ie, a requirement on a file that
no existing package depends on), we still have to scan through the
file tree, which is O(log N) in the number of files.
(AFAICT, there's no reason yum couldn't do the same optimization.
Also, AFAICT, yum currently sticks property dependencies and file
dependencies into the same hash table, so that if any package in the
transaction has a file dependency, it causes *property* dependencies
to become slower to resolve as well...)
THE RULES
---------
This is what we have figured out for transaction-solving rules;
neither yum nor rpm's algorithm seems to be explained in full
anywhere...
1. Every requested install in the initial package set must be
satisfied as either a new install or an update:
- if the requested package name is the name of an upstream
package:
- if there is not a corresponding already-installed
package, then install the upstream package
- else if the upstream package is newer than the
already-installed package, then update the package
- else it's an error (UP_TO_DATE)
- else if the requested package name is the name of an
already-installed package:
- if there is an upstream package that obsoletes the
already-installed package, then behave as though the
user had requested that that package be installed
instead.
- else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?)
- else it's an error (INSTALL_UNAVAILABLE)
2. Every requested removal in the initial package set must be
satisfied as a removal. If any requested package name is not
the name of an installed package, it's an error
(REMOVE_NOT_INSTALLED)
REQUIRES processing:
3. If a package being installed or updated-to REQUIRES a property
that is not provided by any installed or to-be-installed
package, we need to find an installable package that provides
that property. If we find one, install/update it. If not, it's
an error (UNSATISFIABLE). (If we find an upstream package
providing the property that corresponds to a system package
that's being removed, then it's a CONTRADICTION.)
4. If an already-installed package REQUIRES a property which is
only provided by a package that is being removed, then that
package needs to be removed as well.
5. If an already-installed package REQUIRES a property which is
only provided by a package that is being upgraded or obsoleted
(to a new package which does not provide that property), then:
- if there is an update for the installed package, then update
the installed package
- else if there is another installable package that provides
the required property, then install that.
- else it's an error (UNSATISFIABLE?)
CONFLICTS processing
6. If a package being installed or updated-to CONFLICTS with a
property provided by an installed package:
- if there is an update for the installed package, which the
new package does not conflict with, then update the
installed package.
- else it's an error (NEW_CONFLICT)
7. If an already-installed package CONFLICTS with a property
provided by a to-be-installed package:
- if there is an update for the installed package, which does
not conflict with the new package, then update the installed
package.
- else it's an error (NEW_CONFLICT)
8. If a package being installed or updated-to CONFLICTS with a
property provided by a to-be-installed package, then it's an
error (CONTRADICTION).
OBSOLETES processing. NOTE: OBSOLETES are only matched against
package names, not against arbitrary provided properties
9. If a package being installed or updated-to OBSOLETES an
installed package, then obsolete that package. (ie, remove it,
but treat it as updated for purposes of dangling REQUIRES).
10. If an already-installed package OBSOLETES a to-be-installed
package, then it's an error. (ALREADY_OBSOLETE)
11. If a package being installed or updated-to OBSOLETES another
package being installed or updated-to, then it's an error
(CONTRADICTION).
THE DEPSOLVER
-------------
We start with two razor_sets, system and upstream, and a list of
requested installations and removals.
FIXME: what about multiple upstream repos? Having to deal with
arbitrary numbers of razor_sets is possible, but will probably be
messy... It might be easier to either store all upstream repo data
in a single .rzdb file, or else merge all upstream .rzdb files
together into a single razor_set at startup. (Or some combination
of those.)
We create a bit array of the packages in each set, indicating which
ones are installed; the system bitarray starts out all 1s, and the
upstream bitarray all 0s. Each bit is only allowed to change state
once during the transaction; an installed package can be removed, or
an uninstalled package installed, but trying to reinstall a removed
package, or uninstall a newly-installed package is an error. This
means the packages break down into four categories:
- installed (1 bit in the system bit array)
- to-be-removed (0 bit in the system bit array)
- to-be-installed (1 bit in the upstream bit array)
- installable (0 bit in the upstream bit array)
Depsolver algorithm:
- Create new razor_transaction_packages ("rtp"s) for each
requested install or remove. These will be "unresolved", because
we haven't yet found the razor_packages that correspond to them.
- while there are new rtps:
- sort the new rtps
- Walk the system property list, upstream property list, and
new rtp list in parallel, and:
- For each uninstalled PROVIDES:
- If the property is a valid package name (that is,
either it's a package providing its own name, or it
has a matching OBSOLETES), and it matches the name
of a new rtp of type INSTALL or FORCED_UPDATE with
an unresolved new_package:
- If the upstream package has the same version as
the system package, we have an UP_TO_DATE error
(FIXME: not quite right. This doesn't deal with
the case where we try to update an application
because of a library update, and it turns out
there's no new version of the application, but
there IS a compat package containing the old
version of the library.)
- Otherwise, set the rtp's new_package to point to
the package providing this property and set the
appropriate bit in the upstream bit array.
- For each to-be-installed non-file REQUIRES:
- See if there's an installed or to-be-installed
package that PROVIDES that property.
- If not, see if there's an installable package that
PROVIDES that property, and create a new INSTALL rtp
for it if so.
- If not, see if there's a to-be-removed package that
PROVIDES that property. (If we find such a package,
we have a CONTRADICTION error.)
- If none of the above, then we have an UNSATISFIABLE
error
- For each to-be-installed file REQUIRES:
- (We create fake file PROVIDES to match file REQUIRES
when importing/merging razor sets, so if there is
already another installed package that REQUIRES this
file, there will be a PROVIDES listed for it as well.)
- See if there's an installed package that PROVIDES
that file.
- If not, do a binary search of the system file tree
looking to see if some installed package provides
that file but does not have a PROVIDES for it.
- If not, see if there's an installable package that
PROVIDES that property, and create a new INSTALL rtp
for it if so.
- (If we actually work with multiple upstream
razor_sets, then we will need to search the upstream
file trees at this point, because it's possible that
a package in one upstream repo would require a file
in another upstream repo. But if we merge the
multiple upstream repos into a single razor_set at
some point, then we would not need to do that,
because it would be guaranteed that we would have
already created a fake PROVIDES if any package
provides the file.)
- If no installed or installable package provides the
file, see if there's a to-be-removed package that
provides the file. (If we find such a package, we
have a CONTRADICTION error.)
- If none of the above, then we have an UNSATISFIABLE
error
- For each to-be-installed PROVIDES:
- Check if the new PROVIDES conflicts with an
installed CONFLICTS. If so, create a new
FORCED_UPDATE rtp for the installed package, so we
can try to upgrade it to a non-conflicting version.
(If we can't, we'll have an OLD_CONFLICT error.)
- Check if the new PROVIDES conflicts with an
installed OBSOLETES *and* the PROVIDES property
corresponds to the name of its package. (That is,
OBSOLETES are only matched against package names,
not arbitrary provided properties.) If so, we have
an ALREADY_OBSOLETE error.
- Check if the new PROVIDES conflicts with a
to-be-installed CONFLICTS. If so, we have a
CONTRADICTION error.
- For each to-be-installed CONFLICTS:
- Basically the reverse of the previous case: check if
the new CONFLICTS conflicts with an installed
PROVIDES. If so, create a new FORCED_UPDATE rtp for
the installed package, so we can try to upgrade it
to a non-conflicting version. (If we can't, we'll
have an NEW_CONFLICT error.)
- Check if the new CONFLICTS conflicts with a
to-be-installed PROVIDES. If so, we have a
CONTRADICTION error.
- For each to-be-installed OBSOLETES:
- Check if there's an installed package that PROVIDES
that property. If so, create an OBSOLETED rtp for
the installed package.
- If not, check if there's a to-be-installed package
that PROVIDES that property. If so, we have a
CONTRADICTION error.
- For each installed PROVIDES:
- If the property is a valid package name (that is,
it's a package providing its own name), and it
matches the name of a new rtp with an unresolved
old_package, then set the rtp's old_package to point
to the package providing this property and clear the
appropriate bit in the system bit array.
- For each to-be-removed PROVIDES:
- If there's also an identical to-be-installed
PROVIDES, we're ok and can skip this
- Otherwise, for each installed REQUIRES of this
property:
- Look for some other installed or to-be-installed
property that satisfies the REQUIRES.
- If there isn't one, then for each installed
package in this REQUIRES's package list:
- If the PROVIDES was lost because the old
package was REMOVEd (not FORCED_UPDATE or
OBSOLETED), then create a new REMOVE rtp for
this package.
- Otherwise, create a new FORCED_UPDATE rtp
for this package.
- (We don't need to look at to-be-installed REQUIRES
of this property, because if there are any, they
will cause a CONTRADICTION error when we try to
re-satisfy them the next time through.)