Dependency Solver At a very high level, yum's depsolver does something roughly equivalent to: - For each package being installed or removed - For each relevant property (provides, requires, conflicts, obsoletes): - Figure out what additional packages need to be added to or removed from the system to satisfy this property which ends up being roughly O(N^2 * M) where N is the total number of properties and M is the number of packages being acted on. (I just figured that out off the top of my head, and I'm not totally familiar with the yum code, so it may be wrong.) Razor's depsolver is something like: - do { - For each property to be added to or removed from the system: - Figure out what packages need to be added to or removed from the system to satisfy this property - } until we stop adding/remove more packages with the key being that it's very easy to find the PROVIDES corresponding to a REQUIRES and vice versa, because the property arrays are sorted, and so all properties with the same "name" will be adjacent to one another in the array, allowing many dependencies to be satisified in essentially constant time. (Actually... we've been calling it constant, but it's really O(log N) for heavily-depended-on packages, because the more packages you have, the more variations on "requires foo", "requires foo = 1.1", "requires foo > 1.0", etc you're going to have to scan through.) Ideally though, each iteration of the inner loop body happens in constant time, and thus the inner loop as a whole is O(N), and thus the depsolver as a whole is O(N * M) (or at least, less than O(N * M * log N). FILE DEPENDENCIES ----------------- Whenever we add a package with a file REQUIRES to a razor_set, we also add a PROVIDES for that file to the package or packages which provide that file. This means that if we later add another package that requires the same file (eg, /bin/sh or /usr/bin/perl), we can resolve its file requirement exactly like we would resolve a property requirement, in nearly constant time. When adding a *new* file requirement (ie, a requirement on a file that no existing package depends on), we still have to scan through the file tree, which is O(log N) in the number of files. (AFAICT, there's no reason yum couldn't do the same optimization. Also, AFAICT, yum currently sticks property dependencies and file dependencies into the same hash table, so that if any package in the transaction has a file dependency, it causes *property* dependencies to become slower to resolve as well...) THE RULES --------- This is what we have figured out for transaction-solving rules; neither yum nor rpm's algorithm seems to be explained in full anywhere... 1. Every requested install in the initial package set must be satisfied as either a new install or an update: - if the requested package name is the name of an upstream package: - if there is not a corresponding already-installed package, then install the upstream package - else if the upstream package is newer than the already-installed package, then update the package - else it's an error (UP_TO_DATE) - else if the requested package name is the name of an already-installed package: - if there is an upstream package that obsoletes the already-installed package, then behave as though the user had requested that that package be installed instead. - else it's an error (UP_TO_DATE or INSTALL_UNAVAILABLE?) - else it's an error (INSTALL_UNAVAILABLE) 2. Every requested removal in the initial package set must be satisfied as a removal. If any requested package name is not the name of an installed package, it's an error (REMOVE_NOT_INSTALLED) REQUIRES processing: 3. If a package being installed or updated-to REQUIRES a property that is not provided by any installed or to-be-installed package, we need to find an installable package that provides that property. If we find one, install/update it. If not, it's an error (UNSATISFIABLE). (If we find an upstream package providing the property that corresponds to a system package that's being removed, then it's a CONTRADICTION.) 4. If an already-installed package REQUIRES a property which is only provided by a package that is being removed, then that package needs to be removed as well. 5. If an already-installed package REQUIRES a property which is only provided by a package that is being upgraded or obsoleted (to a new package which does not provide that property), then: - if there is an update for the installed package, then update the installed package - else if there is another installable package that provides the required property, then install that. - else it's an error (UNSATISFIABLE?) CONFLICTS processing 6. If a package being installed or updated-to CONFLICTS with a property provided by an installed package: - if there is an update for the installed package, which the new package does not conflict with, then update the installed package. - else it's an error (NEW_CONFLICT) 7. If an already-installed package CONFLICTS with a property provided by a to-be-installed package: - if there is an update for the installed package, which does not conflict with the new package, then update the installed package. - else it's an error (NEW_CONFLICT) 8. If a package being installed or updated-to CONFLICTS with a property provided by a to-be-installed package, then it's an error (CONTRADICTION). OBSOLETES processing. NOTE: OBSOLETES are only matched against package names, not against arbitrary provided properties 9. If a package being installed or updated-to OBSOLETES an installed package, then obsolete that package. (ie, remove it, but treat it as updated for purposes of dangling REQUIRES). 10. If an already-installed package OBSOLETES a to-be-installed package, then it's an error. (ALREADY_OBSOLETE) 11. If a package being installed or updated-to OBSOLETES another package being installed or updated-to, then it's an error (CONTRADICTION). THE DEPSOLVER ------------- We start with two razor_sets, system and upstream, and a list of requested installations and removals. FIXME: what about multiple upstream repos? Having to deal with arbitrary numbers of razor_sets is possible, but will probably be messy... It might be easier to either store all upstream repo data in a single .rzdb file, or else merge all upstream .rzdb files together into a single razor_set at startup. (Or some combination of those.) We create a bit array of the packages in each set, indicating which ones are installed; the system bitarray starts out all 1s, and the upstream bitarray all 0s. Each bit is only allowed to change state once during the transaction; an installed package can be removed, or an uninstalled package installed, but trying to reinstall a removed package, or uninstall a newly-installed package is an error. This means the packages break down into four categories: - installed (1 bit in the system bit array) - to-be-removed (0 bit in the system bit array) - to-be-installed (1 bit in the upstream bit array) - installable (0 bit in the upstream bit array) Depsolver algorithm: - Create new razor_transaction_packages ("rtp"s) for each requested install or remove. These will be "unresolved", because we haven't yet found the razor_packages that correspond to them. - while there are new rtps: - sort the new rtps - Walk the system property list, upstream property list, and new rtp list in parallel, and: - For each uninstalled PROVIDES: - If the property is a valid package name (that is, either it's a package providing its own name, or it has a matching OBSOLETES), and it matches the name of a new rtp of type INSTALL or FORCED_UPDATE with an unresolved new_package: - If the upstream package has the same version as the system package, we have an UP_TO_DATE error (FIXME: not quite right. This doesn't deal with the case where we try to update an application because of a library update, and it turns out there's no new version of the application, but there IS a compat package containing the old version of the library.) - Otherwise, set the rtp's new_package to point to the package providing this property and set the appropriate bit in the upstream bit array. - For each to-be-installed non-file REQUIRES: - See if there's an installed or to-be-installed package that PROVIDES that property. - If not, see if there's an installable package that PROVIDES that property, and create a new INSTALL rtp for it if so. - If not, see if there's a to-be-removed package that PROVIDES that property. (If we find such a package, we have a CONTRADICTION error.) - If none of the above, then we have an UNSATISFIABLE error - For each to-be-installed file REQUIRES: - (We create fake file PROVIDES to match file REQUIRES when importing/merging razor sets, so if there is already another installed package that REQUIRES this file, there will be a PROVIDES listed for it as well.) - See if there's an installed package that PROVIDES that file. - If not, do a binary search of the system file tree looking to see if some installed package provides that file but does not have a PROVIDES for it. - If not, see if there's an installable package that PROVIDES that property, and create a new INSTALL rtp for it if so. - (If we actually work with multiple upstream razor_sets, then we will need to search the upstream file trees at this point, because it's possible that a package in one upstream repo would require a file in another upstream repo. But if we merge the multiple upstream repos into a single razor_set at some point, then we would not need to do that, because it would be guaranteed that we would have already created a fake PROVIDES if any package provides the file.) - If no installed or installable package provides the file, see if there's a to-be-removed package that provides the file. (If we find such a package, we have a CONTRADICTION error.) - If none of the above, then we have an UNSATISFIABLE error - For each to-be-installed PROVIDES: - Check if the new PROVIDES conflicts with an installed CONFLICTS. If so, create a new FORCED_UPDATE rtp for the installed package, so we can try to upgrade it to a non-conflicting version. (If we can't, we'll have an OLD_CONFLICT error.) - Check if the new PROVIDES conflicts with an installed OBSOLETES *and* the PROVIDES property corresponds to the name of its package. (That is, OBSOLETES are only matched against package names, not arbitrary provided properties.) If so, we have an ALREADY_OBSOLETE error. - Check if the new PROVIDES conflicts with a to-be-installed CONFLICTS. If so, we have a CONTRADICTION error. - For each to-be-installed CONFLICTS: - Basically the reverse of the previous case: check if the new CONFLICTS conflicts with an installed PROVIDES. If so, create a new FORCED_UPDATE rtp for the installed package, so we can try to upgrade it to a non-conflicting version. (If we can't, we'll have an NEW_CONFLICT error.) - Check if the new CONFLICTS conflicts with a to-be-installed PROVIDES. If so, we have a CONTRADICTION error. - For each to-be-installed OBSOLETES: - Check if there's an installed package that PROVIDES that property. If so, create an OBSOLETED rtp for the installed package. - If not, check if there's a to-be-installed package that PROVIDES that property. If so, we have a CONTRADICTION error. - For each installed PROVIDES: - If the property is a valid package name (that is, it's a package providing its own name), and it matches the name of a new rtp with an unresolved old_package, then set the rtp's old_package to point to the package providing this property and clear the appropriate bit in the system bit array. - For each to-be-removed PROVIDES: - If there's also an identical to-be-installed PROVIDES, we're ok and can skip this - Otherwise, for each installed REQUIRES of this property: - Look for some other installed or to-be-installed property that satisfies the REQUIRES. - If there isn't one, then for each installed package in this REQUIRES's package list: - If the PROVIDES was lost because the old package was REMOVEd (not FORCED_UPDATE or OBSOLETED), then create a new REMOVE rtp for this package. - Otherwise, create a new FORCED_UPDATE rtp for this package. - (We don't need to look at to-be-installed REQUIRES of this property, because if there are any, they will cause a CONTRADICTION error when we try to re-satisfy them the next time through.)