audit universal resolver's robustness with respect to false positives/negatives with routines like "are these two marker expressions disjoint" #5562
Labels
internal
A refactor or improvement that is not user-facing
Right now, the universal resolver relies on an internal routine,
marker::is_disjoint
, to report whether two marker expressions could ever evaluate totrue
for the same marker environment. If they can't, then they are considered disjoint.At present, we use this to determine whether to create forks between conflicting dependency specifications. For example:
It is indeed the case that
sys_platform == 'linux'
andsys_platform == 'windows'
can literally never be active for the same marker environment, and thus they are considered disjoint. This in turn "allows" the universal resolver to fork. (Although we are very likely going to need to allow forks even in cases of non-disjointness, as explored in #4732.)But, we do use disjointness checking elsewhere. For example, when creating a fork inside the resolver, we remove any dependencies whose marker expressions are disjoint with that fork's marker expression. This is because those dependencies can never appear in environments that use that fork. But if disjointness checking is wrong, then this could lead to incorrect results.
What is known is that our disjointness checking is wrong today. Specifically, I believe perfect disjointness checking is NP-complete. So right now, we only use heuristics. What is not quite known is whether its "wrongness" is limited to only false positives, limited to only false negatives, or is both. Moreover, I do not believe the callers of disjointness checking have been audited from what they are robust to. Sometimes false negatives are okay in the sense that they might just lead to not being able to produce a resolution, for example.
This is also somewhat shifting as mentioned via #4732, and also the work that @ibraheemdev is doing based on marker normalization.
The text was updated successfully, but these errors were encountered: