-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for proptest resolve errors. #6258
Comments
Some investigations so far. |
Thinking about it some more, the Update: It looks like it has found a case where the |
During #6130 saw https://ci.appveyor.com/project/rust-lang-libs/cargo/builds/20073278/job/junrs4vbwm6n67bc with |
https://travis-ci.org/rust-lang/cargo/jobs/451581176 |
Looking at the shrunk case with The pure worst underlying problem, is a case like:
We have already picked a incompatible version for each of the dependencies, I have a bunch of larger ideas for how to optimize this, but I should probably get |
This fixes most of the cases fount in rust-lang#6258, in practice if not in theory.
Use a ad hoc trie to avoid slow cases This adds a test for the pure case described [here](#6258 (comment)). When added, this test will time out in debug (~20 sec release) with an `N` of 3. Looking with a profiler, it is spending most of its time in [`Vec.contains`](https://github.com/rust-lang/cargo/blob/d08fd375db0014f097ee3fbdc1c595b632669f6a/src/cargo/core/resolver/conflict_cache.rs#L84). Keeping a deduplicated list with contains is `O(n^2)`, so let's change things to be `Ord` so we can use a btree to deduplicate. Now the profiler claimed that we are spending our time [searching that `Vec`](https://github.com/rust-lang/cargo/blob/d08fd375db0014f097ee3fbdc1c595b632669f6a/src/cargo/core/resolver/conflict_cache.rs#L66). So time to work on that "vaporware", the simplest thing I could think of that lets us check for is_active in better then `O(n)` is nested hashmaps. Each hashmap has keys of a shared Id, and values of hashmaps representing all the conflicts that use that Id. When searching if the key is not active then we need not look at any of its descendants. There are lots of ways to make this better but even this much gets the test to pass with `N` of 3. So maybe we can justify the improvements with `N` of 4? No, eavan in debug `N` of 4 hits the [50k limit](https://github.com/rust-lang/cargo/blob/d08fd375db0014f097ee3fbdc1c595b632669f6a/src/cargo/core/resolver/types.rs#L55), so the that is as far as we need to go on the conflict_cache.
That pure worst case is now in the tests (#6283), and that test is no longer timing out. However that test is still a very efficient way to make the resolver do a lot of work. Bump up the A way to fix it, probably the way, is to more efficiently encode that "a dependency will fail if one of its dependencies cannot be resolved." This is currently done simply by copying the details of the problem from the dependant to the parent. This is simple because the The thought that is in my head at the moment is to make |
I don't know where to save this and don't want to lose it. Proptest does a good job of minimizing the input, but not a great job. It is often true that you can remove a crate (edit: or dep) from the example without changing the error, even after Proptest has done its best. I have found the following script very helpful for finding the extra parts: let the_erroring_case = |input: &[_]| {
let reg = registry(input.to_vec());
let _ = resolve_and_validated(pkg_id("root"), vec![dep("s")], ®);
};
use proptest::strategy::ValueTree;
use proptest::test_runner::TestRunner;
use std::panic::{self, AssertUnwindSafe};
'outer: loop {
let indexes: Vec<usize> = (0..input.len()).collect();
let shuffle = Just(indexes)
.prop_shuffle()
.new_tree(&mut TestRunner::default())
.unwrap()
.current();
for i in shuffle {
let mut new_input = input.clone();
new_input.remove(i);
// the erroring case
if let Err(_) = panic::catch_unwind(AssertUnwindSafe(|| {
the_erroring_case(&new_input);
})) {
input = new_input;
println!("{:?}", PrettyPrintRegistry(input.clone()));
continue 'outer;
}
let mut new_input = input.clone();
let summary = &input[i];
let deps_indexes: Vec<usize> = (0..summary.dependencies().len()).collect();
let deps_shuffle = Just(deps_indexes)
.prop_shuffle()
.new_tree(&mut TestRunner::default())
.unwrap()
.current();
for d in deps_shuffle {
new_input[i] = remove_dep(&summary, d);
// the erroring case
if let Err(_) = panic::catch_unwind(AssertUnwindSafe(|| {
the_erroring_case(&new_input);
})) {
input = new_input;
println!("{:?}", PrettyPrintRegistry(input.clone()));
continue 'outer;
}
}
}
println!("All passed, let's make sure the original still failed.");
the_erroring_case(&input);
} |
https://travis-ci.com/rust-lang/cargo/jobs/177800477 found in #6664 I'm rather unfamiliar with the cargo codebase, but I don't think I've done anything to the resolver. |
https://travis-ci.com/rust-lang/cargo/jobs/184413405 Could be a performance regression in master? |
The profile does show time spent on maintaining the new data structures, but setting The profile also shows a very deep call stack in So I think we got unlucky, but we should keep an eye out for getting unlucky more often do to a regression. |
It seems that a rustfmt build test failed on that as well. I am recording it here in case it gives additional information:
|
Noting a failure in rust-lang/rust#59244 (comment). |
Resolver: A dep is equivalent to one of the things it can resolve to. This is a series of small changes to the resolver, each one on its own is not worth the cherne, but somehow all together we can add a new optimization rule. The result is that the test in #6283 is no longer exponencial (still a large polynomial, cubick?) and so N can be bumped from 3 to 20. This also means that we pass with all the cases reported in #6258. Resolution is NP-Hard, so we are moving the slow cases around. To reduce the chance that we will be flooded by new bad cases I run the 4 proptests overnight, and they did not find a new exponencial case. I would recommend reviewing this commit by commit. As each change is pretty simple on its own, but the mixed diff is harder to follow. This is submitted as one big PR as that is @alexcrichton's preference. A special thanks to @nex3, our conversation was important in convincing me that several of these changes would be needed even in an eventual PubGrub based system. And, the question "why would PubGrub not have a problem with #6283" was wat crystallized this optimization opportunity in my mind.
I got a few errors here: https://travis-ci.org/ehuss/cargo/jobs/514869843 Do these mean it simply ran out of time (i.e., it is running too slowly)? |
No, that is the algorithm doing more work than we thought a test could generate. |
I do not think it included #6776, so hopefully that's it. |
I let it shrink overnight. So I can confirm that it repros on master with 6776. I can also confirm that it repros before 6776, so at least it is not a regression. |
Sorry I forgot to write up the findings. The shrunk and cleaned version is at Eh2406@aedcd27 witch ended up looking like a vertion of #6283 but with each crate have a version that can't be activated for unrelated reasons. This sneaks by the optimization in #6776, which check whether all the versions that satisfy a dependency have the same route problem. To fix this we need to check if the versions have any problem. Specifically check if that version has dependencies that can't be activated. This is an existing optimization, but the existing version is set up to work only just after the parent has been activated, and we need a version that works in the abstract. Unfortunately, build_deps and resolve_features are tied to mutating a |
Caching the dependencies There are 2 sources of facts for the resolver: 1. The `Registry` tells us for a Dependency what versions are available to fulfil it. 2. The `Summary` tells us for a version (and features) what dependencies need to be fulfilled for it to be activated. The `Registry` was cached with a `RegistryQueryer` back in #5112. This adds a `DepsCache` to cache the calculation of which dependencies are activated by features. In the happy path `flag_activated` means that we don't get to reuse `build_deps`, but the more we backtrack the more time we save. In pathological cases like #6258 (comment), I have measured this as a 10% improvement with release. This also means that `build_deps` can be run in a context free way, which may be useful in a follow up PR to solve #6258 (comment).
Small things This has two small changes that are worth saving from my most recent attempt to speedup #6258 (comment): 1. This removes the `ConflictCache::contains` added in #6776. Replacing it with a more general and easier to explain system based on the existing `ConflictCache::find_conflicting`. 2. This adds code to print the used part of the input for failing resolver tests. This is very helpful when 1. The proptest shrinking algorithm is interrupted, at least you have the smallest one found so far. 2. Hand minimizing, remove a dep and it will tell you all the packages that are no longer needed for the test to fail.
#8393 (comment)
|
Hit in a recent PR:
|
Recent failure in #11366:
|
Recent failure in https://github.com/rust-lang/cargo/actions/runs/9170255271/job/25212187538?pr=13687
|
I had a failure on a PR when I didn't change the resolver:
|
On rare occasions, the proptest resolver tests fail. I'm adding this issue as a place where they can be noted to help investigate what is wrong.
The text was updated successfully, but these errors were encountered: