Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: merge dependencies for better error messages #163

Merged
merged 8 commits into from
Dec 17, 2023
Merged

feat: merge dependencies for better error messages #163

merged 8 commits into from
Dec 17, 2023

Conversation

Eh2406
Copy link
Member

@Eh2406 Eh2406 commented Nov 30, 2023

This is the "simplest thing that could possibly work", when adding a new dependency incompatibility we check to see if there is an existing incompatibility with the same parent package and dependent package and dependent VS, if so we construct a new incompatibility that represents the merging of the two incompatibilities. We remove the old incompatibility from all lookup tables and at the new one. This effectively brings us on par with Dart for #19. When it applies this can massively reduce the number of incompatibilities the algorithm scans through. So it may help a little bit with #135, depending on whether there real problem contain this exact case. The union of versions done when merging incompatibilities here is definitely a candidate for #163.

The big advantage of this is the improvement to error messages. In fact this is split up into several commits the first one as a test case with the old error message and the second one shows the new error message. Our error messages tend to have "holes" constructed by discovering the individual versions don't work. When each version doesn't work for a different reason then we are in a complicated mess and the "holes" are the least of the users problems. In this case, the only reason we discovered the issues one at a time is that we loaded the data one version at a time, which is not useful information for the user to see. I consider this improvement to error messages as justifying a lot of "half-baked" details about this approach.

Unfortunately this is pure overhead when the optimization does not apply. Most problematically this does not apply if we never load more than one version of a package, that is to say when we easily construct a solution. All told this adds a 5% to 20% cost on our benchmarks. After several days of experimenting this slowdown is caused not by any of the inefficient/half-baked things that I would like to improve. :-( Instead it is caused by constructing, populating, and dropping the additional hash table.

Can you think of a more space efficient way to find the incompatibility to merge with?
Do you think the performance cost is worth the error message improvement?
What is the performance cost on your preferred workload?

@mpizenberg
Copy link
Member

The merge_incompatibilities function is a bit hard to read (or maybe it's getting too late ^^). If you could re-explain the approach it would help.

In my point of view, improving error messages and some worst case scenarios, at the cost of degrading the easy solutions is worth it.

@Eh2406
Copy link
Member Author

Eh2406 commented Dec 1, 2023

You are not wrong. I got distracted by the performance problems and forgot to clean up the code. Sorry.

@Eh2406
Copy link
Member Author

Eh2406 commented Dec 1, 2023

Rebased and added a commit splitting things up and added some comments. Hopefully this is clearer.

@Eh2406
Copy link
Member Author

Eh2406 commented Dec 1, 2023

I have a variation that does not have a dedicated map for dependencies, but instead scans the full list of incompatibilities for matches. In the happy path this is much faster, as there are less allocations. Unsurprisingly, larger examples grind to a halt spending all their time checking incompatibilities that are not dependencies.

I'm still thinking about ways to have our cake and eat it too. Something that only allocates if we are in a large case where it would be more efficient, and uses scanning when inputs are small. But I have not thought of result that is efficient in both cases and simple to implement.

@mpizenberg
Copy link
Member

mpizenberg commented Dec 5, 2023

I haven't taken the time to re-read the code yet. However I've watched packaging con 2023. This talk https://youtu.be/z0AB4g2KDX8?si=tpz1DOEYdLXVmPJw about explainability of errors in spack could be relevant. One idea is that overhead can only be run in a second stage, if an error was found. This is akin to the "simplify" that was recently merged, but I guess something like this could also be done for what is proposed here. Something like follows:

  1. solve without overhead
  2. if that's a success, fine nothing more to do
  3. if that's an error, run a simplification step on all recorded incompatibilities and rerun the solver with simplified recorded incompats already in.

With the knowledge that (3) brings I guess that second run should be fairly fast, and with simplified incompatibilities maybe it takes better shortcuts and give a better erroring path overall?

@Eh2406
Copy link
Member Author

Eh2406 commented Dec 5, 2023

I tested some of the larger examples I generated from cargo use cases. They do not hit this pattern. The tool I have for generating these examples is very out of date, I did not look to see if the tool could be updated to meet this pattern.

This also gave me the opportunity to measure the absolute overhead of this approach, which is quite small. The relative overhead is high because Elm if the only input to have this pattern in my benchmark suite and it has very little backtracking.

I'd like to hear from @zanieb or @konstin, about whether there real world use case it's this code path. If it does, we can always improve performance later.

@konstin
Copy link
Member

konstin commented Dec 13, 2023

There is a nice speedup:

Benchmark 1: ./main
  Time (mean ± σ):      8.028 s ±  0.439 s    [User: 7.956 s, System: 0.076 s]
  Range (min … max):    7.550 s …  8.956 s    10 runs
 
  Warning: Ignoring non-zero exit code.
 
Benchmark 2: ./branch
  Time (mean ± σ):      5.673 s ±  0.288 s    [User: 5.355 s, System: 0.321 s]
  Range (min … max):    5.411 s …  6.238 s    10 runs
 
  Warning: Ignoring non-zero exit code.
 
Summary
  ./branch ran
    1.42 ± 0.11 times faster than ./main

@konstin
Copy link
Member

konstin commented Dec 13, 2023

The error message becomes 5 lines instead of 2000

@zanieb
Copy link
Member

zanieb commented Dec 13, 2023

Are the lines unreadably long? That's my primary concern, although perhaps we can handle splitting lines into readable pieces separately.

I'm making good progress on packaging scenarios for testing error messages at https://github.com/zanieb/packse

@Eh2406
Copy link
Member Author

Eh2406 commented Dec 13, 2023

I believe, as of this PR, the lines are still long. However, they get much shorter if simplify is run on them. That can be done in error reporting with the code as it currently is. Alternatively simplify can be added to the algorithm to apply it in at least this case. Witch I plan to play with in a follow up PR.

@mpizenberg can / do you want to look at this before it gets merged?

@mpizenberg
Copy link
Member

I haven't heard about long lines before. Is this something discussed in the office hours?

I can have another look at this Friday if you are ok waiting two more days. Otherwise I trust you.

@mpizenberg
Copy link
Member

I believe, as of this PR, the lines are still long. However, they get much shorter if simplify is run on them.

I'm curious about an example for this if possible.

@Eh2406
Copy link
Member Author

Eh2406 commented Dec 13, 2023

I haven't heard about long lines before. Is this something discussed in the office hours?

I think the test added in this PR is a great example. In the before case there were O(#vertions) lines which were O(#vertions) long. After there are 2 lines but the lines are still O(#vertions). The test case has #vertions == 6, but the real world case has hundreds or thousands of versions.

I believe, as of this PR, the lines are still long. However, they get much shorter if simplify is run on them.

I'm curious about an example for this if possible.

The line

Because there is no available version for bar and foo 1 | 2 | 3 | 4 | 5 depends on bar, foo 1 | 2 | 3 | 4 | 5 is forbidden.

Become something like

Because there is no available version for bar and foo * depends on bar, foo * is forbidden.

The line

And because there is no version of foo in <1 | >1, <2 | >2, <3 | >3, <4 | >4, <5 | >5 and root 1 depends on foo, root 1 is forbidden.

Become something like

And because there is no version of foo in (the empty set) and root 1 depends on foo, root 1 is forbidden.

I would like to defer conversation about future improvements to future PR's/issues, so that we can focus on getting this improvement merged.

I can have another look at this Friday if you are ok waiting two more days. Otherwise I trust you.

Happy to wait! If I don't get an update, I will merge on Monday. If you spot something after that open an issue and will get it fixed.

@@ -30,6 +30,8 @@ pub struct State<P: Package, VS: VersionSet, Priority: Ord + Clone> {
/// and will stay that way until the next conflict and backtrack is operated.
contradicted_incompatibilities: rustc_hash::FxHashSet<IncompId<P, VS>>,

dependencies: Map<(P, P), SmallVec<IncompId<P, VS>>>,
Copy link
Member

@mpizenberg mpizenberg Dec 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you already tested it but I'm gonna ask just in case. Considering we only merge dependants where the dependency is exactly the same, would it make sense to store the dependency in the key? It does increase memory so it's probably worse, but it would prevent from looping through all incompats for the packages pair.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose these package pairs are generally small anyway since you opted for SmallVec so yeah maybe it's a bad trade off. But still curious to hear from if you tried it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering we only merge dependants where the dependency is exactly the same, would it make sense to store the dependency in the key? It does increase memory so it's probably worse, but it would prevent from looping through all incompats for the packages pair.

I would absolutely love to do this. Unfortunately VS is neither Hash nor Ord, so I don't know of an efficient data structure for using them as a key.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, Ord would not make sense for sets. Hash could make sense. Maybe we can make a note of this, to explore in the future if it turns out some real-world use case end up being in the worst case scenario for this (many different ranges for a package pair, causing O(n²) complexity).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realistically, this can only happen for a package pair where there are many versions of each, their versions are bumped in sync. Something like two packages from the same workspace maybe if versions are strictly specified. Or for things like features (in our "pubgrub limitations" guide section) where we pair exactly a feature version to the bare crate version. Hum, maybe not so unlikely ...

Copy link
Member Author

@Eh2406 Eh2406 Dec 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. "Features" is why none of the cargo benchmarks hit this code path.

I'm not too afraid of the O(n²) overhead here. Every time we merge dependencies we follow it up by doing unit_propagation, which does a linear scan over an even larger collection of incompatibilities. (Well... unit_propagation has "previously_contradicted" as an optimization. So theoretically speaking unit_propagation is not guaranteed to be slower, but so far it has been in practice.) That is to say it has the same O(n²) shaped, with a larger n. If this becomes a hotspot in a resolution that is slow we can investigate a fixed then.

Copy link
Member

@mpizenberg mpizenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a few cosmetic changes, changing some names and comments but otherwise I've nothing to add. Looks good to me if you confirm there isn't a big performance hit.

@Eh2406
Copy link
Member Author

Eh2406 commented Dec 17, 2023

I will rerun and annotate my usual benchmarks at some point today.

@Eh2406
Copy link
Member Author

Eh2406 commented Dec 17, 2023

large_cases/elm_str_SemanticVersion.ron
                        time:   [218.63 ms 218.81 ms 219.01 ms]
                        change: [+20.302% +20.450% +20.593%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild
large_cases/large_case_u16_NumberVersion.ron
                        time:   [9.6057 ms 9.6465 ms 9.6910 ms]
                        change: [+7.7815% +8.2632% +8.7441%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe
large_cases/zuse_str_SemanticVersion.ron
                        time:   [226.41 ms 226.75 ms 227.10 ms]
                        change: [+15.258% +15.512% +15.774%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Also looks hobo_str_SemanticVersion is a ~2% regression, although it is hard to tell as noize is larger.

The optimization can hit in elm. But there is very little backtracking. So it doesn't get used much. All the overhead is building the map. It is not expensive (44 ms) but it still 20% overhead.

The randomly generated large_case also does not use the optimization much. The map is cheaper to construct, as hash of u16 is cheaper then hash of str.

zuse never utilizes the optimization, due to the way package names were generated.

Overall I wish the performance impact was smaller, but we can always work on it in follow-up PR's.

@Eh2406 Eh2406 added this pull request to the merge queue Dec 17, 2023
Merged via the queue into dev with commit d97491e Dec 17, 2023
5 checks passed
@Eh2406 Eh2406 deleted the merge_dep branch December 17, 2023 19:21
konstin pushed a commit to astral-sh/pubgrub that referenced this pull request Dec 18, 2023
* bad error test

* feat: merge direct dependencies

* separate function for clearer documentation

* Update doc comments for merge_dependency

* Rename merge_dependency into merge_dependants

* Use american english dependent instead of dependant

* Rename dependencies into merged_dependencies

* Typo mergeed -> merged

---------

Co-authored-by: Matthieu Pizenberg <[email protected]>
zanieb pushed a commit that referenced this pull request Dec 18, 2023
zanieb added a commit to astral-sh/pubgrub that referenced this pull request Dec 21, 2023
* Update test following pubgrub-rs#163

* Fix compiler warning

---------

Co-authored-by: konstin <[email protected]>
konstin added a commit to astral-sh/pubgrub that referenced this pull request Dec 28, 2023
* Update test following pubgrub-rs#163

* Fix compiler warning

---------

Co-authored-by: konstin <[email protected]>
konstin added a commit to astral-sh/pubgrub that referenced this pull request Jan 4, 2024
* Update test following pubgrub-rs#163

* Fix compiler warning

---------

Co-authored-by: konstin <[email protected]>
@Eh2406 Eh2406 mentioned this pull request Jul 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants