-
Notifications
You must be signed in to change notification settings - Fork 701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cabal test
doesn't always solve for tests (unless --enable-tests
or tests: True
is given)
#7883
Comments
It's not the assumption I was using :) Remember this comment, in which I said that I think the solver is allowed to decide to pick a plan which does not include tests even if one exists. For example, if suppose the solver is both allowed to decide whether to enable tests and whether to enable the benchmarks. The solver might try four combinations:
If the solver tries them in that order and picks the first one which succeeds, you'll observe the current bug if 1 fails and 2 and 3 succeed, because 2 is picked even though 3 succeeds. And we can't fix this by checking 3 before 2, because then we'd just have the same bug but for benchmarks! The reality is much worse, because there are many more dimensions of choice, such as the version of every recursive dependency. So many that it's not feasible to check every allowed plan in which the tests are enabled. Instead, a heuristic must be used, to examine more promising branches of the possibility tree before others. And if that heuristic makes it look like disabling the tests is a promising avenue, the solver can easily find a plan in the disabled-tests branch without having completely explored the enabled-tests branch, and thus choose to disable the tests even if a successful plan exists somewhere in that unexplored enable-tests branch. |
@gelisam: you are right and I even responded to that comment, but then promptly repressed that, because it's too painful to believe. In fact, the cabal's blurb in my bug's description is correct and takes into account the possibility of solver not cooperating and that's exactly the blurb you changed in #7834. So, #7834 and related are not threatened. So the problem in my case is that I have 1 library, 1 exe, 1 testsuite and the solver is not certain that the exe and the test are not excluding each other? And so it picks a plan where exe can be built and ignores the testsuite? Could we tweak cabal to prioritize tests (respectively, benchmarks) over exes and libraries and everything, when @grayjay: do you think it's easy to ask the solver to do that and cabal just fails to ask, or is solver not able to prioritize things, apart of the brute force mode where it tries exhaustively and fails, if it can't build something (e.g., --enable-tests)? |
There is a very clear description of the algorithm in this Well-Typed post on Qualified Goals in the Cabal Solver. There is a very clear phase where such preferences are applied: the "Heuristics and Preferences " section in which the tree branches are reordered. I think it would be instructive to create a variant of your repro case in which the search tree is small enough that we can make diagrams like the ones in the post, so we can see at which node the choice of enabling tests or not is made, and discuss at which node we would like to make that choice instead. |
I think this issue is the same as #5079. The problem is that, by default, cabal runs the solver with only a preference to enable tests and benchmarks. The preference is only applied locally by sorting nodes in the search tree, which means that the solver may disable tests in order to satisfy a different preference, such as the preference to avoid deprecated versions. We could apply preferences globally by implementing a feature like #2860 and give the testing preference higher priority, but it would only reduce the frequency of the current issue, not prevent it. There would still be cases where the solver silently disables tests without --enable-tests, such as the conflict between tests and benchmarks in #7883 (comment) or a conflict between two test suites.
I think we should continue using the same default solver options for both "build" and "test" in order to avoid unstable install plans. I think that it would be very confusing for users to get completely different versions of dependencies depending on the cabal command that was run. |
Doh, thank you very much for the explanations. I apologise, I couldn't get around to generating that small test case and tried to do that before responding. But I feel this is an important usability point, so I don't want to just close as a duplicate.
@grayjay: do you mean that if "test" can't be solved using a superset of dependencies generated for "build", it always fails (without |
The question implies that Since both of those behaviors are undesirable, what is actually happening is that all the commands (except cabal-install) use the same plan. So it doesn't matter if you call I was careful to say "involves", not "includes", because a plan does not just pick version numbers for the dependencies, it also makes decisions about flags, whether to enable tests, etc. That's how we nevertheless manage to get plans which don't include the tests. |
OK, you convinced me we are doing the right thing. I take back my proposal to try harder to include tests when the solver notices it's run from I guess, that leaves me with a request to try harder (ideally guarantee) that the plan we find is a local maximum in terms of how many components are included. E.g., in my (non-minimized) real life case above, it was possible to add a testsuite (without resorting to changing flags or picking deprecated deps), but the chosen plan included none. Local maximum in the sense, that if it's possible to add one more component (without removing any that are already included) by changing only dependencies (and not to deprecated versions) then it's added. Perhaps what that entails is that the preference to include even a single new component should be stronger than to keep optimal dependencies (newest possible, or already found), even if that means downgrading absolutely all dependencies to stone age versions (barring deprecated versions). Perhaps that's impossible, though. I'm reading the blog post and it underlines that decisions are local, while I require global guarantees, which may imply an exhaustive search/backtracking. I think the solver never backtracks over preferences, e.g., a preference to not exclude a component? It only backtracks over constraint-like things, e.g., the Or how bad would it be to treat component inclusion as hard constraints? (The difference vs Please tell me how I'm mixing up things again. ;) |
The new value does not change the current behavior in any way, it just better documents the fact that they are three options rather than just |
If we limit our attention to plans which include all the components, then Haskellers will be happier because they are the ones who tend to be surprised by the current behaviour, but non-Haskellers, who need the most help, will be greated with an inscrutable "cannot find build plan" error more often, because the solver will not be allowed to work harder by looking for a build plan which can build the executable they care about but not the tests they don't know how to run anyway. |
Instead of adding even more values to |
But aren't newcomers mostly running
Agreed.
Good to know, thank you. But in a week all I remember will be 'cabal build' again. ;) OTOH, if I stick something that corresponds to that in my |
Oh right, good point about |
I like this idea. If I understand correctly, the first field is exactly the same as the existing boolean
I also like this idea. In my opinion, we should remove the third value, where tests are only enabled with a preference. I think that the testing preference adds complexity without much benefit, because it doesn't make any useful guarantees (and can't make useful guarantees without behaving like a constraint). Setting the default to true will give |
It sounds like we're iterating to the idea that the default is |
@gbaz; I'm happy we all agree about the outline of the solution, even though it's not backward compatible, because we remove the current default (the non-binding promise to try and include stuff in the plan), so it can't be the defaults any more. Regarding the details, @gelisam and me thought about an extra field in the project file that would mean "build it" so that the |
Aren't they already the same? I don't have an opinion as to whether it is better to add a flag changing the meaning of |
They are, but @gbaz proposes to make Let me think. Can we express
Seems to check out. The only thing missing is overriding "tests: True" ("plan, build") on command line to "plan, don't build". Is that important? |
I like the simplicity of having With that approach, we can still express all the combinations:
Wait, what? I thought we all agreed that |
I realize that I'm asking you for quite a number of combinations, so I should be willing to do that amount of work myself. Here is the behaviour I expect for the proposal I support: All of the following combinations have the effect "plan libs, exes, tests, and benches; build libs and exes":
All of the following combinations have the effect "plan libs, exes, tests, and benches; build and run tests":
All of the following combinations have the effect "plan libs, exes, and benches; build libs and exes":
All of the following combinations have the effect of failing with an error message explaining that tests are disabled:
|
I don't oppose that, but in that case the user can't request via a project file to have tests planned, built (type-checked) and not run via any subsequent |
Yes, thanks for explaining. Given that motivation, I think it's important to list the current behaviour under all the combinations, to make sure all those behaviours can easily be obtained under the new proposal. All of the following combinations have the effect "plan libs and exes, and try to include tests if possible; build libs and exes":
All of the following combinations have the effect "plan libs and exes; build libs and exes":
All of the following combinations have the effect "plan libs, exes, and tests; build libs, exes, and tests":
All of the following combinations have the effect "plan libs and exes, and try to include tests if possible; build and run tests, failing if the tests were not included in the plan":
All of the following combinations have the effect "plan libs, exes, and tests; build and run tests":
All of the following combinations have the effect "plan libs and exes; fail because tests are disabled":
|
Here's the subset of those behaviours we can accomplish with the proposal I support. All the commands assume that
I think your only complaint about the above is that |
Actually, I didn't even remember one can set it in You are right, this is my only complaint (if you generalize to project files and ignoring benchmarks, which behave analogously). To reword: the biggest complaint is that if somebody had The minor and last complaint is that a common idiom "typecheck everything" that previously required only ( |
I wanted to mention that there is one advantage to continuing to use three values for the "tests" configuration option over splitting it into two independent options (other than backwards compatibility). The three value design avoids the possibility of setting the "plan tests" option to false and the "build tests" option to true, which is a combination that doesn't make sense. In the three value design I am imagining, |
Ooh, making invalid states unrepresentable, I like that! Let me again exhaustively list the behaviours for this proposal. All of the following combinations have the effect "plan libs and exes; build libs and exes":
All of the following combinations have the effect "plan libs, exes, and tests; build libs and exes":
All of the following combinations have the effect "plan libs, exes, and tests; build libs, exes, and tests":
All of the following combinations have the effect "plan libs, exes, and tests; build and run tests":
All of the following combinations have the effect "plan libs and exes; fail because tests are disabled":
|
This is brilliant. Let's do it. |
See #5079 for previous discussion of this issue. |
cabal test
doesn't always build tests (unless --enable-tests
or tests: True
is given)
cabal test
doesn't always build tests (unless --enable-tests
or tests: True
is given)cabal test
doesn't always solve for tests (unless --enable-tests
or tests: True
is given)
@ulysses4ever's exec summary:
this issue (and its variations) has been discussed on multiple occasions (new-test should imply --enable-tests #5079, 'cabal new-build test:test-suite' doesn't force tests to be enabled. #3923)
a desirable converging point in the design space seems to be, as of now (as put by grayjay):
the main counterpoint to the idea above (as put by fgaz):
it feels like we get a better UX if we follow the idea still, and the answer to the counterpoint (as put by Mikolaj):
anecdotally, many power users already put
tests: True
in their global config and deal with the above mentioned problem on the case by case basis; we just enshrine this practice in the default;our best hope for implementing something like that is, currently, [WIP] Only buildable tests #7829 but it looks like it needs a new champion...
Describe the bug
I'm dog-fooding current dev cabal from master branch.
Solver is not cooperating. Cabal says "solver picked a plan that does not include the test suites" despite there being such a plan, which can be seen when forcing with 'tests: True'.
To Reproduce
Clone
Mikolaj/mostly-harmless@3feb586
Run the following (it seems, a fresh package store is not required after all)
It shows
but
cabal build -w ghc-9.2.1 --enable-tests
(as well ascabal test -w ghc-9.2.1
with 'tests: True' incabal.project.local
) goes through fine and tests run. However, after removingcabal.project.local
,cabal test -w ghc-9.2.1
fails again the same.The same attempts with ghc-8.10.7 work fine.
Expected behavior
The solver should pick the right plan without forcing. Forcing would ideally be used only to show the error that prevents the solver from finding a plan. That's the assumption under which
@gelisam andme operated in #7834 and related tickets. Now it's all put into question.The text was updated successfully, but these errors were encountered: