Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuous integration failed to detect bug in solver Hackage benchmarks #9495

Closed
grayjay opened this issue Dec 6, 2023 · 11 comments
Closed

Comments

@grayjay
Copy link
Collaborator

grayjay commented Dec 6, 2023

#6447 added a short run of the solver Hackage benchmark to CI to test that it can run without errors. However, I noticed a bug where the benchmark fails to run when the cabal config files have not been initialized (#9494). This issue is for checking whether CI is still correctly testing the benchmark.

@Mikolaj
Copy link
Member

Mikolaj commented Dec 6, 2023

For a start, does the test run when testing cabal locally according to the relevant instructions?

@grayjay
Copy link
Collaborator Author

grayjay commented Dec 6, 2023

@Mikolaj Which instructions are you referring to? CONTRIBUTING.md doesn't mention the Hackage benchmark, though it probably should. The benchmark should be able to run locally after #9494.

grayjay added a commit to grayjay/cabal that referenced this issue Dec 6, 2023
@grayjay
Copy link
Collaborator Author

grayjay commented Dec 6, 2023

#9500 (8541514) showed that CI still passes when the benchmark calls die at the start.

@grayjay
Copy link
Collaborator Author

grayjay commented Dec 7, 2023

c332032 showed that CI fails when the benchmark doesn't compile.

EDIT: The step "Validate build" of the job "Validate ubuntu-latest ghc-9.2.8" failed, though all other "Validate" jobs were cancelled.

grayjay added a commit to grayjay/cabal that referenced this issue Dec 7, 2023
grayjay added a commit to grayjay/cabal that referenced this issue Dec 7, 2023
@grayjay
Copy link
Collaborator Author

grayjay commented Dec 7, 2023

5ca936b showed that CI passes when the unit tests have a failure. The three CI runs mean that CI builds the Hackage benchmark but doesn't correctly run it or its unit tests.

@grayjay
Copy link
Collaborator Author

grayjay commented Dec 7, 2023

I'm not sure how to debug the CI next. It looks like validate.sh still attempts to run the benchmarks when the --solver-benchmarks flag is specified:

cabal/validate.sh

Lines 210 to 211 in a1cbd89

--solver-benchmarks)
BENCHMARKS=true

if $BENCHMARKS; then STEPS="$STEPS solver-benchmarks-tests solver-benchmarks-run"; fi

cabal/validate.sh

Lines 464 to 477 in a1cbd89

step_solver_benchmarks_tests() {
print_header "solver-benchmarks: test"
CMD="$($CABALLISTBIN solver-benchmarks:test:unit-tests)"
(cd Cabal && timed $CMD) || exit 1
}
step_solver_benchmarks_run() {
print_header "solver-benchmarks: run"
SOLVEPKG=Chart-diagrams
CMD="$($CABALLISTBIN solver-benchmarks:exe:hackage-benchmark) --cabal1=$CABAL --cabal2=$($CABALLISTBIN cabal-install:exe:cabal) --trials=5 --packages=$SOLVEPKG --print-trials"
(cd Cabal && timed $CMD) || exit 1
}

The --solver-benchmarks flag is used in two places, in .docker/validate-8.8.4.dockerfile

RUN sh ./validate.sh --doctest --solver-benchmarks --complete-hackage -w ghc-8.8.4 -v

, pair "8.8.4" $ Z "ghc-8.8.4" "8.8.4-bionic" False True False True "--doctest --solver-benchmarks --complete-hackage"

and .github/workflows/validate.yml

GHC_FOR_SOLVER_BENCHMARKS: '9.2.8'

if [[ ${{ matrix.ghc }} == ${{ env.GHC_FOR_SOLVER_BENCHMARKS }} ]]; then
FLAGS="$FLAGS --solver-benchmarks"
fi

@Mikolaj
Copy link
Member

Mikolaj commented Dec 7, 2023

EDIT: The step "Validate build" of the job "Validate ubuntu-latest ghc-9.2.8" failed, though all other "Validate" jobs were cancelled.

That's normal. When one job fails, others get cancelled. So this case is fine.

@Mikolaj
Copy link
Member

Mikolaj commented Dec 7, 2023

Good detective work. Yes, I confirm in the raw log from https://github.com/haskell/cabal/actions/runs/7109262010/job/19402640597?pr=9494 that benchmarks are built for GHC 9.2.8. However, they are not tested nor run, because validate.sh is no longer run as whole, to execute all the steps, but instead validate.yml decides which steps to execute and calls validate.sh for each separately. Unfortunately, it seems somebody did not cover the benchmark test and run steps in validate.yml (nor the time-summary step that should come after; no idea if it's still useful).

So this looks like an omission in validate.yml and it's hard to detect because of the duplication with validate.sh, where the benchmark steps are not omitted. I think we should just add the relevant steps to validate.yml, running them conditionally, probably just with the if [[ ${{ matrix.ghc }} == ${{ env.GHC_FOR_SOLVER_BENCHMARKS }} ]]; condition.

@grayjay
Copy link
Collaborator Author

grayjay commented Jan 18, 2024

@Mikolaj Thanks for the advice! I made the change that you suggested in #9625, and now CI fails with the error about a missing config file from #9494.

@mergify mergify bot closed this as completed in c0dcbde Jan 26, 2024
mergify bot added a commit that referenced this issue Jan 26, 2024
Add solver Hackage benchmarks to GitHub Actions (fixes #9495)
grayjay added a commit to grayjay/cabal that referenced this issue Jan 27, 2024
grayjay added a commit to grayjay/cabal that referenced this issue Jan 27, 2024
grayjay added a commit to grayjay/cabal that referenced this issue Jan 27, 2024
@grayjay
Copy link
Collaborator Author

grayjay commented Jan 27, 2024

I tried introducing a failure into the benchmark unit tests now that #9625 has been merged (5655b89), and it caused the CI to fail, as expected.

@Mikolaj
Copy link
Member

Mikolaj commented Jan 27, 2024

I tried introducing a failure into the benchmark unit tests now that #9625 has been merged (5655b89), and it caused the CI to fail, as expected.

Amazing! Success!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants