Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test suite clean up #3385

Open
wants to merge 54 commits into
base: master
Choose a base branch
from
Open

Test suite clean up #3385

wants to merge 54 commits into from

Conversation

JDBetteridge
Copy link
Member

@JDBetteridge JDBetteridge commented Feb 2, 2024

Description

This PR started as an experiment to "cheaply" speed up the test suite by calling mpiexec wrapping pytest, rather than forking a subprocess which calls mpiexec (which is also problematic for other reasons).

This PR now carries around multiple test suite fixes that should be merged back to master and includes fixes including:

  • Adding comm arguments to function calls that need them.
  • Freeing comms that are created.
  • Disabling a test that pollutes the tape.
  • "Fixing" ensemble parallel tests by using the simple partitioner (just in tests, Ensemble needs a proper fix!)
  • This work had to be rebased on JDBetteridge/update caching #3730 and uses PyOP2 #724 and FInAT #134 due to the deadlocks that they call.

We need to consider what aspects of this experiment we want to incorporate back into master.

Some timings for the actual speed-up (the original intention):

Results

(Real only)

Master

This week's scheduled execution:

Total (inc install): 50m 45s

This branch

With fixed caches, mpispawn, fixed FInAT hashes and pytest-split based on a timed execution.
NB: We tweak vertexonly/test_poisson_inverse_conductivity.py to only do 3 iterations (see diff)

Serial: 17m51s
2: 2m59s
3: 6m43s
4: 45s
6: 19s
7: 48s
8: 12s
Total (inc install): 46m 6s

Important, this branch only runs a maximum of 12 ranks/threads!

@connorjward
Copy link
Contributor

connorjward commented Feb 2, 2024

This is cool, but isn't it a bad idea to effectively remove test coverage? If CI doesn't run all the tests no one will.

I can see this being useful in the context of a bigger change where we run the test suite with a number of Firedrake configurations and only one of them would run these slow tests.

@JDBetteridge JDBetteridge force-pushed the JDBetteridge/faster_tests branch 2 times, most recently from 9d5f056 to df4aea3 Compare September 12, 2024 13:04
Copy link

github-actions bot commented Sep 12, 2024

TestsPassed ✅Skipped ⏭️Failed ❌
Firedrake complex8067 ran6423 passed1644 skipped0 failed

Copy link

github-actions bot commented Sep 12, 2024

TestsPassed ✅Skipped ⏭️Failed ❌
Firedrake real8042 ran7224 passed818 skipped0 failed

@JDBetteridge JDBetteridge marked this pull request as ready for review September 24, 2024 14:23
@connorjward connorjward changed the title Mark and skip slow tests Test suite clean up Oct 3, 2024
Copy link
Contributor

@connorjward connorjward left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally very happy with this.

.github/workflows/build.yml Outdated Show resolved Hide resolved
.test_durations Outdated Show resolved Hide resolved
firedrake/parameters.py Outdated Show resolved Hide resolved
firedrake/slate/slac/compiler.py Show resolved Hide resolved
firedrake/tsfc_interface.py Outdated Show resolved Hide resolved
firedrake/tsfc_interface.py Outdated Show resolved Hide resolved
tests/demos/test_demos_run.py Show resolved Hide resolved
tests/output/test_io_mesh.py Outdated Show resolved Hide resolved
tests/slate/test_hdg_poisson.py Outdated Show resolved Hide resolved
@JDBetteridge JDBetteridge linked an issue Oct 10, 2024 that may be closed by this pull request
python "$(which firedrake-clean)"
python -m pip install \
pytest-xdist pytest-timeout ipympl
pytest-xdist pytest-timeout ipympl pytest-split
pip install git+https://github.com/JDBetteridge/mpispawn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we merge this we should probably put this into the firedrakeproject organisation, or pin to a version or something?

.github/workflows/build.yml Outdated Show resolved Hide resolved
Copy link
Contributor

@connorjward connorjward left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving notes for someone (most likely me) to refer to in future. In summary:

  • Need to rebase/merge in master.
  • Tweaks to Makefile and build.yml.

-o faulthandler_timeout=1860 \
--junit-xml=firedrake2_\$MPISPAWN_TASK_ID1.xml \
-m "parallel[\$MPISPAWN_WORLD_SIZE] and not broken" \
-v tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: "dogfood" (bleh) Makefile and use a matrix to massively cut down on boilerplate

.PHONY: test_smoke
test_smoke:
@echo " Running the bare minimum smoke tests"
@python -m pytest -k "poisson_strong or stokes_mini or dg_advection" -v tests/regression/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to use MPI on the "outside" here for the parallel tests so this can be run to check things on HPC

endif
# Requires pytest and pytest-mpi only
.PHONY: test_serial
test_serial:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Terrible name! This runs all the parallel tests too!


# Requires pytest and pytest-mpi only
.PHONY: test_smoke
test_smoke:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bikeshedding: I prefer make smoke_tests or make smoketests

done

.PHONY: _test_large_world_test
_test_large_world_tests:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why we have small_world and large_world tests separately.

@@ -159,7 +159,11 @@ def collect_tsfc_kernel_data(self, mesh, tsfc_coefficients, tsfc_constants, wrap

# Pick the constants associated with a Tensor()/TSFC kernel
tsfc_constants = tuple(tsfc_constants[i] for i in kinfo.constant_numbers)
kernel_data.extend([(c, c.name) for c in wrapper_constants if c in tsfc_constants])
kernel_data.extend([
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to merge in master as these changes are now merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

INSTALL: Tests not passing on fresh install M2 Mac
4 participants