Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for test_training.py failures #245

Conversation

GenevieveBuckley
Copy link
Collaborator

We've been seeing CI failures related to the test_training.py script.

I've investigated, and found:

  1. The major cause was that micro-sam genrally expects the device variable to be a string (eg: "cpu", "cuda", or "mps"), but the training script had it set as a PyTorch object device = torch.device(type="cpu").
  2. There was another problem, not noticeable on the CI, where we did not correctly set the backend device for everything. Having some things on the cpu, and some on a different pytorch backend means we can't perform the computations. This problem was not noticeable on the CI since it is cpu only, but it was very noticeable when I tried to run the tests locally (where I have another backend available that is higher-priority than the cpu).

This PR addresses both issues. Most lines of code relate to problem 2.

pytest test/test_training.py now passes locally on my machine. The CI job should run the tests for this branch once I submit the PR.

@GenevieveBuckley
Copy link
Collaborator Author

Here is an example of a CI job which fails partly because of the test_training.py problem this PR hopes to fix.
https://github.com/GenevieveBuckley/micro-sam/actions/runs/6636342678/job/18028669022?pr=3

 FAILED test/test_training.py::TestTraining::test_training - RuntimeError: Unsupported device: cpu
Please choose from 'cpu', 'cuda', or 'mps'.
Full details (click to expand)


Run aganders3/headless-gui@v1
running on linux
install linux_pkgs
starting XVfb
xvfb pid=5591, display=0
running linux_setup
/usr/bin/bash -l -c pytest
============================= test session starts ==============================
platform linux -- Python 3.11.6, pytest-7.4.3, pluggy-1.3.0 -- /home/runner/micromamba/envs/sam/bin/python
cachedir: .pytest_cache
PyQt5 5.15.10 -- Qt runtime 5.15.2 -- Qt compiled 5.15.2
rootdir: /home/runner/work/micro-sam/micro-sam
configfile: pyproject.toml
plugins: napari-0.4.18, qt-4.2.0, cov-4.1.0, npe2-0.7.2, napari-plugin-engine-0.2.0
collecting ... collected 22 items

  /home/runner/micromamba/envs/sam/lib/python3.11/site-packages/timm/models/layers/__init__.py:49: DeprecationWarning: Importing from timm.models.layers is deprecated, please import via timm.layers
    warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", DeprecationWarning)

../../../micromamba/envs/sam/lib/python3.11/site-packages/timm/models/registry.py:4
  /home/runner/micromamba/envs/sam/lib/python3.11/site-packages/timm/models/registry.py:4: DeprecationWarning: Importing from timm.models.registry is deprecated, please import via timm.models
    warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.models", DeprecationWarning)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

---------- coverage: platform linux, python 3.11.6-final-0 -----------
Coverage XML written to file coverage.xml

============================= slowest 10 durations =============================
49.07s call     test/test_instance_segmentation.py::TestInstanceSegmentation::test_tiled_automatic_mask_generator
15.54s call     test/test_instance_segmentation.py::TestInstanceSegmentation::test_automatic_mask_generator
11.16s setup    test/test_instance_segmentation.py::TestInstanceSegmentation::test_automatic_mask_generator
8.15s call     test/test_gui.py::test_annotator_2d
5.46s call     test/test_util.py::TestUtil::test_tiled_prediction
2.34s call     test/test_sam_annotator/test_widgets.py::test_embedding_widget
2.03s call     test/test_prompt_based_segmentation.py::TestPromptBasedSegmentation::test_segment_from_mask_non_square
1.45s setup    test/test_prompt_based_segmentation.py::TestPromptBasedSegmentation::test_segment_from_box
0.69s call     test/test_prompt_generators.py::TestPromptGenerators::test_iterative_prompt_generator
0.55s call     test/test_prompt_based_segmentation.py::TestPromptBasedSegmentation::test_segment_from_mask
=========================== short test summary info ============================
FAILED test/test_gui.py::test_annotator_2d - TypeError: 'module' object is not callable
FAILED test/test_training.py::TestTraining::test_training - RuntimeError: Unsupported device: cpu
Please choose from 'cpu', 'cuda', or 'mps'.
======= 2 failed, 16 passed, 4 skipped, 2 warnings in 108.94s (0:01:48) ========
Error: The process '/usr/bin/bash' failed with exit code 1
The process '/usr/bin/bash' failed with exit code 1

@codecov
Copy link

codecov bot commented Oct 25, 2023

Codecov Report

❗ No coverage uploaded for pull request base (dev@4fd9ce2). Click here to learn what that means.
The diff coverage is n/a.

@@          Coverage Diff           @@
##             dev     #245   +/-   ##
======================================
  Coverage       ?   38.34%           
======================================
  Files          ?       30           
  Lines          ?     3987           
  Branches       ?        0           
======================================
  Hits           ?     1529           
  Misses         ?     2458           
  Partials       ?        0           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@GenevieveBuckley
Copy link
Collaborator Author

The CI passes now, hooray! 🎉 🎉

Copy link
Contributor

@constantinpape constantinpape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @GenevieveBuckley.
This looks good and I will merge this now so that we can go ahead with the rest.

Eventually we should add a few more unit-tests for the device handling and update the type annotations so that both strs and torch devices are allowed. I will take care of this later.

@constantinpape constantinpape merged commit 4561969 into computational-cell-analytics:dev Oct 25, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants