Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embedding widget for napari plugin #235

Conversation

GenevieveBuckley
Copy link
Collaborator

Closes #203
A simple widget to compute image embeddings via the micro-sam napari plugin.

I've put the widget stuff in a separate file for now. Not sure if that's the right long term choice or not, but we can always change it later.

Copy link
Contributor

@constantinpape constantinpape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a closer look now and also tried to run the widget.

Two things I noticed:

  • The ndim computation for RGB data is wrong.
  • We should check that the global IMAGE_EMBEDDINGS variable is updated.

I also checked out the branch, reinstalled (via pip install -e) and tried to run the widget, but starting the widget from the plugins menu fails with:

RuntimeError: Failed to import command at 'micro_sam.sam_annotator._widgets:embedding_widget': /lib/x86_64-linux-gnu/libti
nfo.so.6: version `NCURSES6_TINFO_6.2.20211010' not found (required by /home/pape/software/conda/miniconda3/envs/sam/lib/p
ython3.10/site-packages/vigra/../../../././libncurses.so.6) 

@GenevieveBuckley : should this work already?

I've put the widget stuff in a separate file for now. Not sure if that's the right long term choice or not, but we can always change it later.

I think that's good and makes more sense than micro_sam.sam_annotator.util, where I put these things before.

micro_sam/sam_annotator/_widgets.py Show resolved Hide resolved
micro_sam/sam_annotator/_widgets.py Outdated Show resolved Hide resolved
micro_sam/util.py Show resolved Hide resolved
micro_sam/sam_annotator/_widgets.py Outdated Show resolved Hide resolved
test/test_sam_annotator/test_widgets.py Show resolved Hide resolved
@GenevieveBuckley
Copy link
Collaborator Author

GenevieveBuckley commented Oct 23, 2023

@GenevieveBuckley : should this work already?

Almost. Our biggest problem is that the global IMAGE_EMBEDDINGS variable is not accessible outside our thread worker.

  1. The error you describe here is odd, I've never seen that before. It almost looks like there is something wrong with your vigra installation? I can't reproduce this.
  2. Yes, the progress bar works now. (Later, I thought we might change it to match this example, but that is NOT working right now, despite me using the development version of magicgui, which should have fixed the bug. It's fine, we have a working solution.)
  3. Yes, writing the output embeddings to file looks like it's working well. I will fix the test case by sorting lists before asserting they are the same, so it understands os.listdir output of ['.zattrs', '.zgroup', 'features'] is the same as ['.zgroup', '.zattrs', 'features']
  4. What is NOT working well is the global IMAGE_EMBEDDINGS variable. According to python, it doesn't exist when I try to access it from outside the thread worker that sets it (eg: in the main embedding_widget function, or in the test case).

How to fix the problem described by 4, above?

  • Option 1: We could try to understand how global variables (this would be ideal, but I don't understand this well and am not confident)
  • Option 2: Just remove the thread worker, and have the embedding computation block interaction with napari while it runs. (A good short term solution, especially if we can't figure out option 1 quickly)
  • Option 3: We could refactor the code to not use global variables. (This would be a very large undertaking, and I'd prefer we picked a quicker solution for this, even if refactoring might be in the long term plan)

@GenevieveBuckley
Copy link
Collaborator Author

I think we have one more problem, actually.

When you click the "Compute Embeddings" button a second time, nothing happens. That's ok, it checks the zarr and finds stuff already there, so it doesn't re-compute anything. That's good, even if it is a little disconcerting that there is no user feedback that the operation was sucessful (maybe we should have a "Finished" dialog box pop up or something).

BUT, when you change the input image layer, but forget to update the output zarr filepath... it doesn't seem to check whether the embedding contents in the zarr directory match the input image or not. I thought PR #67 fixed this. I don't see an error or warning message anywhere, and it is returning without computing anything, just like before (except now, this is not the behaviour we want).

@GenevieveBuckley
Copy link
Collaborator Author

Odd test failure here, something is strange. It passes locally on my machine, but on the github actions CI, we get:

FAILED test/test_training.py::TestTraining::test_training - RuntimeError: Unsupported device: cpu
Please choose from 'cpu', 'cuda', or 'mps'.

"cpu" should always be available, what is going on

@constantinpape
Copy link
Contributor

constantinpape commented Oct 23, 2023

Hi @GenevieveBuckley,

there's quite a lot to follow up on, so I will structure it a bit:

Running the plugin

I am not sure where the error with vigra comes from, but it indeed seems to be some error in my environment(s) (tested in two now), that happens when it's imported multiple times. I can fix it for now by commenting out a few imports that we don't need for testing the plugin. Let's ignore this for now, and we can look into this more if the issue persists once we want to merge this (or even later once we have a more advanced state on dev). You can see what I had to change in the proof of concept singleton PR (see below).

Global state

I played with this a bit now, and you're right this is tricky to do with the global variables. I don't think that this is a big issue going forward though, since we want to get rid of the global variables anyways! I wrote a quick proof of concept for replacing it with a singleton class and that works!
See GenevieveBuckley#3. For testing that it works I have used this script: https://github.com/computational-cell-analytics/micro-sam/blob/wip-singleton/check_embed_widget.py and when pressing p I can check that the image_embeddings are computed (and it indeed works!).
So I suggest as next steps:

  • I will implement the global state cleanly tomorrow with a singleton pattern (see also Refactor global state #201). I talked to Paul and he's too busy for this right now, but this is actually not that much work, and makes most sense for me to do it anyways because I know what state should go in, so I can do it tomorrow.
  • Once that is done you can update the PR here using the singleton instead.

Cached embeddings

Indeed this should be addressed by #67 and you should at least see a warning when the wrong file is used. But maybe the warning is suppressed because it happens in the thread worker?
You can also pass the wrong_file_callback that is called when this happens. You could use that to investigate this further, e.g. crash in that function to make sure the code is triggered.
(The idea behind this callback is to give the user the option to provide a new path, e.g. by opening a GUI where it can be entered, see https://github.com/computational-cell-analytics/micro-sam/blob/master/micro_sam/sam_annotator/gui_utils.py. I am not sure if this fits with the napari plugin logic, we can also change this a bit if needed.)

Test error

That is weird. Unfortunately I can't run the test locally right now to check it because of

$ pytest -k test_annotator_2d
ERROR: usage: pytest [options] [file_or_dir] [file_or_dir] [...]
pytest: error: unrecognized arguments: --cov=micro_sam --cov-report
  inifile: /home/pape/Work/my_projects/micro-sam/pyproject.toml
  rootdir: /home/pape/Work/my_projects/micro-sam

Could you documented what's needed to run the tests now? Thanks!

@constantinpape
Copy link
Contributor

I implemented the singleton state in #240. This already works and if you want you can merge it here or rebase onto that PR @GenevieveBuckley. I only need to update it tomorrow for the 3d and tracking annotator, but that should not interfere with the changes here.

@GenevieveBuckley
Copy link
Collaborator Author

Could you documented what's needed to run the tests now? Thanks!
pytest: error: unrecognized arguments: --cov=micro_sam --cov-report

This looks like you don't have pytest-cov installed in your environment? Can you check that?

We install a couple of pytest plugins (pytest-cov, pytest-qt) into our CI test environment.

You're right, documenting this in a contributing guide is a good idea.

@constantinpape
Copy link
Contributor

Hi @GenevieveBuckley,
a few updates from my side:

  • I have finished the state refactoring, see Implement singleton state and use it in the 2d annotator #240 for details. If you agree with that solution you can merge that PR and use the AnnotatorState here. See my prototype implementation in WIP refactor state in singleton GenevieveBuckley/micro-sam#3 for how it can be used and tested here.
  • I installed pytest-cov and can run the tests now locally, thanks!
  • I created Developer documentation #242 for the developer documentation.
  • The test fails because of https://github.com/computational-cell-analytics/micro-sam/blob/master/test/test_training.py#L80 : instead of using device = "cpu" we use device = torch.device("cpu") there, so the comparisons all fail. I can reproduce this locally, not sure why it works for you locally. We should make sure to also support torch.device in these functions and add unit-tests specifically for this as well.
    • We can either do this in the current PR.
    • Or do it in a follow-up PR, and for now just change to device = "cpu" in test_training.py as well. (I could take care of this, so no problem if you'd rather not do it now to focus on the other parts of this PR. Then please just create an issue about this and assign it to me. Description can be vague, I know what to do, just need an issue to keep track of it...)

@GenevieveBuckley
Copy link
Collaborator Author

I think this is ready now - I'm very happy with how it has turned out!

  • I've used your singleton AnnotatorState in the embedding widget. It's beautiful, and seems to work really well (🚀 Implement singleton state and use it in the 2d annotator #240)
  • I've added an extra section to the test, so now it also checks the AnnotatorState values in-memory, as well as the results that are saved to disk.
  • I've made some small changes so that test_training.py passes, so we don't have to worry about that test failing. Later, we can make a proper fix for Accept str and torch.device in device handling; add unittests #246, but for now we don't have to worry about it.
  • The tests pass locally on my machine, our remote CI test job has passed, and I can open napari and manually interact with the embedding widget - everything looks like it makes sense.

@GenevieveBuckley
Copy link
Collaborator Author

Extra notes:

  1. I haven't tried to use the 'custom weights' option (mostly because I don't have any custom weights available). So that is untested. If somebody else could take a look at that, it would be helpful. This could be a follow up issue, I don't think it has to block this PR.

  2. I've put in a bit of code to make sure the save_path directory exists and is empty. It prints an error message to the terminal if the directory is not empty.
    A follow up piece of work for this might be trying to figure out why we don't see the hash comparison working (hint: we think it's because that happens in the thread worker), and/or trying to make a pop up dialog box to make the error more obvious than just printing to a terminal the user might not see.

  3. For future work, we might like to investigate pop up dialog boxes more generally. It could be good for both error messages, and for informing the user when the embedding computation has completed (good for user feedback in both very short, and very long running computations). Possibly low priority, though.

Copy link
Contributor

@constantinpape constantinpape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @GenevieveBuckley! I ran this again and it works nicely. I also tested with custom weights and it seems to work. To fully check all of this we will need to integrate with the other widgets, but we can go ahead with this now.

I found a few small things that are not related to the core logic. I will just go ahead and fix these so that we can move ahead.

And I will create a few follow-up issues for some of the things you mentioned and some things I noticed.

micro_sam/sam_annotator/_widgets.py Outdated Show resolved Hide resolved
micro_sam/util.py Outdated Show resolved Hide resolved
micro_sam/util.py Outdated Show resolved Hide resolved
test/test_training.py Outdated Show resolved Hide resolved
Copy link
Contributor

@constantinpape constantinpape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While going through the code again I noticed an issue with the core logic. I think this is easy to fix by removing the check but won't do it myself since I may overlook some intention behind adding this check.

raise NotADirectoryError(
f"The user selected 'save_path' is not a direcotry: {save_path}"
)
if len(os.listdir(save_path)) > 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this defeats the purpose of caching the affinities: the idea is that a user can close the annotation tool and open it again later while pointing to the same save_path so that the embeddings don't have to be recomputed.
This check prevents this and makes caching the embeddings not useful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also relates to your comment from above:

I've put in a bit of code to make sure the save_path directory exists and is empty. It prints an error message to the terminal if the directory is not empty.

Printing the error here is not a good idea. Maybe you intended this because of the problem with the hash comparison, but I don't want to print an error for the correct behavior of using already cached affinities.
We could use a warning instead for now and remove this once we figure out the issue with the hash.

Copy link
Collaborator Author

@GenevieveBuckley GenevieveBuckley Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's reasonable - I've changed the code to fix this problem.
Now it works in all situations:

  1. Creating new image embeddings,
  2. Returning/loading existing image embeddings if the data_signature of the save directory matches our input image, and
  3. Making an error popup in the napari viewer if something goes wrong (data signature does not match, save directory is not a zarr array or empty folder, anything else unexpected that raises an actual error).

Copy link

codecov bot commented Nov 7, 2023

Codecov Report

Merging #235 (ca3c77e) into dev (ec7adad) will increase coverage by 1.41%.
Report is 34 commits behind head on dev.
The diff coverage is 52.32%.

@@            Coverage Diff             @@
##              dev     #235      +/-   ##
==========================================
+ Coverage   38.34%   39.76%   +1.41%     
==========================================
  Files          30       33       +3     
  Lines        3987     4170     +183     
==========================================
+ Hits         1529     1658     +129     
- Misses       2458     2512      +54     
Files Coverage Δ
micro_sam/__init__.py 100.00% <ø> (ø)
micro_sam/training/sam_trainer.py 92.10% <100.00%> (ø)
micro_sam/training/util.py 84.48% <100.00%> (-0.27%) ⬇️
micro_sam/visualization.py 14.13% <ø> (-0.93%) ⬇️
micro_sam/instance_segmentation.py 54.79% <85.71%> (+0.29%) ⬆️
micro_sam/util.py 67.58% <91.66%> (+2.35%) ⬆️
micro_sam/precompute_state.py 22.22% <0.00%> (-0.97%) ⬇️
micro_sam/_vendored.py 67.05% <62.50%> (-1.00%) ⬇️
micro_sam/inference.py 90.00% <90.00%> (ø)
micro_sam/sam_annotator/_state.py 87.23% <87.23%> (ø)
... and 7 more

📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today!

save_path = str(save_path),
ndim=ndim,
)
state.image_shape = image_data.shape
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the image_shape attribute supposed to include the RGB dimension in the case of a colour image?
Or does this need to be:

if image.rgb:
    state.image_shape = image_data.shape[:-1]
else:
    state.image_shape = image_data.shape

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shape is supposed to only include the spatial dimensions. So we indeed need to add the condition for rgb images. Thanks for catching this!

@@ -552,7 +579,7 @@ def precompute_image_embeddings(
continue
# check whether the key signature does not match or is not in the file
if key not in f.attrs or f.attrs[key] != val:
warnings.warn(
raise RuntimeError(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upgrading this from a user warning to an actual error allows users to see a little pop up happen in the bottom of the napari viewer if an error happens. I think this is sufficient for what we need.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good! I think it's good to update this to a real error in any case, since having different embeddings will lead to non-sensical results. I think we had a warning there in the beginning because we were still testing this functionality out, and then never got around to updating it to an error.

Copy link
Contributor

@constantinpape constantinpape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes @GenevieveBuckley! This looks good now. The only thing left to do is updating how the image_shape is set.

@GenevieveBuckley
Copy link
Collaborator Author

Thanks for the changes @GenevieveBuckley! This looks good now. The only thing left to do is updating how the image_shape is set.

Updated!

@constantinpape constantinpape merged commit 3c26798 into computational-cell-analytics:dev Nov 8, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants