Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2) ISS spot finding refactor #1518

Merged
merged 1 commit into from
Oct 9, 2019
Merged

Conversation

shanaxel42
Copy link
Collaborator

@shanaxel42 shanaxel42 commented Aug 29, 2019

This PR refactors the ISS pipeline to use the new FindSpots, DecodeSpots workflow.

Main changes:

  • created a new version of blob.py that returns SpotFindingResults
  • created a new version of PerRoundMaxChannel that takes as input SpotFindingResults
  • moved the measure_spot_intensities (across rounds and chs) logic to a spot_finding_utils file
  • created a decoding_utils file with a build_spot_traces_exact_match method that creates spot traces from SpotFindingResults

depends on #1517

test plan: test_iss.py works...probably need more

@codecov-io
Copy link

codecov-io commented Aug 29, 2019

Codecov Report

Merging #1518 into master will increase coverage by 0.54%.
The diff coverage is 93.8%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1518      +/-   ##
==========================================
+ Coverage   87.48%   88.03%   +0.54%     
==========================================
  Files         145      149       +4     
  Lines        5058     5165     +107     
==========================================
+ Hits         4425     4547     +122     
+ Misses        633      618      -15
Impacted Files Coverage Δ
starfish/core/spots/DecodeSpots/trace_builders.py 100% <100%> (ø)
starfish/core/spots/DetectSpots/blob.py 95% <100%> (+0.12%) ⬆️
...spots/DecodeSpots/per_round_max_channel_decoder.py 100% <100%> (ø)
starfish/core/spots/DecodeSpots/__init__.py 100% <100%> (ø) ⬆️
starfish/test/full_pipelines/api/test_iss_api.py 100% <100%> (ø) ⬆️
starfish/core/spots/FindSpots/_base.py 83.33% <100%> (+18.62%) ⬆️
starfish/core/spots/FindSpots/__init__.py 100% <100%> (ø) ⬆️
starfish/core/spots/FindSpots/blob.py 89.13% <89.13%> (ø)
...tarfish/core/spots/FindSpots/spot_finding_utils.py 94.11% <94.11%> (ø)
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 505f73c...cf170a5. Read the comment docs.

@shanaxel42 shanaxel42 force-pushed the saxelrod-add-logging-support-spot-results branch 2 times, most recently from 69e0b02 to 5eae3cc Compare August 29, 2019 16:46
@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch from 1cca1cd to 6f2cff4 Compare August 29, 2019 16:55
@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch from 742ac39 to 8dfc730 Compare August 29, 2019 18:40
@shanaxel42 shanaxel42 force-pushed the saxelrod-add-logging-support-spot-results branch from ad12586 to a484e73 Compare August 30, 2019 18:31
@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch from 8dfc730 to a3770cc Compare August 30, 2019 18:37
Copy link
Member

@ambrosejcarr ambrosejcarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes me a bit nervous that this doesn't delete the existing code that's being replaced. I feel like that PR should exist before this stack lands.

@@ -30,22 +30,23 @@ def iss_pipeline(fov, codebook):
filt = Filter.WhiteTophat(masking_radius, is_volume=False)
filtered = filt.run(registered, verbose=True, in_place=False)

# detect spots using laplacian of gaussians approach
p = DetectSpots.BlobDetector(
lp = FindSpots.BlobDetector(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find both the new and old naming conventions for the instance of blob detector confusing. I typically use the convention of the first letters of the algorithm (here: bd) but would be open to discussion of the right convention for us to stick with.


# parameters to define the allowable gaussian sizes (parameter space)
p = DetectSpots.BlobDetector(
lp = FindSpots.BlobDetector(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above comment.

@@ -0,0 +1,23 @@
from itertools import product
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in file name. decoding_uitls.py -> decoding_utils.py

from starfish.core.types import Axes, Features, SpotFindingResults


def build_spot_traces_exact_match(spot_results: SpotFindingResults):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs more docs. :-)

@shanaxel42 shanaxel42 force-pushed the saxelrod-add-logging-support-spot-results branch from a484e73 to 4b53ae7 Compare September 17, 2019 21:51
@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch from a3770cc to bb8334b Compare September 17, 2019 21:52
@shanaxel42 shanaxel42 force-pushed the saxelrod-add-logging-support-spot-results branch from 4b53ae7 to ed717ad Compare September 17, 2019 23:43
@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch 2 times, most recently from df427e7 to d6bffec Compare September 18, 2019 18:51
@shanaxel42 shanaxel42 force-pushed the saxelrod-add-logging-support-spot-results branch from 8cb311e to d7c7372 Compare September 18, 2019 19:30
@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch 3 times, most recently from c4011ba to 5d6b51a Compare September 18, 2019 20:27
@shanaxel42 shanaxel42 force-pushed the saxelrod-add-logging-support-spot-results branch from 4c3fc56 to 82b52da Compare September 18, 2019 20:44
@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch 3 times, most recently from e6d6b21 to 5997651 Compare September 18, 2019 21:04
@shanaxel42 shanaxel42 force-pushed the saxelrod-add-logging-support-spot-results branch from 3fa68c1 to ccf0b98 Compare September 18, 2019 21:09
@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch 3 times, most recently from dab16e9 to 9f13f39 Compare September 19, 2019 17:04
@shanaxel42 shanaxel42 force-pushed the saxelrod-add-logging-support-spot-results branch from ccf0b98 to ed7b64d Compare September 19, 2019 17:58
@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch from 29bee54 to c24de8b Compare September 23, 2019 22:53
@shanaxel42 shanaxel42 force-pushed the saxelrod-add-logging-support-spot-results branch from f090a90 to bf35640 Compare September 24, 2019 16:49
@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch 2 times, most recently from 280c0e3 to 7256eaf Compare September 25, 2019 00:20
@shanaxel42 shanaxel42 force-pushed the saxelrod-add-logging-support-spot-results branch from bf35640 to c220e44 Compare September 25, 2019 20:36
@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch 4 times, most recently from d5f7f68 to fb9b662 Compare September 26, 2019 23:09
@shanaxel42 shanaxel42 changed the title ISS spot finding refactor 2) ISS spot finding refactor Oct 1, 2019
@shanaxel42 shanaxel42 force-pushed the saxelrod-add-logging-support-spot-results branch from c220e44 to 8338ae1 Compare October 1, 2019 14:16
@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch from fb9b662 to 120f989 Compare October 1, 2019 14:16
@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch from 120f989 to 330e102 Compare October 3, 2019 17:42
@shanaxel42 shanaxel42 changed the base branch from saxelrod-add-logging-support-spot-results to master October 3, 2019 17:45
Copy link
Collaborator

@ttung ttung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find measure_spot_intensity and measure_spot_intensities confusing. They do appear to do different things, but they have such similar names that I think it'll be endlessly confusing. They don't even have the same contract.

Also, will there be non-exact spot traces? I recall that there are. Is it fair to say we will never swap them out?

dots_max_projector = Filter.Reduce((Axes.ROUND, Axes.ZPLANE), func="max", module=Filter.Reduce.FunctionSource.np)
dots_max = dots_max_projector.run(dots)
# locate spots in a reference image
# spots = lp.run(reference_image=dots_max, image_stack=registered_imgs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove this.

# EPY: END code

# EPY: START code
# EPY: ESCAPE # EPY: START code
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should just be removed.

from starfish.core.types import Axes, Features, SpotFindingResults


def build_spot_traces_exact_match(spot_results: SpotFindingResults):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def build_spot_traces_exact_match(spot_results: SpotFindingResults):
def build_spot_traces_exact_match(spot_results: SpotFindingResults) -> IntensityTable:

is there a use for this elsewhere?

)
indices = product(ch_labels, round_labels)
for c, r in indices:
intensity_table.loc[dict(c=c, r=r)] = \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can probably write this without the backslash, which is always a bit cumbersome to have in the code:

        intensity_table.loc[dict(c=c, r=r)] = spot_results.get_spots_for_round_ch(
            {Axes.ROUND: r, Axes.CH: c}).data[Features.INTENSITY]


def build_spot_traces_exact_match(spot_results: SpotFindingResults):
# create IntensityTable with same x/y/z info accross all r/ch
spot_attributes = spot_results.get_spots_for_round_ch({Axes.ROUND: 0, Axes.CH: 0})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the spots guaranteed to be in r=0,c=0? Or is the spot results table for this being generated from a max projected image?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no this line just grabs the spot attributes from the first tile (since there all in the same location) then fills in the intensity for the rest of the rounds/chs and those locations.

from starfish.core.types import Axes, Features, SpotFindingResults


def build_spot_traces_exact_match(spot_results: SpotFindingResults):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def build_spot_traces_exact_match(spot_results: SpotFindingResults):
def build_spot_traces_exact_match(spot_results: SpotFindingResults) -> IntensityTable:

starfish/core/spots/FindSpots/blob.py Show resolved Hide resolved
notebooks/py/ISS.py Outdated Show resolved Hide resolved
round_labels = spot_results.round_labels
intensity_table = IntensityTable.zeros(
spot_attributes=spot_attributes,
ch_labels=ch_labels,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that #1600 does reverse the order of round and ch to match the rest of the codebase. these are kwargs, so they'll be fine, but it may be good to fix it anyway.

intensities = iss.intensities

# assert that the number of spots detected is 99
assert intensities.sizes[Features.AXIS] == 99
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no longer a intensities variable defined in ISS.py. just spots then decoded. I added back the assert statement for decoded though

@shanaxel42
Copy link
Collaborator Author

shanaxel42 commented Oct 7, 2019

I find measure_spot_intensity and measure_spot_intensities confusing. They do appear to do different things, but they have such similar names that I think it'll be endlessly confusing. They don't even have the same contract.

Both methods currently exist in detect.py, measure_spot_intensities handles looping through rounds and chs and measure_spot_intensity handles the actual measuring on a single tile. Should I rename them?

Also, will there be non-exact spot traces? I recall that there are. Is it fair to say we will never swap them out?

Yes, see #1592 for an example of nearest neighbor spot traces and how you can choose which strategy you prefer

@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch 2 times, most recently from 1049703 to 372b2be Compare October 7, 2019 16:33
@ttung
Copy link
Collaborator

ttung commented Oct 7, 2019

Both methods currently exist in detect.py, measure_spot_intensities handles looping through rounds and chs and measure_spot_intensity handles the actual measuring on a single tile. Should I rename them?

Yeah, I think it's likely to cause some confusion. I generally like to err on the side of descriptive function names.

Yes, see #1592 for an example of nearest neighbor spot traces and how you can choose which strategy you prefer

Cool, I'll check that out.

@shanaxel42 shanaxel42 requested a review from ttung October 7, 2019 18:27
Copy link
Collaborator

@ttung ttung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of small style nits, but otherwise lgtm.

@@ -56,7 +56,8 @@
# EPY: START markdown
### Visualize Codebook
#
#The ISS codebook maps each barcode to a gene.This protocol asserts that genes are encoded with a length 4 quatenary barcode that can be read out from the images. Each round encodes a position in the codeword. The maximum signal in each color channel (columns in the above image) corresponds to a letter in the codeword. The channels, in order, correspond to the letters: 'T', 'G', 'C', 'A'.
#The ISS codebook maps each barcode to a gene. This protocol asserts that genes are encoded with
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should unwrap this. Exception to our normal 100col limit, because it makes the jupyter notebook render oddly.

"""
intensities = build_spot_traces_exact_match(spots)
transfer_physical_coords_to_intensity_table(intensity_table=intensities, spots=spots)
return self.codebook.decode_per_round_max(intensities)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of scope: I think Codebook.decode_per_round_max should be brought over to this module. thoughts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea that's actually something I've been thinking during this refactor. Especially since any new decoding method shouldn't have to add code to the Codebook class, I can make that PR

"""
# create IntensityTable with same x/y/z info accross all r/ch
spot_attributes = list(spot_results.values())[0]
ch_labels, round_labels = spot_results.ch_labels, spot_results.round_labels
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can inline this. Just:

    intensity_table = IntensityTable.zeros(
        spot_attributes=spot_attributes,
        round_labels=spot_results.round_labels,
        ch_labels=spot_results.ch_labels,
    )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haaaa that's what you meant by inline. I inlined this one ch_labels, round_labels = spot_results.ch_labels, spot_results.round_labels

ch_labels=ch_labels,
)
for r, c in spot_results.keys():
intensity_table.loc[dict(c=c, r=r)] = spot_results[{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meta question: I wonder if this can be done with more parallelism.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potentially...if you first merge all the spotattribute tables..this might be some pandas sorcery I need @dganguli 's help on

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't need to happen now.


Parameters
----------
min_sigma : float
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
min_sigma : float
min_sigma : Number

If two spots have more than this fraction of overlap, the spots are combined
(default = 0.5)
measurement_type : str ['max', 'mean']
name of the function used to calculate the intensity for each identified spot area
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name of the function used to calculate the intensity for each identified spot area
name of the function used to calculate the intensity for each identified spot area (default: max)

between `min_sigma` and `max_sigma`.
threshold : float
The absolute lower bound for scale space maxima. Local maxima smaller
than thresh are ignored. Reduce this to detect blobs with less
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
than thresh are ignored. Reduce this to detect blobs with less
than ``threshold`` are ignored. Reduce this to detect blobs with less

measurement_type : str ['max', 'mean']
name of the function used to calculate the intensity for each identified spot area
detector_method: str ['blob_dog', 'blob_doh', 'blob_log']
name of the type of detection method used from skimage.feature, default: blob_log
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name of the type of detection method used from skimage.feature, default: blob_log
name of the type of detection method used from skimage.feature (default: blob_log)

than thresh are ignored. Reduce this to detect blobs with less
intensities.
is_volume: bool
If True, pass 3d volumes (x, y, z) to func, else pass 2d tiles (x, y) to func. (default
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If True, pass 3d volumes (x, y, z) to func, else pass 2d tiles (x, y) to func. (default
If True, pass 3d volumes (x, y, z) to func, else pass 2d tiles (x, y) to func. (default:

True)
overlap : float [0, 1]
If two spots have more than this fraction of overlap, the spots are combined
(default = 0.5)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(default = 0.5)
(default: 0.5)

@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch 2 times, most recently from 344d0eb to e6c6431 Compare October 8, 2019 22:48
@shanaxel42 shanaxel42 force-pushed the saxelrod-iss-new-spot-finding branch from e6c6431 to cf170a5 Compare October 8, 2019 22:56
@shanaxel42 shanaxel42 merged commit 2d1f9e6 into master Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants