-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2) ISS spot finding refactor #1518
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1518 +/- ##
==========================================
+ Coverage 87.48% 88.03% +0.54%
==========================================
Files 145 149 +4
Lines 5058 5165 +107
==========================================
+ Hits 4425 4547 +122
+ Misses 633 618 -15
Continue to review full report at Codecov.
|
69e0b02
to
5eae3cc
Compare
1cca1cd
to
6f2cff4
Compare
742ac39
to
8dfc730
Compare
ad12586
to
a484e73
Compare
8dfc730
to
a3770cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes me a bit nervous that this doesn't delete the existing code that's being replaced. I feel like that PR should exist before this stack lands.
@@ -30,22 +30,23 @@ def iss_pipeline(fov, codebook): | |||
filt = Filter.WhiteTophat(masking_radius, is_volume=False) | |||
filtered = filt.run(registered, verbose=True, in_place=False) | |||
|
|||
# detect spots using laplacian of gaussians approach | |||
p = DetectSpots.BlobDetector( | |||
lp = FindSpots.BlobDetector( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find both the new and old naming conventions for the instance of blob detector confusing. I typically use the convention of the first letters of the algorithm (here: bd
) but would be open to discussion of the right convention for us to stick with.
notebooks/py/ISS.py
Outdated
|
||
# parameters to define the allowable gaussian sizes (parameter space) | ||
p = DetectSpots.BlobDetector( | ||
lp = FindSpots.BlobDetector( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above comment.
@@ -0,0 +1,23 @@ | |||
from itertools import product |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in file name. decoding_uitls.py -> decoding_utils.py
from starfish.core.types import Axes, Features, SpotFindingResults | ||
|
||
|
||
def build_spot_traces_exact_match(spot_results: SpotFindingResults): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs more docs. :-)
a484e73
to
4b53ae7
Compare
a3770cc
to
bb8334b
Compare
4b53ae7
to
ed717ad
Compare
df427e7
to
d6bffec
Compare
8cb311e
to
d7c7372
Compare
c4011ba
to
5d6b51a
Compare
4c3fc56
to
82b52da
Compare
e6d6b21
to
5997651
Compare
3fa68c1
to
ccf0b98
Compare
dab16e9
to
9f13f39
Compare
ccf0b98
to
ed7b64d
Compare
29bee54
to
c24de8b
Compare
f090a90
to
bf35640
Compare
280c0e3
to
7256eaf
Compare
bf35640
to
c220e44
Compare
d5f7f68
to
fb9b662
Compare
c220e44
to
8338ae1
Compare
fb9b662
to
120f989
Compare
120f989
to
330e102
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find measure_spot_intensity and measure_spot_intensities confusing. They do appear to do different things, but they have such similar names that I think it'll be endlessly confusing. They don't even have the same contract.
Also, will there be non-exact spot traces? I recall that there are. Is it fair to say we will never swap them out?
notebooks/py/ISS.py
Outdated
dots_max_projector = Filter.Reduce((Axes.ROUND, Axes.ZPLANE), func="max", module=Filter.Reduce.FunctionSource.np) | ||
dots_max = dots_max_projector.run(dots) | ||
# locate spots in a reference image | ||
# spots = lp.run(reference_image=dots_max, image_stack=registered_imgs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove this.
notebooks/py/ISS.py
Outdated
# EPY: END code | ||
|
||
# EPY: START code | ||
# EPY: ESCAPE # EPY: START code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should just be removed.
from starfish.core.types import Axes, Features, SpotFindingResults | ||
|
||
|
||
def build_spot_traces_exact_match(spot_results: SpotFindingResults): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def build_spot_traces_exact_match(spot_results: SpotFindingResults): | |
def build_spot_traces_exact_match(spot_results: SpotFindingResults) -> IntensityTable: |
is there a use for this elsewhere?
) | ||
indices = product(ch_labels, round_labels) | ||
for c, r in indices: | ||
intensity_table.loc[dict(c=c, r=r)] = \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can probably write this without the backslash, which is always a bit cumbersome to have in the code:
intensity_table.loc[dict(c=c, r=r)] = spot_results.get_spots_for_round_ch(
{Axes.ROUND: r, Axes.CH: c}).data[Features.INTENSITY]
|
||
def build_spot_traces_exact_match(spot_results: SpotFindingResults): | ||
# create IntensityTable with same x/y/z info accross all r/ch | ||
spot_attributes = spot_results.get_spots_for_round_ch({Axes.ROUND: 0, Axes.CH: 0}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the spots guaranteed to be in r=0,c=0? Or is the spot results table for this being generated from a max projected image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no this line just grabs the spot attributes from the first tile (since there all in the same location) then fills in the intensity for the rest of the rounds/chs and those locations.
from starfish.core.types import Axes, Features, SpotFindingResults | ||
|
||
|
||
def build_spot_traces_exact_match(spot_results: SpotFindingResults): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def build_spot_traces_exact_match(spot_results: SpotFindingResults): | |
def build_spot_traces_exact_match(spot_results: SpotFindingResults) -> IntensityTable: |
round_labels = spot_results.round_labels | ||
intensity_table = IntensityTable.zeros( | ||
spot_attributes=spot_attributes, | ||
ch_labels=ch_labels, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that #1600 does reverse the order of round and ch to match the rest of the codebase. these are kwargs, so they'll be fine, but it may be good to fix it anyway.
intensities = iss.intensities | ||
|
||
# assert that the number of spots detected is 99 | ||
assert intensities.sizes[Features.AXIS] == 99 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was this removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no longer a intensities
variable defined in ISS.py. just spots
then decoded
. I added back the assert statement for decoded
though
Both methods currently exist in detect.py, measure_spot_intensities handles looping through rounds and chs and measure_spot_intensity handles the actual measuring on a single tile. Should I rename them?
Yes, see #1592 for an example of nearest neighbor spot traces and how you can choose which strategy you prefer |
1049703
to
372b2be
Compare
Yeah, I think it's likely to cause some confusion. I generally like to err on the side of descriptive function names.
Cool, I'll check that out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of small style nits, but otherwise lgtm.
notebooks/py/ISS.py
Outdated
@@ -56,7 +56,8 @@ | |||
# EPY: START markdown | |||
### Visualize Codebook | |||
# | |||
#The ISS codebook maps each barcode to a gene.This protocol asserts that genes are encoded with a length 4 quatenary barcode that can be read out from the images. Each round encodes a position in the codeword. The maximum signal in each color channel (columns in the above image) corresponds to a letter in the codeword. The channels, in order, correspond to the letters: 'T', 'G', 'C', 'A'. | |||
#The ISS codebook maps each barcode to a gene. This protocol asserts that genes are encoded with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should unwrap this. Exception to our normal 100col limit, because it makes the jupyter notebook render oddly.
""" | ||
intensities = build_spot_traces_exact_match(spots) | ||
transfer_physical_coords_to_intensity_table(intensity_table=intensities, spots=spots) | ||
return self.codebook.decode_per_round_max(intensities) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of scope: I think Codebook.decode_per_round_max
should be brought over to this module. thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea that's actually something I've been thinking during this refactor. Especially since any new decoding method shouldn't have to add code to the Codebook class, I can make that PR
""" | ||
# create IntensityTable with same x/y/z info accross all r/ch | ||
spot_attributes = list(spot_results.values())[0] | ||
ch_labels, round_labels = spot_results.ch_labels, spot_results.round_labels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can inline this. Just:
intensity_table = IntensityTable.zeros(
spot_attributes=spot_attributes,
round_labels=spot_results.round_labels,
ch_labels=spot_results.ch_labels,
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haaaa that's what you meant by inline. I inlined this one ch_labels, round_labels = spot_results.ch_labels, spot_results.round_labels
ch_labels=ch_labels, | ||
) | ||
for r, c in spot_results.keys(): | ||
intensity_table.loc[dict(c=c, r=r)] = spot_results[{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meta question: I wonder if this can be done with more parallelism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
potentially...if you first merge all the spotattribute tables..this might be some pandas sorcery I need @dganguli 's help on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't need to happen now.
|
||
Parameters | ||
---------- | ||
min_sigma : float |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
min_sigma : float | |
min_sigma : Number |
If two spots have more than this fraction of overlap, the spots are combined | ||
(default = 0.5) | ||
measurement_type : str ['max', 'mean'] | ||
name of the function used to calculate the intensity for each identified spot area |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name of the function used to calculate the intensity for each identified spot area | |
name of the function used to calculate the intensity for each identified spot area (default: max) |
between `min_sigma` and `max_sigma`. | ||
threshold : float | ||
The absolute lower bound for scale space maxima. Local maxima smaller | ||
than thresh are ignored. Reduce this to detect blobs with less |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
than thresh are ignored. Reduce this to detect blobs with less | |
than ``threshold`` are ignored. Reduce this to detect blobs with less |
measurement_type : str ['max', 'mean'] | ||
name of the function used to calculate the intensity for each identified spot area | ||
detector_method: str ['blob_dog', 'blob_doh', 'blob_log'] | ||
name of the type of detection method used from skimage.feature, default: blob_log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name of the type of detection method used from skimage.feature, default: blob_log | |
name of the type of detection method used from skimage.feature (default: blob_log) |
than thresh are ignored. Reduce this to detect blobs with less | ||
intensities. | ||
is_volume: bool | ||
If True, pass 3d volumes (x, y, z) to func, else pass 2d tiles (x, y) to func. (default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If True, pass 3d volumes (x, y, z) to func, else pass 2d tiles (x, y) to func. (default | |
If True, pass 3d volumes (x, y, z) to func, else pass 2d tiles (x, y) to func. (default: |
True) | ||
overlap : float [0, 1] | ||
If two spots have more than this fraction of overlap, the spots are combined | ||
(default = 0.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(default = 0.5) | |
(default: 0.5) |
344d0eb
to
e6c6431
Compare
e6c6431
to
cf170a5
Compare
This PR refactors the ISS pipeline to use the new FindSpots, DecodeSpots workflow.
Main changes:
depends on #1517
test plan: test_iss.py works...probably need more