-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempted fix for Issues 26 and 28 by writing CSVs to different files #29
Conversation
Thank you, @JoshuaHess12 |
This definitely addresses #26. However, #28 still seems to be an issue. When doing
However, when quantifying multiple masks with
It seems that there is "cross-talk" between masks, where the cell position is taken from Steps to reproduce:
$ sed -n -e 1p -e '2171,2174p' cytoOnly/exemplar-001_cytoRingMask.csv | cut -d ',' -f 1,11-15 | \
sed "s/,/\t/g" | sed 's/\(\.[0-9][0-9]\)[0-9]*/\1/g'
CellID FDX1 CD357 CD1D X_centroid Y_centroid
2170 2247.09 1482.88 1064.37 1003.11 1301.05
2172 3153.31 1322.06 1337.24 1569.82 1303.80
2173 3660.94 1150.97 1252.94 675.15 1304.17
2174 2779.16 1468.5 1096.03 815.46 1301.94
$ sed -n -e 1p -e '2171,2174p' both/exemplar-001_cytoRingMask.csv | cut -d ',' -f 1,11-15 | \
sed "s/,/\t/g" | sed 's/\(\.[0-9][0-9]\)[0-9]*/\1/g'
CellID FDX1 CD357 CD1D X_centroid Y_centroid
2170 2247.09 1482.88 1064.37 1000.95 1301.36
2171 3153.31 1322.06 1337.24 1040.01 1298.77
2172 3660.94 1150.97 1252.94 1569.5 1303.27
2173 2779.16 1468.5 1096.03 675.52 1304.18
|
Following up on the above, the likely culprit is in the following: Here, but then get concatenated to all other tables: This concatenation assumes that the same set of cells is present in every mask. Unfortunately, this assumption is violated when a cell has zero area (as in the cytoplasm example above). A suggested fix is to fully isolate the processing of a single mask file, including the extraction of Cell IDs. The outer loop can then call the corresponding function with a single mask a time, which will ensure that no "cross-talk" between masks happens. |
I think the processing of each mask is already uncoupled in the for loop -- there isn't any crosstalk between the masks with the way this pull request exports the CSVs. The CellIDs are mismatched because regionprops in Python automatically enumerates the CellIDs for us by sweeping from left to right across the image. If there is no cytoplasm object for a cell, then all the other CellIDs for the cytoplasm mask will be shifted up by a value of one in the CellID column of the cytoplasm CSV compared to the nucleus CSV file. I think one way to fix this would be to do a 1-nearest neighbor assignment from the other CSV files to the nuclei CSV file based on their spatial coordinates. If we assume that the cytoplasm of each cell is always going to be closest to its own nucleus then this may work. We could relabel all other CellID rows in the mismatched CSVs according to the index of their nearest neighbor in the nuclei CSV. |
Wait, you may be right @ArtemSokolov . Sorry about that. I will look at this a little more. |
Thanks for looking into it, @JoshuaHess12
So, I actually had this concern before also, but I verified with Clarence that
I think the end goal is just to ensure that the output |
@ArtemSokolov No problem! I think this makes sense now. I moved the extraction of Cell IDs inside the loop so that it gets executed separately for each mask. Let me know if the latest commit addresses the issue. |
Work great, @JoshuaHess12! I can confirm that |
Adjusted code to write separate CSVs for each input mask rather than concatenating quantification output into a single CSV file.