Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type annotations #656

Merged
merged 35 commits into from
Apr 29, 2023
Merged

Add type annotations #656

merged 35 commits into from
Apr 29, 2023

Conversation

jwodder
Copy link
Member

@jwodder jwodder commented Mar 29, 2023

Closes #653.


Problems encountered so far with applying type annotations:

  • heudiconv.dicoms.group_dicoms_into_seqinfos(): If grouping == "custom" and custom_grouping is a callable, the return value is the result of applying custom_grouping() to some arguments; otherwise, the return value is something else. Normally, this could be annotated by using overloads and Literal["custom"], but whenever group_dicoms_into_seqinfos() is called in the heudiconv code, grouping (if present) is always passed as a variable rather than a string literal, and so Literal is unusable here.
  • The docstring for heudiconv.dicoms.create_seqinfo() states that the first argument is of type nibabel.nicom.dicomwrappers.MosaicWrapper, yet here the supplied argument (obtained from validate_dicom()) is just a nibabel.nicom.dicomwrappers.Wrapper.
  • SeqInfo.series_id is clearly meant to be a str, yet there are several places in heudiconv/heuristics/example.py where this field is compared against or assigned to an int variable.

@codecov
Copy link

codecov bot commented Mar 29, 2023

Codecov Report

Patch coverage: 84.89% and project coverage change: +0.39 🎉

Comparison is base (502bf49) 81.48% compared to head (c03a5ae) 81.87%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #656      +/-   ##
==========================================
+ Coverage   81.48%   81.87%   +0.39%     
==========================================
  Files          41       41              
  Lines        3899     4116     +217     
==========================================
+ Hits         3177     3370     +193     
- Misses        722      746      +24     
Impacted Files Coverage Δ
heudiconv/info.py 100.00% <ø> (ø)
heudiconv/heuristics/example.py 7.69% <9.25%> (+3.46%) ⬆️
heudiconv/tests/anonymize_script.py 42.85% <28.57%> (-11.69%) ⬇️
heudiconv/heuristics/uc_bids.py 15.62% <33.33%> (+8.72%) ⬆️
heudiconv/heuristics/studyforrest_phase2.py 23.07% <45.45%> (+10.03%) ⬆️
heudiconv/cli/monitor.py 34.40% <50.00%> (+2.54%) ⬆️
heudiconv/heuristics/multires_7Tbold.py 21.73% <53.33%> (+7.45%) ⬆️
heudiconv/tests/test_monitor.py 46.87% <56.09%> (+4.19%) ⬆️
heudiconv/heuristics/bids_with_ses.py 11.90% <71.42%> (+6.77%) ⬆️
heudiconv/convert.py 85.53% <78.57%> (-1.76%) ⬇️
... and 29 more

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@yarikoptic
Copy link
Member

I just merged a small PR which introduced conflicts, please resolve.

@jwodder jwodder added the internal Changes only affect the internal API label Apr 10, 2023
@jwodder jwodder force-pushed the gh-653 branch 8 times, most recently from 83f86c4 to dcabd66 Compare April 12, 2023 12:53
@jwodder
Copy link
Member Author

jwodder commented Apr 12, 2023

@nipy/team-heudiconv I am currently facing the following blockers to applying type annotations:

  • I believe I may have found a bug, and I'm not sure how to address it. I've determined that the ses argument to heudiconv.convert.prep_conversion() is of type str | int | None (See explanation below), yet here ses is passed to sanitize_label(), which only accepts strs and will error if given an int. What should be done about this?

  • Some of the code in heudiconv/convert.py extracts fields from BIDS JSON (sidecar?) files, yet I cannot determine what types these fields are supposed to be. Specifically, the fields in question are:

    Just what are these fields?


Explanation for why ses appears to be of type str | int | None: prep_conversion() is only called at one point, here, where the ses argument is set to session, which is last assigned to here and here.

  • At the first location, session is assigned from an element of a key returned by get_study_sessions(), and all of these keys are of type StudySessionInfo, and so session must accept all types defined for StudySessionInfo.session.

    • At this location in get_study_sessions(), StudySessionInfo.session is initialized from either session_ or session.

      • session_ is a "key" from the values returned by get_extracted_dicoms() and thus is of type Optional[int].

      • session is an argument to get_study_sessions(), which is only called at two locations, both in heudiconv/main.py.

        • At this location in workflow(), session comes from the argument of the same name to workflow(), which is of type Optional[str].
        • At this location in process_extra_commands(), session comes from the argument of the same name to process_extra_commands(), which (despite what the docstring says) is of type Optional[str], as process_extra_commands() is called with the same session as passed to workflow() and thus must include all of its types.

        Therefore, the session argument to get_study_sessions() must be of type Optional[str].

      Therefore, the type of StudySessionInfo.session is str | int | None.

  • At the second location, session is assigned from session_manual when it's non-None, and this variable is equal to the original session argument passed to workflow() and thus of type Optional[str].

Therefore, the session variable passed as the ses argument to prep_conversion() will be either str | int | None or str (i.e., just str | int | None), and so ses in prep_conversion() has the same types.

@jwodder
Copy link
Member Author

jwodder commented Apr 12, 2023

@nipy/team-heudiconv Also, side question: In this code:

for pub, priv in DICOM_FIELDS_TO_TEST.items():
# ensure missing public tag
with pytest.raises(AttributeError):
dcm.pub

is dcm.pub supposed to be getattr(dcm, pub)? That would seem to make more sense.

@jwodder
Copy link
Member Author

jwodder commented Apr 12, 2023

@nipy/team-heudiconv Yet another issue: In the following code:

heudiconv/heudiconv/bids.py

Lines 711 to 717 in 2c6d228

from nibabel import load as nb_load
nifti_file = glob(remove_suffix(json_file, ".json") + ".nii*")
assert len(nifti_file) == 1
nifti_file = nifti_file[0]
nifti_header = nb_load(nifti_file).header
key_info = [nifti_header.get_best_affine(), nifti_header.get_data_shape()[:3]]

the header.get_best_affine() and header.get_data_shape() methods are not present on all possible return types of nibabel.load(); as far as I can tell, they are only present when nibabel.load() returns an instance of one of the following types:

  • Nifti1Pair
  • Nifti1Image
  • AnalyzeImage
  • MGHImage

Exactly what type is nb_load(nifti_file) expected to return here?

@jwodder
Copy link
Member Author

jwodder commented Apr 12, 2023

@nipy/team-heudiconv This should hopefully be the last typing issue I post: The prov_file argument to heudiconv.dicoms.embed_metadata_from_dicoms() can apparently only be str, yet in this call to the function, the provided prov_file argument can be either a str or None. What should be done about this?

@jwodder
Copy link
Member Author

jwodder commented Apr 25, 2023

@yarikoptic I still need a resolution for the issues mentioned in my top comment.

@yarikoptic
Copy link
Member

  • heudiconv.dicoms.group_dicoms_into_seqinfos(): If grouping == "custom" and custom_grouping is a callable, the return value is the result of applying custom_grouping() to some arguments; otherwise, the return value is something else. Normally, this could be annotated by using overloads and Literal["custom"], but whenever group_dicoms_into_seqinfos() is called in the heudiconv code, grouping (if present) is always passed as a variable rather than a string literal, and so Literal is unusable here.

ATM it seems that no shipped along heuristic defines custom groupping:

❯ git grep 'def grouping'
docs/heuristics.rst:    def grouping(files, dcmfilter, seqinfo):

and it is somewhat documented at https://heudiconv.readthedocs.io/en/latest/heuristics.html?highlight=grouping#grouping-string-or-grouping-files-dcmfilter-seqinfo and was introduced in #359 . The expectation is that if that is the callable - we get the same dict[SeqInfo, list[str]] (mapping from SeqInfo to the list of DICOM files) so we should cast result into that I guess. And flatten seems to be ignored. FWIW, I am not aware of any heuristic which actually used that feature.

  • The docstring for heudiconv.dicoms.create_seqinfo() states that the first argument is of type nibabel.nicom.dicomwrappers.MosaicWrapper, yet here the supplied argument (obtained from validate_dicom()) is just a nibabel.nicom.dicomwrappers.Wrapper.

I think the docstring overspecified and it should be just a Wrapper.

  • SeqInfo.series_id is clearly meant to be a str, yet there are several places in heudiconv/heuristics/example.py where this field is compared against or assigned to an int variable.

example.py also has an explicit str() of its value... I think it is just generally inconsistent there, as since just an example (ie not really used) -- can be and is likely buggy. I am trying to wrap my head around there...

@jwodder
Copy link
Member Author

jwodder commented Apr 25, 2023

@yarikoptic Specifying that the custom_grouping argument to group_dicoms_into_seqinfos() must return dict[SeqInfo, list[str]] doesn't solve the problem, as group_dicoms_into_seqinfos() can return either that or dict[Optional[str], dict[SeqInfo, list[str]]] depending on the value of flatten — yet the function is called with a custom_grouping and flatten=True here, and it's called with a custom_grouping and flatten=False here. The only way for mypy to be sure what type it's getting would be if the values of flatten, grouping, and custom_grouping were all statically known.

@yarikoptic
Copy link
Member

but it seems that we can't know them statically and moreover we "violate" it in case of flatten=False whenever Callable custom_groupping is provided. So -- may be for now just comment out the @overloads you defined for the group_dicoms_into_seqinfos and add a comment that such clear/strict typing is not needed since flatten is ignored in case of Callable custom_groupping which is expected to return the other kind?

@yarikoptic
Copy link
Member

FWIW -- just checked that it would not be a solution since then code which calls group_dicoms_into_seqinfos would not be able to tell one or another...

since we would need just 1 value to %s!
…ere just number to be used

Not sure if that heuristic is even usable any longer, since no tests for it
were done etc.
@yarikoptic
Copy link
Member

I've tuned up that example.py, fixed another minor bug (can go outside of this PR in principle) and also added that casting

-            return custom_grouping(files, dcmfilter, SeqInfo)
+            return cast(Dict[SeqInfo, List[str]],
+                        custom_grouping(files, dcmfilter, SeqInfo))

and mypy seems to be happy for me locally, so may be cast'ing is just enough here ?

@yarikoptic
Copy link
Member

hm, on CI (where we have py 3.7) it still fails with following

heudiconv/convert.py:55: error: Cannot find implementation or library stub for
module named "type_extensions"  [import]
            from type_extensions import TypedDict
    ^
heudiconv/convert.py:55: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
heudiconv/convert.py:57: error: Unexpected keyword argument "total" for
"__init_subclass__" of "object"  [call-arg]
        class PopulateIntendedForOpts(TypedDict, total=False):
        ^
.tox/typing/lib/python3.7/site-packages/mypy/typeshed/stdlib/builtins.pyi:113: note: "__init_subclass__" of "object" defined here
Found 2 errors in 1 file (checked 46 source files)
typing: exit 1 (14.48 seconds) /home/runner/work/heudiconv/heudiconv> mypy heudiconv pid=1840
.pkg: _exit> python /opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/pyproject_api/_backend.py True setuptools.build_meta
  typing: FAIL code 1 (58.[26](https://github.com/nipy/heudiconv/actions/runs/4812417000/jobs/8567639256?pr=656#step:5:27)=setup[43.78]+cmd[14.48] seconds)
  evaluation failed :( (58.40 seconds)

may be we just should go to newer python for type checking?

@yarikoptic
Copy link
Member

cool, thanks for fixing! So we are all green -- take out of draft and let's invite others for possible review/feedback/training? @pvelasco @tsalo - interested in reviewing some type annotations ;) ?

@jwodder jwodder marked this pull request as ready for review April 26, 2023 19:41
read/write config - we know it must be dict
age - must be str or float, as we test, the rest -- not legit
apparently treat_age would have made None into "None" string which maybe_na would
not map to n/a.  So decided to make it more explicit and specific here.
I checked on sample that we do get even time as  "str" from dcmdata, so
should be matching
@yarikoptic
Copy link
Member

@jwodder please check two last small commits where I have (I think) constrained typing a little more - let me know if may be you have reservations against that.

@yarikoptic
Copy link
Member

ok -- Let's go. Thank you again @jwodder for all this monumental work!

@yarikoptic yarikoptic merged commit cc81d7d into master Apr 29, 2023
@yarikoptic yarikoptic deleted the gh-653 branch April 29, 2023 00:38
@yarikoptic yarikoptic added minor Increment the minor version when merged and removed internal Changes only affect the internal API labels May 8, 2023
@yarikoptic
Copy link
Member

although indeed internal I want next release to be minor version boost at least due to the scale of this change, so labeling it is as such

@github-actions
Copy link

github-actions bot commented May 8, 2023

🚀 PR was released in v0.13.0 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
minor Increment the minor version when merged released
Projects
None yet
Development

Successfully merging this pull request may close these issues.

initiate typing checks
2 participants