Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REF, ENH] Add CBMAEstimator base class #232

Merged
merged 14 commits into from
May 26, 2020
Merged

Conversation

tsalo
Copy link
Member

@tsalo tsalo commented May 26, 2020

Closes #231, closes #194, closes #171 (although studies with missing coordinates would still probably be a problem). References #170 and #195. Supersedes #179.

Regarding #195- this uses masker throughout CBMA methods so that it's at least consistent.

Changes proposed in this pull request:

  • Use masker throughout CBMA methods. (masker handling #194 and Clean up masking/preprocessing in CBMAEstimator hierarchy #195)
  • Change ALE “n” to “sample_size”.
  • Remove "n"/"sample_size" from coordinates attribute. (Drop sample size from Dataset.coordinates #231)
  • Include "coordinates" in Dataset.get (Add coordinate support to Dataset.get #171)
    • This loops through Dataset IDs, splits up the coordinates DataFrame based on ID, and merges them again at the end. This is necessary for Dataset.get, but is also slow and a little pointless.
    • There is nothing to handle missing data (e.g., studies with no coordinates), but there will be in the future.
  • Add CBMAEstimator base class.

    • This base class pulls metadata required by kernel transformer from dataset’s metadata into dataset’s coordinate dataframe as part of _preprocess_input.
    • Modified coordinates DataFrame is stored as Estimator.inputs_['coordinates'] now! This is great because we don't need to mess with the actual dataset.
  • Change IBMAEstimator to MetaEstimator. CBMAEstimator inherits from this to add masker and check input types.
    • Input type checking is not implemented yet for the CBMAEstimator (i.e., it checks against nothing), but this will be helpful when we figure that out. Minimal input type checking is performed (see above).
  • Wonderful references formatting in docstrings.
  • Update and refactor tests:
    • No more monkey-patching pytest for fixtures.
    • Drop custom datasets in CBMA tests because we now require more features and mocking up a dummy would be too much effort.

Notes

  • Need to squeeze masker.transform outputs. Unlike apply_mask, which returns (n_voxels,) arrays when only one 3D image is provided, transform returns (1, n_voxels) arrays.

tsalo added 14 commits May 24, 2020 23:27
- This base class pulls metadata required by kernel transformer from
dataset’s metadata into dataset’s coordinate dataframe as part of
_preprocess_input.
- Change ALE “n” to “sample_size”
- Change IBMAEstimator to MetaEstimator. CBMAEstimator inherits from
this.
- Need to squeeze masker.transform outputs. Unlike apply_mask, which
returns (n_voxels,) arrays when only one 3D image is provided,
transform returns (1, n_voxels) arrays.
- No more monkey-patching pytest for fixtures.
- Drop custom datasets in CBMA tests because we now require more
features and mocking up a dummy would be too much effort.
- Use masker throughout CBMA methods.
@codecov
Copy link

codecov bot commented May 26, 2020

Codecov Report

Merging #232 into master will decrease coverage by 0.07%.
The diff coverage is 92.76%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #232      +/-   ##
==========================================
- Coverage   61.39%   61.32%   -0.08%     
==========================================
  Files          65       65              
  Lines        4536     4494      -42     
==========================================
- Hits         2785     2756      -29     
+ Misses       1751     1738      -13     
Impacted Files Coverage Δ
nimare/annotate/boltzmann.py 77.77% <ø> (ø)
nimare/annotate/cogat.py 69.87% <ø> (ø)
nimare/annotate/cogpo.py 80.00% <ø> (ø)
nimare/annotate/gclda.py 90.38% <ø> (ø)
nimare/annotate/lda.py 26.66% <ø> (ø)
nimare/annotate/text2brain.py 85.71% <ø> (ø)
nimare/annotate/word2brain.py 85.71% <ø> (ø)
nimare/decode/continuous.py 38.59% <ø> (ø)
nimare/decode/discrete.py 22.96% <ø> (ø)
nimare/decode/encode.py 82.75% <ø> (ø)
... and 23 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 96bec08...98ee66d. Read the comment docs.

@tsalo tsalo merged commit 1c9084f into neurostuff:master May 26, 2020
@tsalo tsalo deleted the ref/dataset branch May 26, 2020 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Drop sample size from Dataset.coordinates masker handling Add coordinate support to Dataset.get
1 participant