Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REF] Modularize metric calculation #591

Merged
merged 27 commits into from
Jun 14, 2021
Merged

[REF] Modularize metric calculation #591

merged 27 commits into from
Jun 14, 2021

Conversation

tsalo
Copy link
Member

@tsalo tsalo commented Aug 7, 2020

Closes #501.

Changes proposed in this pull request:

  • Modularize metrics.
  • Add while loop to cluster-extent thresholding to maximize similarity in number of significant voxels between comparison maps.
  • Add signal-noise_z metric, which should ultimately replace signal-noise_t.

tsalo and others added 4 commits November 22, 2019 11:52
* Reorganize metrics.

* Some more work on organizing metrics.

* Add signal_minus_noise_z metric.

* Variable name change.

* Move comptable.

* Move T2* cap into decay function.

* Split up metric files.

* Adjust cluster-extent thresholding to match across maps.

* Partially address review.

Lots of great refactoring by @rmarkello.

* Make DICE broadcastable.

* Clean up signal-noise metrics and fix compute_countnoise.

* Simplify calculate_z_maps.

* Fix dice (thanks @rmarkello)

* Improve documentation.

* Fix import.

* Fix imports.

* Get modularized metrics mostly working.

* Fix bugs in metric calculations.

All metrics should be calculated on *masked* data. Any metric maps
should also be masked.

* Revert changes to decision tree.

* Finish reverting.

* Fix viz.

* Fix style issues

* More???

* Fix bug in generate_metrics.

* Add initial tests.

* Improve docstrings, add shape checks, and reorder functions.

* Fix assertions.

* Fix style issue.

* Add metric submodules to API docs.

* Improve reporting for T_to_Z transform.

* Fix bugs in new modularized dependence metrics.
* Reorganize metrics.

* Some more work on organizing metrics.

* Add signal_minus_noise_z metric.

* Variable name change.

* edited gitignore after testing

* Revert "edited gitignore after testing"

This reverts commit 28098e1.

* fix the +0 -0 absurdity and actually commit

* silly underscore go away

* style fix

* I swear I can spell things.

* Move comptable.

* more style

* Move T2* cap into decay function.

* Split up metric files.

* Adjust cluster-extent thresholding to match across maps.

* Partially address review.

Lots of great refactoring by @rmarkello.

* Make DICE broadcastable.

* Clean up signal-noise metrics and fix compute_countnoise.

* Simplify calculate_z_maps.

* Fix dice (thanks @rmarkello)

* use index, doesn't have component

* add the scree to the outputs

* Improve documentation.

* Fix import.

* Fix imports.

* Get modularized metrics mostly working.

* Fix bugs in metric calculations.

All metrics should be calculated on *masked* data. Any metric maps
should also be masked.

* Revert changes to decision tree.

* Finish reverting.

* Fix viz.

* Fix style issues

* More???

* Fix bug in generate_metrics.

* Add initial tests.

* Improve docstrings, add shape checks, and reorder functions.

* Fix assertions.

* Fix style issue.

* Add metric submodules to API docs.

* Improve reporting for T_to_Z transform.

* Fix bugs in new modularized dependence metrics.

* [MNT] Drop versioneer from requirements (#481)

* MNT: Drop versioneer from requirements

The versioneer package is not a requirement to use versioneer.

* MNT: Add versioneer to a new "dev" extra

* [ENH] Adds maPCA decomposition algorithm (#435)

* Create fix/decomp branch

* Started writing GIFT version in Pythonn as we found it selects a reasonable number of components

* Advanced on new PCA computation

* Solved NAN issues and added testing data

* Solved est_indp_sp bug

* PCA is SOLVED! Hooray!

* PR request ready version of the PCA fix

* fix: avoid circular imports

* Bug fixes and started working on docstrings

* [FIX] Clean up some variables in GIFT PCA (#453)

* Added version argument (#415)

* Added version argument

* changed import of __version__

* [REF] No pytest_cache dir

* [ENH] Dockerfile for dev testing

* [DOC] Add docs for code testing to CONTRIBUTING

* [REF] Docker org can't have hyphens

Just use tedana/tedana-dev namespace

* Changed default export in gzipped nifti (#416)

* Changed default export in gzipped nifti

* Updated extension of files in tests

* [FIX] Update typo in .gitignore

Co-Authored-By: Joshua Teves <[email protected]>

* [REF, DOC] Use logging to generate reports (#424)

* Use logging for report/references generation.

* Improve ContextFilter docstring.

* Remove broken newlines.

* Make the style good.

* [ENH][TST] Overhauls CI completely (#418)

* Adds integration and five echo skipping

* Style fixes

* Updates config for CircleCI

* Attempts to fix YML

* [TEST] Update Dockerfile to match new integr tests

* [TEST] Fixes integration tests in Docker image

* [FIX] Remove intermediate IO files

* Resolves merge conflict, adds output check

* Some fixes

* [TEST] Updates dev_tool testing infra

* [TEST] Fixes pytest integration path checking

* [TEST] CircleCI uses Docker image to run tests

* [FIX] Minor dev_tool issues for CircleCI

* [TEST] Use variable for integration test filename

* Attempts to fix CircleCI style check

* Revert "Attempts to fix CircleCI style check"

This reverts commit 769f4b7.

* Attempt to fix tput call

* Adds checkout to code in YML

* [TEST] Integration tests run in parallel

* [TEST] Separate data downloads from Docker build

* [TEST] Update integration test data path

* [TEST] CircleCI uses good Docker

* [TEST] No version check in circleci

* [TEST] Checkout for get_data / style check

* Attempts to fix integration test inclusion

* [TEST] Checkout for get_data / style check

* [FIX] Fix circleci config hopefully

* [FIX] No / workdir for circleci machine

* [FIX] Use ~ for coverage in circleci

* Switches integration tests to truncated length data

* [FIX] Actually merge coverage files

* [FIX] Coverage cache path circleci

* [TEST] Integration test outputs in tests/data

* [FIX] circleci config bug

* [TEST] Major testing infra overhaul

Docker image considerably slimmed down (only test python 3.6 locally),
added new dev_requirements.txt to make conda yaml files obsolete, added
Makefile to make testing easier locally (if you aren't using the Docker
image), and removed integration test data downloads from separate script
and into the integration tests themselves

* [TEST] Massive CircleCI config regression

@tsalo had it right --- moving towards a fully Dockerized implementation
was not the way forward for a simple Python package.

* [TEST] Better integration testing?

At least, more equivalent to what was happening before, where we check
that ONLY the expected output files are generated (no more, no less)

* [FIX] CircleCI workflow issue

* [MNT] No flake8-putty

* [FLK] New flake8 error detected

* [TEST] Run style check separately

@leej3 said it's not fair to stop running tests for a few minor style
errors, and he's usually right so....

* [TEST] Py37 for all non-unit test stuff

* [WIP,TEST,DOC] Builds docs in CircleCI (#428)

* [REF] Fix up dev vs non-dev requirements.txt

* [TEST] Build docs in CircleCI

Also modifies the building of the py37 environment a bit since it's
shared between so many things

* [REF] Change tab-delimited files from txt extension to tsv (#429)

* Change comptables and log from txt to tsv.

* Prevents shell expansion in API (#431)

* enh: add stale-bot (#430)

* edited gitignore after testing

* Revert "edited gitignore after testing"

This reverts commit 28098e1.

* fixing typo and clarity

* cleaning up approach introduction

* more cleanup in approach

* adding initial plot to pubs page

* correcting plots

* removing comment parts

* getting live plots working on publication page

* clean up and adding more WIP details

* completing live TE plot from spreasheet

* changing outputs to nii.gz

* adding TR and Vox dim subplots

* initial adding of output details

* fixing tables

* rearranging to considerations and recomendations

* seperating out considerations section

* technically spelling things correct

* started fixing titles

* addinf figures for what is ME page

* title clarification

* Add information about GE

* adding mutliecho questions to the FAQ

* initial adding to multiecho

* multiecho page figures and explanations

* pipeline and phsyics updated

* typo

* fix headings

* possibly clean latex?

* we have eight echoes

* more details in approach

* title change

* adding software details

* clean up of multiecho

* additional considerations!

* typo

* update plots

* due credit in faq

* more details on multi echo, usage

* tedana does accept BRIK/HEAD

* someone wanted sentences seperated

* life by a thousand little edits

* GrAmMaRs

* other software!

* links to api in approach

* white space?

* [ENH] Make png default (#427)

* Makes --png default, adds --nopng

* Fixes integration test for --nopng

* Adds correction to integration test outputs

* Update docs/faq.rst

Co-Authored-By: Elizabeth DuPre <[email protected]>

* MRI scanners != BBQ

* Update docs/approach.rst

Co-Authored-By: Elizabeth DuPre <[email protected]>

* Update docs/approach.rst

Co-Authored-By: Elizabeth DuPre <[email protected]>

* Apply suggestions from code review, thanks a million

Co-Authored-By: Elizabeth DuPre <[email protected]>

* echo cleanup, and kundu 3 banish

* noted png default

* completely changed the QC section

* pls wrk

Co-Authored-By: Elizabeth DuPre <[email protected]>

* docs: add jbteves as a contributor (#442)

* docs: update README.md

* docs: update .all-contributorsrc

* docs: add eurunuela as a contributor (#441)

* docs: update README.md

* docs: update .all-contributorsrc

* [ENH][TST] Feature/parallel integration (#437)

* Adjusts makefile to include three-echo and five-echo, remove integration
* Adjusts CircleCI config to do the same as makefile 
* Removes doc auto-building via CircleCI

* [FIX] Adds check for just one component (#405)

* Adds check for just one component

* [ENH] Consolidate stats unit tests and fix linting errors in tests (#421)

* Consolidates stats unit tests
* Changes to simpler assert for stats errors
* Oof linting

Co-Authored-By: Elizabeth DuPre <[email protected]>

* Clean up some variables.

* Fix output name.

* Fix bugs and use TruncatedSVD in GIFT.

* Normalization of data added

* doc: fix linting errors

* Second if statement in _checkOrder does not check for None values now, as these should never be given to the function.

Co-Authored-By: Elizabeth DuPre <[email protected]>

* Style editing to gift_pca.py

Co-Authored-By: Elizabeth DuPre <[email protected]>

* Style editing to gift_pca.py

Co-Authored-By: Elizabeth DuPre <[email protected]>

* Removed silly comment inherited from MATLAB

Co-Authored-By: Elizabeth DuPre <[email protected]>

* Removed print statement int gift_pca.py

Co-Authored-By: Elizabeth DuPre <[email protected]>

* Commented TruncatedSVD as it does not provide the functionality we are looking for. Also, made some edits after the comments from Elizabeth

* Removed unused import

* Changed some of the print statements to make them more user friendly

* Adds documentation

* Fixes PCA memory madness

* Fixes linter errors

* Now accounting for imprecission in kurtosis calculation

* Edited autocorr docstring

* GIFT PCA doesn't use the var normalized now in order to get the same number of components as the original GIFT algorithm. This is something we should definitely discuss

* Corrections to style

* Changed test output expected

* Removed GIFT notebook

* Made MDL the default and added GIFT reference

* Removed GIFT testing data

* Changes as commented in PR review

* Style fix

* Changed variable names to make them more readble

* Style fix

* Style fix

* Update tedana/decomposition/gift_pca.py

Co-Authored-By: Ross Markello <[email protected]>

* Changes suggested in PR review

* More changes based on PR review suggestions

* Changes to workflows/tedana.py based on PR review suggestions

* Changes to decomposition/pca.py based on PR review suggestions

* Update tedana/decomposition/gift_pca.py

Co-Authored-By: Ross Markello <[email protected]>

* Update tedana/decomposition/gift_pca.py

Co-Authored-By: Ross Markello <[email protected]>

* Update tedana/decomposition/gift_pca.py

Co-Authored-By: Ross Markello <[email protected]>

* Update tedana/decomposition/gift_pca.py

Co-Authored-By: Ross Markello <[email protected]>

* Update tedana/decomposition/gift_pca.py

Co-Authored-By: Ross Markello <[email protected]>

* Update tedana/decomposition/gift_pca.py

Co-Authored-By: Ross Markello <[email protected]>

* Update tedana/decomposition/gift_pca.py

Co-Authored-By: Ross Markello <[email protected]>

* More review changes

* Style fixes

* Finished with PR review suggestinos

* Update tedana/decomposition/gift.py

Co-Authored-By: Taylor Salo <[email protected]>

* Update tedana/decomposition/gift.py

Co-Authored-By: Taylor Salo <[email protected]>

* Renamed gift to ma_pca and added some math notes

* Added more maths-related comments

* Renamed ma_pca function

* Added unit test for maPCA and fixed a typo in ma_pca.py

* Added unit test for maPCA and fixed a typo in ma_pca.py

* Added unit test for ent_rate_sp inside maPCA

* Added test for constant input to ent_rate_sp

* Documentation fix

Co-Authored-By: Joshua Teves <[email protected]>

* Documentation fix

Co-Authored-By: Joshua Teves <[email protected]>

* Removed if statements in ent_rate_sp as they were never going to be accessed in our case. Also addressed some PR review comments.

* Style fix

* Update tedana/decomposition/ma_pca.py

Co-Authored-By: Joshua Teves <[email protected]>

* Update tedana/decomposition/ma_pca.py

Co-Authored-By: Joshua Teves <[email protected]>

* Addresses documentation suggestions

* [TST] Expands integration (#482)

* Expands integration testing to include mapca and several of its options

* [REF] Remove the sourceTEs option (#493)

* [REF] Removes mle option (#495)

* Removes mle option

* [REF] Remove confusing variable copy (#498)

* mmix -> mmix_new

* Change variable name.

* add new plot to all tests

* removing to_numpy(), plt. to fig.

* docs: add Islast as a contributor (#503)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

* [DOC] Adds developer guideline start (#446)

* Adds developer guideline start
* Modifies CONTRIBUTING
* Fixes dead link, removes project board
* Adds more thorough developing guidelines
* Fix lack of linting
* Updates dev requirements
* Adds flake8 instructions
* Updates config with correct CI version
* Adds manual selection of artifiacts through CircleCI UI
* Adds tar command
* Clarify some pytest stuff
* Offer alternative to Gitter
* Outlines steps for integration tests more clearly
* Adds filenaming convention for outputs files
* Ask not to rewrite history
* Adds two reviewer rule

Co-Authored-By: Elizabeth DuPre <[email protected]>
Co-Authored-By: Ross Markello <[email protected]>

* docs: add eurunuela as a contributor (#504)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

* docs: add dowdlelt as a contributor (#505)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

* [FIX] Make bi-sided clustering default (#331)

* Change default thresholding to bi-sided.

We should treat positively and negatively weighted voxels separately
when performing cluster-extent thresholding.

* Update license in zenodo

* [TST] Expand parameters tested in integration tests (#502)

* Add args to integration tests.

* Fix import error.

* Fix 5-echo test.

* Move rerun into another directory.

* Fix outputs.

* Fix...

* Dummy commit.

* [FIX, TST] Fix the failing tests (#518)

* Change requirements.

* Change env checksums.

* Improve documentation for debugging.

* [DOC] Improve documentation of computefeats2 (#458)

* adding more doc to computefeats2

* Added the default option add_const=False for user-friendliness

* changes following PR review & linter check

* Update tedana/stats.py

Co-Authored-By: Elizabeth DuPre <[email protected]>

* Update tedana/stats.py

Co-Authored-By: Elizabeth DuPre <[email protected]>

* Update tedana/stats.py

Co-Authored-By: Elizabeth DuPre <[email protected]>

Co-authored-by: Elizabeth DuPre <[email protected]>

* [TST] Remove testing directory from coverage tests (#496)

* Remove testing directory from coverage tests

* Removes some more tests

* Removes setup.py from coverage ignore

* Ignores info.py

* [ENH, REF] Allow re-running without manacc and reorder args (#508)

* Reorder tedana workflow arguments

Broadly reorder them from early to late in the workflow, but also
grouping related arguments together. Add a group for re-running the
workflow.

* Remove unnecessary check.

Data is a list from the CLI and io.load_data will convert a string to
list anyway. Plus io.load_data will raise an error if the file doesn’t
exist.

* Allow re-running workflow without manacc.

I.e., if manacc is not supplied, but ctab is, then use the
classifications from ctab.

* Address @dowdlelt's review.

* [ENH] Add t2smap argument for optional recalculated T2* map (#408)

* Add t2smap argument

Also, move rerunning-related arguments into their own subgroup and
allow rerunning *without* manacc (i.e., if manacc is not supplied, but
ctab is, then use the classifications from ctab.

* Update docs a big.

* Fix log message.

* Reorganize workflow arguments to group related params together.

* Get T2* re-running working.

* Replace `.nii` in workflow with `.nii.gz`.

* Fix args.

* Add manacc test.

* Fix t2s naming.

* Add fit type back in.

* Fix bad merge.

* [REF] Add out_dir argument to filewriting functions (#500)

* Add out_dir argument to gscontrol functions.

* Add out_dir argument to io functions.

* Remove unnecessary chdir call.

* Fix removed files.

* Fix outputs.

* Fix io.filewrite.

* Fix filewrite.

* [FIX, DOC] Reorganize and improve documentation (#530)

* Fix typos.

* Miscellaneous doc cleanup.

* Fix PCA/ICA links (closes #480).

* Remove outdated output from table.

* Move acquisition-related sections to one page.

Also, move considerations into multi-echo page, move resources from
considerations into new “resources” page, and move publications into
new “acquisition” page.

* Add info about quantitative T2* mapping (closes #464).

* Add link to ME-fMRI sequences OSF project.

* Address review.

* Add @handwerkerd's recommendation for the protocols.

* Add @handwerkerd's text about T2* mapping.

* Fix links.

* Add comment about copy.

Using copy makes more sense here than re-implementing what I did in
#498.

* Darn so I forgot to save when I resolved the conflict.

* Capping is already in the decay module.

* Fix long line.

* Draft check_mask function.

* Fix mismatch between metrics and decision tree.

Also calculate metrics in the actual workflow.

And also documentation.

* Fix style problem.

* Add while loop escape parameter.

* Reduce stalebot aggression. (#535)

* [DOC] Add "is it maintained" badges (#536)

* [FIX] Remove colons from log filename (#542)

* Remove colons from log-file name.

* Update integration test regex for finding log files.

* [TST] Add tests for the workflow CLIs (#538)

* Change rerun tests to use the CLI.

* Add t2smap CLI test.

* FIX

* No but fix.

* Address @rmarkello's review.

* [ENH] Output and load T2* maps in seconds (#525)

* Output and load in T2* maps in seconds.

* Print info about T2* in seconds.

* Add comments to trigger CI.

* Maybe test t2smap arg.

* Add sec2millisec and millisec2sec functions.

* Apply @jbteves' suggestions from code review

Add millisec2sec and sec2millisec to t2smap.

Co-Authored-By: Joshua Teves <[email protected]>

* Add functions to API doc page.

* Fix bad import.

* Fix mistake.

Co-authored-by: Joshua Teves <[email protected]>

* [FIX,TST] Adds unit tests for ma_pca functions and fixes a bug in ma_pca (#549)

* Adds unit tests for ma_pca functions and fixes a bug in ma_pca

* Adds trivwin false check on test_check_order

* Update tedana/tests/test_mapca.py

Co-Authored-By: Taylor Salo <[email protected]>

Co-authored-by: Taylor Salo <[email protected]>

* [FIX] Use mask, when provided, in t2smap workflow. (#545)

* [DOC] Updates the TEDPCA section following changes in tedana v0.0.8 (#551)

* Adds missing reference

* Adds missing line break

* Reduced the math-wizardry of the MA PCA description

* Gives iid its full name

* Update docs/approach.rst

* [ENH] Control number of threads with threadpoolctl (#537)

* Let's give threadpoolctl a try.

* Group related imports together and sort by length like the artists we are.

* Apparently I have to import things if I want to use them.

* Switch to using argparse function for e5 integration test re-run.

* Revert.

* Apply suggestions from code review

Co-Authored-By: Ross Markello <[email protected]>

* Partially address review.

* Finish addressing review.

* Update tedana/workflows/t2smap.py

Co-Authored-By: Ross Markello <[email protected]>

Co-authored-by: Ross Markello <[email protected]>

* [DOC] Adds PCA documentation change that was missing (#554)

* [DOC] Updates the TEDPCA section following changes in tedana v0.0.8

* Adds missing reference

* Adds missing line break

* Reduced the math-wizardry of the MA PCA description

* Gives iid its full name

* Update docs/approach.rst

Co-Authored-By: Taylor Salo <[email protected]>

* Update docs/approach.rst

Co-Authored-By: Taylor Salo <[email protected]>

* Assess PR review comments

Co-authored-by: Taylor Salo <[email protected]>

* [FIX] Drop use of eimask (#555)

This is a quick bug-fix, but we will ultimately probably incorporate
the eimask into the initial adaptive masking at the beginning of the
workflow.

* [REF] Include daysUntilStale and daysUntilClose in stalebot message (#559)

* [ENH] Only use good echoes for metric calculation/optimal combination (#358)

* Only use good echoes for metric calculation and optimal combination.

* Change variable name.

* A bit more memory management.

* Fix.

* Fix tests.

* Fix style.

* Limit mask to adaptive_mask >= 3

* Fix make_adaptive_mask

* Fix test.

* Update outputs.

* Remove old file.

* Revert some variable name changes.

* Revert some more.

* Fix style issues with test.

* Add constant when calculating betas.

Removing this seemed to cause the breaking test in the last merge commit.

* Drop extra components from outputs file.

* Improve tedpca docstring.

Co-Authored-By: Elizabeth DuPre <[email protected]>

* Improve description of adaptive mask.

* Change empty to unstable.

* Update tedana/combine.py

Co-Authored-By: Elizabeth DuPre <[email protected]>

* Update tedana/combine.py

Co-Authored-By: Elizabeth DuPre <[email protected]>

* Update tedana/combine.py

Co-Authored-By: Elizabeth DuPre <[email protected]>

Co-authored-by: Elizabeth DuPre <[email protected]>

* [FIX] Use SNR for PAID instead of mean data (#560)

* [REF] Update t2smap workflow for fMRIPrep compatibility (#557)

* Add out-dir argument to t2smap workflow.

Also move logging setup from _main into t2smap_workflow.

* Change t2smap workflow output files.

* I hate tests.

* Oh my tests are the worst.

* Fix the CLI test as well.

* [FIX] Correct t2smap outputs (#562)

* Fix BEP001-compatible output names.

* Arbitrary change to allow PR.

* [FIX] Fix t2smap optimal combination (#566)

* Fix t2smap optcom

Introduced when I changed make_optcom’s arg from mask to adaptive mask
in #358.

* Add t2smap integration test.

* Update acquisition.rst to include information for Philips scanners (#579)

* Update acquisition.rst

Include description for Philips users

* docs: add mjversluis as a contributor (#580)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

* Empty

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>

* [DOC] multi-echo reports (#457)

* Added html template for report

* doc: initial commit of reporting template

* doc: save out html for testing

* fix: add missing bokeh js script

* fix: unescape HTML ugly way

* fix: add missing cdn's

* enh: switch to generate_report structure

* sty: switch back to pure CSS from bootstrap

* fix: differently update template for content

* sty: add jupyter notebooks to git ignore

* enh:initial commit of dynamic plots

* fix: patch unpaired bracket

* enh:modularized dynamic kappa/rho plots

* enh:docstrings added to KappaRho_DynPlot.py

* Force figures default, fold viz into reporting module

* enh:TS and Spectrum are now dynamic plots

* sty: initial refactor of dynamic figures

* fix: re-introduce func to load comp_table

* sty: finalize rough refactor

* fix: patch linting errors

* doc: continue refactor

* Advanced on dynamic_figures

* Made the refactored code work

* fix: new year, new approach

* tmp: non-functioning reporting code

* Fixes bug

* Fixes bug

* Added new function to generate html report and removed tempita calls

* Added FFT and time series plots

* Made template subsitute work and updated necessary bokeh version

* Changes html title

* remove time series, fft plots

* fix: remove unescapes

* fix: patch incorrect imports

* fix: add missing bokeh import

* fix: spelling mistake 🤦

* fix: reflect reorg in get_coeffs call

* fix: correct expected test outputs

* fix: don't designate file path when saving report

* fix: combine generate, html report functions

* fix: ignore missing columns in df.drop, to better handle multiple params

* Dynamic figures are now two rows, static figure is on the right. Removed the cute dog link and made adding report.txt text to bottom of report automatic

* Added new line after References in report.txt

* Removes BeautifulSoup dependency in favor of an object tag and changes some CSS styles

* enh: switch to grid layout, style About section, add links to navbar

* doc: update no-png to no-reports

* modifiation of setup.py to properly install all reporting directories

* Use relative path for static figures on report

* Add version check for bokeh figures

Co-authored-by: smoia <[email protected]>
Co-authored-by: Javier Gonzalez-Castillo <[email protected]>
Co-authored-by: eurunuela <[email protected]>

* Escape earlier.

* Fix inescapable while loop.

* Update outputs.

* Log in CI.

* Add smoke tests for metric functions.

* A little cleanup.

* Apply adaptive masking to metrics.

* Work around get_coeffs idiosyncrasy.

* Fix tests.

* Ugh, fix.

* Drop old metrics file.

Co-authored-by: Taylor Salo <[email protected]>
Co-authored-by: Logan <[email protected]>
Co-authored-by: Chris Markiewicz <[email protected]>
Co-authored-by: Eneko Uruñuela <[email protected]>
Co-authored-by: Joshua Teves <[email protected]>
Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>
Co-authored-by: Ross Markello <[email protected]>
Co-authored-by: Cesar Caballero Gaudes <[email protected]>
Co-authored-by: mjversluis <[email protected]>
Co-authored-by: smoia <[email protected]>
Co-authored-by: Javier Gonzalez-Castillo <[email protected]>
Co-authored-by: eurunuela <[email protected]>
@tsalo tsalo changed the title [REF] Modularize metrics [REF] Modularize metric calculation Aug 7, 2020
@codecov
Copy link

codecov bot commented Aug 7, 2020

Codecov Report

Merging #591 (c1f80cc) into main (ad664ae) will decrease coverage by 0.53%.
The diff coverage is 91.39%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #591      +/-   ##
==========================================
- Coverage   92.81%   92.27%   -0.54%     
==========================================
  Files          25       27       +2     
  Lines        1879     2137     +258     
==========================================
+ Hits         1744     1972     +228     
- Misses        135      165      +30     
Impacted Files Coverage Δ
tedana/reporting/dynamic_figures.py 100.00% <ø> (ø)
tedana/workflows/tedana.py 90.30% <66.66%> (+0.46%) ⬆️
tedana/metrics/_utils.py 73.77% <73.77%> (ø)
tedana/utils.py 97.50% <85.71%> (-0.80%) ⬇️
tedana/selection/tedica.py 92.56% <88.88%> (-0.81%) ⬇️
tedana/metrics/dependence.py 93.25% <93.25%> (ø)
tedana/metrics/collect.py 95.10% <95.10%> (ø)
tedana/stats.py 96.62% <96.42%> (-0.10%) ⬇️
tedana/decay.py 97.24% <100.00%> (+0.07%) ⬆️
tedana/decomposition/pca.py 86.66% <100.00%> (ø)
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ad664ae...c1f80cc. Read the comment docs.

@tsalo
Copy link
Member Author

tsalo commented Aug 7, 2020

The only failing tests are CodeCov, because of a 0.73% decrease in coverage. So that's good at least!

@tsalo
Copy link
Member Author

tsalo commented Aug 21, 2020

Update from today's call: Will everyone please try to both review this and test it out on your own code. Also, @jbteves and @handwerkerd have committed to reviewing.

@jbteves
Copy link
Collaborator

jbteves commented Oct 7, 2020

Hey @tsalo, thanks so much for this, this is awesome!
While I was looking through this one of the concerns I have is that we'll have to maintain the order of metric calculation and checks for metrics in the core part of the code. I think this could be difficult to maintain, especially if our plan is to add or delete metrics frequently. I know that we're generally opposed to using object-oriented approaches because they can be conceptually difficult, but this is a case where I think it will actually be easier to follow if we use classes. I came up with something along these lines:

  • A metric superclass contains a class-specific table telling you which metrics are calculated and if so, what their values are
  • Each individual class inherits from metric superclass (easy template to follow), and has a constructor that checks to see if that metric's value exists in the table; if not, calculate its dependencies and then it
  • This allows you to do recursion to figure out dependencies, simplifying the core code to just specifying the top-level metrics you want
  • Modify [REF] Decision tree modularization #592 to iterate over the needed metrics and calculate them.
  • Would have to wipe out the metric values at the end of each component to avoid memory explosion, but that's easy to implement by simply pointing the metric to None, so the GC takes out the big variables
  • Maybe add a verbosity function which saves intermediates? Currently we're not saving some of these and it might be good to do so.

I would propose to do the work stated above if we decide it's a good idea. I'm open to other suggestions, but I am a little concerned about having to edit core code in order to add metrics, if that makes sense.

Conflicts:
    tedana/tests/data/fiu_four_echo_outputs.txt
Resolved by removing some extra files and correctly renaming T1c->MIR.
@jbteves
Copy link
Collaborator

jbteves commented Jan 25, 2021

@tsalo it looks like we end up losing several metric files. I talked to @handwerkerd and he thinks you chatted about that some time ago. Do you recall if you decided that it was okay to lose those metric files?
What would you think about adding an option to save used metrics for later review?
Lastly, would it be okay to push a few commits to this branch to bring it in line with master?

@stale stale bot removed the stale label Jan 25, 2021
@tsalo
Copy link
Member Author

tsalo commented Jan 25, 2021

Sorry, what metric files?

Also yes, feel free to push changes.

@jbteves
Copy link
Collaborator

jbteves commented Jan 25, 2021

Sorry for the ambiguity. In this branch, we're no longer writing the following:
image

@tsalo
Copy link
Member Author

tsalo commented Jan 25, 2021

Ah, I see. I believe all of those were only used for the figures in our documentation, but we can figure out how to output them again.

@jbteves
Copy link
Collaborator

jbteves commented Jan 25, 2021

I don't see a need to do that right away. If we miss them we can bring them back. In the future, though, we should probably create an option to get metric maps.

Base automatically changed from master to main February 1, 2021 23:57
@jbteves
Copy link
Collaborator

jbteves commented May 14, 2021

Alright, this is mostly in line with main. There is basically one hack that I had to make where we save the verbose outputs every time we collect the metrics, because the verbose outputs require a lot of information that doesn't live outside of that function, unlike the metrics which don't require echo-specific information after that step. This actually slows down the four-echo test a bit since that means each re-run requires a series of file writes. I can't think of a much better way of doing things without further modularization. However, this branch easily gets out of sync with main and I think it's better to get it in, since it touches so many parts of the code.
It additionally contains some modifications to RTD because for some reason our old configuration just stopped working. I copied the gist linked there into a javascript file to more closely match the setup specification on their documentation.
Additionally, unlike main I generate metric metadata from the component table rather than during metric creation because that ended up being a simpler way to handle the metadata while I was fixing the merge conflicts.

@jbteves
Copy link
Collaborator

jbteves commented May 14, 2021

@tsalo and @handwerkerd do you mind taking a look?
Anyone else is welcome, but this is a very dense PR!

This was referenced May 18, 2021
@eurunuela
Copy link
Collaborator

I have just compared the results with main and this PR on a motor task and a resting-state dataset. I used np.allclose() to check if all metrics were identical, and they were. I ran tedana with all the default options by the way.

Copy link
Member

@handwerkerd handwerkerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ready to merge to me! I've looked through the code for clarity and run it on several distinct data sets and the results don't change between this PR & main!

metric_metadata["classification"] = {
"LongName": "Component classification",
"Description": (
"Classification from the manual classification procedure."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Classification from the manual classification procedure."
"Classification from the classification procedure."

The word manual was added here. Since this is the descriptor for any classification, I think saying it's just for the manual classification is incorrect.

@handwerkerd
Copy link
Member

@emdupre & @eurunuela, Any chance one of you can give this a look-over so that we can approve this week? It would be really nice to just say this is merged, without any qualifiers, on the OHBM poster.

We could have me and @jbteves formally approve and @tsalo merge, but it would be nice to have a few more eyes. As far as I can tell, the resulting metric calculations are effectively unchanged. I've tested it on a few data sets with different voxel sizes, run lengths, and 3 vs 5 echoes. During my review, I saw some things that might be worth revisiting later, but were functional as is.

Please let us know if we should wait for your reviews or if you're ok with us merging.

@eurunuela
Copy link
Collaborator

@emdupre & @eurunuela, Any chance one of you can give this a look-over so that we can approve this week? It would be really nice to just say this is merged, without any qualifiers, on the OHBM poster.

I'll review it before Saturday.

@emdupre
Copy link
Member

emdupre commented Jun 9, 2021

I'll also aim to review this weekend, but if you have the necessary reviews don't feel compelled to wait on me !

@notZaki
Copy link
Contributor

notZaki commented Jun 10, 2021

I compared this PR vs main on the 88 subjects in the Cambridge collection, using both tedpca='mdl' and tedpca='kundu'.
With mdl, the denoised data was exactly identical. With kundu, there were minor differences but they were all below 1e-7 in all cases.

The pca_metrics.tsv file has more detail in the PR version. 👍
Edit: On the flip side, PR is slightly slower and took ~20 minutes longer to process the 88 subjects with kundu. The time difference was smaller with mdl.

Copy link
Collaborator

@eurunuela eurunuela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the changes. This is some excellent work you've done @jbteves @handwerkerd , thanks!

I only have very minor comments that can be addressed after this PR is merged.

"""
not_found = [k for k in requested_metrics if k not in dict_.keys()]
if not_found:
raise ValueError("Unknown metric(s): {}".format(", ".join(not_found)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a couple of unit tests to check all if statements here. I'd rather do it once we merge this PR.

print("Warning: {} not found".format(k))
required_metrics_new += new_metrics
if set(required_metrics) == set(required_metrics_new):
# There are no more parent metrics to calculate
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a LGR.debug here?

RepLGR.info("The following metrics were calculated: {}.".format(", ".join(metrics)))

if not (data_cat.shape[0] == data_optcom.shape[0] == adaptive_mask.shape[0]):
raise ValueError(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before, I think we need to make sure these errors are raised.

Comment on lines +113 to +156
METRIC_DEPENDENCIES = {
"kappa": ["map FT2", "map Z"],
"rho": ["map FS0", "map Z"],
"countnoise": ["map Z", "map Z clusterized"],
"countsigFT2": ["map FT2 clusterized"],
"countsigFS0": ["map FS0 clusterized"],
"dice_FT2": ["map beta T2 clusterized", "map FT2 clusterized"],
"dice_FS0": ["map beta S0 clusterized", "map FS0 clusterized"],
"signal-noise_t": ["map Z", "map Z clusterized", "map FT2"],
"variance explained": ["map optcom betas"],
"normalized variance explained": ["map weight"],
"d_table_score": [
"kappa",
"dice_FT2",
"signal-noise_t",
"countnoise",
"countsigFT2",
],
"map FT2": ["map Z", "mixing", "tes", "data_cat", "adaptive_mask"],
"map FS0": ["map Z", "mixing", "tes", "data_cat", "adaptive_mask"],
"map Z": ["map weight"],
"map weight": ["data_optcom", "mixing"],
"map optcom betas": ["data_optcom", "mixing"],
"map percent signal change": ["data_optcom", "map optcom betas"],
"map Z clusterized": ["map Z", "mask", "ref_img", "tes"],
"map FT2 clusterized": ["map FT2", "mask", "ref_img", "tes"],
"map FS0 clusterized": ["map FS0", "mask", "ref_img", "tes"],
"map beta T2 clusterized": [
"map FT2 clusterized",
"map optcom betas",
"countsigFT2",
"mask",
"ref_img",
"tes",
],
"map beta S0 clusterized": [
"map FS0 clusterized",
"map optcom betas",
"countsigFS0",
"mask",
"ref_img",
"tes",
],
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be more elegant to have this dictionary on a JSON file? I think it would make adding new metrics easier.

)

if "map percent signal change" in required_metrics:
LGR.info("Calculating percent signal change maps")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing as before, we should add tests for the if statements.

Comment on lines +350 to +355
"A t-test was performed between the distributions of T2*-model "
"F-statistics associated with clusters (i.e., signal) and "
"non-cluster voxels (i.e., noise) to generate a z-statistic "
"(metric signal-noise_z) and p-value (metric signal-noise_p) "
"measuring relative association of the component to signal "
"over noise."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the wording in this sentence is a little complicated to follow.

Comment on lines +369 to +370
"The number of significant voxels not from clusters was "
"calculated for each component."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wording

@jbteves
Copy link
Collaborator

jbteves commented Jun 11, 2021

@eurunuela please also credit @tsalo as mostly we were adapting his branch.

@handwerkerd any strong feelings about addressing these changes now vs. later?

@eurunuela
Copy link
Collaborator

@eurunuela please also credit @tsalo as mostly we were adapting his branch.

True! I had completely forgotten! Sorry about that @tsalo and thanks for your amazing work!

@jbteves
Copy link
Collaborator

jbteves commented Jun 11, 2021

Since this is text I just want to clarify that I wasn't being short with you, I just want to make sure he gets due credit! It's a significant change that was very intricate.

@eurunuela
Copy link
Collaborator

Since this is text I just want to clarify that I wasn't being short with you, I just want to make sure he gets due credit! It's a significant change that was very intricate.

Oh, don't worry! I'm actually grateful that you pointed it out :)

@handwerkerd
Copy link
Member

These look like good changes, but they won't affect functionality and should be easy to do in a later PR. My preference would be to merge before Monday and open a new issue with these suggested changes.
If @emdupre might look at this over the weekend, I'd wait for her review. Otherwise, @tsalo, you want the honor of merging Sunday night?

required_metrics = required_metrics_new
escape_counter += 1
if escape_counter >= 10:
LGR.warning("dependency_resolver in infinite loop. Escaping early.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should alert the user why this would happen with the LGR.

Copy link
Member

@emdupre emdupre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! I would like to see a few more tests, but we can merge and then add those next ! 🚀

@handwerkerd
Copy link
Member

@tsalo You willing to squash & merge today? We can then open another issue or two to start addressing all the comments here.

@tsalo
Copy link
Member Author

tsalo commented Jun 14, 2021

I'll do it now then. Thanks.

@tsalo tsalo merged commit 2f338a2 into main Jun 14, 2021
@tsalo tsalo deleted the temp-modularization branch April 19, 2024 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort: high More than 40h total work enhancement issues describing possible enhancements to the project impact: high Enhancement or functionality improvement that will affect most users priority: high issues that would be really helpful if they were fixed already refactoring issues proposing/requesting changes to the code which do not impact behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Look for inefficiencies in the interaction between unmasking and spatial clustering
6 participants