configurable normalizations #68

edyoshikun · 2024-02-24T05:21:16Z

This PR adds configurable normalizations for both the source and targets similar to the transformations/augmentations that are applied to the samples.

examples/configs/fit_example.yml

…augmentations

viscy/utils/meta_utils.py

edyoshikun · 2024-02-24T19:03:37Z

@ziw-liu just tested with new dataset using the preprocessing and we can do the normalizations

* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <[email protected]> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * compose normalizations for predict and test stages * black * fix normalization in example config * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * removing normalize_source from configs. * typing fixes * fix test data path * fix test dataset * add docstring for ConcatDataModule * format --------- Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>

* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <[email protected]> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * always use untrainable head for FCMAE * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * custom head * ddp caching fixes * fix caching when using combined loader * compose normalizations for predict and test stages * black * fix normalization in example config * fix normalization in example config * prefetch more in validation * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * typing fixes * fix test dataset * fix invert transform * add ddp prepare flag for combined data module * remove redundant operations * filter empty detections * pass trainer to underlying data modules in concatenated * hack: add test dataloader for LiveCell dataset * test datasets for livecell and ctmc * fix merge error * fix merge error * fix mAP default for over 100 detections * bump torchmetric * fix combined loader training for virtual staining task * fix non-combined data loader training * add fcmae to graph script * fix type hint * format * add back convolutiuon option for fcmae head --------- Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>

* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <[email protected]> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * compose normalizations for predict and test stages * black * fix normalization in example config * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * removing normalize_source from configs. * typing fixes * fix test data path * fix test dataset * add docstring for ConcatDataModule * format --------- Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>

* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <[email protected]> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * always use untrainable head for FCMAE * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * custom head * ddp caching fixes * fix caching when using combined loader * compose normalizations for predict and test stages * black * fix normalization in example config * fix normalization in example config * prefetch more in validation * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * typing fixes * fix test dataset * fix invert transform * add ddp prepare flag for combined data module * remove redundant operations * filter empty detections * pass trainer to underlying data modules in concatenated * hack: add test dataloader for LiveCell dataset * test datasets for livecell and ctmc * fix merge error * fix merge error * fix mAP default for over 100 detections * bump torchmetric * fix combined loader training for virtual staining task * fix non-combined data loader training * add fcmae to graph script * fix type hint * format * add back convolutiuon option for fcmae head --------- Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>

* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <[email protected]> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * compose normalizations for predict and test stages * black * fix normalization in example config * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * removing normalize_source from configs. * typing fixes * fix test data path * fix test dataset * add docstring for ConcatDataModule * format --------- Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>

* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <[email protected]> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * always use untrainable head for FCMAE * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * custom head * ddp caching fixes * fix caching when using combined loader * compose normalizations for predict and test stages * black * fix normalization in example config * fix normalization in example config * prefetch more in validation * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * typing fixes * fix test dataset * fix invert transform * add ddp prepare flag for combined data module * remove redundant operations * filter empty detections * pass trainer to underlying data modules in concatenated * hack: add test dataloader for LiveCell dataset * test datasets for livecell and ctmc * fix merge error * fix merge error * fix mAP default for over 100 detections * bump torchmetric * fix combined loader training for virtual staining task * fix non-combined data loader training * add fcmae to graph script * fix type hint * format * add back convolutiuon option for fcmae head --------- Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>

* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <[email protected]> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * compose normalizations for predict and test stages * black * fix normalization in example config * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * removing normalize_source from configs. * typing fixes * fix test data path * fix test dataset * add docstring for ConcatDataModule * format --------- Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>

* refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <[email protected]> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * always use untrainable head for FCMAE * move log values to GPU before syncing Lightning-AI/pytorch-lightning#18803 * custom head * ddp caching fixes * fix caching when using combined loader * compose normalizations for predict and test stages * black * fix normalization in example config * fix normalization in example config * prefetch more in validation * fix collate when multi-sample transform is not used * ddp caching fixes * fix caching when using combined loader * typing fixes * fix test dataset * fix invert transform * add ddp prepare flag for combined data module * remove redundant operations * filter empty detections * pass trainer to underlying data modules in concatenated * hack: add test dataloader for LiveCell dataset * test datasets for livecell and ctmc * fix merge error * fix merge error * fix mAP default for over 100 detections * bump torchmetric * fix combined loader training for virtual staining task * fix non-combined data loader training * add fcmae to graph script * fix type hint * format * add back convolutiuon option for fcmae head --------- Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>

…bel-free images (#70) * refactor data loading into its own module * update type annotations * move the logging module out * move old logging into utils * rename tests to match module name * bump torch * draft fcmae encoder * add stem to the encoder * wip: masked stem layernorm * wip: patchify masked features for linear * use mlp from timm * hack: POC training script for FCMAE * fix mask for fitting * remove training script * default architecture * fine-tuning options * fix cli for finetuning * draft combined data module * fix import * manual validation loss reduction * update linting new black version has different rules * update development guide * update type hints * bump iohub * draft ctmc v1 dataset * update tests * move test_data * remove path conversion * configurable normalizations (#68) * inital commit adding the normalization. * adding dataset_statistics to each fov to facilitate the configurable augmentations * fix indentation * ruff * test preprocessing * remove redundant field * cleanup --------- Co-authored-by: Ziwen Liu <[email protected]> * fix ctmc dataloading * add example ctmc v1 loading script * changing the normalization and augmentations default from None to empty list. * invert intensity transform * concatenated data module * subsample videos * livecell dataset * all sample fields are optional * fix multi-dataloader validation * lint * fixing preprocessing for varying array shapes (i.e aics dataset) * update loading scripts * fix CombineMode * added model and annotation code draft * chnaged to simple unet model * start with lesser augmentations * added readme file * added tensorboard logging * added validation step * chnaged to viscy 2d unet * used crossentropyloss with one-hot encoding * added sample image logging * attempt to build magicgui annotation * renamed infection annotation tool * added normalization and augmentations * added model testing code * removed annotation refiner * corrected conversion of class to int * corrected prediction module * cleaned up the code and comments for the LightningUNet * removed confusion matrix code, finding runtime error with model * moved scripts to viscy.scripts.infection_phenotyping module to enable imports across scripts * combine the lightning modules for training and prediction, fix the DDP exception * all the stubs for computing and logging confusion matrix per cell * separated training and test scripts * lightning module * corrected test cm compute * corrected test module * separated test and prediction scripts * changed confusion matrix compute * fix merge error * split 2D and 2.5D model scripts * added covnext script * fix model input parameter * update input file * add augmentations * refactor infection_classification code to viscy/applications * changes made for BJ5 classification * format code * add explicit packaging list * rename testing script * update readme * move function to preprocessing * format code * formatting * histogram with dask * fix index and test * fix import * black * fix float comp * clean up headers * clean up import * add argument to change number of classes --------- Co-authored-by: Ziwen Liu <[email protected]> Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]> Co-authored-by: Shalin Mehta <[email protected]> Co-authored-by: Ziwen Liu <[email protected]>

inital commit adding the normalization.

74fa77c

ziw-liu reviewed Feb 24, 2024

View reviewed changes

examples/configs/fit_example.yml Outdated Show resolved Hide resolved

edyoshikun added 3 commits February 23, 2024 21:40

adding dataset_statistics to each fov to facilitate the configurable …

9ed7cd0

…augmentations

fix indentation

529340d

ruff

61f9a9f

ziw-liu reviewed Feb 24, 2024

View reviewed changes

viscy/utils/meta_utils.py Outdated Show resolved Hide resolved

ziw-liu added 4 commits February 23, 2024 23:57

Merge branch 'fcmae' into normalization_roi

154ad31

test preprocessing

0a15f31

remove redundant field

c13400c

cleanup

e215bc6

edyoshikun marked this pull request as ready for review February 24, 2024 18:41

ziw-liu approved these changes Feb 26, 2024

View reviewed changes

ziw-liu merged commit 74e7db3 into fcmae Feb 26, 2024
3 checks passed

ziw-liu deleted the normalization_roi branch February 26, 2024 17:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configurable normalizations #68

configurable normalizations #68

edyoshikun commented Feb 24, 2024

edyoshikun commented Feb 24, 2024

configurable normalizations #68

configurable normalizations #68

Conversation

edyoshikun commented Feb 24, 2024

edyoshikun commented Feb 24, 2024