Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configurable normalizations #68

Merged
merged 8 commits into from
Feb 26, 2024
Merged

configurable normalizations #68

merged 8 commits into from
Feb 26, 2024

Conversation

edyoshikun
Copy link
Contributor

This PR adds configurable normalizations for both the source and targets similar to the transformations/augmentations that are applied to the samples.

viscy/utils/meta_utils.py Outdated Show resolved Hide resolved
@edyoshikun edyoshikun marked this pull request as ready for review February 24, 2024 18:41
@edyoshikun
Copy link
Contributor Author

@ziw-liu just tested with new dataset using the preprocessing and we can do the normalizations

@ziw-liu ziw-liu merged commit 74e7db3 into fcmae Feb 26, 2024
3 checks passed
@ziw-liu ziw-liu deleted the normalization_roi branch February 26, 2024 17:31
ziw-liu added a commit that referenced this pull request Apr 8, 2024
* refactor data loading into its own module

* update type annotations

* move the logging module out

* move old logging into utils

* rename tests to match module name

* bump torch

* draft fcmae encoder

* add stem to the encoder

* wip: masked stem layernorm

* wip: patchify masked features for linear

* use mlp from timm

* hack: POC training script for FCMAE

* fix mask for fitting

* remove training script

* default architecture

* fine-tuning options

* fix cli for finetuning

* draft combined data module

* fix import

* manual validation loss reduction

* update linting
new black version has different rules

* update development guide

* update type hints

* bump iohub

* draft ctmc v1 dataset

* update tests

* move test_data

* remove path conversion

* configurable normalizations (#68)

* inital commit adding the normalization.

* adding dataset_statistics to each fov to facilitate the configurable augmentations

* fix indentation

* ruff

* test preprocessing

* remove redundant field

* cleanup

---------

Co-authored-by: Ziwen Liu <[email protected]>

* fix ctmc dataloading

* add example ctmc v1 loading script

* changing the normalization and augmentations default from None to empty list.

* invert intensity transform

* concatenated data module

* subsample videos

* livecell dataset

* all sample fields are optional

* fix multi-dataloader validation

* lint

* fixing preprocessing for varying array shapes (i.e aics dataset)

* update loading scripts

* fix CombineMode

* compose normalizations for predict and test stages

* black

* fix normalization in example config

* fix collate when multi-sample transform is not used

* ddp caching fixes

* fix caching when using combined loader

* move log values to GPU before syncing
Lightning-AI/pytorch-lightning#18803

* removing normalize_source from configs.

* typing fixes

* fix test data path

* fix test dataset

* add docstring for ConcatDataModule

* format

---------

Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>
ziw-liu added a commit that referenced this pull request Jun 11, 2024
* refactor data loading into its own module

* update type annotations

* move the logging module out

* move old logging into utils

* rename tests to match module name

* bump torch

* draft fcmae encoder

* add stem to the encoder

* wip: masked stem layernorm

* wip: patchify masked features for linear

* use mlp from timm

* hack: POC training script for FCMAE

* fix mask for fitting

* remove training script

* default architecture

* fine-tuning options

* fix cli for finetuning

* draft combined data module

* fix import

* manual validation loss reduction

* update linting
new black version has different rules

* update development guide

* update type hints

* bump iohub

* draft ctmc v1 dataset

* update tests

* move test_data

* remove path conversion

* configurable normalizations (#68)

* inital commit adding the normalization.

* adding dataset_statistics to each fov to facilitate the configurable augmentations

* fix indentation

* ruff

* test preprocessing

* remove redundant field

* cleanup

---------

Co-authored-by: Ziwen Liu <[email protected]>

* fix ctmc dataloading

* add example ctmc v1 loading script

* changing the normalization and augmentations default from None to empty list.

* invert intensity transform

* concatenated data module

* subsample videos

* livecell dataset

* all sample fields are optional

* fix multi-dataloader validation

* lint

* fixing preprocessing for varying array shapes (i.e aics dataset)

* update loading scripts

* fix CombineMode

* always use untrainable head for FCMAE

* move log values to GPU before syncing
Lightning-AI/pytorch-lightning#18803

* custom head

* ddp caching fixes

* fix caching when using combined loader

* compose normalizations for predict and test stages

* black

* fix normalization in example config

* fix normalization in example config

* prefetch more in validation

* fix collate when multi-sample transform is not used

* ddp caching fixes

* fix caching when using combined loader

* typing fixes

* fix test dataset

* fix invert transform

* add ddp prepare flag for combined data module

* remove redundant operations

* filter empty detections

* pass trainer to underlying data modules in concatenated

* hack: add test dataloader for LiveCell dataset

* test datasets for livecell and ctmc

* fix merge error

* fix merge error

* fix mAP default for over 100 detections

* bump torchmetric

* fix combined loader training for virtual staining task

* fix non-combined data loader training

* add fcmae to graph script

* fix type hint

* format

* add back convolutiuon option for fcmae head

---------

Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>
edyoshikun added a commit that referenced this pull request Jun 12, 2024
* refactor data loading into its own module

* update type annotations

* move the logging module out

* move old logging into utils

* rename tests to match module name

* bump torch

* draft fcmae encoder

* add stem to the encoder

* wip: masked stem layernorm

* wip: patchify masked features for linear

* use mlp from timm

* hack: POC training script for FCMAE

* fix mask for fitting

* remove training script

* default architecture

* fine-tuning options

* fix cli for finetuning

* draft combined data module

* fix import

* manual validation loss reduction

* update linting
new black version has different rules

* update development guide

* update type hints

* bump iohub

* draft ctmc v1 dataset

* update tests

* move test_data

* remove path conversion

* configurable normalizations (#68)

* inital commit adding the normalization.

* adding dataset_statistics to each fov to facilitate the configurable augmentations

* fix indentation

* ruff

* test preprocessing

* remove redundant field

* cleanup

---------

Co-authored-by: Ziwen Liu <[email protected]>

* fix ctmc dataloading

* add example ctmc v1 loading script

* changing the normalization and augmentations default from None to empty list.

* invert intensity transform

* concatenated data module

* subsample videos

* livecell dataset

* all sample fields are optional

* fix multi-dataloader validation

* lint

* fixing preprocessing for varying array shapes (i.e aics dataset)

* update loading scripts

* fix CombineMode

* compose normalizations for predict and test stages

* black

* fix normalization in example config

* fix collate when multi-sample transform is not used

* ddp caching fixes

* fix caching when using combined loader

* move log values to GPU before syncing
Lightning-AI/pytorch-lightning#18803

* removing normalize_source from configs.

* typing fixes

* fix test data path

* fix test dataset

* add docstring for ConcatDataModule

* format

---------

Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>
edyoshikun added a commit that referenced this pull request Jun 12, 2024
* refactor data loading into its own module

* update type annotations

* move the logging module out

* move old logging into utils

* rename tests to match module name

* bump torch

* draft fcmae encoder

* add stem to the encoder

* wip: masked stem layernorm

* wip: patchify masked features for linear

* use mlp from timm

* hack: POC training script for FCMAE

* fix mask for fitting

* remove training script

* default architecture

* fine-tuning options

* fix cli for finetuning

* draft combined data module

* fix import

* manual validation loss reduction

* update linting
new black version has different rules

* update development guide

* update type hints

* bump iohub

* draft ctmc v1 dataset

* update tests

* move test_data

* remove path conversion

* configurable normalizations (#68)

* inital commit adding the normalization.

* adding dataset_statistics to each fov to facilitate the configurable augmentations

* fix indentation

* ruff

* test preprocessing

* remove redundant field

* cleanup

---------

Co-authored-by: Ziwen Liu <[email protected]>

* fix ctmc dataloading

* add example ctmc v1 loading script

* changing the normalization and augmentations default from None to empty list.

* invert intensity transform

* concatenated data module

* subsample videos

* livecell dataset

* all sample fields are optional

* fix multi-dataloader validation

* lint

* fixing preprocessing for varying array shapes (i.e aics dataset)

* update loading scripts

* fix CombineMode

* always use untrainable head for FCMAE

* move log values to GPU before syncing
Lightning-AI/pytorch-lightning#18803

* custom head

* ddp caching fixes

* fix caching when using combined loader

* compose normalizations for predict and test stages

* black

* fix normalization in example config

* fix normalization in example config

* prefetch more in validation

* fix collate when multi-sample transform is not used

* ddp caching fixes

* fix caching when using combined loader

* typing fixes

* fix test dataset

* fix invert transform

* add ddp prepare flag for combined data module

* remove redundant operations

* filter empty detections

* pass trainer to underlying data modules in concatenated

* hack: add test dataloader for LiveCell dataset

* test datasets for livecell and ctmc

* fix merge error

* fix merge error

* fix mAP default for over 100 detections

* bump torchmetric

* fix combined loader training for virtual staining task

* fix non-combined data loader training

* add fcmae to graph script

* fix type hint

* format

* add back convolutiuon option for fcmae head

---------

Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>
edyoshikun added a commit that referenced this pull request Jun 12, 2024
* refactor data loading into its own module

* update type annotations

* move the logging module out

* move old logging into utils

* rename tests to match module name

* bump torch

* draft fcmae encoder

* add stem to the encoder

* wip: masked stem layernorm

* wip: patchify masked features for linear

* use mlp from timm

* hack: POC training script for FCMAE

* fix mask for fitting

* remove training script

* default architecture

* fine-tuning options

* fix cli for finetuning

* draft combined data module

* fix import

* manual validation loss reduction

* update linting
new black version has different rules

* update development guide

* update type hints

* bump iohub

* draft ctmc v1 dataset

* update tests

* move test_data

* remove path conversion

* configurable normalizations (#68)

* inital commit adding the normalization.

* adding dataset_statistics to each fov to facilitate the configurable augmentations

* fix indentation

* ruff

* test preprocessing

* remove redundant field

* cleanup

---------

Co-authored-by: Ziwen Liu <[email protected]>

* fix ctmc dataloading

* add example ctmc v1 loading script

* changing the normalization and augmentations default from None to empty list.

* invert intensity transform

* concatenated data module

* subsample videos

* livecell dataset

* all sample fields are optional

* fix multi-dataloader validation

* lint

* fixing preprocessing for varying array shapes (i.e aics dataset)

* update loading scripts

* fix CombineMode

* compose normalizations for predict and test stages

* black

* fix normalization in example config

* fix collate when multi-sample transform is not used

* ddp caching fixes

* fix caching when using combined loader

* move log values to GPU before syncing
Lightning-AI/pytorch-lightning#18803

* removing normalize_source from configs.

* typing fixes

* fix test data path

* fix test dataset

* add docstring for ConcatDataModule

* format

---------

Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>
edyoshikun added a commit that referenced this pull request Jun 12, 2024
* refactor data loading into its own module

* update type annotations

* move the logging module out

* move old logging into utils

* rename tests to match module name

* bump torch

* draft fcmae encoder

* add stem to the encoder

* wip: masked stem layernorm

* wip: patchify masked features for linear

* use mlp from timm

* hack: POC training script for FCMAE

* fix mask for fitting

* remove training script

* default architecture

* fine-tuning options

* fix cli for finetuning

* draft combined data module

* fix import

* manual validation loss reduction

* update linting
new black version has different rules

* update development guide

* update type hints

* bump iohub

* draft ctmc v1 dataset

* update tests

* move test_data

* remove path conversion

* configurable normalizations (#68)

* inital commit adding the normalization.

* adding dataset_statistics to each fov to facilitate the configurable augmentations

* fix indentation

* ruff

* test preprocessing

* remove redundant field

* cleanup

---------

Co-authored-by: Ziwen Liu <[email protected]>

* fix ctmc dataloading

* add example ctmc v1 loading script

* changing the normalization and augmentations default from None to empty list.

* invert intensity transform

* concatenated data module

* subsample videos

* livecell dataset

* all sample fields are optional

* fix multi-dataloader validation

* lint

* fixing preprocessing for varying array shapes (i.e aics dataset)

* update loading scripts

* fix CombineMode

* always use untrainable head for FCMAE

* move log values to GPU before syncing
Lightning-AI/pytorch-lightning#18803

* custom head

* ddp caching fixes

* fix caching when using combined loader

* compose normalizations for predict and test stages

* black

* fix normalization in example config

* fix normalization in example config

* prefetch more in validation

* fix collate when multi-sample transform is not used

* ddp caching fixes

* fix caching when using combined loader

* typing fixes

* fix test dataset

* fix invert transform

* add ddp prepare flag for combined data module

* remove redundant operations

* filter empty detections

* pass trainer to underlying data modules in concatenated

* hack: add test dataloader for LiveCell dataset

* test datasets for livecell and ctmc

* fix merge error

* fix merge error

* fix mAP default for over 100 detections

* bump torchmetric

* fix combined loader training for virtual staining task

* fix non-combined data loader training

* add fcmae to graph script

* fix type hint

* format

* add back convolutiuon option for fcmae head

---------

Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>
edyoshikun added a commit that referenced this pull request Jun 12, 2024
* refactor data loading into its own module

* update type annotations

* move the logging module out

* move old logging into utils

* rename tests to match module name

* bump torch

* draft fcmae encoder

* add stem to the encoder

* wip: masked stem layernorm

* wip: patchify masked features for linear

* use mlp from timm

* hack: POC training script for FCMAE

* fix mask for fitting

* remove training script

* default architecture

* fine-tuning options

* fix cli for finetuning

* draft combined data module

* fix import

* manual validation loss reduction

* update linting
new black version has different rules

* update development guide

* update type hints

* bump iohub

* draft ctmc v1 dataset

* update tests

* move test_data

* remove path conversion

* configurable normalizations (#68)

* inital commit adding the normalization.

* adding dataset_statistics to each fov to facilitate the configurable augmentations

* fix indentation

* ruff

* test preprocessing

* remove redundant field

* cleanup

---------

Co-authored-by: Ziwen Liu <[email protected]>

* fix ctmc dataloading

* add example ctmc v1 loading script

* changing the normalization and augmentations default from None to empty list.

* invert intensity transform

* concatenated data module

* subsample videos

* livecell dataset

* all sample fields are optional

* fix multi-dataloader validation

* lint

* fixing preprocessing for varying array shapes (i.e aics dataset)

* update loading scripts

* fix CombineMode

* compose normalizations for predict and test stages

* black

* fix normalization in example config

* fix collate when multi-sample transform is not used

* ddp caching fixes

* fix caching when using combined loader

* move log values to GPU before syncing
Lightning-AI/pytorch-lightning#18803

* removing normalize_source from configs.

* typing fixes

* fix test data path

* fix test dataset

* add docstring for ConcatDataModule

* format

---------

Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>
edyoshikun added a commit that referenced this pull request Jun 12, 2024
* refactor data loading into its own module

* update type annotations

* move the logging module out

* move old logging into utils

* rename tests to match module name

* bump torch

* draft fcmae encoder

* add stem to the encoder

* wip: masked stem layernorm

* wip: patchify masked features for linear

* use mlp from timm

* hack: POC training script for FCMAE

* fix mask for fitting

* remove training script

* default architecture

* fine-tuning options

* fix cli for finetuning

* draft combined data module

* fix import

* manual validation loss reduction

* update linting
new black version has different rules

* update development guide

* update type hints

* bump iohub

* draft ctmc v1 dataset

* update tests

* move test_data

* remove path conversion

* configurable normalizations (#68)

* inital commit adding the normalization.

* adding dataset_statistics to each fov to facilitate the configurable augmentations

* fix indentation

* ruff

* test preprocessing

* remove redundant field

* cleanup

---------

Co-authored-by: Ziwen Liu <[email protected]>

* fix ctmc dataloading

* add example ctmc v1 loading script

* changing the normalization and augmentations default from None to empty list.

* invert intensity transform

* concatenated data module

* subsample videos

* livecell dataset

* all sample fields are optional

* fix multi-dataloader validation

* lint

* fixing preprocessing for varying array shapes (i.e aics dataset)

* update loading scripts

* fix CombineMode

* always use untrainable head for FCMAE

* move log values to GPU before syncing
Lightning-AI/pytorch-lightning#18803

* custom head

* ddp caching fixes

* fix caching when using combined loader

* compose normalizations for predict and test stages

* black

* fix normalization in example config

* fix normalization in example config

* prefetch more in validation

* fix collate when multi-sample transform is not used

* ddp caching fixes

* fix caching when using combined loader

* typing fixes

* fix test dataset

* fix invert transform

* add ddp prepare flag for combined data module

* remove redundant operations

* filter empty detections

* pass trainer to underlying data modules in concatenated

* hack: add test dataloader for LiveCell dataset

* test datasets for livecell and ctmc

* fix merge error

* fix merge error

* fix mAP default for over 100 detections

* bump torchmetric

* fix combined loader training for virtual staining task

* fix non-combined data loader training

* add fcmae to graph script

* fix type hint

* format

* add back convolutiuon option for fcmae head

---------

Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>
edyoshikun added a commit that referenced this pull request Jun 18, 2024
* refactor data loading into its own module

* update type annotations

* move the logging module out

* move old logging into utils

* rename tests to match module name

* bump torch

* draft fcmae encoder

* add stem to the encoder

* wip: masked stem layernorm

* wip: patchify masked features for linear

* use mlp from timm

* hack: POC training script for FCMAE

* fix mask for fitting

* remove training script

* default architecture

* fine-tuning options

* fix cli for finetuning

* draft combined data module

* fix import

* manual validation loss reduction

* update linting
new black version has different rules

* update development guide

* update type hints

* bump iohub

* draft ctmc v1 dataset

* update tests

* move test_data

* remove path conversion

* configurable normalizations (#68)

* inital commit adding the normalization.

* adding dataset_statistics to each fov to facilitate the configurable augmentations

* fix indentation

* ruff

* test preprocessing

* remove redundant field

* cleanup

---------

Co-authored-by: Ziwen Liu <[email protected]>

* fix ctmc dataloading

* add example ctmc v1 loading script

* changing the normalization and augmentations default from None to empty list.

* invert intensity transform

* concatenated data module

* subsample videos

* livecell dataset

* all sample fields are optional

* fix multi-dataloader validation

* lint

* fixing preprocessing for varying array shapes (i.e aics dataset)

* update loading scripts

* fix CombineMode

* always use untrainable head for FCMAE

* move log values to GPU before syncing
Lightning-AI/pytorch-lightning#18803

* custom head

* ddp caching fixes

* fix caching when using combined loader

* compose normalizations for predict and test stages

* black

* fix normalization in example config

* fix normalization in example config

* prefetch more in validation

* fix collate when multi-sample transform is not used

* ddp caching fixes

* fix caching when using combined loader

* typing fixes

* fix test dataset

* fix invert transform

* add ddp prepare flag for combined data module

* remove redundant operations

* filter empty detections

* pass trainer to underlying data modules in concatenated

* hack: add test dataloader for LiveCell dataset

* test datasets for livecell and ctmc

* fix merge error

* fix merge error

* fix mAP default for over 100 detections

* bump torchmetric

* fix combined loader training for virtual staining task

* fix non-combined data loader training

* add fcmae to graph script

* fix type hint

* format

* add back convolutiuon option for fcmae head

---------

Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>
ziw-liu added a commit that referenced this pull request Jul 10, 2024
…bel-free images (#70)

* refactor data loading into its own module

* update type annotations

* move the logging module out

* move old logging into utils

* rename tests to match module name

* bump torch

* draft fcmae encoder

* add stem to the encoder

* wip: masked stem layernorm

* wip: patchify masked features for linear

* use mlp from timm

* hack: POC training script for FCMAE

* fix mask for fitting

* remove training script

* default architecture

* fine-tuning options

* fix cli for finetuning

* draft combined data module

* fix import

* manual validation loss reduction

* update linting
new black version has different rules

* update development guide

* update type hints

* bump iohub

* draft ctmc v1 dataset

* update tests

* move test_data

* remove path conversion

* configurable normalizations (#68)

* inital commit adding the normalization.

* adding dataset_statistics to each fov to facilitate the configurable augmentations

* fix indentation

* ruff

* test preprocessing

* remove redundant field

* cleanup

---------

Co-authored-by: Ziwen Liu <[email protected]>

* fix ctmc dataloading

* add example ctmc v1 loading script

* changing the normalization and augmentations default from None to empty list.

* invert intensity transform

* concatenated data module

* subsample videos

* livecell dataset

* all sample fields are optional

* fix multi-dataloader validation

* lint

* fixing preprocessing for varying array shapes (i.e aics dataset)

* update loading scripts

* fix CombineMode

* added model and annotation code draft

* chnaged to simple unet model

* start with lesser augmentations

* added readme file

* added tensorboard logging

* added validation step

* chnaged to viscy 2d unet

* used crossentropyloss with one-hot encoding

* added sample image logging

* attempt to build magicgui annotation

* renamed infection annotation tool

* added normalization and augmentations

* added model testing code

* removed annotation refiner

* corrected conversion of class to int

* corrected prediction module

* cleaned up the code and comments for the LightningUNet

* removed confusion matrix code, finding runtime error with model

* moved scripts to viscy.scripts.infection_phenotyping module to enable imports across scripts

* combine the lightning modules for training and prediction, fix the DDP exception

* all the stubs for computing and logging confusion matrix per cell

* separated training and test scripts

* lightning module

* corrected test cm compute

* corrected test module

* separated test and prediction scripts

* changed confusion matrix compute

* fix merge error

* split 2D and 2.5D model scripts

* added covnext script

* fix model input parameter

* update input file

* add augmentations

* refactor infection_classification code to viscy/applications

* changes made for BJ5 classification

* format code

* add explicit packaging list

* rename testing script

* update readme

* move function to preprocessing

* format code

* formatting

* histogram with dask

* fix index and test

* fix import

* black

* fix float comp

* clean up headers

* clean up import

* add argument to change number of classes

---------

Co-authored-by: Ziwen Liu <[email protected]>
Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>
Co-authored-by: Shalin Mehta <[email protected]>
Co-authored-by: Ziwen Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants