This repository is for sharing code related to DynaMorph: self-supervised learning of morphodynamic states of live cells. The related preprint is here.
We summarize the components of the DynaMorph pipeline and the structure of this repository below.
DynaMorph is developed and tested under Python 3.7, packages below are required.
For u-net segmentation
- TensorFlow ==v.2.1
- segmentation-models ==v1.0.1
For preprcoessing, patching, latent-space encoding, latent-space training
To install this codebase, you must first install git
getting-started-installing-git
Then you can clone this repository onto your computer:
git clone https://github.com/czbiohub/dynamorph.git
If you are missing a particular dependency (listed in /requirements/default.txt), you can install them using pip
pip install <missing package>
Example data to test the pipeline can be downloaded here.
The pipeline is currently in beta and we are actively working to make the pipeline easy to use. If you encounter any bugs, please report them via Issues on this repository.
DynaMorph utilizes a broad set of deep learning and machine learning tools to analyze cell imaging data, the pipeline folder contains wrapper methods for easy access to the functionalities of DynaMorph.
We also maintained some example scripts run_preproc.py
, run_segmentation.py
, run_patch.py
and run_VAE.py
to facilitate parallelization of data processing.
Check section below for functionalities this repo provides.
DynaMorph starts with raw image files from cell imaging experiments and sequentially applies a set of segmentation and encoding tools. Below we briefly introduced the main processing steps.
Starting from any microscopy data (file format is .tif single-page series or multi-page stacks acquired from micro-manager) (panel A). In the dynamorph paper, we used Phase and Retardance images measured with Quantitative Label-Free Imaging with Phase and Polarization microscopy as the input. Then use a segmentation model of your choice to generate semantic segmentation maps from the input (panel C).
Instance segmentation in this work is based on clustering, related methods can be found in SingleCellPatch/extract_patches.py
. Cell tracking methods can be found in SingleCellPatch/generate_trajectories.py
.
To generate segmentation and tracking from scratch, follow steps below:
2. (optional) train a segmentation model to provide per-pixel class probabilities, see scripts in NNsegmentation/run.py
3. prepare inputs as 5-D numpy arrays of shape (Ntime frames, Nchannels, Nslices, height, width), see run_preproc.py
for an example
4. apply trained segmentation model for semantic segmentation, see method pipeline.segmentation.segmentation
or run_segmentation.py
5. use predicted class probability maps for instance segmentation, see method pipeline.segmentation.instance_segmentation
or run_segmentation.py
DynaMorph uses VQ-VAE to encode and reconstruct cell image patches, from which latent vectors are used as morphology descriptor.
To extract single cell patches and employ morphology encoding, follow steps below:
6. extract cell patches based on instance segmentation, see method pipeline.patch_VAE.extract_patches
or run_patch.py -m 'extract_patches'
7. extract cell trajectories based on instance segmentation, see method pipeline.patch_VAE.extract_patches
or run_patch.py -m 'build_trajectories'
9. assemble cell patches generated from step 7 to model-compatible datasets, see method pipeline.patch_VAE.assemble_VAE
or run_VAE.py -m 'assemble'
10. Generate latent representations for cell patches using trained VAE models, see method pipeline.patch_VAE.process_VAE
or run_VAE.py -m 'process'
The dataset accompanying this repository is large and currently available upon request for demonstration.
Scripts run_preproc.py
, run_segmentation.py
, run_patch.py
, run_VAE.py
and run_training.py
provide command line interface to run each module. For details please check by using the -h
option.
Each CLI requires a configuration file (.yaml format) that contains parameters for each stage. Please see the example: configs/config_example.yml
To run the dynamorph pipeline, data should first be assembled into 5-D numpy arrays (step 3).
Semantic segmentation (step 4) and instance segmentation (step 5)):
python run_segmentation.py -m "segmentation" -c <path-to-your-config-yaml>
python run_segmentation.py -m "instance_segmentation" -c <path-to-your-config-yaml>
Extract patches from segmentation results (step 6), then connect them into trajectories (step 7):
python run_patch.py -m "extract_patches" -c <path-to-your-config-yaml>
python run_patch.py -m "build_trajectories" -c <path-to-your-config-yaml>
Train a DNN model (VQ-VAE) to learn a representation of your image data (step 8):
python run_training.py -c <path-to-your-config-yaml>
Transform image patches into DNN model (VQ-VAE) latent-space by running inference. (step 9 and 10):
python run_VAE.py -m "assemble" -c <path-to-your-config-yaml>
python run_VAE.py -m "process" -c <path-to-your-config-yaml>
Reduce the dimension of latent vectors for visualization by fitting a PCA or UMAP model to the data. For PCA:
python run_dim_reduction.py -m "pca" -c <path-to-your-config-yaml>
To cite DynaMorph, please use the bibtex entry below:
@article{wu2020dynamorph,
title={DynaMorph: learning morphodynamic states of human cells with live imaging and sc-RNAseq},
author={Wu, Zhenqin and Chhun, Bryant B and Schmunk, Galina and Kim, Chang and Yeh, Li-Hao and Nowakowski, Tomasz J and Zou, James and Mehta, Shalin B},
journal={bioRxiv},
year={2020},
publisher={Cold Spring Harbor Laboratory}
}
If you have any questions regarding this work or code in this repo, feel free to raise an issue or reach out to us through:
- Zhenqin Wu [email protected]
- Bryant Chhun [email protected]
- Syuan-Ming Guo [email protected]
- Shalin Mehta [email protected]