This is a pipeline for the processing of imaging mass cytometry (IMC) data.
It is largely based on Vito Zanotelli's pipeline.
It performs image preprocessing and filtering, uses
ilastik
for semi-supervised pixel classification,
CellProfiler
for image segmentation and
quantification of single cells.
The pipeline can be used in standalone mode or with imcrunner
in order to
process multiple samples in a distributed way and in parallel such as a local
computer, on the cloud, or a high performance computing cluster (HPC).
This is due to the use of the light-weight computing configuration manager
divvy.
Requires:
- Python >= 3.7
- One of:
docker
,singularity
,conda
orcellprofiler
in a local installation.
Install with:
pip install imcpipeline
Make sure to have an updated PIP version. Development and testing is only done for Linux. If anyone is interested in maintaining this repository in MacOS/Windows fell free to submit a PR.
You can run a demo dataset using the --demo
flag:
imcpipeline --demo
The pipeline will try to use a local cellprofiler
installation, docker
or
singularity
in that order if any is available.
Output files are in a imcpipeline_demo_data
directory.
To run the pipeline on real data, one simply needs to specify input and output
directories. A trained ilastik
model can be provided and if not, the user will
be prompted to train it.
imcpipeline \
--container docker \
--ilastik-model model.ilp \
-i input_dir -o output_dir
If docker
or singularity
is not available, one could for example use a
conda
environment or a virtualenv
environment activated only for the
cellprofiler
command like this:
imcpipeline \
--cellprofiler-exec \
"source ~/.miniconda2/bin/activate && conda activate cellprofiler && cellprofiler"
--ilastik-model model.ilp \
-i input_dir -o output_dir
To run one step only for a single sample, use the -s/--step
argument:
imcpipeline \
--step segmentation \
-i input_dir -o output_dir
Or provide more than one consecutive step in the same way:
imcpipeline \
--step predict,segmentation \
-i input_dir -o output_dir
To run the pipeline for various samples in a specific computing configuration (more details in the documentation):
imcrunner \
--divvy-configuration slurm \
metadata.csv \
--container docker \
--ilastik-model model.ilp \
-i input_dir -o output_dir
For additional details on the pipeline, see the documentation.
- Vito Zanotelli's pipeline;
- A similar pipeline implemented in Nextflow.