Skip to content

Commit

Permalink
Merge pull request #60 from AllenCell/preprocessing-docs
Browse files Browse the repository at this point in the history
Clarify that image_preprocessing is only for punctate structures
  • Loading branch information
pgarrison authored Nov 26, 2024
2 parents d2be7ef + 8704064 commit ef0da96
Show file tree
Hide file tree
Showing 3 changed files with 27 additions and 12 deletions.
3 changes: 2 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -88,12 +88,13 @@ repos:

# md formatting
- repo: https://github.com/executablebooks/mdformat
rev: 0.7.14
rev: 0.7.16
hooks:
- id: mdformat
args: ["--number"]
additional_dependencies:
- mdformat-gfm
- mdformat-gfm-alerts
- mdformat-tables
- mdformat_frontmatter
# - mdformat-toc
Expand Down
26 changes: 16 additions & 10 deletions docs/PREPROCESSING.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,32 +2,36 @@

Preprocessing is divided into three steps that use two different virtual environments.

1. Alignment, masking, and registration (`image_preprocessing` virtual environment)
1. Punctate structures: Alignment, masking, and registration (`image_preprocessing` virtual environment)
2. Punctate structures: Generate pointclouds (main virtual environment)
3. Polymorphic structures: Generate SDFs (main virtual environment)

# Configure input data

1. Datasets are hosted on quilt. Download raw data at the following links

* [cellPACK synthetic dataset](https://open.quiltdata.com/b/allencell/tree/aics/morphology_appropriate_representation_learning/cellPACK_single_cell_punctate_structure/)
* [DNA replication foci dataset](https://open.quiltdata.com/b/allencell/packages/aics/nuclear_project_dataset_4)
* [WTC-11 hIPSc single cell image dataset v1](https://staging.allencellquilt.org/b/allencell/tree/aics/hipsc_single_cell_image_dataset/)
* [Nucleolar drug perturbation dataset](https://open.quiltdata.com/b/allencell/tree/aics/NPM1_single_cell_drug_perturbations/)
- [cellPACK synthetic dataset](https://open.quiltdata.com/b/allencell/tree/aics/morphology_appropriate_representation_learning/cellPACK_single_cell_punctate_structure/)
- [DNA replication foci dataset](https://open.quiltdata.com/b/allencell/packages/aics/nuclear_project_dataset_4)
- [WTC-11 hIPSc single cell image dataset v1](https://staging.allencellquilt.org/b/allencell/tree/aics/hipsc_single_cell_image_dataset/)
- [Nucleolar drug perturbation dataset](https://open.quiltdata.com/b/allencell/tree/aics/NPM1_single_cell_drug_perturbations/)

> [!NOTE]
> Ensure to download all the data in the same folder where the repo was cloned!
> [!NOTE]
> Ensure to download all the data in the `benchmarking_representations` folder.
# Alignment, masking, and registration
1. Edit the data paths in the file `/subpackages/image_preprocessing/config/config.yaml` to point to your copies of the data.
# Punctate structures: Alignment, masking, and registration

1. Edit the data paths in the file `subpackages/image_preprocessing/config/config.yaml` to point to your copies of the data.
2. Follow the [installation and usage instructions](/subpackages/image_preprocessing/README.md) to create the `image_preprocessing` virtual environment and run the Snakefile.

# Switch to main virtual environment

1. Deactivate the `image_preprocessing` virtual environment (if applicable).
2. Follow the [installation instructions](./USAGE.md) (everything before "Usage") for the main virtual environment.

# Punctate structures: Generate pointclouds
Edit the data paths in the following file to match the location of your copy of the data, then run it.

Edit the data paths in the following file to match the location of the outputs of the alignment, masking, and registration step, then run it.

```
src
└── br
Expand All @@ -38,7 +42,9 @@ src
```

# Polymorphic structures: Generate SDFs

Edit the data paths in the following files to match the location of your copy of the data, then run both.

```
src
└── br
Expand Down
10 changes: 9 additions & 1 deletion subpackages/image_preprocessing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,20 +5,28 @@ Code for alignment, masking, and registration of 3D single cell images.
# Installation

Move to this `image_preprocessing` directory.

```bash
cd subpackages/image_preprocessing
```

Install dependencies.

```bash
conda create --name preprocessing_env python=3.10
conda activate preprocessing_env
pip install -r requirements.txt
pip install -e .
```

# Configuration

Edit the data paths in the file `subpackages/image_preprocessing/config/config.yaml` to point to your copies of the data.

# Usage

Once data is downloaded and config files are set up, run preprocessing scripts.

```bash
snakemake -s Snakefile --cores all
```
```

0 comments on commit ef0da96

Please sign in to comment.