Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify that image_preprocessing is only for punctate structures #60

Merged
merged 2 commits into from
Nov 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -88,12 +88,13 @@ repos:

# md formatting
- repo: https://github.com/executablebooks/mdformat
rev: 0.7.14
rev: 0.7.16
hooks:
- id: mdformat
args: ["--number"]
additional_dependencies:
- mdformat-gfm
- mdformat-gfm-alerts
- mdformat-tables
- mdformat_frontmatter
# - mdformat-toc
Expand Down
26 changes: 16 additions & 10 deletions docs/PREPROCESSING.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,32 +2,36 @@

Preprocessing is divided into three steps that use two different virtual environments.

1. Alignment, masking, and registration (`image_preprocessing` virtual environment)
1. Punctate structures: Alignment, masking, and registration (`image_preprocessing` virtual environment)
2. Punctate structures: Generate pointclouds (main virtual environment)
3. Polymorphic structures: Generate SDFs (main virtual environment)

# Configure input data

1. Datasets are hosted on quilt. Download raw data at the following links

* [cellPACK synthetic dataset](https://open.quiltdata.com/b/allencell/tree/aics/morphology_appropriate_representation_learning/cellPACK_single_cell_punctate_structure/)
* [DNA replication foci dataset](https://open.quiltdata.com/b/allencell/packages/aics/nuclear_project_dataset_4)
* [WTC-11 hIPSc single cell image dataset v1](https://staging.allencellquilt.org/b/allencell/tree/aics/hipsc_single_cell_image_dataset/)
* [Nucleolar drug perturbation dataset](https://open.quiltdata.com/b/allencell/tree/aics/NPM1_single_cell_drug_perturbations/)
- [cellPACK synthetic dataset](https://open.quiltdata.com/b/allencell/tree/aics/morphology_appropriate_representation_learning/cellPACK_single_cell_punctate_structure/)
- [DNA replication foci dataset](https://open.quiltdata.com/b/allencell/packages/aics/nuclear_project_dataset_4)
- [WTC-11 hIPSc single cell image dataset v1](https://staging.allencellquilt.org/b/allencell/tree/aics/hipsc_single_cell_image_dataset/)
- [Nucleolar drug perturbation dataset](https://open.quiltdata.com/b/allencell/tree/aics/NPM1_single_cell_drug_perturbations/)

> [!NOTE]
> Ensure to download all the data in the same folder where the repo was cloned!
> [!NOTE]
> Ensure to download all the data in the `benchmarking_representations` folder.

# Alignment, masking, and registration
1. Edit the data paths in the file `/subpackages/image_preprocessing/config/config.yaml` to point to your copies of the data.
# Punctate structures: Alignment, masking, and registration

1. Edit the data paths in the file `subpackages/image_preprocessing/config/config.yaml` to point to your copies of the data.
2. Follow the [installation and usage instructions](/subpackages/image_preprocessing/README.md) to create the `image_preprocessing` virtual environment and run the Snakefile.

# Switch to main virtual environment

1. Deactivate the `image_preprocessing` virtual environment (if applicable).
2. Follow the [installation instructions](./USAGE.md) (everything before "Usage") for the main virtual environment.

# Punctate structures: Generate pointclouds
Edit the data paths in the following file to match the location of your copy of the data, then run it.

Edit the data paths in the following file to match the location of the outputs of the alignment, masking, and registration step, then run it.

```
src
└── br
Expand All @@ -38,7 +42,9 @@ src
```

# Polymorphic structures: Generate SDFs

Edit the data paths in the following files to match the location of your copy of the data, then run both.

```
src
└── br
Expand Down
10 changes: 9 additions & 1 deletion subpackages/image_preprocessing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,20 +5,28 @@ Code for alignment, masking, and registration of 3D single cell images.
# Installation

Move to this `image_preprocessing` directory.

```bash
cd subpackages/image_preprocessing
```

Install dependencies.

```bash
conda create --name preprocessing_env python=3.10
conda activate preprocessing_env
pip install -r requirements.txt
pip install -e .
```

# Configuration

Edit the data paths in the file `subpackages/image_preprocessing/config/config.yaml` to point to your copies of the data.

# Usage

Once data is downloaded and config files are set up, run preprocessing scripts.

```bash
snakemake -s Snakefile --cores all
```
```