Skip to content

Commit

Permalink
Merge branch 'main' into data-conventions
Browse files Browse the repository at this point in the history
  • Loading branch information
mzouink authored Nov 12, 2024
2 parents 64a508b + 975b8b8 commit 4a4312b
Show file tree
Hide file tree
Showing 4 changed files with 207 additions and 3 deletions.
4 changes: 2 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@
# -- Project information -----------------------------------------------------

project = "DaCapo"
copyright = "2024, Caroline Malin-Mayor, Jeff Rhoades, Marwan Zouinkhi, William Patton, David Ackerman, Jan Funke"
author = "Caroline Malin-Mayor, Jeff Rhoades, Marwan Zouinkhi, William Patton, David Ackerman, Jan Funke"
copyright = "2024, William Patton, Jeff Rhoades, Marwan Zouinkhi, David Ackerman, Caroline Malin-Mayor, Jan Funke"
author = " William Patton, Jeff Rhoades, Marwan Zouinkhi, David Ackerman, Caroline Malin-Mayor, Jan Funke"


# -- General configuration ---------------------------------------------------
Expand Down
4 changes: 3 additions & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,14 @@
overview
install
notebooks/minimal_tutorial
unet_architectures
tutorial
docker
aws
cosem_starter
roadmap
autoapi/index
cli

.. include:: ../../README.md
:parser: myst_parser.sphinx_
:parser: myst_parser.sphinx_
77 changes: 77 additions & 0 deletions docs/source/roadmap.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
.. _sec_roadmap:

Road Map
========

Overview
--------

+-----------------------------------+------------------+-------------------------------+
| Task | Priority | Current State |
+===================================+==================+===============================+
| Write Documentation | High | Started with a long way to go |
+-----------------------------------+------------------+-------------------------------+
| Simplify configurations | High | First draft complete |
+-----------------------------------+------------------+-------------------------------+
| Develop Data Conventions | High | First draft complete |
+-----------------------------------+------------------+-------------------------------+
| Improve Blockwise Post-Processing | Low | Not Started |
+-----------------------------------+------------------+-------------------------------+
| Simplify Array handling | High | Almost done (Up/Down sampling)|
+-----------------------------------+------------------+-------------------------------+

Detailed Road Map
-----------------

- [ ] Write Documentation
- [ ] tutorials: not more than three, simple and continuously tested (with Github actions, small U-Net on CPU could work)
- [x] Basic tutorial: train a U-Net on a toy dataset
- [ ] Parametrize the basic tutorial across tasks (instance/semantic segmentation).
- [ ] Improve visualizations. Move some simple plotting functions to DaCapo.
- [ ] Add a pure pytorch implementation to show benefits side-by-side
- [ ] Track performance metrics (e.g., loss, accuracy, etc.) so we can make sure we aren't regressing
- [ ] semantic segmentation (LM and EM)
- [ ] instance segmentation (LM or EM, can be simulated)
- [ ] general documentation of CLI, also API for developers (curate docstrings)
- [x] Simplify configurations
- [x] Depricate old configs
- [x] Add simplified config for simple cases
- [x] can still get rid of `*Config` classes
- [x] Develop Data Conventions
- [x] document conventions
- [ ] convenience scripts to convert dataset into our convention (even starting from directories of PNG files)
- [ ] Improve Blockwise Post-Processing
- [ ] De-duplicate code between “in-memory” and “block-wise” processing
- [ ] have only block-wise algorithms, use those also for “in-memory”
- [ ] no more “in-memory”, this is just a run with a different Compute Context
- [ ] Incorporate `volara` into DaCapo (embargo until January)
- [ ] Improve debugging support (logging of chain of commands for reproducible runs)
- [ ] Split long post-processing steps into several smaller ones for composability (e.g., support running each step independently if we want to support choosing between `waterz` and `mutex_watershed` for fragment generation or agglomeration)
- [x] Incorporate `funlib.persistence` adaptors.
- [x] all of those can be adapters:
- [x] Binarize Labels into Mask
- [x] Scale/Shift intensities
- [ ] Up/Down sample (if easily possible)
- [ ] DVID source
- [x] Datatype conversions
- [x] everything else
- [x] simplify array configs accordingly

Can Have
--------

- [ ] Support other stats stores. Too much time, effort and code was put into the stats and didn’t provide a very nice interface:
- [ ] defining variables to store
- [ ] efficiently batch writing, storing and reading stats to both files and mongodb
- [ ] visualizing stats.
- [ ] Jeff and Marwan suggest MLFlow instead of WandB
- [ ] Support for slurm clusters
- [ ] Support for cloud computing (AWS)
- [ ] Lazy loading of dependencies (import takes too long)
- [ ] Support bioimage model spec for model dissemination

Non-Goals (for v1.0)
--------------------

- custom dash board
- GUI to run experiments
125 changes: 125 additions & 0 deletions docs/source/unet_architectures.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
UNet Models
===========

This section explains how to configure and use UNet models in DaCapo. Several configurations for different types of UNet architectures are demonstrated below.

Overview
--------

UNet is a popular architecture for image segmentation tasks, particularly in biomedical imaging. DaCapo provides support for configuring various types of UNet models with customizable parameters.

Examples
--------

Here are some examples of UNet configurations:

1. **Upsample UNet**

.. code-block:: python
from dacapo.experiments.architectures import CNNectomeUNetConfig
from funlib.geometry import Coordinate
architecture_config = CNNectomeUNetConfig(
name="upsample_unet",
input_shape=Coordinate(216, 216, 216),
eval_shape_increase=Coordinate(72, 72, 72),
fmaps_in=1,
num_fmaps=12,
fmaps_out=72,
fmap_inc_factor=6,
downsample_factors=[(2, 2, 2), (3, 3, 3), (3, 3, 3)],
constant_upsample=True,
upsample_factors=[(2, 2, 2)],
)
2. **Yoshi UNet**

.. code-block:: python
yoshi_unet_config = CNNectomeUNetConfig(
name="yoshi-unet",
input_shape=Coordinate(188, 188, 188),
eval_shape_increase=Coordinate(72, 72, 72),
fmaps_in=1,
num_fmaps=12,
fmaps_out=72,
fmap_inc_factor=6,
downsample_factors=[(2, 2, 2), (2, 2, 2), (2, 2, 2)],
constant_upsample=True,
upsample_factors=[],
)
3. **Attention Upsample UNet**

.. code-block:: python
attention_upsample_config = CNNectomeUNetConfig(
name="attention-upsample-unet",
input_shape=Coordinate(216, 216, 216),
eval_shape_increase=Coordinate(72, 72, 72),
fmaps_in=1,
num_fmaps=12,
fmaps_out=72,
fmap_inc_factor=6,
downsample_factors=[(2, 2, 2), (3, 3, 3), (3, 3, 3)],
constant_upsample=True,
upsample_factors=[(2, 2, 2)],
use_attention=True,
)
4. **2D UNet**

.. code-block:: python
architecture_config = CNNectomeUNetConfig(
name="2d_unet",
input_shape=(2, 132, 132),
eval_shape_increase=(8, 32, 32),
fmaps_in=2,
num_fmaps=8,
fmaps_out=8,
fmap_inc_factor=2,
downsample_factors=[(1, 4, 4), (1, 4, 4)],
kernel_size_down=[[(1, 3, 3)] * 2] * 3,
kernel_size_up=[[(1, 3, 3)] * 2] * 2,
constant_upsample=True,
padding="valid",
)
5. **UNet with Batch Normalization**

.. code-block:: python
architecture_config = CNNectomeUNetConfig(
name="unet_norm",
input_shape=Coordinate(216, 216, 216),
eval_shape_increase=Coordinate(72, 72, 72),
fmaps_in=1,
num_fmaps=2,
fmaps_out=2,
fmap_inc_factor=2,
downsample_factors=[(2, 2, 2), (3, 3, 3), (3, 3, 3)],
constant_upsample=True,
upsample_factors=[],
batch_norm=False,
)
Configuration Parameters
------------------------

- **name**: A unique identifier for the configuration.
- **input_shape**: The shape of the input data.
- **eval_shape_increase**: Increase in shape during evaluation.
- **fmaps_in**: Number of input feature maps.
- **num_fmaps**: Number of feature maps in the first layer.
- **fmaps_out**: Number of output feature maps.
- **fmap_inc_factor**: Factor by which feature maps increase in each layer.
- **downsample_factors**: Factors by which the input is downsampled at each layer.
- **upsample_factors**: Factors by which the input is upsampled at each layer.
- **constant_upsample**: Whether to use constant upsampling.
- **use_attention**: Whether to use attention mechanisms.
- **batch_norm**: Whether to use batch normalization.
- **padding**: Padding mode for convolutional layers.

This page should serve as a reference for configuring UNet models in DaCapo. Adjust the parameters as per your dataset and task requirements.

0 comments on commit 4a4312b

Please sign in to comment.