diff --git a/.github/README.md b/.github/README.md index 32475d8..a17eaa0 100644 --- a/.github/README.md +++ b/.github/README.md @@ -1,73 +1,127 @@ -## Domain Adaptation and Uncertainty Quantification for Gravitational Lens Modeling +# Domain Adaptation and Uncertainty Quantification for Gravitational Lens Modeling ![status](https://img.shields.io/badge/License-MIT-lightgrey) ---- +
This project combines Domain Adaptation (DA) with neural network (NN) Uncertainty Quantification (UQ) in the context of strong gravitational lens parameter prediction. We hope that this work helps take a step towards more accurate applications of deep learning models to real observed datasets, especially when the latter have limited labels. We predict the Einstein radius $\theta_\mathrm{E}$ from simulated multi-band images of strong gravitational lenses. Generally, to our knowledge, this is the first work in which domain adaptation and uncertainty quantification are combined, including for regression on an astrophysics dataset. +
-### UQ: Mean-variance Estimation (MVE) + + -For UQ, we use a mean-variance estimation (MVE) NN to predict the Einstein radius $\theta_\mathrm{E}$ and its aleatoric uncertainty $\sigma_\mathrm{al}$. -Scientific analysis requires an estimate of uncertainty on measurements. We adopt an approach known as mean-variance estimation, which seeks to estimate the variance and control regression by minimizing the beta negative log-likelihood loss. +## UQ: Mean-variance Estimation (MVE) ++For UQ, we use a mean-variance estimation (MVE) NN to predict the Einstein radius $\theta_\mathrm{E}$ and its aleatoric uncertainty $\sigma_\mathrm{al}$. Scientific analysis requires an estimate of uncertainty on measurements. We adopt an approach known as mean-variance estimation, which seeks to estimate the variance and control regression by minimizing the beta negative log-likelihood loss. +
-### Unsupervised Domain Adaptation (UDA) + + + +## Unsupervised Domain Adaptation (UDA) + +Applying deep learning in science fields like astronomy can be difficult. When models trained on simulated data are applied to real data, the models frequently underperform because simulations rarely perfectly capture the full complexity of real data. Enter domain adaptation (DA), a framework for +
-In this work, we use unsupervised DA (UDA), where the target data The DA technique used in this work use Maximum Mean Discrepancy (MMD) Loss to train a network to being embeddings of labeled "source" data gravitational lenses in line with unlabeled "target" gravitational lenses. With source and target datasets made similar, training on source datasets can be used with greater fidelity on target datasets. -Unuspervised DA aligns an unlabelled "target" dataset with a labeled "source" dataset, so that predictions can be performed on both with accuracy. That target domain has a domain shift that must be aligned. In our case, we add realistic astrophysical survey-like noise to strong lensing images in the target dataset, but no noise in the source dataset. ++In this work, we use unsupervised DA (UDA), where the target data The DA technique used in this work, we use Maximum Mean Discrepancy (MMD) Loss to train a network to being embeddings of labeled "source" data gravitational lenses in line with unlabeled "target" gravitational lenses. With source and target datasets made similar, training on source datasets can be used with greater fidelity on target datasets. Unsupervised DA aligns an unlabelled "target" dataset with a labeled "source" dataset so that predictions can be performed on both with accuracy. That target domain has a domain shift that must be aligned. In our case, we add realistic astrophysical survey-like noise to strong lensing images in the target dataset but no noise in the source dataset. +
+ -![plot](../src/training/MVEUDA/figures/isomap_final.png) + + -#### Coded By: [Shrihan Agarwal](https://github.com/ShrihanSolo) ---- +## Datasets ++ +Both source and target datasets are generated using `deeplenstronomy`. In the figure below, we show a single simulated strong lens in three bands ($g$, $r$, $z$) without noise (source domain; upper panel) and with DES-like noise (target domain; lower panel). The datasets (images and labels) can be downloaded from the project's zenodo site: [zenodo: Neural network prediction of strong lensing systems with domain adaptation and uncertainty quantification +](https://zenodo.org/records/13647416). -### Datasets -Both source and target datasets are generated using ```deeplenstronomy```. Below, we show a single 3-band image simulated using the no-noise source dataset and DES-like noise target dataset as a comparison. + -![plot](../src/training/MVEUDA/figures/source_example.png) + + + +
-![plot](../src/training/MVEUDA/figures/target_example.png) + -The datasets with these images, as well as the training labels, can be downloaded from zenodo: https://zenodo.org/records/13647416. ---- +## Installation -### Installation +Clone the package into any directory: +> git clone https://github.com/deepskies/DomainAdaptiveMVEforLensModeling -#### Clone +Create environments with `conda` for training and for simulation, respectively: -Clone the package using: +> conda env create -f training_env.yml. + +> conda env create -f deeplenstronomy_env.yml -> git clone https://github.com/deepskies/AdaptiveMVEforLensModeling -into any directory. Then, install the environments. ++ +A `yaml` file (i.e., `training_env.yml`) is required for training the `pytorch` neural network model, and `deeplenstronomy_env.yml` is required for simulating strong lensing datasets with `deeplenstronomy`. There is a sky brightness-related bug in the PyPI 0.0.2.3 version of `deeplenstronomy`, and an update to the latest version will be required for reproduction of results. This works on Linux but has not been tested for Mac or Windows. +
-#### Environments + -This works on linux, but has not been tested for mac, windows. -We recommend using conda. Install the environments in `envs/` using conda with the following command: -> conda env create -f training_env.yml. + +## Reproducing the Paper Results + +### Acquiring the Dataset + +* __Option A: Generate the Dataset__ + * Navigate to `src/sim/notebooks/`. + * Generate a source/target data pair in the `src/data/` directory by running `gen_sim.py` on `src/sim/config/source_config.yaml` and `target_config.yaml`: + * > gen_sim.py src/sim/config/source_config.yaml target_config.yaml -> conda env create -f deeplenstronomy_env.yml +* __Option B: Download the Dataset__ + * Zip files of the dataset are available at https://zenodo.org/records/13647416. + * The source and target data downloaded should be added to the `src/data/` directory. + * Move or copy the directories `mb_paper_source_final` and `mb_paper_target_final` into the `src/data/` directory. + + -The `training_env.yml` is required for training the Pytorch model, and `deeplenstronomy_env.yml` for simulating strong lensing datasets using `deeplenstronomy`. Note that there is a sky brightness-related bug in the PyPI 0.0.2.3 version of deeplenstronomy, and an update to the latest version will be required for reproduction of results. +### Training the Model + +* __MVE-Only__ + * Navigate to `src/training/MVEonly/MVE_noDA_RunA.ipynb` (or Run B, C, D, E) + * Activate the `neural` conda environment: + * Run training by running the notebook. + * New runs by a user will be stored in the adjacent `models/` directories. + +* __MVE-UDA__ + * Follow an identical procedure to the above, except that the base path is `src/training/MVEUDA/`. ---- + -### Repository Structure +### Visualizing the Paper Results -The repository structure is below. +* To generate the results in the paper, use the notebook `src/training/MVEUDA/ModelVizPaper.ipynb`. + * Final figures from this notebook are stored in `src/training/MVEUDA/figures/`. + * Saved PyTorch models of the runs are provided in `src/training/MVE*/paper_models/`. + + +