Static Documentation for v0.1 (#118)

* Make the clay model work on lightning cli * Update Mean & std for larger chunk of data * Update model & trainer to work on multi gpu setup * Fix logging images to wandb * Add specification docs * Add trainer details * Add streamlit UI to test the embeddings (#108) UI supports vector search & arithmetic for embeddings generated from CLAY. --------- Co-authored-by: SRM <[email protected]> * Use rasterio.plot to show images & filter by year, tile & idx * Fix half:0 error by not explicitly setting float16 to cube[pixels] * Add documentation to create the pixel reconstruction from the model * Add documention explaining location embeddings * Add shuffle as an argument to CLAY * Add documentation to showcase interpolation between images in embedding space * Adapt data pipeline to accept custom date range and local path for storing tiles * Add example script to run clay over custom AOI. * add sections to TOC * basic use from root README * Install from root REAMDE * how to build/preview docs * more narrative on release notes * more narrative on intro * rename intro to index, better SEO * reorder for clarity * fixed links * fixed links * renamed file for clarity * added links and consistent title * typo * Closes #64, as much as I can imagine. * results explanation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * shorter line * Making ruff happy * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Making ruff happy * Moved shuffle to an optional keyword. CHECK * Recovered from 0145e55 * Clearing outputs for notebooks. Should not merge into main been PR is squashed. * tested to run * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * tested to run * Update docs/_toc.yml Co-authored-by: Wei Ji <[email protected]> * assume root folder Co-authored-by: Wei Ji <[email protected]> * Update src/model_clay.py Co-authored-by: Wei Ji <[email protected]> * Update src/model_clay.py Co-authored-by: Wei Ji <[email protected]> * Update src/model_clay.py Co-authored-by: Wei Ji <[email protected]> * Update docs/README Co-authored-by: Wei Ji <[email protected]> * Update docs/installation.md Co-authored-by: Wei Ji <[email protected]> * Let pre-commit enforce 512KB file size limit in entire repo Make sure that large files don't get commited to git. Xref https://github.com/pre-commit/pre-commit-hooks/tree/v4.5.0#check-added-large-files * Remove extra args.update lines * Set execute_notebooks to cache Don't force running all notebooks on the docs build, only run those that don't have output cells already. * Remove PNG files in docs/assets, replace with web version Hosting the PNG images directly via GitHub's UI, instead of in the git repo. * Only build files in toc and rename docs/README to docs/README.md Don't build files that aren't in the Table of Contents (https://jupyterbook.org/en/stable/structure/configure.html#disable-building-files-that-arent-in-the-table-of-contents). Can then rename README to README.md without raising a warning like `WARNING: document isn't included in any toctree`. * Use huggingface ckpt, fix header, and rm sample.gif in interpolation nb Some minor changes to the clay-v0-interpolation.ipynb file, and remove the large sample.gif file. Also gitignoring .gif and .png files. * Fix some link typos and edit wording on main index page * Remove run_region notebook Will be handled in #122 * Fix typos in model specification Should be 10 bands of Sentinel-2, not 13. * Rename tutorial notebooks * Exclude tutorial notebooks from execution during build. * Make tutorial title consistent --------- Co-authored-by: SRM <[email protected]> Co-authored-by: Soumya Ranjan Mohanty <[email protected]> Co-authored-by: Daniel Wiesmann <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Wei Ji <[email protected]> Co-authored-by: Daniel Wiesmann <[email protected]>
Clay-foundation · Jan 16, 2024 · 959dfbd · 959dfbd
1 parent 8309c50
commit 959dfbd
Show file tree

Hide file tree

Showing 15 changed files with 1,584 additions and 23 deletions.
diff --git a/.gitignore b/.gitignore
@@ -18,6 +18,8 @@ logs/
 # Data files and folders
 data/**
 !data/**/
+**/*/.gif
+**/*/.png
 
 # Distribution / packaging
 .Python

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -5,6 +5,7 @@ repos:
   rev: v4.5.0
   hooks:
     - id: check-added-large-files
+      args: [ '--maxkb=512', '--enforce-all' ]
     - id: check-yaml
     - id: end-of-file-fixer
     - id: trailing-whitespace

diff --git a/docs/README.md b/docs/README.md
@@ -0,0 +1,20 @@
+# Clay Model Documentation
+
+This Documentation uses [Jupyter Book](https://jupyterbook.org/intro.html).
+
+Install it with:
+```bash
+pip install -U jupyter-book
+```
+
+Then build it with:
+```bash
+jupyter-book build docs/
+```
+
+You can preview the site locally with:
+```bash
+python -m http.server --directory _build/html
+```
+
+There is a GitHub Action on `./github/workflows/deploy-docs.yml` that builds the site and pushes it to GitHub Pages.
diff --git a/docs/_config.yml b/docs/_config.yml
@@ -4,11 +4,14 @@
 title: Clay Foundation Model
 author: Clay Foundation
 logo: logo.png
+only_build_toc_files: true
 
-# Force re-execution of notebooks on each build.
+# Only execution notebooks with no output cells on each build.
 # See https://jupyterbook.org/content/execute.html
 execute:
-  execute_notebooks: force
+  execute_notebooks: cache
+  exclude_patterns:
+    - clay-v0-*.ipynb
 
 # Define the name of the latex output file for PDF builds
 latex:

diff --git a/docs/_toc.yml b/docs/_toc.yml
@@ -2,12 +2,20 @@
 # Learn more at https://jupyterbook.org/customize/toc.html
 
 format: jb-book
-root: intro
+root: index
 parts:
+- caption: Release notes
+  chapters:
+    - title: Software release notes
+      file: changelog
+    - title: Model release notes
+      file: specification
 - caption: Getting Started
   chapters:
     - title: Installation
       file: installation
+    - title: Basic Use
+      file: basic_use
 - caption: Data Preparation
   chapters:
     - title: Creating datacubes
@@ -22,13 +30,19 @@ parts:
       file: model_embeddings
     - title: Finetuning
       file: model_finetuning
-- caption: Reference Documentation
+- caption: Tutorials
   chapters:
-    - title: Changelog
-      file: changelog
+    - title: Generative AI for pixel reconstruction
+      file: clay-v0-reconstruction
+    - title: Create location embeddings
+      file: clay-v0-location-embeddings
+    - title: Interpolating images in embedding space
+      file: clay-v0-interpolation
 - caption: About Clay
   chapters:
     - title: GitHub
       url: https://github.com/Clay-foundation
     - title: LinkedIn
       url: https://www.linkedin.com/company/made-with-clay
+    - title: Website
+      url: https://madewithclay.org
diff --git a/docs/basic_use.md b/docs/basic_use.md
@@ -0,0 +1,42 @@
+# Basic Use
+
+### Running jupyter lab
+
+    mamba activate claymodel
+    python -m ipykernel install --user --name claymodel  # to install virtual env properly
+    jupyter kernelspec list --json                       # see if kernel is installed
+    jupyter lab &
+
+
+### Running the model
+
+The neural network model can be ran via
+[LightningCLI v2](https://pytorch-lightning.medium.com/introducing-lightningcli-v2supercharge-your-training-c070d43c7dd6).
+To check out the different options available, and look at the hyperparameter
+configurations, run:
+
+    python trainer.py --help
+    python trainer.py test --print_config
+
+To quickly test the model on one batch in the validation set:
+
+    python trainer.py validate --trainer.fast_dev_run=True
+
+To train the model for a hundred epochs:
+
+    python trainer.py fit --trainer.max_epochs=100
+
+To generate embeddings from the pretrained model's encoder on 1024 images
+(stored as a GeoParquet file with spatiotemporal metadata):
+
+    python trainer.py predict --ckpt_path=checkpoints/last.ckpt \
+                              --data.batch_size=1024 \
+                              --data.data_dir=s3://clay-tiles-02 \
+                              --trainer.limit_predict_batches=1
+
+More options can be found using `python trainer.py fit --help`, or at the
+[LightningCLI docs](https://lightning.ai/docs/pytorch/2.1.0/cli/lightning_cli.html).
+
+## Advanced
+
+See [Readme](https://github.com/Clay-foundation/model/blob/v0.0.1/README.md) on model root for more details.
diff --git a/docs/changelog.md b/docs/changelog.md
@@ -1,6 +1,11 @@
-# Changelog
+(software_release)=
+# Code Model release v0.0.1
+
+This changelog is a summary of the changes to the source code of the Clay model.
+Released on 2024/01/12.
+
+> For release notes for the trained model, see [](model_release)
 
-## Release v0.0.1 (2024/01/12)
 
 ### 💫 Highlights