Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Product backlog #1

Open
19 of 30 tasks
JLrumberger opened this issue Dec 11, 2023 · 1 comment
Open
19 of 30 tasks

Product backlog #1

JLrumberger opened this issue Dec 11, 2023 · 1 comment
Assignees

Comments

@JLrumberger
Copy link
Collaborator

JLrumberger commented Dec 11, 2023

General

  • Prepare scaling plots until end of february. Y-axis: the speedup we get when running one epoch through the model for 2,4,6,8,10 GPUs
  • Find out how many samples we have in the plankton dataset (Lorenz)
    image. 3,423,255 samples
  • Implement Nav to enable training with different resolutions (no padding)
  • test restarting training from ckpt (with or without optimizer state?)

Speed

  • Improve GPU Utilization
  • Turn benchmark notebook into Python script and run benchmark on cluster (Lucas)
  • improve data loading / pre fetching / augmentation library / data format (lmdb)
  • if enough CPU power available: augmentations back to CPU to save GPU mem
  • Try ffcv (Jerome)
  • check if we can change crop size if we don’t use pretrained models
  • Change dataloader from python for-loop to pandas concat dataframe
  • Benchmark DDP vs FSDP (DDP slightly faster but not crucial)
  • Investigate slow down with more data (ie. 1/5 of dataset runs faster than full ds)
  • Slice Input to use only one dimension (Nora)
  • Use torch Profiler (may need Pytorch 2.1 bc bug w 2.0.0)
  • Cache dataset to RAM (Jerome)
  • Investigate FSDP chunk number (Jerome)
  • Check if non 2^x batch size slows down the training (would be useful to optimize GPU mem usage)
  • Merge Lorenz pandas loop

Evaluation

  • train fully supervised model (ViT, CNN) to get upper bound
  • evaluate (knn) using checkpoints from models trained on Plankton data vs imagenet pre-trained models
  • linear evaluation (Jerome)
  • Add confusion matrix (Nora)
  • Look at wrongly classified images (Nora)
  • clustering of embeddings (look at shapes, use https://biigle.de/ ?)
  • visualization of attention maps / intermediate features (Lucas is interested in looking at that)
  • Unsupervised clustering algo

Misc

  • Try masked auto-encoder instead of DinoV2 and test performance against DinoV2
  • Add NaViT rescaling of positional embedding (PE) and masking of padded tokens or only re-scaling of PE in the style of timm.
  • Add Ray hyperparameter tuning
@NoraKrbr
Copy link

NoraKrbr commented Dec 11, 2023

Tasks

We should try only to do minimal code changes!

First Group

  • Dataset Pre-Processing, creating np files, make all images same size and fill with padding (@JLrumberger)
  • Build Dataset class for WHOI (kwargs should include options to provide multiple paths)
  • adapt make_dataset function (function in loaders)
  • adapt where make_dataset is called in train.py (e.g. if it's not string, concatenate datasets)

Second Group

Other

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants