Feature/auto batch size find #426

benjijamorris · 2024-09-20T19:57:30Z

What does this PR do?

The auto batch size finder increases the batch size by powers of 2 (starting with batch-size=1) by setting the datamodule's batch_size attribute. This PR allows just the dataframe datamodule (I think the only one used by the plugin) to respond to its batch_size attribute being changed. In order to do this, we store the initial batch size (set to 1) and check for changes to the batch size to update the train dataloaders.

Before submitting

Did you make sure title is self-explanatory and the description concisely explains the PR?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you list all the breaking changes introduced by this pull request?
Did you test your PR locally with pytest command?
Did you run pre-commit hooks with pre-commit run -a command?

Did you have fun?

Make sure you had fun coding 🙃

hughes036 · 2024-09-27T21:59:31Z

cyto_dl/train.py

@@ -93,6 +100,10 @@ def train(cfg: DictConfig, data=None) -> Tuple[dict, dict]:
        log.info("Logging hyperparameters!")
        utils.log_hyperparameters(object_dict)

+    if use_batch_tuner:


This seems related to the new block up starting on line 62 - could it go up there for readability? Could they also be a single condition (it seems they are logically bound?).

Good point. I have it split up because the data arguments need to be changed if the batch size tuner is used, but then the actual batch tuning requires an initialized datamodule. I can rearrange things to make them a little closer, but that might be purely aesthetic as they can't be fully grouped.

hughes036

I would weight my review low not having much context.

Is a new kwarg is added by this change? It would be good to see some unit test coverage for that but only if a suit exists already.

benjijamorris · 2024-09-27T22:08:16Z

No new kwarg is added - just changing the default value of an existing kwarg.

ritvikvasan

Super useful! Definitely annoying to try to find the largest batch size that fits memory

Benjamin Morris added 2 commits September 20, 2024 12:51

change batch size for dataframe datamodule

0c4bdca

update config

6998235

benjijamorris requested review from hughes036, saeliddp, ritvikvasan and yrkim98 September 20, 2024 19:57

check if batch size exists for array prediction"

1f8f705

hughes036 reviewed Sep 27, 2024

View reviewed changes

hughes036 approved these changes Sep 27, 2024

View reviewed changes

ritvikvasan approved these changes Oct 2, 2024

View reviewed changes

benjijamorris merged commit d285e66 into main Oct 3, 2024
4 of 6 checks passed

benjijamorris deleted the feature/auto_batch_size_find branch October 3, 2024 17:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/auto batch size find #426

Feature/auto batch size find #426

benjijamorris commented Sep 20, 2024

hughes036 Sep 27, 2024

benjijamorris Sep 27, 2024

hughes036 left a comment

benjijamorris commented Sep 27, 2024

ritvikvasan left a comment

Feature/auto batch size find #426

Feature/auto batch size find #426

Conversation

benjijamorris commented Sep 20, 2024

What does this PR do?

Before submitting

Did you have fun?

hughes036 Sep 27, 2024

Choose a reason for hiding this comment

benjijamorris Sep 27, 2024

Choose a reason for hiding this comment

hughes036 left a comment

Choose a reason for hiding this comment

benjijamorris commented Sep 27, 2024

ritvikvasan left a comment

Choose a reason for hiding this comment