Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some algorithm templates in Auto3DSeg doesn't support validation skipping #7777

Closed
mingxin-zheng opened this issue May 16, 2024 · 0 comments · Fixed by #7778
Closed

Some algorithm templates in Auto3DSeg doesn't support validation skipping #7777

mingxin-zheng opened this issue May 16, 2024 · 0 comments · Fixed by #7778

Comments

@mingxin-zheng
Copy link
Contributor

Describe the bug

If we have a datalist that includes 3 folds of data, whether it's allowed to run the 4th fold is debatable.

For example, we split the data in 3 groups: #0, #1, and #2.
1st experiment would hold #0 for validation and use 1 and 2
2nd experiment would hold #1 for val, and use 0 and 2
3rd experiment would hold #2 for val, and use 1 and 2.
The question is whether it should allow the 4th fold hold nothing and use 0, 1, and 2

The comment in code allows so:

Auto3DSeg allows no validation set, so the maximum fold number is max_fold + 1

But in practice it would cause an error in DiNTs

To Reproduce
Steps to reproduce the behavior:

  1. Create a datalist with 4 folds
  2. Run AutoRunner.
  3. Set the num_fold to 5

Expected behavior
Consistent behavior between doc and algorithm result

Additional context

16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx:
dints_2 - training ...:   0%|          | 0/1 [00:00<?, ?round/s]
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx:
dints_2 - training ...: 100%|██████████| 1/1 [00:35<00:00, 35.25s/round]
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx:
dints_2 - training ...: 100%|██████████| 1/1 [00:35<00:00, 35.25s/round]
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx: dints_2 - validation at original spacing/resolution
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx: 2024-05-16 06:32:56,886 - WARNING - dints_2 - training: finished
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx: 2024-05-16 06:32:58,570 - INFO - The keys num_warmup_epochs cannot be found in the /shared/orgs/iasixjqzw1hj/users/9550eff3-7258-5c36-96a0-5f8d3b030ad8/jobs/4cc2678c-1244-4d1b-ac3d-b47cfb7da171/auto3dseg_v0.0.8/dints_3/configs/hyper_parameters.yaml for training. Skipped overriding key num_warmup_epochs.
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx: 2024-05-16 06:32:58,571 - INFO - ['python', '/shared/orgs/iasixjqzw1hj/users/9550eff3-7258-5c36-96a0-5f8d3b030ad8/jobs/4cc2678c-1244-4d1b-ac3d-b47cfb7da171/auto3dseg_v0.0.8/dints_3/scripts/train.py', 'run', "--config_file='/shared/orgs/iasixjqzw1hj/users/9550eff3-7258-5c36-96a0-5f8d3b030ad8/jobs/4cc2678c-1244-4d1b-ac3d-b47cfb7da171/auto3dseg_v0.0.8/dints_3/configs/hyper_parameters.yaml,/shared/orgs/iasixjqzw1hj/users/9550eff3-7258-5c36-96a0-5f8d3b030ad8/jobs/4cc2678c-1244-4d1b-ac3d-b47cfb7da171/auto3dseg_v0.0.8/dints_3/configs/hyper_parameters_search.yaml,/shared/orgs/iasixjqzw1hj/users/9550eff3-7258-5c36-96a0-5f8d3b030ad8/jobs/4cc2678c-1244-4d1b-ac3d-b47cfb7da171/auto3dseg_v0.0.8/dints_3/configs/network.yaml,/shared/orgs/iasixjqzw1hj/users/9550eff3-7258-5c36-96a0-5f8d3b030ad8/jobs/4cc2678c-1244-4d1b-ac3d-b47cfb7da171/auto3dseg_v0.0.8/dints_3/configs/network_search.yaml,/shared/orgs/iasixjqzw1hj/users/9550eff3-7258-5c36-96a0-5f8d3b030ad8/jobs/4cc2678c-1244-4d1b-ac3d-b47cfb7da171/auto3dseg_v0.0.8/dints_3/configs/transforms_infer.yaml,/shared/orgs/iasixjqzw1hj/users/9550eff3-7258-5c36-96a0-5f8d3b030ad8/jobs/4cc2678c-1244-4d1b-ac3d-b47cfb7da171/auto3dseg_v0.0.8/dints_3/configs/transforms_train.yaml,/shared/orgs/iasixjqzw1hj/users/9550eff3-7258-5c36-96a0-5f8d3b030ad8/jobs/4cc2678c-1244-4d1b-ac3d-b47cfb7da171/auto3dseg_v0.0.8/dints_3/configs/transforms_validate.yaml'", '--training#num_epochs_per_validation=1', '--training#num_images_per_batch=2', '--training#num_epochs=1']
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx: 2024/05/16 06:33:05 INFO mlflow.tracking.fluent: Experiment with name 'Auto3DSeg' does not exist. Creating a new experiment.
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx:
dints_3 - training ...:   0%|          | 0/1 [00:00<?, ?round/s]
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx:
dints_3 - training ...:   0%|          | 0/1 [00:43<?, ?round/s]
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx: Traceback (most recent call last):
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx:   File "/shared/orgs/iasixjqzw1hj/users/9550eff3-7258-5c36-96a0-5f8d3b030ad8/jobs/4cc2678c-1244-4d1b-ac3d-b47cfb7da171/auto3dseg_v0.0.8/dints_3/scripts/train.py", line 1002, in <module>
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx:     fire.Fire()
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx:   File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 143, in Fire
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx:     component_trace = _Fire(component, args, parsed_flag_args, context, name)
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx:   File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 477, in _Fire
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx:     component, remaining_args = _CallAndUpdateTrace(
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx:   File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 693, in _CallAndUpdateTrace
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx:     component = fn(*varargs, **kwargs)
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx:   File "/shared/orgs/iasixjqzw1hj/users/9550eff3-7258-5c36-96a0-5f8d3b030ad8/jobs/4cc2678c-1244-4d1b-ac3d-b47cfb7da171/auto3dseg_v0.0.8/dints_3/scripts/train.py", line 767, in run
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx:     logger.debug(f"evaluation metric - class {_c + 1}: {metric[2 * _c] / metric[2 * _c + 1]}")
16/May/2024:06:39:18,6511159,4cc2678c-1244-4d1b-ac3d-b47cfb7da171-dgx: ZeroDivisionError: float division by zero
KumoLiu added a commit that referenced this issue May 17, 2024
Fixes #7777.

### Description

Lower the maximum `num_fold` allowed for user inputs in Auto3DSeg
AutoRunner

### Types of changes
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [x] Non-breaking change (fix or new feature that would not break
existing functionality).
- [x] In-line docstrings updated.

Signed-off-by: Mingxin Zheng <[email protected]>
Co-authored-by: YunLiu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant