-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple fixes on benchmark ensembling problems #6414
Conversation
/build |
/integration-test |
Signed-off-by: heyufan1995 <[email protected]>
Signed-off-by: heyufan1995 <[email protected]>
Signed-off-by: heyufan1995 <[email protected]>
Signed-off-by: heyufan1995 <[email protected]>
Hi @heyufan1995 , the original behavior was (somewhat) looking for "best_metrics" in the first version of design, and then changed in the skip algo train PR: |
Fyi the current pr doesn't pass the integration test https://github.com/Project-MONAI/MONAI/actions/runs/4766813105/jobs/8474307100 |
@mingxin-zheng I checked the current monai dev branch, the 'trained" algo should have AlgoKeys.SCORE value in the algorithm pickle file, or else it's untrained. I added a logic here, if AlgoKeys.SCORE is not in algorithm pickle (which is the case if the training is done outside autorunner), use algo.get_score to read from progress.yaml. If there is progress.yaml with a score, then consider it as trained. So I don't think there is a conflict with skipping algo. But the risk is if an algo is trained for some epoch and had validation score, but the training somehow failed, this algo will still be considered "trained". So I think one way is to write a FINISH flag file directly from algo.train, not by setting a score in the pickle file after training in autorunner._train_algo_in_sequence |
@wyli I looked at the test results, it says "
But in this PR I changed this line to "assert len(history) == 3" to avoid this assertion error. |
Signed-off-by: monai-bot <[email protected]>
Do you suggest writing it to progress.yaml file @heyufan1995 ? |
/build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integration test: https://github.com/Project-MONAI/MONAI/actions/runs/4772828611
Fixes # .
Description
Fixed the problem with import_bundle_history with algo trained outside autorunner. If outside autorunner, the algo_object.pkl will not have score meta, and import_bundle_history will not recognize the algo as trained. Changed that to read progress.yaml.
Fixed the OOM problem during ensembling. Move to CPU if OOM. Also do not append prediction tensors to list and return. Save each predictions separately and return the save path.
Types of changes
./runtests.sh -f -u --net --coverage
../runtests.sh --quick --unittests --disttests
.make html
command in thedocs/
folder.