-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix last step is not executed #236
Conversation
If we have 730 steps, DLIO benchmark only executes until 729 The bug also persists when user specified `total_training_steps`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
If `total_training_steps` is not specified, the default will be -1. Thus checking whether it is > 0 is needed
There is one bug. if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CI is failing. Can u run pytest and check.
I think the last commit fixed the CI @hariharan-devarajan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good.
- We make sure we complete the step correctlyly.
- For DALI, we need to handle the stop iteration gracefully.
@zhenghh04 This is ready for merge as well. |
If we have 730 steps, DLIO benchmark only executes until 729
The bug also persists when user specified
total_training_steps
Fix: #235