-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not benefiting from checkpointing #297
Comments
Thanks for submitting the issue @mstfldmr. Do you have a simple example I can use to try and repro the issue? I can also try and repro this using our basic example, but it might be good to get closer to your current set up as well. |
@owenvallis I'm sorry, I can't share the full code because it has some confidential pieces we developed. This was how I configured checkpointing:
and how I loaded a checkpoint back:
|
@owenvallis could you reproduce it? |
Hi @mstfldmr, sorry for the delay here. I'll try and get to this this week. |
Looking into this now as it also looks like there is a breaking change in 2.8 where they removed Which optimizer were you using? Was it Adam? |
@owenvallis yes, it was tfa.optimizers.RectifiedAdam. |
Hello,
I save checkpoints with:
After loading the latest checkpoint and continuing training, I would expect the loss value to be around the loss value in the last checkpoint.
However, the loss value does not continue from where it left. It looks like it's simply starting the training from scratch and not benefiting from checkpoints.
The text was updated successfully, but these errors were encountered: