-
Notifications
You must be signed in to change notification settings - Fork 558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert RNN generation to TF v2. #1978
Conversation
TF v2 models only save 2 files, not three.
If no checkpoint has been saved, validation and data generation will fail as they imply loading saved weights to use in the rebatched model. A better design would be to have callbacks that do validation and generation and all TensorBoard stuff but for that we first need to convert to a dataset representation and do automatic batching.
Fixes #1540 |
/gcbrun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is crazy and LGTM! Let's see what makes the CI unhappy
/gcbrun |
The pylint issue on Travis I think can be fixed by pylint-dev/pylint#1542 (comment). TensorBoard recommends the I cannot see the GCP build log so I don't know what failed there. Should I instead try and convert code to use |
I disabled it inline, hopefully this would fix the travis build |
Ok, Travis passes now. Should we try another /gcbrun ? |
/gcbrun |
Still fails :( But I cannot see the log |
|
Using |
Thanks. I'll try to replicate locally, it's possible to be caused by the change in model format |
/gcbrun |
/gcbrun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!!! Thanks @inferno-chromium for helping to debug!
We can merge this today, but if anyone deploys, I won't be able to monitor the errors as I'm OOO today. |
I'm ok with waiting :) |
TF v2 offers ways to handle validation and TensorBoard during model callback. We can also do the batching using
tf.data
instead of writing the loop manually.I'm trying it this way now to not do too many changes at once but will probably send another PR later to convert to the recommended way.