Need help in saving and loading model to resume training #2

BashCache · 2022-04-20T09:18:03Z

We have started to train the model according to the instructions given. We have used CUDA 9.0 , Python 3.6.8, with 1 GPU core. We have trained for 5 ticks and it took nearly 45 minutes for the same. We do not have hardware requirements to keep running the model throughout the day and hence we require your help in knowing the following things:

How long will it take to train the model.
We are assuming network-snapshot pickle file stores our model weights for each tick. We would like to know if our assumption is right and if not, could you please explain what network-snapshot tries to save.
We are planning to run 10 ticks at at time, store the model and resume training from the subsequent tick. We would like to know how to load the model to continue training and would want to know where to make changes in training_loop.py file

It would really be helpful if you could shed light on the above points as soon as possible as this would be immensely helpful for our project. Appreciate your time! Thank you! :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need help in saving and loading model to resume training #2

Need help in saving and loading model to resume training #2

BashCache commented Apr 20, 2022

Need help in saving and loading model to resume training #2

Need help in saving and loading model to resume training #2

Comments

BashCache commented Apr 20, 2022