-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ASK] xDeepFM - Help on saving model checkpoint to Azure ML output directory #970
Comments
maybe @eedeleon can help here? |
@miguelgfierro: I have a solution to this problem. It involves adding a "/" to the save_path string as shown below: This code is located in base_model.py. I have validated that this works on the Azure Machine Learning service. I'm in the process of validating that the change works on local notebooks as well. Will you accept a pull request for this? |
@miguelgfierro: Just finished testing this change on the local xdeeepfm notebook in 00_quick_start: contents of the local folder in my C: drive - |
Thanks @elogicaadith I added a comment, can you please take a look? |
…ng os.path.join
Description
I've taken the xDeepFM deep dive notebook and adapted it so that it can run in Azure Machine Learning Service. I would like Azure to capture the model checkpoints and associated files so that I can download the best run and visualize training in Tensorboard, as well as restore the model at a later point in time. Currently, I do not see any of the model files captured (AML Service needs these to be in the outputs directory).
It appears that MODEL_DIR is used in a concatenation above. Should MODEL_DIR be passed in the form of a string such as './outputs' or as an os.path.join type construct?
When I tried the above on my local machine, I get the summaries nicely placed in the summaries directory under the outputs directory as I would expect. However the model files are placed in the outputs directory and are prepended with "model"
When I run this in Azure ML Service, summary files and model files are not available for download. My hunch is that the relative directory must be off.
Any tips on how to set up MODEL_DIR correctly in order to get the files placed in the outputs directory and how to set this up for running in Azure ML Service would be welcome.
Other Comments
The text was updated successfully, but these errors were encountered: